3DVelocity would like to
thank Intel Corp
and especially Mathias Raeck and Graham Palmer for their help
and courtesy in providing this motherboard for review.

Architecture
details -
As this isn't strictly
a Pentium 4 review, but rather a look at the i845 chipset,
I'm not going to bog you down too much in the intricacies
of the Pentium 4's architecture, but to understand the importance
of this chipset we need to understand some basics.
| Features |
Benefits |
| Processor
Core Speeds Up to 2 GHz |
Maximum performance
for a wide range of emerging Internet, PC and workstation
applications |
| Intel® NetBurst™
Microarchitecture, including: 400-MHz System Bus |
High bandwidth
between the processor and the rest of the system improves
throughput and performance |
| 256-KB L2
Advanced Transfer Cache |
Enhances
performance by providing fast access to heavily used data
and instructions |
| Hyper-Pipelined
Technology |
Extended
pipeline stages significantly increase overall throughput |
| Streaming
SIMD Extensions 2 |
144 new instructions
accelerate operation across a broad range of demanding
applications |
| Rapid Execution
Engine |
Arithmetic
Logic Units run at twice the core frequency, speeding
execution in this performance critical area |
| 128-Bit Floating
Point Port |
Floating
Point performance boost provides enhanced 3D visualization
and scientific calculation |
| SIMD 128-bit
Integer |
Accelerates
video, speech, encryption and imaging/photo processing |
| Execution
Trace Cache |
Greatly improves
instruction cache efficiency, maximizing performance on
frequently used sections of software code |
| Advanced
Dynamic Execution |
Improved
branch prediction enhances performance for all 32-bit
applications by optimizing instruction sequences |
Intel christened its new
architecture for the P4 "NetBurst", and though I've
not read an official reason for this name being chosen, it's
certainly nothing to do with Internet, at least it doesn't
seem to be optimised for the Internet other than for video
streaming. It was suggested that the name was chosen to reflect
the way the 'Net was seen by many as new and trendy, but I
don't think any of us see the 'Net as new and trendy in a
particularly high tech way. Perhaps it refers to P4's ability
to transfer or "burst" data at speed through its
own micro network. Either way, the name isn't really what
matters, it's the technology behind it we want to know about.
Bandwidth -
The Pentium 4 works on
a "quad pumped" internal bus. That is, although
the system bus runs at 100MHz, it is "multiplied"
by a factor of four. This means data is transferred internally
at an incredible 400MHz. The upshot of this is that the P4
has a full 3.2GB/Second of bandwidth, totally eclipsing Athlons
maximum of 2.1 GB/Second. This is only half the story however,
as it's pointless having 3.2GB/S of data bouncing around internally
if your memory starts choking at less than that figure. Controversial
though it has been, Intel's decision to launch the P4 on the
i850 chipset with dual channel Rambus was actually the only
way to go, as only Rambus (RDRAM) has anywhere near enough
bandwidth to go the distance. In actual fact, rated at 3.2GB/S,
the dual Rambus configuration is the perfect match for the
P4's appetite. What turned people against the idea of RDRAM,
prior to its very public legal battles, was that it needed
to be fitted in pairs, and in identically matched pairs at
that. Given the price of RDRAM when P4 launched in the November
2000 it was never likely to be an easy for Intel to sell the
idea of Rambus to a value conscious market.
L3 Cache -
It's claimed Intel had
originally planned to strap MB of L3 cache to the P4, but
clearly that never happened, presumably because it would have
meant a move back to a cartridge design not to mention costs.
L2 Cache -
Sticking with its PIII
naming convention, the P4's L2 cache, or its "'Advanced
Transfer Cache" remains at 256k. However, additional
enhancements were introduced to help power through the data,
including 128 byte cache lines and a 256 bit data bus to the
core. The bandwidth offered is awesome, in fact you're talking
48GB/Second for the 1.5GHz core.
L1 Cache -
I don't think it's any
great secret that the original plans for the P4 had to be
"stripped down" in order to meet its die size limitations.
Several ambitious features were either dropped or rethought,
and one of those was the L1 cache. Originally planned to comprise
of a 16k data cache and a 12000 instruction execution trace
cache (instruction cache), the final silicone had a mere 8k
data cache, though so far as I'm aware the 12000 instruction
execution trace cache remained intact. Why the fancy name?
well, in a nutshell the execution trace cache is very closely
tied in with the core's decoders and handles only micro-ops,
these are chunks of code that, unlike x86 instructions, need
no prior deciding, and with a latency of only 2 clocks, an
excellent branch prediction unit and a clever compression
algorithm in place this seemingly tiny amount of cache does
a fine job of keeping the rather long pipeline fed with data.
Hyper Pipeline -
To enable it to push processor
speeds beyond the 1GHz ceiling encountered using its older
P6 architecture, Intel raised the pipeline stages from 10
in the PIII to 20 in the P4. Occasionally however, data in
that pipeline will need to be flushed and the more stages
there are, the more data gets flushed (up to 126 instructions
in fact) and the longer it takes to refill. Use of the execution
trace cache aims to keep these occurrences to a minimum, but
when they do occur the delays can be significant, in processor
terms at least.
SSE2 -
SSE2 adds 140 new instructions
to the original SSE set. Some will claim that SSE2 is nothing
more than SSE should have originally been, but regardless
of that its power and flexibility is now huge. This is probably
just as well because one of the other parts of the P4 architecture
to hit the operating theatre floor when it underwent its fat
reduction operation was one of the two floating point units.
This is why we often see such average FPU performance when
running code that can't compensate for this deficit by using
SSE2.
The Rapid Execution
Engine -
At the heart of the rapid
execution engine lie two double pumped ALUs and two double
pumped AGUs. These operate at twice the core frequency but
are only able to cope with micro-ops. More complex instructions
need to be channeled through the single slow ALU, and this
actually accounts for the vast majority of data handled.
Page
3- Chipset Options