Krayzie_D_187
06-29-2004, 02:40 PM
Xenon Hardware Overview
By Pete
Isensee, Development Lead, Xbox Advanced Technology Group
This documentation
is an early release of the final documentation, which may be changed substantially prior to final
commercial release, and is confidential and proprietary information of MS Corporation. It is
disclosed pursuant to a nondisclosure agreement between the recipient and MS.
“Xenon” is the code name for the successor to the Xbox® game console
from MS. Xenon is expected to launch in 2005. This white paper is designed to provide a brief
overview of the primary hardware features of the console from a game developer’s
standpoint.
Caveats
In some cases, sizes, speeds, and other details of the Xenon
console have not been finalized. Values not yet finalized are identified with a
“+” sign, indicating that the numbers may be larger than indicated here. At the
time of this writing, the final console is many months from entering production. Based on our
experience with Xbox, it’s likely that some of this information will change slightly for the
final console.
For additional information on various hardware components, see the
other relevant white papers.
Hardware Goals
Xenon was designed with the
following goals in mind:
•Focus on innovation in silicon, particularly features
that game developers need. Although all Xenon hardware components are technologically
advanced, the hardware engineering effort has concentrated on digital performance in the
CPU and GPU.
•Maximize general purpose processing performance rather than
fixed-function hardware. This focus on general purpose processing puts the power into the
Xenon software libraries and tools. Rather than being hamstrung by particular hardware
designs, software libraries can support the latest and most efficient techniques.
•Eliminate the performance issues of the past. On Xbox, the primary
bottlenecks were memory and CPU bandwidth. Xenon does not have these limitations.
Basic Hardware Specifications
Xenon is powered by a 3.5+ GHz IBM
PowerPC processor and a 500+ MHz ATI graphics processor. Xenon has 256+ MB of unified
memory. Xenon runs a custom operating system based on MS® Windows NT®, similar to the
Xbox operating system. The graphics interface is a superset of MS® Direct3D® version 9.0.
CPU
The Xenon CPU is a custom processor based on PowerPC technology. The
CPU includes three independent processors (cores) on a single die. Each core runs at 3.5+
GHz. The Xenon CPU can issue two instructions per clock cycle per core. At peak
performance, Xenon can issue 21 billion instructions per second.
The Xenon CPU was
designed by IBM in close consultation with the Xbox team, leading to a number of revolutionary
additions, including a dot product instruction for extremely fast vector math and custom security
features built directly into the silicon to prevent piracy and hacking.
Each core has two
symmetric hardware threads (SMT), for a total of six hardware threads available to games. Not
only does the Xenon CPU include the standard set of PowerPC integer and floating-point
registers (one set per hardware thread), the Xenon CPU also includes 128 vector (VMX)
registers per hardware thread. This astounding number of registers can drastically improve the
speed of common mathematical operations.
Each of the three cores includes a 32-KB
L1 instruction cache and a 32-KB L1 data cache. The three cores share a 1-MB L2 cache.
The L2 cache can be locked down in segments to improve performance. The L2 cache also
has the very unusual feature of being directly readable from the GPU, which allows the GPU to
consume geometry and texture data from L2 and main memory simultaneously.
Xenon CPU
instructions are exposed to games through compiler intrinsics, allowing developers to access
the power of the chip using C language notation.
GPU
The Xenon GPU is a
custom 500+ MHz graphics processor from ATI. The shader core has 48 Arithmetic Logic Units
(ALUs) that can execute 64 simultaneous threads on groups of 64 vertices or pixels. ALUs are
automatically and dynamically assigned to either pixel or vertex processing depending on load.
The ALUs can each perform one vector and one scalar operation per clock cycle, for a total of
96 shader operations per clock cycle. Texture loads can be done in parallel to ALU operations.
At peak performance, the GPU can issue 48 billion shader operations per second.
The
GPU has a peak pixel fill rate of 4+ gigapixels/sec (16 gigasamples/sec with 4× antialiasing).
The peak vertex rate is 500+ million vertices/sec. The peak triangle rate is 500+ million
triangles/sec. The interesting point about all of these values is that they’re not just
theoretical—they are attainable with nontrivial shaders.
Xenon is designed for
high-definition output. Included directly on the GPU die is 10+ MB of fast embedded dynamic
RAM (EDRAM). A 720p frame buffer fits very nicely here. Larger frame buffers are also possible
because of hardware-accelerated partitioning and predicated rendering that has little cost
other than additional vertex processing. Along with the extremely fast EDRAM, the GPU also
includes hardware instructions for alpha blending, z-test, and antialiasing.
The Xenon
graphics architecture is a unique design that implements a superset of Direct3D version 9.0. It
includes a number of important extensions, including additional compressed texture formats
and a flexible tessellation engine. Xenon not only supports high-level shading language (HLSL)
model 3.0 for vertex and pixel shaders but also includes advanced shader features well
beyond model 3.0. For instance, shaders use 32-bit IEEE floating-point math throughout.
Vertex shaders can fetch from textures, and pixel shaders can fetch from vertex streams.
Xenon shaders also have the unique ability to directly access main memory, allowing
techniques that have never before been possible.
As with Xbox, Xenon will support
precompiled push buffers (“command buffers” in Xenon terminology), but to a
much greater extent than the Xbox console does. The Xbox team is exposing and
documenting the command buffer format so that games are able to harness the GPU much
more effectively.
In addition to an extremely powerful GPU, Xenon also includes a very
high-quality resize filter. This filter allows consumers to choose whatever output mode they
desire. Xenon automatically scales the game’s output buffer to the consumer-chosen
resolution.
Memory and Bandwidth
Xenon has 256+ MB of unified memory, equally
accessible to both the GPU and CPU. The main memory controller resides on the GPU (the
same as in the Xbox architecture). It has 22.4+ GB/sec aggregate bandwidth to RAM,
distributed between reads and writes. Aggregate means that the bandwidth may be used for all
reading or all writing or any combination of the two. Translated into game performance, the
GPU can consume a 512×512×32-bpp texture in only 47 microseconds.
The front side
bus (FSB) bandwidth peak is 10.8 GB/sec for reads and 10.8 GB/sec for writes, over 20 times
faster than for Xbox. Note that the 22.4+ GB/sec main memory bandwidth is shared between
the CPU and GPU. If, for example, the CPU is using 2 GB/sec for reading and 1 GB/sec for
writing on the FSB, the GPU has 19.4+ GB/sec available for accessing RAM.
Eight
pixels (where each pixel is color plus z = 8 bytes) can be sent to the EDRAM every GPU clock
cycle, for an EDRAM write bandwidth of 32 GB/sec. Each of these pixels can be expanded
through multisampling to 4 samples, for up to 32 multisampled pixel samples per clock cycle.
With alpha blending, z-test, and z-write enabled, this is equivalent to having 256 GB/sec of
effective bandwidth! The important thing is that frame buffer bandwidth will never slow down
the Xenon GPU.
Audio
The Xenon CPU is a superb processor for audio,
particularly with its massive mathematical horsepower and vector register set. The Xenon CPU
can process and encode hundreds of audio channels with sophisticated per-voice and global
effects, all while using a fraction of the power of a single CPU core.
The Xenon system
south bridge also contains a key hardware component for audio—XMA decompression.
XMA is the native Xenon compressed audio format, based on the WMA Pro architecture. XMA
provides sound quality higher than ADPCM at even better compression ratios, typically
6:1–12:1. The south bridge contains a full silicon implementation of the XMA
decompression algorithm, including support for multichannel XMA sources. XMA is processed
by the south bridge into standard PCM format in RAM. All other sound processing (sample rate
conversion, filtering, effects, mixing, and multispeaker encoding) happens on the Xenon CPU.
The lowest-level Xenon audio software layer is XAudio, a new API designed for optimal
digital signal processing. The Xbox Audio Creation Tool (XACT) API from Xbox is also
supported, along with new features such as conditional events, improved parameter control,
and a more flexible 3D audio model.
Input/Output
As with Xbox, Xenon is
designed to be a multiplayer console. It has built-in networking support including an Ethernet
10/100-BaseT port. It supports up to four controllers. From an audio/video standpoint, Xenon
will support all the same formats as Xbox, including multiple high-definition formats up through
1080i, plus VGA output.
In order to provide greater flexibility and support a wider
variety of attached devices, the Xenon console includes standard USB 2.0 ports. This feature
allows the console to potentially host storage devices, cameras, microphones, and other
devices.
Storage
The Xenon console is designed around a larger world view of
storage than Xbox was. Games will have access to a variety of storage devices, including
connected devices (memory units, USB storage) and remote devices (networked PCs, Xbox
Live™). At the time of this writing, the decision to include a built-in hard disk in every
Xenon console has not been made. If a hard disk is not included in every console, it will
certainly be available as an integrated add-on component.
Xenon supports up to two
attached memory units (MUs). MUs are connected directly to the console, not to controllers as
on Xbox. The initial size of the MUs is 64 MB, although larger MUs may be available in the
future. MU throughput is expected to be around 8 MB/sec for reads and 1 MB/sec for writes.
The Xenon game disc drive is a 12× DVD, with an expected outer edge throughput of
16+ MB/sec. Latency is expected to be in the neighborhood of 100 ms. The media format will
be similar to Xbox, with approximately 6 GB of usable space on the disk. As on Xbox, media will
be stored on a single side in two 3 GB layers.
Industrial Design
The Xenon
industrial design process is well under way, but the final look of the box has not been
determined. The Xenon console will be smaller than the Xbox console.
The standard
Xenon controller will have a look and feel similar to the Xbox controller. The primary changes
are the removal of the Black and White buttons and the addition of shoulder buttons. The
triggers, thumbsticks, D-pad, and primary buttons are essentially unchanged. The controller will
support vibration.
Xenon Development Kit
The Xenon development environment
follows the same model as for Xbox. Game development occurs on the PC. The resulting
executable image is loaded by the Xenon development kit and remotely debugged on the PC.
MS® Visual Studio® version 7.1 continues as the development environment for Xenon.
The Xenon compiler is based on a custom PowerPC back end and the latest MS®
Visual C++® front end. The back end uses technology developed at MS for Windows NT on
PowerPC. The Xenon software group includes a dedicated team of compiler engineers
updating the compiler to support Xenon-specific CPU extensions. This team is also heavily
focused on optimization work.
The Xenon development kit will include accurate DVD
emulation technology to allow developers to very precisely gauge the effects of the retail
console disc drive.
Miscellaneous Xenon Hardware Notes
Some additional
notes:
•Xenon is a big-endian system. Both the CPU and GPU process memory in
big-endian mode. Games ported from little-endian systems such as the Xbox or PC need to
account for this in their game asset pipeline.
•Tapping into the power of the
CPU is a daunting task. Writing multithreaded game engines is not trivial. Xenon system
software is designed to take advantage of this processing power wherever possible. The Xbox
Advanced Technology Group (ATG) is also exploring a variety of techniques for offloading
graphics work to the CPU.
•People often ask if Xenon can be backward
compatible with Xbox. Although the architecture of the two consoles is quite different, Xenon
has the processing power to emulate Xbox. Whether Xenon will be backward compatible
involves a variety of factors, not the least of which is the massive development and testing
effort required to allow Xbox games run on Xenon.
By Pete
Isensee, Development Lead, Xbox Advanced Technology Group
This documentation
is an early release of the final documentation, which may be changed substantially prior to final
commercial release, and is confidential and proprietary information of MS Corporation. It is
disclosed pursuant to a nondisclosure agreement between the recipient and MS.
“Xenon” is the code name for the successor to the Xbox® game console
from MS. Xenon is expected to launch in 2005. This white paper is designed to provide a brief
overview of the primary hardware features of the console from a game developer’s
standpoint.
Caveats
In some cases, sizes, speeds, and other details of the Xenon
console have not been finalized. Values not yet finalized are identified with a
“+” sign, indicating that the numbers may be larger than indicated here. At the
time of this writing, the final console is many months from entering production. Based on our
experience with Xbox, it’s likely that some of this information will change slightly for the
final console.
For additional information on various hardware components, see the
other relevant white papers.
Hardware Goals
Xenon was designed with the
following goals in mind:
•Focus on innovation in silicon, particularly features
that game developers need. Although all Xenon hardware components are technologically
advanced, the hardware engineering effort has concentrated on digital performance in the
CPU and GPU.
•Maximize general purpose processing performance rather than
fixed-function hardware. This focus on general purpose processing puts the power into the
Xenon software libraries and tools. Rather than being hamstrung by particular hardware
designs, software libraries can support the latest and most efficient techniques.
•Eliminate the performance issues of the past. On Xbox, the primary
bottlenecks were memory and CPU bandwidth. Xenon does not have these limitations.
Basic Hardware Specifications
Xenon is powered by a 3.5+ GHz IBM
PowerPC processor and a 500+ MHz ATI graphics processor. Xenon has 256+ MB of unified
memory. Xenon runs a custom operating system based on MS® Windows NT®, similar to the
Xbox operating system. The graphics interface is a superset of MS® Direct3D® version 9.0.
CPU
The Xenon CPU is a custom processor based on PowerPC technology. The
CPU includes three independent processors (cores) on a single die. Each core runs at 3.5+
GHz. The Xenon CPU can issue two instructions per clock cycle per core. At peak
performance, Xenon can issue 21 billion instructions per second.
The Xenon CPU was
designed by IBM in close consultation with the Xbox team, leading to a number of revolutionary
additions, including a dot product instruction for extremely fast vector math and custom security
features built directly into the silicon to prevent piracy and hacking.
Each core has two
symmetric hardware threads (SMT), for a total of six hardware threads available to games. Not
only does the Xenon CPU include the standard set of PowerPC integer and floating-point
registers (one set per hardware thread), the Xenon CPU also includes 128 vector (VMX)
registers per hardware thread. This astounding number of registers can drastically improve the
speed of common mathematical operations.
Each of the three cores includes a 32-KB
L1 instruction cache and a 32-KB L1 data cache. The three cores share a 1-MB L2 cache.
The L2 cache can be locked down in segments to improve performance. The L2 cache also
has the very unusual feature of being directly readable from the GPU, which allows the GPU to
consume geometry and texture data from L2 and main memory simultaneously.
Xenon CPU
instructions are exposed to games through compiler intrinsics, allowing developers to access
the power of the chip using C language notation.
GPU
The Xenon GPU is a
custom 500+ MHz graphics processor from ATI. The shader core has 48 Arithmetic Logic Units
(ALUs) that can execute 64 simultaneous threads on groups of 64 vertices or pixels. ALUs are
automatically and dynamically assigned to either pixel or vertex processing depending on load.
The ALUs can each perform one vector and one scalar operation per clock cycle, for a total of
96 shader operations per clock cycle. Texture loads can be done in parallel to ALU operations.
At peak performance, the GPU can issue 48 billion shader operations per second.
The
GPU has a peak pixel fill rate of 4+ gigapixels/sec (16 gigasamples/sec with 4× antialiasing).
The peak vertex rate is 500+ million vertices/sec. The peak triangle rate is 500+ million
triangles/sec. The interesting point about all of these values is that they’re not just
theoretical—they are attainable with nontrivial shaders.
Xenon is designed for
high-definition output. Included directly on the GPU die is 10+ MB of fast embedded dynamic
RAM (EDRAM). A 720p frame buffer fits very nicely here. Larger frame buffers are also possible
because of hardware-accelerated partitioning and predicated rendering that has little cost
other than additional vertex processing. Along with the extremely fast EDRAM, the GPU also
includes hardware instructions for alpha blending, z-test, and antialiasing.
The Xenon
graphics architecture is a unique design that implements a superset of Direct3D version 9.0. It
includes a number of important extensions, including additional compressed texture formats
and a flexible tessellation engine. Xenon not only supports high-level shading language (HLSL)
model 3.0 for vertex and pixel shaders but also includes advanced shader features well
beyond model 3.0. For instance, shaders use 32-bit IEEE floating-point math throughout.
Vertex shaders can fetch from textures, and pixel shaders can fetch from vertex streams.
Xenon shaders also have the unique ability to directly access main memory, allowing
techniques that have never before been possible.
As with Xbox, Xenon will support
precompiled push buffers (“command buffers” in Xenon terminology), but to a
much greater extent than the Xbox console does. The Xbox team is exposing and
documenting the command buffer format so that games are able to harness the GPU much
more effectively.
In addition to an extremely powerful GPU, Xenon also includes a very
high-quality resize filter. This filter allows consumers to choose whatever output mode they
desire. Xenon automatically scales the game’s output buffer to the consumer-chosen
resolution.
Memory and Bandwidth
Xenon has 256+ MB of unified memory, equally
accessible to both the GPU and CPU. The main memory controller resides on the GPU (the
same as in the Xbox architecture). It has 22.4+ GB/sec aggregate bandwidth to RAM,
distributed between reads and writes. Aggregate means that the bandwidth may be used for all
reading or all writing or any combination of the two. Translated into game performance, the
GPU can consume a 512×512×32-bpp texture in only 47 microseconds.
The front side
bus (FSB) bandwidth peak is 10.8 GB/sec for reads and 10.8 GB/sec for writes, over 20 times
faster than for Xbox. Note that the 22.4+ GB/sec main memory bandwidth is shared between
the CPU and GPU. If, for example, the CPU is using 2 GB/sec for reading and 1 GB/sec for
writing on the FSB, the GPU has 19.4+ GB/sec available for accessing RAM.
Eight
pixels (where each pixel is color plus z = 8 bytes) can be sent to the EDRAM every GPU clock
cycle, for an EDRAM write bandwidth of 32 GB/sec. Each of these pixels can be expanded
through multisampling to 4 samples, for up to 32 multisampled pixel samples per clock cycle.
With alpha blending, z-test, and z-write enabled, this is equivalent to having 256 GB/sec of
effective bandwidth! The important thing is that frame buffer bandwidth will never slow down
the Xenon GPU.
Audio
The Xenon CPU is a superb processor for audio,
particularly with its massive mathematical horsepower and vector register set. The Xenon CPU
can process and encode hundreds of audio channels with sophisticated per-voice and global
effects, all while using a fraction of the power of a single CPU core.
The Xenon system
south bridge also contains a key hardware component for audio—XMA decompression.
XMA is the native Xenon compressed audio format, based on the WMA Pro architecture. XMA
provides sound quality higher than ADPCM at even better compression ratios, typically
6:1–12:1. The south bridge contains a full silicon implementation of the XMA
decompression algorithm, including support for multichannel XMA sources. XMA is processed
by the south bridge into standard PCM format in RAM. All other sound processing (sample rate
conversion, filtering, effects, mixing, and multispeaker encoding) happens on the Xenon CPU.
The lowest-level Xenon audio software layer is XAudio, a new API designed for optimal
digital signal processing. The Xbox Audio Creation Tool (XACT) API from Xbox is also
supported, along with new features such as conditional events, improved parameter control,
and a more flexible 3D audio model.
Input/Output
As with Xbox, Xenon is
designed to be a multiplayer console. It has built-in networking support including an Ethernet
10/100-BaseT port. It supports up to four controllers. From an audio/video standpoint, Xenon
will support all the same formats as Xbox, including multiple high-definition formats up through
1080i, plus VGA output.
In order to provide greater flexibility and support a wider
variety of attached devices, the Xenon console includes standard USB 2.0 ports. This feature
allows the console to potentially host storage devices, cameras, microphones, and other
devices.
Storage
The Xenon console is designed around a larger world view of
storage than Xbox was. Games will have access to a variety of storage devices, including
connected devices (memory units, USB storage) and remote devices (networked PCs, Xbox
Live™). At the time of this writing, the decision to include a built-in hard disk in every
Xenon console has not been made. If a hard disk is not included in every console, it will
certainly be available as an integrated add-on component.
Xenon supports up to two
attached memory units (MUs). MUs are connected directly to the console, not to controllers as
on Xbox. The initial size of the MUs is 64 MB, although larger MUs may be available in the
future. MU throughput is expected to be around 8 MB/sec for reads and 1 MB/sec for writes.
The Xenon game disc drive is a 12× DVD, with an expected outer edge throughput of
16+ MB/sec. Latency is expected to be in the neighborhood of 100 ms. The media format will
be similar to Xbox, with approximately 6 GB of usable space on the disk. As on Xbox, media will
be stored on a single side in two 3 GB layers.
Industrial Design
The Xenon
industrial design process is well under way, but the final look of the box has not been
determined. The Xenon console will be smaller than the Xbox console.
The standard
Xenon controller will have a look and feel similar to the Xbox controller. The primary changes
are the removal of the Black and White buttons and the addition of shoulder buttons. The
triggers, thumbsticks, D-pad, and primary buttons are essentially unchanged. The controller will
support vibration.
Xenon Development Kit
The Xenon development environment
follows the same model as for Xbox. Game development occurs on the PC. The resulting
executable image is loaded by the Xenon development kit and remotely debugged on the PC.
MS® Visual Studio® version 7.1 continues as the development environment for Xenon.
The Xenon compiler is based on a custom PowerPC back end and the latest MS®
Visual C++® front end. The back end uses technology developed at MS for Windows NT on
PowerPC. The Xenon software group includes a dedicated team of compiler engineers
updating the compiler to support Xenon-specific CPU extensions. This team is also heavily
focused on optimization work.
The Xenon development kit will include accurate DVD
emulation technology to allow developers to very precisely gauge the effects of the retail
console disc drive.
Miscellaneous Xenon Hardware Notes
Some additional
notes:
•Xenon is a big-endian system. Both the CPU and GPU process memory in
big-endian mode. Games ported from little-endian systems such as the Xbox or PC need to
account for this in their game asset pipeline.
•Tapping into the power of the
CPU is a daunting task. Writing multithreaded game engines is not trivial. Xenon system
software is designed to take advantage of this processing power wherever possible. The Xbox
Advanced Technology Group (ATG) is also exploring a variety of techniques for offloading
graphics work to the CPU.
•People often ask if Xenon can be backward
compatible with Xbox. Although the architecture of the two consoles is quite different, Xenon
has the processing power to emulate Xbox. Whether Xenon will be backward compatible
involves a variety of factors, not the least of which is the massive development and testing
effort required to allow Xbox games run on Xenon.