Clearly in the future we will have more cores to work with. The intel design is a very different approach to the problem. It will be interesting to see what kind of performance they get out of it when they have real silicon to demo.
The thing that impressed me the most about the cuda architecture is how well it hides memory latency. Rather than have a very deep pipeline, a monster cache, and complex logic to do branch prediction, etc. It just has many threads ready to run with zero overhead to switch between them using a simple register renaming scheme. This works well for problems that have lots of data that can be operated on separately. The future will only bring more options. I would like to see Nvidia and ATI to agree on a general purpose computing API so that code can be written to run on both brands. --- In amibroker@yahoogroups.com, "rhoemke" <[EMAIL PROTECTED]> wrote: > > Maybe it all becomes easier with this here? > > http://www.intel.com/pressroom/archive/releases/20080804fact.htm? iid=pr1_releasepri_20080804fact >