Interesting result.  It is reasonable to expect newer and future 
processors to have even larger caches.  6MB on the E8400

So, there may be an opportunity to get useful work out of the other 
cores during long optimization runs.  Maybe large "work units" such 
as symbols lists or entire optimization passes to minimize sync 
overhead?

That also implies that user formula code could be optimized if we 
group array access together, ie use a result right after it is 
created, so it can be pulled out of cache, rather than wait until 
later in the code.  I will have to try this...



--- In [email protected], "Tomasz Janeczko" <[EMAIL PROTECTED]> 
wrote:
>
> Hello,
> 
> I just run the same code on my relatively new notebook (Core 2 Duo 
2GHz (T7250))
> and the loop takes less than 2ns per iteration (3x speedup). So it 
looks like the data sits entirely inside the cache. 
> This core 2 has 2MB of cache and thats 4 times more than on Athlon 
x2 I got.
> 
> > If what you say is true, and one core alone fills the memory 
> > bandwidth, then there should be a net loss of performance while 
> > running two copies of ami.  
> 
> It depends on complexity of the formula and the amount of data per 
symbol
> you are using. As each array element has 4 bytes, to fill 4 MB of 
cache
> you would need 1 million array elements or 100 arrays each having 
10000 elements
> or 10 arrays each having 100K elements. Generally speaking people 
testing
> on EOD data where 10 years is just 2600 bars should see speed up.
> People using very very long intraday data sets may see degradation, 
but
> rather unnoticeable.
> 
> Best regards,
> Tomasz Janeczko
> amibroker.com

Reply via email to