Playing with CPU caches

2021-08-03 Thread planetis
because your component's already pretty compact. There is no cache trashing, since you use every member field. Note in ECS is not about this low level of granularity where you would split it's direction to different components. Although that would make sense if you plan to do manual vectorizatio

Playing with CPU caches

2021-08-03 Thread yglukhov
@demotomohiro thanks, that nails it. So basically my AoS variant flushes 4 times more memory than actually needed. I guess it's safe to conclude that it's better to not interleave write memory with read memory.

Playing with CPU caches

2021-08-03 Thread demotomohiro
In `testSOA`, only the content of seq `m` need to be written to main memory . I think in `testAOS`, not only field `m` in object S, but also field `x, y, z` need to be written to main memory because they are also likely in the cache line containing field `m`. According to this wikipedia entry:

Playing with CPU caches

2021-08-03 Thread Stefan_Salewski
I think that your test is very special in that you really use all the fields of your object struct in your test. I would assume that for a larger object, with more fields that are not actually used in your tests loops, that fields would pollute the data cache.

Playing with CPU caches

2021-08-03 Thread mratsim
In general SoA is more cache-friendly than AoS and lend itself more to SIMD optimizations. This is why in video processing, the preferred format is YUV or YV12 instead of RGB. Now regarding your code, hardware prefetchers can follow up to 12 forward streams. It's very easy to catch the pattern

Playing with CPU caches

2021-08-03 Thread yglukhov
While migrating our [game engine](https://github.com/yglukhov/rod/) to ECS architecture we went out to validate some simple ideas about CPU caches, and were a bit surprised. I'm hoping someone can explain what's happening here. import random, std/monotimes const MAX = 9000