On Friday, 4 December 2020 at 14:48:32 UTC, jmh530 wrote:

It looks like all the `sweep_XXX` functions are only defined for contiguous slices, as that would be the default if define a Slice!(T, N).

How the functions access the data is a big difference. If you compare the `sweep_field` version with the `sweep_naive` version, the `sweep_field` function is able to access through one index, whereas the `sweep_naive` function has to use two in the 2d version and 3 in the 3d version.

Also, the main difference in the NDSlice version is that it uses *built-in* MIR functionality, like how `sweep_ndslice` uses the `each` function from MIR, whereas `sweep_field` uses a for loop. I think this is partially to show that the built-in MIR functionality is as fast as if you tried to do it with a for loop yourself.

I see, looking at some of the code, field case is literally doing the indexing calculation right there. I guess ndslice is doing the same thing just with "Mir magic" an in the background? Still, ndslice is able to get a consistent higher rate of flops than the field case - interesting. One thing I discovered about these kinds of plots is that introducing log scale or two particularly for timed comparisons can make the differences between different methods that look close clearer. A log plot might show some consistent difference between the timings of ndslice and the field case. Underneath they should be doing essentially the same thing so teasing out what is causing the difference would be interesting. Is Mir doing some more efficient form of the indexing calculation than naked field calculations?

I'm still not sure why slice is so slow. Doesn't that completely rely on the opSlice implementations? The choice of indexing method and underlying data structure? Isn't it just a symbolic interface that you write whatever you want?

Reply via email to