Re: C++ / Why Iterators Got It All Wrong

jmh530 via Digitalmars-d Wed, 06 Sep 2017 14:46:19 -0700

On Wednesday, 6 September 2017 at 20:24:05 UTC, Enamex wrote:

On Sunday, 3 September 2017 at 09:24:03 UTC, Ilya Yaroshenkowrote:
1. Contiguous tensors. Their data is located contiguously inmemory. Single dense memory chunk. All strides betweensubs-tensors can be computed from lengths.
2. Canonical tensors. Only data for one dimension is dense,other dimensions has strides that can not be computed fromlengths. BLAS matrixes are canonical tensors: they have twolengths and one stride.
3. Universal tensors. Each dimension has a stride. Numpyndarrays are universal tensors.
Can you elaborate?

IMO, it's something that still needs to get explained better inthe documentation. I haven't tried to because I'm not 100% on it.


Below is as best as I have figured things out:

All Slices in mir can have an iterator, lengths, and strides.

The lengths are always N-dimensional and contain information onthe shape of the Slice. So for instance, if the lengths are [3,4], then N=2 and it is a 2-dimensional slice, with 3 rows and 4columns.

I left out packs...which are an added complication. Packs can beused to make slices of slices. For the above Slice, the defaultwould simply be that the packs are [1], which means that there isno slice of slicing going on. If the packs were now [1, 1] (thesum of packs must equal N), then that is like saying you now havea slice of slices. In this case, slice[0] would be a row insteadof a scalar. This is what allows you to iterate by row or bycolumn.

So when you're thinking about contiguous, canonical, anduniversal. The way that lengths and packs work is the same forall of them. The difference is in the strides. Contiguous slicesdon't have a strides vector. Canonical slices have a stridesvector with a length of N-1. Universal slices have a stridesvector of length N.

So how are the strides used and why do they matter? I'm not sureI grok this part 100%, so I'll do my best. Strides tell you howmuch difference there is between the units of each array. So forinstance, if my slice (call it a) has lengths [2, 3, 4] withstrides [12, 4, 1], then a[0] is a [3, 4] matrix, which is wherethe 12 comes from. To move the pointer to the start of the next[3, 4] matrix (a[1]), requires moving 12 of whatever the type is.This would be a universal slice because it has N=3 lengths andstrides. So if you call a._strides, then it would give you [12,4, 1]. If a were canonical, calling _strides would give you [12,4] because _strides for canonical slices have length N-1. Now ifa were contiguous instead of universal and you call _strides onit, then it would give you [], because contiguous slices have nostrides.

The memory footprint of the slice appears different for thesewith a and a[0] of universal being larger than canonical andcontiguous. This largely reflects the storage of the strides data.

Similarly, a[0] has _strides [4, 1] for universal, [4] forcanonical, and [] for contiguous. Mir is written in such a waythat a[0] the same regardless of the SliceKind. For the mostpart, this means that it isn't really obvious that there is adifference between them. It matters in some underlying functions,but I haven't needed to do much other than sometimes convert acontiguous slice to universal (though it's not always clear to mewhy, I just do it).

Re: C++ / Why Iterators Got It All Wrong

Reply via email to