Hi all,

I've read some discussions about adding labeled axes, and even ticks, to numpy 
arrays  (such as in Luis' dataarray).

I have recently found that the ability to label axes would be very helpful to 
me, but I'd like to keep the implementation as lightweight as possible.  

The reason I would find this useful is because I am writing a ndarray subclass 
that loads image/volume file formats into numpy arrays.  Some of these files 
might have multiple images/volumes, I'll call them channels, and also may have 
an additional dimension for vectors associated with each pixel/voxel, like 
color.  The max dims of the array would then be 5.

Example: data = ndarray([1023,128,128,128,3]) might mean (channels,z,y,x,rgb) 
for one array.  Now I want to keep as much of the fancy indexing capabilities 
of numpy as I can, but I am finding it difficult to track the removal of axes 
that can occur from indexing.  For example data[2,2] would return an array of 
shape (128,128,3), or the third slice through the third volume in the dataset, 
but the returned array has lost the meaning associated with its axes, so saving 
it back out would require manual relabeling of the axes.   I'd like to be able 
to track the axes as metadata and retain all the fancy numpy indexing.

There are two ways I could accomplish this with minimal code on the python side:

 One would be if indexing of the array always returned an array of the same 
dimensionality, that is data[2,2] returned an array of shape (1,1,128,128,3).  
I could then delete the degenerate axes labels from the metadata, and return 
the compressed array, resulting in the same output:

class Data(np.ndarray):
        def __getitem__(self,indices):
                data = np.ndarray.__getitem__(self,indices,donotcompress=True) 
# as an example
                data.axeslabels = [label for label,dim in 
zip(self.axeslabels,data.shape) if dim > 1]
                return data.compress()
        def __getslice__(self,s1,s2,step):
                # trivial case

Another approach would be if there is some function in the numpy internals that 
I could use to get the needed information before calling the ndarray's 
__getitem__ function:

class Data(np.ndarray):
        def __getitem__(self,indices):
                unique = np.uniqueIndicesPerDimension(indices)
                data = np.ndarray.__getitem__(self,indices)
                data.axeslabels = [label for label,dim in zip(self.axeslabels, 
unique) if dim > 1]
                return data

Finally, I could implement my own parser for the passed indices to figure this 
out myself.  This would be bad since I'd have to recreate a lot of the same 
code that must go on inside numpy, and it would be slower, error-prone, etc. :

class Data(np.ndarray):
        def __getitem__(self,indices):
                indices = self.uniqueDimensionIndices(indices)
                data = np.ndarray.__getitem__(self,indices)
                data.axeslabels = [label for label,dim in 
zip(self.axeslabels,indices) if dim > 1]
                return data
        def uniqueDimensionIndices(self,indices):
                if isinstance(indices,int):
                        indices = (indices,)
                if isinstance(indices,tuple):
                        ....
                elif isinstance(indices,list):
                        ...


Is there anything in the numpy internals already that would allow me to do #1 
or #2?, I don't think #3 is a very good option.

Thanks!

        



_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Reply via email to