Hi, This looks useful. What you said about __array__ makes sense, but I didn't see it in the code you linked. Do you know when python netcdf4 will support the numpy array interface directly? I searched around for a roadmap but didn't find anything. It may be best for me to proceed with a slightly clumsy interface for now and wait until the array interface is built in for free.
Thanks, Glenn On Mar 30, 2014 2:18 AM, "Stephan Hoyer" <sho...@gmail.com> wrote: > Hi Glenn, > > Here is a full example of how we wrap a netCDF4.Variable object, > implementing all of its ndarray-like methods: > > https://github.com/akleeman/xray/blob/0c1a963be0542b7303dc875278f3b163a15429c5/src/xray/conventions.py#L91 > > The __array__ method would be the most relevant one for you: it means that > numpy knows how to convert the wrapper array into a numpy.ndarray when you > call np.mean(cplx_data). More generally, any function that calls > np.asarray(cplx_data) will properly convert the values, which should > include most functions from well-written libraries (including numpy and > scipy). netCDF4.Variable doesn't currently have such an __array__ method, > but it will in the next released version of the library. > > The quick and dirty hack to make all numpy methods work (now going beyond > what the netCDF4 library implements) would be to add something like the > following: > > def __getattr__(self, attr): > return getattr(np.asarray(self), attr) > > But this is a little dangerous, since some methods might silently fail or > give unpredictable results (e.g., those that modify data). It would be > safer to list the methods you want to implement explicitly, or to just > liberally use np.asarray. The later is generally a good practice when > writing library code, anyways, to catch unusual ndarray subclasses like > np.matrix. > > Stephan > > > On Sat, Mar 29, 2014 at 8:42 PM, G Jones <glenn.calt...@gmail.com> wrote: > >> Hi Stephan, >> Thanks for the reply. I was thinking of something along these lines but >> was hesitant because while this provides clean access to chunks of the >> data, you still have to remember to do cplx_data[:].mean() for example in >> the case that you want cplx_data.mean(). >> >> I was hoping to basically have all of the ndarray methods at hand without >> any indexing, but then also being smart about taking advantage of the mmap >> when possible. But perhaps your solution is the best compromise. >> >> Thanks again, >> Glenn >> On Mar 29, 2014 10:59 PM, "Stephan Hoyer" <sho...@gmail.com> wrote: >> >>> Hi Glenn, >>> >>> My usual strategy for this sort of thing is to make a light-weight >>> wrapper class which reads and converts values when you access them. For >>> example: >>> >>> class WrapComplex(object): >>> def __init__(self, nc_var): >>> self.nc_var = nc_var >>> >>> def __getitem__(self, item): >>> return self.nc_var[item].view('complex') >>> >>> nc = netCDF4.Dataset('my.nc') >>> cplx_data = WrapComplex(nc.groups['mygroup'].variables['cplx_stuff']) >>> >>> Now you can index cplx_data (e.g., cplx_data[:10]) and only the values >>> you need will be read from disk and converted on the fly. >>> >>> Hope this helps! >>> >>> Cheers, >>> Stephan >>> >>> >>> >>> >>> On Sat, Mar 29, 2014 at 6:13 PM, G Jones <glenn.calt...@gmail.com>wrote: >>> >>>> Hi, >>>> I am using netCDF4 to store complex data using the recommended strategy >>>> of creating a compound data type with the real and imaginary parts. This >>>> all works well, but reading the data into a numpy array is a bit clumsy. >>>> >>>> Typically I do: >>>> >>>> nc = netCDF4.Dataset('my.nc') >>>> cplx_data = >>>> nc.groups['mygroup'].variables['cplx_stuff'][:].view('complex') >>>> >>>> which directly gives a nice complex numpy array. This is OK for small >>>> arrays, but is wasteful if I only need some chunks of the array because it >>>> reads all the data in, reducing the utility of the mmap feature of netCDF. >>>> >>>> I'm wondering if there is a better way to directly make a numpy array >>>> view that uses the netcdf variable's memory mapped buffer directly. Looking >>>> at the Variable class, there is no access to this buffer directly which >>>> could then be passed to np.ndarray(buffer=...). >>>> >>>> Any ideas of simple solutions to this problem? >>>> >>>> Thanks, >>>> Glenn >>>> >>>> _______________________________________________ >>>> NumPy-Discussion mailing list >>>> NumPy-Discussion@scipy.org >>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>>> >>>> >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion@scipy.org >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >>> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion@scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > >
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion