Hi Glenn, Here is the line in my linked code defining the __array__ method: https://github.com/akleeman/xray/blob/0c1a963be0542b7303dc875278f3b163a15429c5/src/xray/conventions.py#L152
I don't know when Jeff Whitaker will be releasing the next version of netCDF4, but I expect that might be pretty soon if you asked nicely! Otherwise you can always download the development version off of github: https://github.com/Unidata/netcdf4-python Cheers, Stephan On Sun, Mar 30, 2014 at 5:18 AM, G Jones <glenn.calt...@gmail.com> wrote: > Hi, > This looks useful. What you said about __array__ makes sense, but I didn't > see it in the code you linked. > Do you know when python netcdf4 will support the numpy array interface > directly? I searched around for a roadmap but didn't find anything. It may > be best for me to proceed with a slightly clumsy interface for now and wait > until the array interface is built in for free. > > Thanks, > Glenn > On Mar 30, 2014 2:18 AM, "Stephan Hoyer" <sho...@gmail.com> wrote: > >> Hi Glenn, >> >> Here is a full example of how we wrap a netCDF4.Variable object, >> implementing all of its ndarray-like methods: >> >> https://github.com/akleeman/xray/blob/0c1a963be0542b7303dc875278f3b163a15429c5/src/xray/conventions.py#L91 >> >> The __array__ method would be the most relevant one for you: it means >> that numpy knows how to convert the wrapper array into a numpy.ndarray when >> you call np.mean(cplx_data). More generally, any function that calls >> np.asarray(cplx_data) will properly convert the values, which should >> include most functions from well-written libraries (including numpy and >> scipy). netCDF4.Variable doesn't currently have such an __array__ method, >> but it will in the next released version of the library. >> >> The quick and dirty hack to make all numpy methods work (now going beyond >> what the netCDF4 library implements) would be to add something like the >> following: >> >> def __getattr__(self, attr): >> return getattr(np.asarray(self), attr) >> >> But this is a little dangerous, since some methods might silently fail or >> give unpredictable results (e.g., those that modify data). It would be >> safer to list the methods you want to implement explicitly, or to just >> liberally use np.asarray. The later is generally a good practice when >> writing library code, anyways, to catch unusual ndarray subclasses like >> np.matrix. >> >> Stephan >> >> >> On Sat, Mar 29, 2014 at 8:42 PM, G Jones <glenn.calt...@gmail.com> wrote: >> >>> Hi Stephan, >>> Thanks for the reply. I was thinking of something along these lines but >>> was hesitant because while this provides clean access to chunks of the >>> data, you still have to remember to do cplx_data[:].mean() for example in >>> the case that you want cplx_data.mean(). >>> >>> I was hoping to basically have all of the ndarray methods at hand >>> without any indexing, but then also being smart about taking advantage of >>> the mmap when possible. But perhaps your solution is the best compromise. >>> >>> Thanks again, >>> Glenn >>> On Mar 29, 2014 10:59 PM, "Stephan Hoyer" <sho...@gmail.com> wrote: >>> >>>> Hi Glenn, >>>> >>>> My usual strategy for this sort of thing is to make a light-weight >>>> wrapper class which reads and converts values when you access them. For >>>> example: >>>> >>>> class WrapComplex(object): >>>> def __init__(self, nc_var): >>>> self.nc_var = nc_var >>>> >>>> def __getitem__(self, item): >>>> return self.nc_var[item].view('complex') >>>> >>>> nc = netCDF4.Dataset('my.nc') >>>> cplx_data = WrapComplex(nc.groups['mygroup'].variables['cplx_stuff']) >>>> >>>> Now you can index cplx_data (e.g., cplx_data[:10]) and only the values >>>> you need will be read from disk and converted on the fly. >>>> >>>> Hope this helps! >>>> >>>> Cheers, >>>> Stephan >>>> >>>> >>>> >>>> >>>> On Sat, Mar 29, 2014 at 6:13 PM, G Jones <glenn.calt...@gmail.com>wrote: >>>> >>>>> Hi, >>>>> I am using netCDF4 to store complex data using the recommended >>>>> strategy of creating a compound data type with the real and imaginary >>>>> parts. This all works well, but reading the data into a numpy array is a >>>>> bit clumsy. >>>>> >>>>> Typically I do: >>>>> >>>>> nc = netCDF4.Dataset('my.nc') >>>>> cplx_data = >>>>> nc.groups['mygroup'].variables['cplx_stuff'][:].view('complex') >>>>> >>>>> which directly gives a nice complex numpy array. This is OK for small >>>>> arrays, but is wasteful if I only need some chunks of the array because it >>>>> reads all the data in, reducing the utility of the mmap feature of netCDF. >>>>> >>>>> I'm wondering if there is a better way to directly make a numpy array >>>>> view that uses the netcdf variable's memory mapped buffer directly. >>>>> Looking >>>>> at the Variable class, there is no access to this buffer directly which >>>>> could then be passed to np.ndarray(buffer=...). >>>>> >>>>> Any ideas of simple solutions to this problem? >>>>> >>>>> Thanks, >>>>> Glenn >>>>> >>>>> _______________________________________________ >>>>> NumPy-Discussion mailing list >>>>> NumPy-Discussion@scipy.org >>>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>>>> >>>>> >>>> >>>> _______________________________________________ >>>> NumPy-Discussion mailing list >>>> NumPy-Discussion@scipy.org >>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>>> >>>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion@scipy.org >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >>> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion@scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > >
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion