[Numpy-discussion] Reading a big netcdf file

2011-08-03 Thread Kiko
Hi. I'm trying to read a big netcdf file (445 Mb) using netcdf4-python. The data are described as: *The GEBCO gridded data set is stored in NetCDF as a one dimensional array of 2-byte signed integers that represent integer elevations in metres. The complete data set gives global coverage. It cons

Re: [Numpy-discussion] Reading a big netcdf file

2011-08-03 Thread Gökhan Sever
Here are my values for your comparison: test.nc file is about 715 MB. The details are below: In [21]: netCDF4.__version__ Out[21]: '0.9.4' In [22]: np.__version__ Out[22]: '2.0.0.dev-b233716' In [23]: from netCDF4 import Dataset In [24]: f = Dataset("test.nc") In [25]: f.variables['reflectivi

Re: [Numpy-discussion] Reading a big netcdf file

2011-08-03 Thread Christopher Barker
On 8/3/11 9:30 AM, Kiko wrote: > I'm trying to read a big netcdf file (445 Mb) using netcdf4-python. I've never noticed that netCDF4 was particularly slow for reading (writing can be pretty slow some times). How slow is slow? > The data are described as: please post the results of: ncdump -h t

Re: [Numpy-discussion] Reading a big netcdf file

2011-08-03 Thread Gökhan Sever
Just a few extra tests on my side pushing the limits of my system memory: In [34]: k = np.zeros((21601, 10801, 3), dtype='int16') k ndarray 21601x10801x3: 699937203 elems, type `int16`, 1399874406 bytes (1335 Mb) And for the first time my memory explodes with a hard kernel crash: In

Re: [Numpy-discussion] Reading a big netcdf file

2011-08-03 Thread Ian Stokes-Rees
On 8/3/11 12:50 PM, Christopher Barker wrote: > As a reference, reading that much data in from a raw file into a numpy > array takes 2.57 on my machine (a rather old Mac, but disks haven't > gotten much faster). 2.57 seconds? or minutes? If seconds, does it actually read the whole thing into mem

Re: [Numpy-discussion] Reading a big netcdf file

2011-08-03 Thread Christopher Barker
On 8/3/11 11:09 AM, Ian Stokes-Rees wrote: > On 8/3/11 12:50 PM, Christopher Barker wrote: >> As a reference, reading that much data in from a raw file into a numpy >> array takes 2.57 on my machine (a rather old Mac, but disks haven't >> gotten much faster). > > 2.57 seconds? or minutes? sorry --

Re: [Numpy-discussion] Reading a big netcdf file

2011-08-03 Thread Christopher Barker
On 8/3/11 9:46 AM, Gökhan Sever wrote: > In [23]: from netCDF4 import Dataset > > In [24]: f = Dataset("test.nc ") > > In [25]: f.variables['reflectivity'].shape > Out[25]: (6, 18909, 506) > > In [26]: f.variables['reflectivity'].size > Out[26]: 57407724 > > In [27]: f.variables['re

Re: [Numpy-discussion] Reading a big netcdf file

2011-08-03 Thread Gökhan Sever
This is what I get here: In [1]: a = np.zeros((21601, 10801), dtype=np.uint16) In [2]: a.tofile('temp.npa') In [3]: del a In [4]: timeit a = np.fromfile('temp.npa', dtype=np.uint16) 1 loops, best of 3: 251 ms per loop On Wed, Aug 3, 2011 at 10:50 AM, Christopher Barker wrote: > On 8/3/11 9:3

Re: [Numpy-discussion] Reading a big netcdf file

2011-08-03 Thread Gökhan Sever
I think these answer your questions. In [3]: type f.variables['reflectivity'] --> type(f.variables['reflectivity']) Out[3]: In [4]: type f.variables['reflectivity'][:] --> type(f.variables['reflectivity'][:]) Out[4]: In [5]: z = f.variables['reflectivity'][:] In [6]: type z --> ty

Re: [Numpy-discussion] Reading a big netcdf file

2011-08-03 Thread Christopher Barker
On 8/3/11 1:57 PM, Gökhan Sever wrote: > This is what I get here: > > In [1]: a = np.zeros((21601, 10801), dtype=np.uint16) > > In [2]: a.tofile('temp.npa') > > In [3]: del a > > In [4]: timeit a = np.fromfile('temp.npa', dtype=np.uint16) > 1 loops, best of 3: 251 ms per loop so that's about 10 ti

Re: [Numpy-discussion] Reading a big netcdf file

2011-08-03 Thread Gökhan Sever
On Wed, Aug 3, 2011 at 3:15 PM, Christopher Barker wrote: > On 8/3/11 1:57 PM, Gökhan Sever wrote: > > This is what I get here: > > > > In [1]: a = np.zeros((21601, 10801), dtype=np.uint16) > > > > In [2]: a.tofile('temp.npa') > > > > In [3]: del a > > > > In [4]: timeit a = np.fromfile('temp.npa'

Re: [Numpy-discussion] Reading a big netcdf file

2011-08-03 Thread Warren Weckesser
On Wed, Aug 3, 2011 at 4:24 PM, Gökhan Sever wrote: > > > On Wed, Aug 3, 2011 at 3:15 PM, Christopher Barker > wrote: >> >> On 8/3/11 1:57 PM, Gökhan Sever wrote: >> > This is what I get here: >> > >> > In [1]: a = np.zeros((21601, 10801), dtype=np.uint16) >> > >> > In [2]: a.tofile('temp.npa') >

Re: [Numpy-discussion] Reading a big netcdf file

2011-08-03 Thread Eric Firing
On 08/03/2011 11:24 AM, Gökhan Sever wrote: > I[1]: timeit a = np.fromfile('temp.npa', dtype=np.uint16) > 1 loops, best of 3: 263 ms per loop You need to clear your cache and then run timeit with options "-n1 -r1". Eric ___ NumPy-Discussion mailing lis

Re: [Numpy-discussion] Reading a big netcdf file

2011-08-03 Thread Gökhan Sever
Back to the reality. After clearing the cache using Warren's suggestion: In [1]: timeit -n1 -r1 a = np.fromfile('temp.npa', dtype=np.uint16) 1 loops, best of 1: 7.23 s per loop On Wed, Aug 3, 2011 at 4:52 PM, Eric Firing wrote: > On 08/03/2011 11:24 AM, Gökhan Sever wrote: > > > I[1]: timeit a

Re: [Numpy-discussion] Reading a big netcdf file

2011-08-04 Thread Kiko
Hi, all. Thank you very much for your replies. I am obtaining some issues. If I use netcdf4-python or scipy.io.netcdf libraries: In [4]: import netCDF4 as n4 In [5]: from scipy.io import netcdf as nS In [6]: import numpy as np In [7]: gebco4 = n4.Dataset('GridOne.grd', 'r') In [8]: gebcoS = nS.n

Re: [Numpy-discussion] Reading a big netcdf file

2011-08-04 Thread Jeff Whitaker
On 8/4/11 4:46 AM, Kiko wrote: Hi, all. Thank you very much for your replies. I am obtaining some issues. If I use netcdf4-python or scipy.io.netcdf libraries: In [4]: import netCDF4 as n4 In [5]: from scipy.io import netcdf as nS In [6]: import numpy as np In [7]: gebco4 =

Re: [Numpy-discussion] Reading a big netcdf file

2011-08-04 Thread Christopher Barker
On 8/4/11 3:46 AM, Kiko wrote: > In [9]: z4 = gebco4.variables['z'] > > I got no problems and I have: > > In [14]: type(z4); z4.shape; z4.size > Out[14]: > Out[14]: (233312401,) > Out[14]: 233312401 > > But if I do: > > In [15]: z4 = gebco4.variables['z'][:] > MemoryError > What's the difference

Re: [Numpy-discussion] Reading a big netcdf file

2011-08-04 Thread Christopher Barker
On 8/3/11 3:56 PM, Gökhan Sever wrote: > Back to the reality. After clearing the cache using Warren's suggestion: > > In [1]: timeit -n1 -r1 a = np.fromfile('temp.npa', dtype=np.uint16) > 1 loops, best of 1: 7.23 s per loop yup -- that cache sure can be handy! -Chris -- Christopher Barker, Ph.

Re: [Numpy-discussion] Reading a big netcdf file

2011-08-04 Thread Christopher Barker
On 8/4/11 10:02 AM, Christopher Barker wrote: > On 8/4/11 8:53 AM, Jeff Whitaker wrote: >> Kiko: I think the difference may be that when you read the data with >> netcdf4-python, it tries to unpack the short integers to a float32 >> array. > > Jeff, why is that? is it an netcdf4 convention? I alway