Re: [Numpy-discussion] Designing a new storage format for numpy recarrays

2009-10-30 Thread Anne Archibald
2009/10/30 Stephen Simmons : > I should clarify what I meant.. > > Suppose I have a recarray with 50 fields and want to read just one of > those fields. PyTables/HDF will read in the compressed data for chunks > of complete rows, decompress the full 50 fields, and then give me back > the data f

Re: [Numpy-discussion] Designing a new storage format for numpy recarrays

2009-10-30 Thread Stephen Simmons
I should clarify what I meant.. Suppose I have a recarray with 50 fields and want to read just one of those fields. PyTables/HDF will read in the compressed data for chunks of complete rows, decompress the full 50 fields, and then give me back the data for just one field. I'm after a solut

Re: [Numpy-discussion] Designing a new storage format for numpy recarrays

2009-10-30 Thread Robert Kern
On Fri, Oct 30, 2009 at 08:18, Stephen Simmons wrote: > Thoughts about a new format > > It seems that numpy could benefit from a new storage format. While you may indeed need a new format, I'm not sure that numpy does. Lord knows I've gotten enough flak for inven

Re: [Numpy-discussion] Designing a new storage format for numpy recarrays

2009-10-30 Thread Francesc Alted
A Friday 30 October 2009 14:18:05 Stephen Simmons escrigué: > - Pytables (HDF using chunked storage for recarrays with LZO > compression and shuffle filter) > - can't extract individual field from a recarray Er... Have you tried the ``cols`` accessor? http://www.pytables.org/docs/manual/ch04

Re: [Numpy-discussion] Designing a new storage format for numpy recarrays

2009-10-30 Thread Zachary Pincus
Unless I read your request or the documentation wrong, h5py already supports pulling specific fields out of "compound data types": http://h5py.alfven.org/docs-1.1/guide/hl.html#id3 > For compound data, you can specify multiple field names alongside > the numeric slices: > >>> dset["FieldA"] >

Re: [Numpy-discussion] Designing a new storage format for numpy recarrays

2009-10-30 Thread Dag Sverre Seljebotn
Stephen Simmons wrote: > P.S. Maybe this will be too much work, and I'd be better off sticking > with Pytables. I can't judge that, but I want to share some thoughts (rant?): - Are you ready to not only write the code, but maintain it over years to come, and work through nasty bugs, and thin

Re: [Numpy-discussion] Designing a new storage format for numpy recarrays

2009-10-30 Thread Dag Sverre Seljebotn
Dag Sverre Seljebotn: > Hi, > > Is anyone working on alternative storage options for numpy arrays, and > specifically recarrays? My main application involves processing series > of large recarrays (say 1000 recarrays, each with 5M rows having 50 > fields). Existing options meet some but not all of