Re: [Numpy-discussion] Designing a new storage format for numpy recarrays

2009-10-30 Thread Anne Archibald
2009/10/30 Stephen Simmons : > I should clarify what I meant.. > > Suppose I have a recarray with 50 fields and want to read just one of > those fields. PyTables/HDF will read in the compressed data for chunks > of complete rows, decompress the full 50 fields, and then give me back > the data f

Re: [Numpy-discussion] Designing a new storage format for numpy recarrays

2009-10-30 Thread Stephen Simmons
I should clarify what I meant.. Suppose I have a recarray with 50 fields and want to read just one of those fields. PyTables/HDF will read in the compressed data for chunks of complete rows, decompress the full 50 fields, and then give me back the data for just one field. I'm after a solut

Re: [Numpy-discussion] Designing a new storage format for numpy recarrays

2009-10-30 Thread Robert Kern
On Fri, Oct 30, 2009 at 08:18, Stephen Simmons wrote: > Thoughts about a new format > > It seems that numpy could benefit from a new storage format. While you may indeed need a new format, I'm not sure that numpy does. Lord knows I've gotten enough flak for inven

Re: [Numpy-discussion] Designing a new storage format for numpy recarrays

2009-10-30 Thread Francesc Alted
A Friday 30 October 2009 14:18:05 Stephen Simmons escrigué: > - Pytables (HDF using chunked storage for recarrays with LZO > compression and shuffle filter) > - can't extract individual field from a recarray Er... Have you tried the ``cols`` accessor? http://www.pytables.org/docs/manual/ch04

Re: [Numpy-discussion] Designing a new storage format for numpy recarrays

2009-10-30 Thread Zachary Pincus
Unless I read your request or the documentation wrong, h5py already supports pulling specific fields out of "compound data types": http://h5py.alfven.org/docs-1.1/guide/hl.html#id3 > For compound data, you can specify multiple field names alongside > the numeric slices: > >>> dset["FieldA"] >

Re: [Numpy-discussion] Designing a new storage format for numpy recarrays

2009-10-30 Thread Dag Sverre Seljebotn
Stephen Simmons wrote: > P.S. Maybe this will be too much work, and I'd be better off sticking > with Pytables. I can't judge that, but I want to share some thoughts (rant?): - Are you ready to not only write the code, but maintain it over years to come, and work through nasty bugs, and thin

Re: [Numpy-discussion] Designing a new storage format for numpy recarrays

2009-10-30 Thread Dag Sverre Seljebotn
Dag Sverre Seljebotn: > Hi, > > Is anyone working on alternative storage options for numpy arrays, and > specifically recarrays? My main application involves processing series > of large recarrays (say 1000 recarrays, each with 5M rows having 50 > fields). Existing options meet some but not all of

[Numpy-discussion] Designing a new storage format for numpy recarrays

2009-10-30 Thread Stephen Simmons
Hi, Is anyone working on alternative storage options for numpy arrays, and specifically recarrays? My main application involves processing series of large recarrays (say 1000 recarrays, each with 5M rows having 50 fields). Existing options meet some but not all of my requirements. Requirement