Francesc Alted <faltet <at> pytables.org> writes:
> > Question 2: Further on I only need to work with the last 20% or so of each
> > row. Is there an efficient way to slice from a row without having to load
> > it all from disk?
> >
> > for i in range(len(y)):
> > yj = y[i][-2000:] # not having to read y[i][:6500]
> > ...
>
> I'm afraid you can't. The thing is that the VL types cannot be divided and
> the entire data element must be transferred. See:
>
> http://www.hdfgroup.org/HDF5/doc/UG/11_Datatypes.html
>
> section 4.3.2.3 for more info on this.
Ouch. This means I'll have to store my own "abstracts" of the data, and in many
cases it will be faster to re-compute the details I need than store them all in
HDF5 files. I already currently have something like this: a table with a field
for the VLArray row index. I'll need to somehow expand it with more fields, but
I vaguely recall that it's not possible to alter table descriptions (add/drop
fields). I guess a new table aligned with the old one is the easiest way out,
or a manual loop:
# get description of old_table
# add new fields
# create new_table
from itertools import izip
for oldrow, newrow in izip(old_table, new_table):
for field in old_fields:
newrow[field] = oldrow[field]
I'm a little surprised that the design of HDF5 does not permit striding and
slicing of VLArray rows; I thought a VLArray mostly behaved like any other
array.
Thank you for a very clarifying answer!
Best regards,
Jon Olav
------------------------------------------------------------------------------
Crystal Reports - New Free Runtime and 30 Day Trial
Check out the new simplified licensing option that enables unlimited
royalty-free distribution of the report engine for externally facing
server and web deployment.
http://p.sf.net/sfu/businessobjects
_______________________________________________
Pytables-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/pytables-users