>I've been doing some benchmarks based on your requirements, and my
conclusion
>is that the implementation of variable length types in HDF5 is not
very
>efficient, specially with HDF5 1.8.x series (see [1]). So, you
should avoid
>using VLArrays for saving small arrays: they fit better in table
fields.
>With this, a possible solution is to distinguish between small and
large
>strings (for this case). Small strings can be saved in a Table
field, while
>larger ones will be output into a VLArray. Then you will have to add
another
>field in the table specifying where the data is (for example -1
could mean "in
>this table" and any other positive value "the index in the
VLArray"). You may
>want to experiment in order to see the optimal threshold that
separates
>'small' string from 'large' ones, but anything between 128 and 1024
would work
>fine.
>I'm adding the script that I've been using for my own benchmarking.
Notice
>that if your optimal break-point (threshold) is too large (say, >
10000
>bytes), then this partition is not going to work well, but chances
are that
>your scenario would fit here easily. If not, one can think on a finer
>partition, but let's start by this one.
Francesc,
thank you for your explanation and for your time. I'll try to use a
combined approach.
A very stupid question is the following:
Normally I have to update each string of the vlarray (and in the table
as well) adding a character at the end.
For example:
vlarray = fileh.createVLArray(root, 'vlarray7', VLStringAtom(),
"Variable Length String")
vlarray.append("asd")
vlarray.append("aaana")
if I write:
vlarray__setitem__(0,"acg")
The value at position 0 is correctly updated.
If instead I write:
vlarray__setitem__(0,"acgt")
I get a ValueError as expected reading the manual.
The question is:
Is there a way to update the string of a vlarray?
Thanks a lot,
Ernesto
------------------------------------------------------------------------------
Return on Information:
Google Enterprise Search pays you back
Get the facts.
http://p.sf.net/sfu/google-dev2dev
_______________________________________________
Pytables-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/pytables-users