PyTable Users,
I've read the following thread in an attempt to better understand how to
organize a 2D EArray/CArray and retain the ability to efficiently select rows
or columns.
http://www.mail-archive.com/[email protected]/msg00723.html
In this thread it was suggested that access to the columns of an EArray that
was built by appending rows could be done efficiently if the appropriate
chunkshape is passed (At least by my reading). It was also suggested that a
second copy of the data be stored in a different orientation but this statement
was a bit unclear. What I'm looking for is a clear example of how to
efficiently access the columns an array build by appending rows. My data come
in as a series of rows but I would like to be able to read the columns in a
reasonable amount of time.
Below I have a code snippet that creates a fairly large EArray by appending
rows. Can anyone provide some insight on how to access these columns
efficiently and or how to make a second copy of the data in the file using the
appropriate chunkshape? (It is the chunkshape aspect that I'm unclear on how
that size is chosen). Thanks for all your help.
Brian
#################Begin Snippet:
import tables as T
import numpy as N
import time
t1 = time.clock()
hdf = T.openFile('test.h5', mode = "w", title = '')
atom = T.Int32Atom()
#shape = (?,?
#chunkshape = (?,?)
rows = 400
columns = 350000
arr = N.random.random(rows)*100
shape = (rows, columns)#(rows, columns)
filters = T.Filters(complevel=5, complib='zlib')
ea = hdf.createEArray(hdf.root, "EArray", atom, (0, rows), filters = filters,
expectedrows = rows)
for i in xrange(columns):
arr = N.random.random(rows)*100
#print i
ea.append(arr[N.newaxis,:])
ea.flush()
if i%10000 == 0:
print i
#ea[:,1] #is really slow, whereas,
#ea[1] #is fast, how to use chunkshape in order to effeciently access columns
when
#the array was built by rows?
hdf.close()
print "Done"
t2 = time.clock()
print t2-t1
-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Pytables-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/pytables-users