Hi Tom, Il 09/12/2011 14:12, Tom Diethe ha scritto: > I have files stored using Matlab's sparse format (HDF5, csc I > believe), and I'm trying to use Pytables to operate on them directly, > but haven't succeeded yet. Using h5py I can do the following: > > # Method 1: uses h5py (WORKS) > f1 = h5py.File(fname) > data = f1['M']['data] > ir = f1['M]['ir'] > jc = f1['M']['jc'] > M = scipy.sparse.csc_matrix( (data,ir,jc) ) > > but if I try to do the equivalent in Pytables: > > # Method 2: uses pyTables (DOESN'T WORK) > f2 = tables.openFile(fname) > data = f2.root.M.data > ir = f2.root.M.ir > jc = f2.root.M.jc > M = scipy.sparse.csc_matrix( (data,ir,jc) ) >
If my understanding is correct before calling csc_matrix you should actually read data from disk > data = f2.root.M.data[...] > ir = f2.root.M.ir[...] > jc = f2.root.M.jc[...] Please note that f3.root.M.data in a pytables object, and not a numpy array In [23]: f2.root.M.data Out[23]: /M/data (Array(20,)) '' atom := Float64Atom(shape=(), dflt=0.0) maindim := 0 flavor := 'numpy' byteorder := 'little' chunkshape := None > this fails (after a long wait) with the error: > > TypeError Traceback (most recent call last) > > /home/tdiethe/BMJ/<ipython console> in <module>() > > /usr/lib/python2.6/dist-packages/scipy/sparse/compressed.pyc in > __init__(self, arg1, shape, dtype, copy, dims, nzmax) > 56 self.indices = np.array(indices, copy=copy) > 57 self.indptr = np.array(indptr, copy=copy) > ---> 58 self.data = np.array(data, copy=copy, > dtype=getdtype(dtype, data)) > 59 else: > 60 raise ValueError, "unrecognized %s_matrix > constructor usage" %\ > > /usr/lib/python2.6/dist-packages/scipy/sparse/sputils.pyc in > getdtype(dtype, a, default) > 69 canCast = False > 70 else: > ---> 71 raise TypeError, "could not interpret data type" > 72 else: > 73 newdtype = np.dtype(dtype) > > TypeError: could not interpret data type > > > I have two questions: > > - how do I load this file in > - do I need to perform the conversion to a scipy sparse matrix in > order to be able to perform operations on it, or can I perform > operations directly on the disk files (matrix multiplication etc)? Not sure to understand the second question but I guess the answer is yes, unless it is a trivial element-by-element operation. Best regards -- Antonio Valentino ------------------------------------------------------------------------------ Learn Windows Azure Live! Tuesday, Dec 13, 2011 Microsoft is holding a special Learn Windows Azure training event for developers. It will provide a great way to learn Windows Azure and what it provides. You can attend the event by watching it streamed LIVE online. Learn more at http://p.sf.net/sfu/ms-windowsazure _______________________________________________ Pytables-users mailing list Pytables-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/pytables-users