Robert Ferrell <ferrell <at> diablotech.com> writes:
> I'm trying to read rows from a table. I wrote the table with a
> dictionary descriptor, which looks like {'label':
> tables.StringCol(16), 'x':tables. Float64Col(), 'y':tables.
> Float64Col()}
>
> tbl = h5File.createTable(gp, tblName, tblDict, title='')
>
> If I read the whole table with tbl.read() I get an array of records,
> with the field names 'label', 'x', 'y', just as expected.
>
> However, I can't figure out how to read one row at a time and get a
> record array. I've tried:
>
> r = [row[:] for row in tbl]
>
> I get back a list with an item per row, but the rows are tuples, not
> record arrays. How can I read the rows and get record arrays?
If I understand the question correctly, this has confused me too. The short
answer may be:
a = tbl[:]
a = tbl[:].view(numpy.recarray)
or for single rows:
a = tbl[i]
a = tbl[i:i+1].view(numpy.recarray)
The various options all have the same a.dtype and allow named field access with
a["label"]. However, "dotting" (a.label) only works with recarrays
(numpy.recarray aka numpy.core.records.recarray). The biggest gotcha for me was
that casting tbl[i] to recarray has no effect: You need tbl[i:i+1] to get
dotted access to fields of a single record.
It is somewhat confusing that
type(tbl[i]) is numpy.void
type(tbl[i:i+1]) is numpy.ndarray
and that none of them are numpy.recarray.
It is also confusing that the term "record array" is used both for
numpy.recarray and a numpy.ndarray with a compound dtype (i.e. nonempty
a.dtype.names).
http://docs.scipy.org/doc/numpy/user/basics.rec.html
However, once you find the right combination of tricks it is very neat to
extract data from Pytables and index them with dotting 8-)
a = tbl[i:j].view(numpy.recarray)
sum(a.x)
Here's a lengthy exploration of what you can and cannot do with the various
incarnations of table rows as Numpy arrays.
Hope this helps,
Jon Olav
"""Getting a numpy.recarray from a tables.Table"""
import numpy, tables
f = tables.openFile("test.h5", "w")
description = {'label': tables.StringCol(16),
'x':tables.Float64Col(), 'y':tables.Float64Col()}
t = f.createTable(f.root, "test", description)
r = t.row
for i in range(3): # add some data
r["label"] = chr(i + ord("a")) # "a", "b", ...
r["x"] = i * 2.5
r["y"] = i
r.append()
t.flush()
t
# /test (Table(3L,)) ''
# description := {
# "label": StringCol(itemsize=16, shape=(), dflt='', pos=0),
# "x": Float64Col(shape=(), dflt=0.0, pos=1),
# "y": Float64Col(shape=(), dflt=0.0, pos=2)}
# byteorder := 'little'
# chunkshape := (256,)
# t[:] gives a numpy.ndarray of all records, accessing fields as a["label"]
a = t[:]
a
# array([('a', 0.0, 0.0), ('b', 2.5, 1.0), ('c', 5.0, 2.0)],
# dtype=[('label', '|S16'), ('x', '<f8'), ('y', '<f8')])
a.label # AttributeError: 'numpy.ndarray' object has no attribute 'label'
a["label"] # array(['a', 'b', 'c'], dtype='|S16')
type(a) # <type 'numpy.ndarray'>
a.dtype # dtype([('label', '|S16'), ('x', '<f8'), ('y', '<f8')])
a.shape # (3,)
# numpy.core.records.recarray of all records, allowing a.label
a = t[:].view(numpy.recarray)
a
# recarray([('a', 0.0, 0.0), ('b', 2.5, 1.0), ('c', 5.0, 2.0)],
# dtype=[('label', '|S16'), ('x', '<f8'), ('y', '<f8')])
a.label # chararray(['a', 'b', 'c'], dtype='|S16')
a["label"] # recarray(['a', 'b', 'c'], dtype='|S16')
type(a) # <class 'numpy.core.records.recarray'>
a.dtype # dtype([('label', '|S16'), ('x', '<f8'), ('y', '<f8')])
a.shape # (3,)
# t[i] gives a numpy.void structured array (I think...)
a = t[1]
a # ('b', 2.5, 1.0)
a.label # AttributeError'>: 'numpy.void' object has no attribute 'label'
a["label"] # 'b'
type(a) # <type 'numpy.void'>
type(a.view(numpy.recarray) # no effect!: <type 'numpy.void'>
a.dtype # dtype([('label', '|S16'), ('x', '<f8'), ('y', '<f8')])
a.shape # ()
# t[i:i+1] is like t[:] but with only a single row
a = t[0:1]
a
# array([('a', 0.0, 0.0)],
# dtype=[('label', '|S16'), ('x', '<f8'), ('y', '<f8')])
a.label # AttributeError'>: 'numpy.void' object has no attribute 'label'
a["label"] # array(['a'], dtype='|S16')
type(a) # <type 'numpy.ndarray'>
a.dtype # dtype([('label', '|S16'), ('x', '<f8'), ('y', '<f8')])
a.shape # (1,)
# t[i:i+1].view(numpy.recarray) is clunky but allows a.label
a = t[0:1].view(numpy.recarray)
# recarray([('a', 0.0, 0.0)],
# dtype=[('label', '|S16'), ('x', '<f8'), ('y', '<f8')])
a.label # chararray(['a'], dtype='|S16')
a["label"] # recarray(['a'], dtype='|S16')
type(a) # <class 'numpy.core.records.recarray'>
a.dtype # dtype([('label', '|S16'), ('x', '<f8'), ('y', '<f8')])
a.shape # (1,)
------------------------------------------------------------------------------
Crystal Reports - New Free Runtime and 30 Day Trial
Check out the new simplified licensing option that enables unlimited
royalty-free distribution of the report engine for externally facing
server and web deployment.
http://p.sf.net/sfu/businessobjects
_______________________________________________
Pytables-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/pytables-users