Hi,
I have a python script that essentially converts a csv file to python, but I 
have a few problems that I haven't been able to solve.


1.       Column Order - I don't know the columns that I need to write until 
runtime, so creating an extension of the IsDescription class was a non-starter. 
 Therefore, to define by columns, I am passing in a dictionary that maps column 
name to column class to the create_table method:

      h5_file = tables.open_file(filename, mode = 'w', title = 'Test File')
      group = h5_file.create_group('/', 'data', 'Data Group')
      column_dict = OrderedDict()
      for key in column_names:
        column_dict[key] = create_col(key)

table = h5_file.create_table(group, 'table', column_dict, 'Table')

create_col is simply a method that returns Int32Col(), Float64Col(), etc., 
depending on some information about the column.  That is working fine.  
However, the columns in the table that are created are not in the order that I 
want.  I used OrderedDict to ensure that the columns are in the dictionary in 
insertion order, but the table doesn't reflect this.  Any ideas on how to 
control the column order if I can't extend IsDescription to create my data type?


2.       Variable length strings - Strings work fine when I give them a maximum 
size.  This was fine to get something up and running, but the strings really 
need to be variable length.  Is there a way to have VLString columns within a 
table?  I see examples of VLStringAtom being passed as a type to 
h5file.create_array, but I don't see similar examples for table columns and 
there isn't a Col class for this type.  Any help is appreciated.


3.       "Blanks" in my csv file - The csv files I'm converting contain null or 
blank values.  If you imagine loading the file in Excel or a similar program, 
some cells will be blank.  So, even if column X is an Int32Col, there may be 
blanks.  How would I handle this using PyTables?  I suppose I can substitute 
some value for blank cells, but I would like to avoid that if possible.


Help on any of these items is greatly appreciated.  I know that using h5py 
(would I have to use the low-level API?) instead of pytables would probably 
solve these problems, but am trying to avoid that since pytables has otherwise 
been so easy to use.

Thanks in advance,
Sarah



_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5

Reply via email to