Hi Nitin, Yes, HDF5 files generated in pandas can be appended with more rows easily using the HDFStore.append() method (as shown in the documentation and in my examples).
Regarding visualizations, pandas uses its own format on top of HDF5 to store dataframes, so this is why using a standard HDF5 viewer (like HDFView) is not showing the table (i.e. compound type) that you might expect. For this, it is better to use pandas itself to read the HDF5 dataset (or parts of it) and then visualize the resulting dataframe with one of many existing tools that interacts well with pandas: http://pandas.pydata.org/pandas-docs/stable/ecosystem.html#visualization Take your time to decide which tool works best for your case. Meanwhile, you can have a glance at the kind of plots that can produce plotly with HDF5 files produced by pandas: https://plot.ly/python/pytables In general, and if you want to proceed with the pandas path, you may want to ask in the pandas mailing list, where far more people will be ready for helping you. Francesc Alted ________________________________ From: Hdf-forum <[email protected]> on behalf of nitin chandra <[email protected]> Sent: Wednesday, February 1, 2017 8:04:58 PM To: HDF Users Discussion List Subject: Re: [Hdf-forum] CSV data into HDF5 data structure and files Hi Francesc, I tried your example as it is, could not get time to modify and try some thing new. ran the $ python csv_demo.py it did create a CSV file with 10 columns, populating the columns with random no. The demo.h5 was created, and I used HDFView 2.9 to see the contents of the demo.h5 file. created were a directory table, and data table - table. In the data table - table, there are 2 columns index | value_block_0 empty | no value no data | but 10 commas So that I can relate to your guidance with respect to the issue, please find attached 2 sample files. Also, note the first row in CSVs attached, this was created to initialise the start point of data sequence. Will it be a good practice to have them in h5 tables also ? Last column has string values, need them. ALIGN data goes into file1 and GRADE data into File2, so I am looking for a write function to write into respective tables and then read function to read from them. After the data is in H5 file, can I insert/add/append a new row in between other rows or at end of file ? Which editor to use or method to do it in ? Thank you, Nitin On 30 January 2017 at 23:01, nitin chandra <[email protected]> wrote: > Thank you Francesc, > > Please give me 2-3 days try your example ... do some reading and > testes based as per the link mentioned. > > I shall repost soon. > > Thank you > > Nitin > > On 30 January 2017 at 17:14, Francesc Altet <[email protected]> wrote: >> Hi Nitin, >> >> >> I think before getting into details, you need to look into how to >> efficiently read and write data from CSV files into HDF5 in Python. For >> this, pandas is a great library to use. My advice is to have a look at the >> excellent documentation in pandas website: >> >> >> http://pandas.pydata.org/pandas-docs/stable/io.html >> >> >> In particular, you want to use the `pandas.read_csv()` which one of the >> fastest ways to read CSV files that I am aware of. Also, for storing the >> data in HDF5, `pandas.HDFStore()` comes handy because it can generate HDF5 >> files out of pandas Dataframes. In addition, in order to avoid loading all >> the data in a Dataframe in memory, you want to use the `chunksize` keyword >> that will allow to read the CSV files in chunks before storing. >> >> >> I have prepared an example for you (attached) so that you can have a look at >> how to use all of this (it is simpler than it may seem). Here it is the >> output on my machine: >> >> >> $ python csv_demo.py >> CSV creation time: 1.491 (67.092 Krow/s) >> CSV reading time: 0.134 (748.360 Krow/s) >> HDF5 store time: 0.322 (310.228 Krow/s) >> HDF5 read time: 0.006 (15622.990 Krow/s) >> >> >> so, once the data is stored in HDF5, the read times will be much faster than >> using CSV (as expected). >> >> >> HTH, >> >> >> Francesc >> >> >> _______________________________________________ >> Hdf-forum is for HDF software users discussion. >> [email protected] >> http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org >> Twitter: https://twitter.com/hdf5
_______________________________________________ Hdf-forum is for HDF software users discussion. [email protected] http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org Twitter: https://twitter.com/hdf5
