On Sun, Oct 21, 2012 at 10:41 AM, Francesc Alted <fal...@pytables.org> wrote:

> Hi,
>
> I'm going to give a tutorial on PyTables next Thursday during the PyData
> conference in New York (http://nyc2012.pydata.org/) and I'd like to use
> some real life data files.  So, if you have some public repository with
> data generated with PyTables, please tell me.  I'm looking for files
> that are not very large (< 1GB), and that use the Table object
> significantly.  A small description of the data included will be more
> that welcome too!
>
> Thanks!
>
> --
> Francesc Alted



Hi Francesc.

I've been working on a library for accessing climatology data that
uses pytables to cache data from the USGS. It could easily be used to
create a sample dataset for some area of interest. File size is
determined by how much data gets queried.


The general layout is:

/usgs/sites
- the sites table contains information and metadata about a site


/usgs/values/<AGENCY>/<SITE_CODE>/<PARAMETER_CODE>
- a table containing all the timeseries data for each site and
parameter is created as data are queried
- parameter codes are a bit obscure but a dict with descriptive
metadata stashed at table.attrs.variable
- the datetime column has a CSIndex on it and stored as as a string
because some sites have data prior to the year 1901
- pretty inefficient in terms of disk space (lots of large-ish string
columns) because it handles a very general class of data types


Here's what the code would look like to download and create the hdf5
file for 10 random sites in New York:

import ulmo

# the default location for the hdf5 file is OS dependent, so provide
the path you want to use
hdf5_file_path = './usgs_data.h5'

# get list of sites in NY
ulmo.usgs.pytables.update_site_list(state_code='NY', path=hdf5_file_path)
sites = ulmo.usgs.pytables.get_sites(path=hdf5_file_path)

# download data for a few random sites
for site in sites.keys()[:10]:
    ulmo.usgs.pytables.update_site_data(site, path=hdf5_file_path)



The project is on github: https://github.com/swtools/ulmo
and the code that does all the pytables stuff (including the table
descriptions) is here:
https://github.com/swtools/ulmo/blob/master/ulmo/usgs/pytables.py

-andy

------------------------------------------------------------------------------
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:
http://p.sf.net/sfu/appdyn_sfd2d_oct
_______________________________________________
Pytables-users mailing list
Pytables-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/pytables-users

Reply via email to