On Sun, Oct 21, 2012 at 10:41 AM, Francesc Alted <fal...@pytables.org> wrote:
> Hi, > > I'm going to give a tutorial on PyTables next Thursday during the PyData > conference in New York (http://nyc2012.pydata.org/) and I'd like to use > some real life data files. So, if you have some public repository with > data generated with PyTables, please tell me. I'm looking for files > that are not very large (< 1GB), and that use the Table object > significantly. A small description of the data included will be more > that welcome too! > > Thanks! > > -- > Francesc Alted Hi Francesc. I've been working on a library for accessing climatology data that uses pytables to cache data from the USGS. It could easily be used to create a sample dataset for some area of interest. File size is determined by how much data gets queried. The general layout is: /usgs/sites - the sites table contains information and metadata about a site /usgs/values/<AGENCY>/<SITE_CODE>/<PARAMETER_CODE> - a table containing all the timeseries data for each site and parameter is created as data are queried - parameter codes are a bit obscure but a dict with descriptive metadata stashed at table.attrs.variable - the datetime column has a CSIndex on it and stored as as a string because some sites have data prior to the year 1901 - pretty inefficient in terms of disk space (lots of large-ish string columns) because it handles a very general class of data types Here's what the code would look like to download and create the hdf5 file for 10 random sites in New York: import ulmo # the default location for the hdf5 file is OS dependent, so provide the path you want to use hdf5_file_path = './usgs_data.h5' # get list of sites in NY ulmo.usgs.pytables.update_site_list(state_code='NY', path=hdf5_file_path) sites = ulmo.usgs.pytables.get_sites(path=hdf5_file_path) # download data for a few random sites for site in sites.keys()[:10]: ulmo.usgs.pytables.update_site_data(site, path=hdf5_file_path) The project is on github: https://github.com/swtools/ulmo and the code that does all the pytables stuff (including the table descriptions) is here: https://github.com/swtools/ulmo/blob/master/ulmo/usgs/pytables.py -andy ------------------------------------------------------------------------------ Everyone hates slow websites. So do we. Make your web apps faster with AppDynamics Download AppDynamics Lite for free today: http://p.sf.net/sfu/appdyn_sfd2d_oct _______________________________________________ Pytables-users mailing list Pytables-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/pytables-users