Re: [Pytables-users] PyTables data files for a tutorial

2012-10-21 Thread Anthony Scopatz
Hello Francesc,

I look forward to your pydata hearing how your tutorial goes!

Here [1] is a file that stores some basic nuclear data that is freely
redistributable.  It stores atomic weights, bound neutron scattering
lengths, and pre-compiled neutron cross sections (xs) for 5 different
energy regimes.  Everything in here is a table.  The file is rather
(at about 165 kb).  There are integer, float, and complex columns.

I hope that this helps!

Be Well
Anthony

1. https://s3.amazonaws.com/pyne/prebuilt_nuc_data.h5

On Sun, Oct 21, 2012 at 10:41 AM, Francesc Alted wrote:

> Hi,
>
> I'm going to give a tutorial on PyTables next Thursday during the PyData
> conference in New York (http://nyc2012.pydata.org/) and I'd like to use
> some real life data files.  So, if you have some public repository with
> data generated with PyTables, please tell me.  I'm looking for files
> that are not very large (< 1GB), and that use the Table object
> significantly.  A small description of the data included will be more
> that welcome too!
>
> Thanks!
>
> --
> Francesc Alted
>
>
>
> --
> Everyone hates slow websites. So do we.
> Make your web apps faster with AppDynamics
> Download AppDynamics Lite for free today:
> http://p.sf.net/sfu/appdyn_sfd2d_oct
> ___
> Pytables-users mailing list
> Pytables-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/pytables-users
>
--
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:
http://p.sf.net/sfu/appdyn_sfd2d_oct___
Pytables-users mailing list
Pytables-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/pytables-users


Re: [Pytables-users] PyTables data files for a tutorial

2012-10-21 Thread Andy Wilson
On Sun, Oct 21, 2012 at 10:41 AM, Francesc Alted  wrote:

> Hi,
>
> I'm going to give a tutorial on PyTables next Thursday during the PyData
> conference in New York (http://nyc2012.pydata.org/) and I'd like to use
> some real life data files.  So, if you have some public repository with
> data generated with PyTables, please tell me.  I'm looking for files
> that are not very large (< 1GB), and that use the Table object
> significantly.  A small description of the data included will be more
> that welcome too!
>
> Thanks!
>
> --
> Francesc Alted



Hi Francesc.

I've been working on a library for accessing climatology data that
uses pytables to cache data from the USGS. It could easily be used to
create a sample dataset for some area of interest. File size is
determined by how much data gets queried.


The general layout is:

/usgs/sites
- the sites table contains information and metadata about a site


/usgs/values///
- a table containing all the timeseries data for each site and
parameter is created as data are queried
- parameter codes are a bit obscure but a dict with descriptive
metadata stashed at table.attrs.variable
- the datetime column has a CSIndex on it and stored as as a string
because some sites have data prior to the year 1901
- pretty inefficient in terms of disk space (lots of large-ish string
columns) because it handles a very general class of data types


Here's what the code would look like to download and create the hdf5
file for 10 random sites in New York:

import ulmo

# the default location for the hdf5 file is OS dependent, so provide
the path you want to use
hdf5_file_path = './usgs_data.h5'

# get list of sites in NY
ulmo.usgs.pytables.update_site_list(state_code='NY', path=hdf5_file_path)
sites = ulmo.usgs.pytables.get_sites(path=hdf5_file_path)

# download data for a few random sites
for site in sites.keys()[:10]:
ulmo.usgs.pytables.update_site_data(site, path=hdf5_file_path)



The project is on github: https://github.com/swtools/ulmo
and the code that does all the pytables stuff (including the table
descriptions) is here:
https://github.com/swtools/ulmo/blob/master/ulmo/usgs/pytables.py

-andy

--
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:
http://p.sf.net/sfu/appdyn_sfd2d_oct
___
Pytables-users mailing list
Pytables-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/pytables-users


Re: [Pytables-users] PyTables data files for a tutorial

2012-10-21 Thread Jason Moore
This is a PyTables generated file with data collected from vehicle
(bicycle) dynamics measurements. Meta data are in tables and time series
are stored in array objects.

http://mae.ucdavis.edu/~biosport/InstrumentedBicycleData/InstrumentedBicycleData.h5.bz2

It is about 308 mb compressed and 610 mb uncompressed.

Jason

On Sun, Oct 21, 2012 at 1:01 PM, Andy Wilson wrote:

> On Sun, Oct 21, 2012 at 10:41 AM, Francesc Alted 
> wrote:
>
> > Hi,
> >
> > I'm going to give a tutorial on PyTables next Thursday during the PyData
> > conference in New York (http://nyc2012.pydata.org/) and I'd like to use
> > some real life data files.  So, if you have some public repository with
> > data generated with PyTables, please tell me.  I'm looking for files
> > that are not very large (< 1GB), and that use the Table object
> > significantly.  A small description of the data included will be more
> > that welcome too!
> >
> > Thanks!
> >
> > --
> > Francesc Alted
>
>
>
> Hi Francesc.
>
> I've been working on a library for accessing climatology data that
> uses pytables to cache data from the USGS. It could easily be used to
> create a sample dataset for some area of interest. File size is
> determined by how much data gets queried.
>
>
> The general layout is:
>
> /usgs/sites
> - the sites table contains information and metadata about a site
>
>
> /usgs/values///
> - a table containing all the timeseries data for each site and
> parameter is created as data are queried
> - parameter codes are a bit obscure but a dict with descriptive
> metadata stashed at table.attrs.variable
> - the datetime column has a CSIndex on it and stored as as a string
> because some sites have data prior to the year 1901
> - pretty inefficient in terms of disk space (lots of large-ish string
> columns) because it handles a very general class of data types
>
>
> Here's what the code would look like to download and create the hdf5
> file for 10 random sites in New York:
>
> import ulmo
>
> # the default location for the hdf5 file is OS dependent, so provide
> the path you want to use
> hdf5_file_path = './usgs_data.h5'
>
> # get list of sites in NY
> ulmo.usgs.pytables.update_site_list(state_code='NY', path=hdf5_file_path)
> sites = ulmo.usgs.pytables.get_sites(path=hdf5_file_path)
>
> # download data for a few random sites
> for site in sites.keys()[:10]:
> ulmo.usgs.pytables.update_site_data(site, path=hdf5_file_path)
>
>
>
> The project is on github: https://github.com/swtools/ulmo
> and the code that does all the pytables stuff (including the table
> descriptions) is here:
> https://github.com/swtools/ulmo/blob/master/ulmo/usgs/pytables.py
>
> -andy
>
>
> --
> Everyone hates slow websites. So do we.
> Make your web apps faster with AppDynamics
> Download AppDynamics Lite for free today:
> http://p.sf.net/sfu/appdyn_sfd2d_oct
> ___
> Pytables-users mailing list
> Pytables-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/pytables-users
>



-- 
Jason K. Moore, Ph.D.
Personal Website 
Sports Biomechanics Lab , UC Davis
Davis Open Science 
Google Voice: +01 530-601-9791
--
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:
http://p.sf.net/sfu/appdyn_sfd2d_oct___
Pytables-users mailing list
Pytables-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/pytables-users


Re: [Pytables-users] PyTables data files for a tutorial

2012-10-22 Thread Francesc Alted
Hey, thanks to everybody that contributed datasets!  I'll look into them 
and hope to be able to select something to show.

Francesc

On 10/21/12 10:55 PM, Jason Moore wrote:
> This is a PyTables generated file with data collected from vehicle 
> (bicycle) dynamics measurements. Meta data are in tables and time 
> series are stored in array objects.
>
> http://mae.ucdavis.edu/~biosport/InstrumentedBicycleData/InstrumentedBicycleData.h5.bz2
>  
> 
>
> It is about 308 mb compressed and 610 mb uncompressed.
>
> Jason
>
> On Sun, Oct 21, 2012 at 1:01 PM, Andy Wilson 
> mailto:wilson.andre...@gmail.com>> wrote:
>
> On Sun, Oct 21, 2012 at 10:41 AM, Francesc Alted
> mailto:fal...@pytables.org>> wrote:
>
> > Hi,
> >
> > I'm going to give a tutorial on PyTables next Thursday during
> the PyData
> > conference in New York (http://nyc2012.pydata.org/) and I'd like
> to use
> > some real life data files.  So, if you have some public
> repository with
> > data generated with PyTables, please tell me.  I'm looking for files
> > that are not very large (< 1GB), and that use the Table object
> > significantly.  A small description of the data included will be
> more
> > that welcome too!
> >
> > Thanks!
> >
> > --
> > Francesc Alted
>
>
>
> Hi Francesc.
>
> I've been working on a library for accessing climatology data that
> uses pytables to cache data from the USGS. It could easily be used to
> create a sample dataset for some area of interest. File size is
> determined by how much data gets queried.
>
>
> The general layout is:
>
> /usgs/sites
> - the sites table contains information and metadata about a site
>
>
> /usgs/values///
> - a table containing all the timeseries data for each site and
> parameter is created as data are queried
> - parameter codes are a bit obscure but a dict with descriptive
> metadata stashed at table.attrs.variable
> - the datetime column has a CSIndex on it and stored as as a string
> because some sites have data prior to the year 1901
> - pretty inefficient in terms of disk space (lots of large-ish string
> columns) because it handles a very general class of data types
>
>
> Here's what the code would look like to download and create the hdf5
> file for 10 random sites in New York:
>
> import ulmo
>
> # the default location for the hdf5 file is OS dependent, so provide
> the path you want to use
> hdf5_file_path = './usgs_data.h5'
>
> # get list of sites in NY
> ulmo.usgs.pytables.update_site_list(state_code='NY',
> path=hdf5_file_path)
> sites = ulmo.usgs.pytables.get_sites(path=hdf5_file_path)
>
> # download data for a few random sites
> for site in sites.keys()[:10]:
> ulmo.usgs.pytables.update_site_data(site, path=hdf5_file_path)
>
>
>
> The project is on github: https://github.com/swtools/ulmo
> and the code that does all the pytables stuff (including the table
> descriptions) is here:
> https://github.com/swtools/ulmo/blob/master/ulmo/usgs/pytables.py
>
> -andy
>
> 
> --
> Everyone hates slow websites. So do we.
> Make your web apps faster with AppDynamics
> Download AppDynamics Lite for free today:
> http://p.sf.net/sfu/appdyn_sfd2d_oct
> ___
> Pytables-users mailing list
> Pytables-users@lists.sourceforge.net
> 
> https://lists.sourceforge.net/lists/listinfo/pytables-users
>
>
>
>
> -- 
> Jason K. Moore, Ph.D.
> Personal Website 
> Sports Biomechanics Lab , UC Davis
> Davis Open Science 
> Google Voice: +01 530-601-9791
>
>
>
> --
> Everyone hates slow websites. So do we.
> Make your web apps faster with AppDynamics
> Download AppDynamics Lite for free today:
> http://p.sf.net/sfu/appdyn_sfd2d_oct
>
>
> ___
> Pytables-users mailing list
> Pytables-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/pytables-users


-- 
Francesc Alted


--
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:
http://p.sf.net/sfu/appdyn_sfd2d_oct
___
Pytables-users mailing list
Pytables-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/pytables-users