Re: [Numpy-discussion] Problem with importing csv into datetime64

2011-09-29 Thread Christopher Jordan-Squire
On Wed, Sep 28, 2011 at 9:15 AM, Grové grove.st...@gmail.com wrote:
 Hi,

 I am trying out the latest version of numpy 2.0 dev:

 np.__version__
 Out[44]: '2.0.0.dev-aded70c'

 I am trying to import CSV data that looks like this:

 date,system,pumping,rgt,agt,sps,eskom_import,temperature,wind,pressure,weather
 2007-01-01 00:30,481.9,481.9,15,SW,1040,Fine
 2007-01-01 01:00,471.9,471.9,15,SW,1040,Fine
 2007-01-01 01:30,455.9,455.9
 etc.

 by using the following code:

 convertdict = {0: lambda s: np.datetime64(s, 'm'), 1: lambda s: float(s or 0),
 2: lambda s: float(s or 0), 3: lambda s: float(s or 0), 4: lambda s: float(s 
 or
 0), 5: lambda s: float(s or 0), 6: lambda s: float(s or 0), 7: lambda s: 
 float(s
 or 0), 8: str, 9: str, 10: str}
 dt = [('date', np.datetime64),('system', float), ('pumping', float),('rgt',
 float), ('agt', float), ('sps', float) ,('eskom_import', float), 
 ('temperature',
 float), ('wind', str), ('pressure', float), ('weather', str)]
 a = np.recfromcsv(fp, dtype=dt, converters=convertdict, usecols=range(0-11),
 names=True)

 The dtype it generates for a.date is 'object':

 array([2007-01-01T00:30+0200, 2007-01-01T01:00+0200, 2007-01-01T01:30+0200,
       ..., 2007-12-31T23:00+0200, 2007-12-31T23:30+0200,
       2008-01-01T00:00+0200], dtype=object)

 But I need it to be datetime64, like in this example (but including hrs and
 minutes):

 array(['2011-07-11', '2011-07-12', '2011-07-13', '2011-07-14',
       '2011-07-15', '2011-07-16', '2011-07-17'], dtype='datetime64[D]')

 It seems that the CSV import creates an embedded object datetype for 'date'
 rather than a datetime64 data type.  Any ideas on how to fix this?

 Grové


Not sure how big your file is, but you might take a look at the
loadtable branch on my numpy fork:
https://github.com/chrisjordansquire/numpy.

It has a function loadtable, with some docs and tests. It currently
only loads dates, but you could likely modify it to handle datetimes
as well without too much trouble. (Well, it should be pretty simple
once you kinda grok how the function works. Unfortunately it's
somewhat large and complicated, so it might not be what you want if
you just want to load your date quick and be done with it.)

-Chris JS



 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] datetimes with date vs time units, local time, and time zones

2011-09-29 Thread Grové

Hi Mark

Did you ever get to write:

date_as_datetime(datearray, hour, minute, second, microsecond, 
timezone='local', unit=None, out=None)
and
datetime_as_date(datetimearray, timezone='local', out=None)
?

I am looking for an easy way of using datetime[m] data to test for business 
days 
and do half hourly comparisons.

I am using:

In [181]: np.__version__
Out[181]: '2.0.0.dev-aded70c'


Regards

Grové Steyn

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Comparing NumPy/IDL Performance

2011-09-29 Thread Zachary Pincus
I think the remaining delta between the integer and float boxcar smoothing is 
that the integer version (test 21) still uses median_filter(), while the float 
one (test 22) is using uniform_filter(), which is a boxcar.

Other than that and the slow roll() implementation in numpy, things look pretty 
solid, yes?

Zach


On Sep 29, 2011, at 12:11 PM, Keith Hughitt wrote:

 Thank you all for the comments and suggestions.
 
 First off, I would like to say that I entirely agree with people's 
 suggestions about lack of objectiveness in the test design, and the caveat 
 about optimizing early. The main reason we put together the Python version of 
 the benchmark was as a quick sanity check to make sure that there are no 
 major show-stoppers before we began work on the library. We also wanted to 
 put together something to show other people who are firmly in the IDL camp 
 that this is a viable option.
 
 We did in fact put together another short test-suite (test_testr.py  
 time_testr.pro) which consists of operations that would are frequently used 
 by us, but it also is testing a very small portion of the kinds of things our 
 library will eventually do.
 
 That said, I made a few small changes to the original benchmark, based on 
 people's feedback, and put together a new plot.
 
 The changes made include:
 
 1. Using xrange instead of range
 2. Using uniform filter instead of median filter
 3. Fixed a typo for tests 2  3 which resulted in slower Python results
 
 Again, note that some of the tests are testing non-numpy functionality. 
 Several of the results still stand out,  but overall the results are much 
 more reasonable than before.
 
 Cheers,
 Keith
 time_test3_idl_vs_python.png___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Comparing NumPy/IDL Performance

2011-09-29 Thread Keith Hughitt
Ah. Thanks for catching that!

Otherwise though I think everything looks pretty good.

Thanks all,
Keith

On Thu, Sep 29, 2011 at 12:18 PM, Zachary Pincus zachary.pin...@yale.eduwrote:

 I think the remaining delta between the integer and float boxcar
 smoothing is that the integer version (test 21) still uses median_filter(),
 while the float one (test 22) is using uniform_filter(), which is a boxcar.

 Other than that and the slow roll() implementation in numpy, things look
 pretty solid, yes?

 Zach


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion