Re: [Numpy-discussion] Problem with importing csv into datetime64
On Wed, Sep 28, 2011 at 9:15 AM, Grové grove.st...@gmail.com wrote: Hi, I am trying out the latest version of numpy 2.0 dev: np.__version__ Out[44]: '2.0.0.dev-aded70c' I am trying to import CSV data that looks like this: date,system,pumping,rgt,agt,sps,eskom_import,temperature,wind,pressure,weather 2007-01-01 00:30,481.9,481.9,15,SW,1040,Fine 2007-01-01 01:00,471.9,471.9,15,SW,1040,Fine 2007-01-01 01:30,455.9,455.9 etc. by using the following code: convertdict = {0: lambda s: np.datetime64(s, 'm'), 1: lambda s: float(s or 0), 2: lambda s: float(s or 0), 3: lambda s: float(s or 0), 4: lambda s: float(s or 0), 5: lambda s: float(s or 0), 6: lambda s: float(s or 0), 7: lambda s: float(s or 0), 8: str, 9: str, 10: str} dt = [('date', np.datetime64),('system', float), ('pumping', float),('rgt', float), ('agt', float), ('sps', float) ,('eskom_import', float), ('temperature', float), ('wind', str), ('pressure', float), ('weather', str)] a = np.recfromcsv(fp, dtype=dt, converters=convertdict, usecols=range(0-11), names=True) The dtype it generates for a.date is 'object': array([2007-01-01T00:30+0200, 2007-01-01T01:00+0200, 2007-01-01T01:30+0200, ..., 2007-12-31T23:00+0200, 2007-12-31T23:30+0200, 2008-01-01T00:00+0200], dtype=object) But I need it to be datetime64, like in this example (but including hrs and minutes): array(['2011-07-11', '2011-07-12', '2011-07-13', '2011-07-14', '2011-07-15', '2011-07-16', '2011-07-17'], dtype='datetime64[D]') It seems that the CSV import creates an embedded object datetype for 'date' rather than a datetime64 data type. Any ideas on how to fix this? Grové Not sure how big your file is, but you might take a look at the loadtable branch on my numpy fork: https://github.com/chrisjordansquire/numpy. It has a function loadtable, with some docs and tests. It currently only loads dates, but you could likely modify it to handle datetimes as well without too much trouble. (Well, it should be pretty simple once you kinda grok how the function works. Unfortunately it's somewhat large and complicated, so it might not be what you want if you just want to load your date quick and be done with it.) -Chris JS ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] datetimes with date vs time units, local time, and time zones
Hi Mark Did you ever get to write: date_as_datetime(datearray, hour, minute, second, microsecond, timezone='local', unit=None, out=None) and datetime_as_date(datetimearray, timezone='local', out=None) ? I am looking for an easy way of using datetime[m] data to test for business days and do half hourly comparisons. I am using: In [181]: np.__version__ Out[181]: '2.0.0.dev-aded70c' Regards Grové Steyn ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Comparing NumPy/IDL Performance
I think the remaining delta between the integer and float boxcar smoothing is that the integer version (test 21) still uses median_filter(), while the float one (test 22) is using uniform_filter(), which is a boxcar. Other than that and the slow roll() implementation in numpy, things look pretty solid, yes? Zach On Sep 29, 2011, at 12:11 PM, Keith Hughitt wrote: Thank you all for the comments and suggestions. First off, I would like to say that I entirely agree with people's suggestions about lack of objectiveness in the test design, and the caveat about optimizing early. The main reason we put together the Python version of the benchmark was as a quick sanity check to make sure that there are no major show-stoppers before we began work on the library. We also wanted to put together something to show other people who are firmly in the IDL camp that this is a viable option. We did in fact put together another short test-suite (test_testr.py time_testr.pro) which consists of operations that would are frequently used by us, but it also is testing a very small portion of the kinds of things our library will eventually do. That said, I made a few small changes to the original benchmark, based on people's feedback, and put together a new plot. The changes made include: 1. Using xrange instead of range 2. Using uniform filter instead of median filter 3. Fixed a typo for tests 2 3 which resulted in slower Python results Again, note that some of the tests are testing non-numpy functionality. Several of the results still stand out, but overall the results are much more reasonable than before. Cheers, Keith time_test3_idl_vs_python.png___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Comparing NumPy/IDL Performance
Ah. Thanks for catching that! Otherwise though I think everything looks pretty good. Thanks all, Keith On Thu, Sep 29, 2011 at 12:18 PM, Zachary Pincus zachary.pin...@yale.eduwrote: I think the remaining delta between the integer and float boxcar smoothing is that the integer version (test 21) still uses median_filter(), while the float one (test 22) is using uniform_filter(), which is a boxcar. Other than that and the slow roll() implementation in numpy, things look pretty solid, yes? Zach ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion