A Monday 14 July 2008, Pierre GM escrigué: > On Monday 14 July 2008 09:07:47 Francesc Alted wrote: > > The advantage of this abstraction is that the user can easily > > choose the scale of resolution that better fits his need. I'm > > thinking in providing the next resolutions: > > > > ["femtosec", "picosec", "nanosec", "microsec", "millisec", "sec", > > "min", "hour", "month", "year"] > > In TimeSeries, we don't have anything less than a second, but we > have 'daily', 'business daily', 'weekly' and 'quarterly' resolutions.
Yes, I forgot the "day" resolution. I suppose that "weekly" and "quaterly" could be added too. However, if we adopt a new way to specify the resolution (see later), these can be stated as '7d' and '3m' respectively. Mmh, not sure about "business daily"; this maybe is useful in time series, but I don't find a reasonable meaning for it as a 'time resolution' (which is a different concept from 'time frequency'). So I'd let it out. > A very useful point that Matt Knox had coded is the possibility to > specify starting points for switching from one resolution to another. > For example, you can have a series with a 'ANN_MAR' frequency, that > corresponds to 1 point a year, the year starting in April. When > switching back to a monthly resolution, the points from January to > March of the first year will be masked. Ok. Ann was also suggesting that the origin of time would be configurable, but then, you are talking about *masking* values. Mmm, I don't think we should try to incorporate masking capabilities in the NumPy date/time types. At any rate, I've not thought about the possibility of having an origin defined by the user, but if we could add the 'resolution' metainfo, I don't see why we couldn't do the same with the 'origin' metainfo too. > Another useful point would be allow the user to define his/her own > resolution (every 15min, every 12h...). Right now it's a bit clunky > in TimeSeries, we have to use the lowest resolution of the series > (min, hour) and leave a lot of blanks (TimeSeries don't have to be > regularly spaced, but it helps...) Ok. I see the use case for this, but for implementation purposes, we should come with a more complete way to specify the resolution than I realized before. Hmm, what about the next: [N]timeunit where ``timeunit`` can take the values in: ['y', 'm', 'd', 'h', 'm', 's', 'ms', 'us', 'ns', 'fs'] so, for example, '14d' means a resolution of 14 days, or '10ms' means a resolution of 1 hundreth of second. Sounds good to me. What other people think? > > > Now, it comes the tricky part: how to integrate the notion > > of 'resolution' with the 'dtype' data type factory of NumPy? > > In TimeSeries, the frequency is stored as an integer. For example, a > daily frequency is stored as 6000, an annual frequency as 1000, a > 'ANN_MAR' frequency as 1003... Well, I initially planned to keep the resolution as an enumerated (int8 would be enough), but if the new way to specify resolutions goes ahead, I'm afraid that we may need a fill int64 to save this. But apart from that, this should be not a problem (in general, the metainfo is a very tiny part of the space taken by a dataset). Cheers, -- Francesc Alted _______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion