On 06/10/2011 09:18 AM, Mark Wiebe wrote:
On Fri, Jun 10, 2011 at 12:56 AM, Ralf Gommers <ralf.gomm...@googlemail.com <mailto:ralf.gomm...@googlemail.com>> wrote:



    On Fri, Jun 10, 2011 at 1:54 AM, Mark Wiebe <mwwi...@gmail.com
    <mailto:mwwi...@gmail.com>> wrote:

        On Thu, Jun 9, 2011 at 5:21 PM, Ralf Gommers
        <ralf.gomm...@googlemail.com
        <mailto:ralf.gomm...@googlemail.com>> wrote:



            On Thu, Jun 9, 2011 at 11:54 PM, Mark Wiebe
            <mwwi...@gmail.com <mailto:mwwi...@gmail.com>> wrote:

                On Thu, Jun 9, 2011 at 4:27 PM, Ralf Gommers
                <ralf.gomm...@googlemail.com
                <mailto:ralf.gomm...@googlemail.com>> wrote:



                    On Thu, Jun 9, 2011 at 10:58 PM, Mark Wiebe
                    <mwwi...@gmail.com <mailto:mwwi...@gmail.com>> wrote:

                        On Thu, Jun 9, 2011 at 3:41 PM, Christopher
                        Barker <chris.bar...@noaa.gov
                        <mailto:chris.bar...@noaa.gov>> wrote:

                    Your branch works fine for me (OS X, py2.6), no
                    failures. Only a few deprecation warnings like:
                    
/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/unittest.py:336:
                    DeprecationWarning: DType strings 'O4' and 'O8'
                    are deprecated because they are platform specific.
                    Use 'O' instead
                      callableObj(*args, **kwargs)


                It looks like there are some '|O4' dtypes in
                'lib/tests/test_format.py', testing the .npy file
                format. I'm not sure why I'm not getting this warning
                though.

                            Mark Wiebe wrote:
                            > Because of the nature of datetime and
                            timedelta, arange has to be
                            > slightly different than with all the
                            other types. In particular, for
                            > datetime the primary signature is
                            np.arange(datetime, datetime, timedelta).
                            >
                            > I've implemented a simple extension
                            which allows for another way to
                            > specify a date range, as
                            np.arange(datetime, timedelta, timedelta).

                    Did you think about how to document which of these
                    basic functions work with datetime? I don't think
                    that belongs in the docstrings, but it may then be
                    hard for the user to figure out which functions
                    accept datetimes. And there will be no usage
                    examples in the docstrings.


                 I think documenting it in a 'datetime' section of the
                arange documentation would be reasonable. The main
                datetime documentation page would also mention the
                functions that are most useful.

                    Besides docs, I am not sure about your choice to
                    modify functions like arange instead of writing a
                    module of wrapper functions for them that know
                    what to do with the dtype. If you have a module
                    you can group all relevant functions, so they're
                    easy to find. Plus it's more future-proof - if at
                    some point numpy grows another new dtype, just
                    create a new module with wrapper funcs for that dtype.


                The facts that datetime and timedelta are related in a
                particular way different from other data types, and
                that they are parameterized types, both contribute to
                them not fitting naturally the current structure of
                NumPy. I'm not sure I understand the module idea,


            Basically, use np.datetime.arange which understand the
            dtype, then calls np.arange under the hood. Or is just its
            own function, like the dtrange() Robert just suggested.
            It's pretty much the same as for the ma module, which
            reimplements or wraps many numpy functions that do not
            understand masked arrays.


        I'm not a big fan of the way the ma module works, it doesn't
        integrate naturally and orthogonally with all the other
        features of NumPy. It's also an array subtype, quite different
        from a dtype. We don't have np.bool.arange, np.int8.arange,
        etc, and the abstraction used by arange built into the custom
        data type mechanism is too weak too support the needs of datetime.

        I'd like to use the requirements of datetime as a guide
        to molding the future design of the data type system, and if
        we make datetime a second-class citizen because it doesn't
        behave like a float, we're not going to be able to discover
        the possibilities.

                I would rather think that since it's a built-in NumPy
                data type, it should work with the regular NumPy
                functions wherever that makes sense.


            That doesn't make sense to me. Being a dtype that happens
            to be shipped with numpy doesn't make it more special than
            other dtypes.


        This isn't making it more special, it's just conforming the
        natural NumPy way to how datetime/timedelta operates.

    Maybe I'm misunderstanding this, and once you make a function work
    for datetime it would also work for other new dtypes. But my
    impression is that that's not the case. Let's say I make a new
    dtype with distance instead of time attached. Would I be able to
    use it with arange, or would I have to go in and change the arange
    implementation again to support it?


Ok, I think I understand the point you're driving at now. We need NumPy to have the flexibility so that external plugins defining custom data types can do the same thing that datetime does, having datetime be special compared to those is undesirable. This I wholeheartedly agree with, and the way I'm coding datetime is driving in that direction.

The state of the NumPy codebase, however, prevents jumping straight to such a solution, since there are several mechanisms and layers of such abstraction already which themselves do not satisfy the needs of datetime and other similar types. Also, arange is just one of a large number of functions which could be extended individually for different types, for example if one wanted to make a unit quaternion data type for manipulating rotations, its needs would be significantly different. Because I can't see the big picture from within the world of datetime, and because such generalization takes a great amount of effort, I'm instead making these changes minimally invasive on the current codebase.

This approach is along the lines of "lifting" in generic programming. First you write your algorithm in the specific domain, and work out how it behaves there. Then, you determine what are the minimal requirements of the types involved, and abstract the algorithm appropriately. Jumping to the second step directly is generally too difficult, an incremental path to the final goal must be used.

http://www.generic-programming.org/about/intro/lifting.php

Cheers,
Mark


    Ralf



    _______________________________________________
    NumPy-Discussion mailing list
    NumPy-Discussion@scipy.org <mailto:NumPy-Discussion@scipy.org>
    http://mail.scipy.org/mailman/listinfo/numpy-discussion



_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion
I have following the multiple date/time discussions with some interest as it is clear there is not 'one way' (perhaps it's Dutch). But, I do keep coming back to Chris's concepts of time as a strict unit of measure and time as a calender. So I do think that types of changes are rather premature without defining a some base measurement of time - probably some thing like Unix time or International Atomic Time (TAI) but not UTC due to leap seconds (http://en.wikipedia.org/wiki/Leap_second).

Leap seconds make using UTC rather problematic for a couple of reasons:
1) It's essentially only historical. A range of the seconds in December 2011 computed 'now' in June 2011 using UTC might be different than a range calculated in a couple weeks if leaps seconds are added to December 2011. 2) There is also the issue that 23:59:60 December 31, 2008 UTC is a valid time but not for other years like 2009 and 2010. It also means that you have to be careful of doing experiments that require accuracy of seconds or less because a 1 second gap could be recorded as a 2 second gap.

The other issue is how do you define the np.arange step argument since that can be in different scales such as month, years, seconds? Can a user specific days and get half-days (like 1.5 days) or must these be 'integer' days?

Bruce
_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Reply via email to