Re: [Python-Dev] email package status in 3.X

2010-06-21 Thread Jess Austin
On Mon, Jun 22, 2010 at 7:27:31 PM, Steven D'Aprano  wrote:
> On Tue, 22 Jun 2010 08:03:58 am Nick Coghlan wrote:
>> So, to boil down the ebytes idea, it is basically a request for a
>> second string type that holds an octet stream plus an encoding name,
>> rather than a Unicode character stream.
>
> Do any other languages have any equivalent to this ebtyes type?

Ruby seems to do this:

http://yokolet.blogspot.com/2009/07/design-and-implementation-of-ruby-m17n.html

I don't use ruby myself, and I'm probably missing some subtle flaws,
but the exposition at that link makes sense to me.

cheers,
Jess
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Issue5434: datetime.monthdelta

2009-04-22 Thread Jess Austin
On Thu, Apr 16, 2009 at 8:01 PM, Jess Austin  wrote:
> These operations are useful in particular contexts.  What I've
> submitted is also useful, and currently isn't easy in core,
> batteries-included python.  While I would consider the foregoing
> interpretation of the Zen to be backwards (this doesn't add another
> way to do something that's already possible, it makes possible
> something that currently encourages one to pull her hair out), I
> suppose it doesn't matter.  If adding a class and a function to a
> module will require extended advocacy on -ideas and c.l.p, I'm
> probably not the person for the job.
>
> If, on the other hand, one of the committers wants to toss this in at
> some point, whether now or 3 versions down the road, the patch is up
> at bugs.python.org (and I'm happy to make any suggested
> modifications).  I'm glad to have written this; I learned a bit about
> CPython internals and scraped a layer of rust off my C skills.  I will
> go ahead and backport the python-coded version to 2.3.  I'll continue
> this conversation with whomever for however long, but I suspect this
> topic will soon have worn out its welcome on python-dev.


I've uploaded the backported python version source distribution to
PyPI, http://pypi.python.org/pypi?name=MonthDelta&:action=display with
better-formatted documentation at
http://packages.python.org/MonthDelta/

"easy_install MonthDelta" works too.

cheers,
Jess
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Issue5434: datetime.monthdelta

2009-04-16 Thread Jess Austin
On Thu, Apr 16, 2009 at 7:18 PM,   wrote:
>
>    >> I have this funny feeling that arithmetic using monthdelta wouldn't
>    >> always be intuitive.
>
>    Jess> I think that's true, especially since these calculations are not
>    Jess> necessarily invertible:
>
>    >>> date(2008, 1, 30) + monthdelta(1)
>    datetime.date(2008, 2, 29)
>    >>> date(2008, 2, 29) - monthdelta(1)
>    datetime.date(2008, 1, 29)
>
>    Jess> It could be that non-intuitivity is inherent in the problem of
>    Jess> dealing with dates and months.
>
> To which I would respond:
>
>    >>> import this
>    The Zen of Python, by Tim Peters
>
>    ...
>    In the face of ambiguity, refuse the temptation to guess.
>    There should be one-- and preferably only one --obvious way to do it.
>    Although that way may not be obvious at first unless you're Dutch.
>    ...
>
> From the discussion I've seen so far, it's not clear that there is one
> obvious way to do it, and the ambiguity of the problem forces people to
> guess.
>
> My recommendations after letting it roll around in the back of my brain for
> the day:
>
>    * I think it would be best to leave the definition of monthdelta up to
>      individual users.  That is, add nothing to the datetime module and let
>      them write a function which does what they want it to do.
>
>    * The idea/implementation probably needs to bake on the python-ideas
>      list and perhaps comp.lang.python for a bit to see if some concensus
>      can be reached on reasonable functionality.

So far, all the other solutions to the problem that have been
mentioned are easily supported in current python.

Raise an exception when a calculation results in an invalid date:

>>> dt = date(2008, 1, 31)
>>> dt.replace(month=dt.month + 1)
Traceback (most recent call last):
  File "", line 1, in 
ValueError: day is out of range for month


Add exactly 30 days to a date:

>>> dt + timedelta(30)
datetime.date(2008, 3, 1)


These operations are useful in particular contexts.  What I've
submitted is also useful, and currently isn't easy in core,
batteries-included python.  While I would consider the foregoing
interpretation of the Zen to be backwards (this doesn't add another
way to do something that's already possible, it makes possible
something that currently encourages one to pull her hair out), I
suppose it doesn't matter.  If adding a class and a function to a
module will require extended advocacy on -ideas and c.l.p, I'm
probably not the person for the job.

If, on the other hand, one of the committers wants to toss this in at
some point, whether now or 3 versions down the road, the patch is up
at bugs.python.org (and I'm happy to make any suggested
modifications).  I'm glad to have written this; I learned a bit about
CPython internals and scraped a layer of rust off my C skills.  I will
go ahead and backport the python-coded version to 2.3.  I'll continue
this conversation with whomever for however long, but I suspect this
topic will soon have worn out its welcome on python-dev.

cheers,
Jess
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Issue5434: datetime.monthdelta

2009-04-16 Thread Jess Austin
Antoine Pitrou  wrote:
> Jess Austin  gmail.com> writes:
>>
>> What other behavior options besides "last-valid-day-of-the-month"
>> would you like to see?
>
> IMHO, the question is rather what the use case is for the behaviour you are
> proposing. In which kind of situation is it acceptable to turn 31/2 silently
> into 29/2?

I have worked in utility/telecom billing, and needed to examine large
numbers of invoice dates, fulfillment dates, disconnection dates,
payment dates, collection event dates, etc.  There would often be
particular rules for the relationships among these dates, and since
many companies generate invoices every day of the month, you couldn't
rely on rules like "this always happens on the 5th".  Here is an
example (modified) from the doc page.  We want to find missing
invoices:

>>> invoices = {123: [date(2008, 1, 31),
...   date(2008, 2, 29),
...   date(2008, 3, 31),
...   date(2008, 4, 30),
...   date(2008, 5, 31),
...   date(2008, 6, 30),
...   date(2008, 7, 31),
...   date(2008, 12, 31)],
... 456: [date(2008, 1, 1),
...   date(2008, 5, 1),
...   date(2008, 6, 1),
...   date(2008, 7, 1),
...   date(2008, 8, 1),
...   date(2008, 11, 1),
...   date(2008, 12, 1)]}
>>> for account, dates in invoices.items():
... a = dates[0]
... for b in dates[1:]:
... if b - monthdelta(1) > a:
... print('account', account, 'missing between', a, 'and', b)
... a = b
...
account 456 missing between 2008-01-01 and 2008-05-01
account 456 missing between 2008-08-01 and 2008-11-01
account 123 missing between 2008-07-31 and 2008-12-31


In general, sometimes we care more about the number of months that
separate dates than we do about the exact dates themselves.  This is
perhaps not the most common situation for date calculations, but it
does come up for some of us.  I tired of writing one-off solutions
that would fail in unexpected corner cases, so I created this patch.
Paul Moore has also described his favorite use-case for this
functionality.

cheers,
Jess
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Issue5434: datetime.monthdelta

2009-04-16 Thread Jess Austin
Jon Ribbens  wrote:
> On Thu, Apr 16, 2009 at 12:10:36PM +0400, Oleg Broytmann wrote:
>> > This patch adds a "monthdelta" class and a "monthmod" function to the
>> > datetime module.  The monthdelta class is much like the existing
>> > timedelta class, except that it represents months offset from a date,
>> > rather than an exact period offset from a date.
>>
>>    I'd rather see the code merged with timedelta: timedelta(months=n).
>
> Unfortunately, that's simply impossible. A timedelta is a fixed number
> of seconds, and the time between one month and the next varies.

I agree.


> I am very much in favour of there being the ability to add months to
> dates though. Obviously there is the question of what to do when you
> move forward 1 month from the 31st January; in my opinion an optional
> argument to specify different behaviours would be nice.

Others have suggested raising an exception when a month calculation
lands on an invalid date.  Python already has that; it's spelled like
this:

>>> dt = date(2008, 1, 31)
>>> dt.replace(month=dt.month + 1)
Traceback (most recent call last):
  File "", line 1, in 
ValueError: day is out of range for month

What other behavior options besides "last-valid-day-of-the-month"
would you like to see?

cheers,
Jess
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python-Dev Digest, Vol 69, Issue 143

2009-04-16 Thread Jess Austin
Jared Grubb  wrote:
> On 16 Apr 2009, at 11:42, Paul Moore wrote:
>> The key thing missing (I believe) from dateutil is any equivalent of
>> monthmod.
>
>
> I agree with that. It's well-defined and it makes a lot of sense. +1
>
> But, I dont think monthdelta can be made to work... what should the
> following be?

>>> print(date(2008,1,30) + monthdelta(1))
2008-02-29
>>> print(date(2008,1,30) + monthdelta(2))
2008-03-30
>>> print(date(2008,1,30) + monthdelta(1) + monthdelta(1))
2008-03-29

This is a perceptive observation: in the absence of parentheses to
dictate a different order of operations, the third quantity will
differ from the second.  Furthermore, this won't _always_ be true,
just for dates near the end of the month, which is nonintuitive.
(Incidentally, this is another reason why this functionality should
not just be lumped into timedelta; guarantees that have long existed
for operations with timedelta would no longer hold if it tried to deal
with months.)

I find that date calculations involving months involve a certain
amount of inherent confusion.  I've tried to reduce this by
introducing well-specified functionality that will allow accurate
reasoning, as part of the core's included batteries.  I think that one
who uses these objects will develop an intuition and write accurate
code quickly.  It is nonintuitive that order of operation matters for
addition of months, just as it matters for subtraction and division of
all objects, but with the right tools we can deal with this.  An
interesting consequence is that if I want to determine if date b is
more than a month after date a, sometimes I should use:

b - monthdelta(1) > a

rather than

a + monthdelta(1) < b

[Consider a list of run dates for a process that should run the last
day of every month: "a" might be date(2008, 2, 29) while "b" is
date(2008, 3, 31). In this case the two expressions would have
different values.]

cheers,
Jess
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Issue5434: datetime.monthdelta

2009-04-16 Thread Jess Austin
Thanks for everyone's comments!

On Thu, Apr 16, 2009 at 9:54 AM, Paul Moore  wrote:
> I like the idea in principle. In practice, of course, month
> calculations are inherently ill-defined, so you need to be very
> specific in documenting all of the edge cases, and you should have
> strong use cases to ensure that the behaviour implemented matches user
> requirements. (I haven't yet had time to read the patch - you may well
> already have these points covered, certainly your comments above
> indicate that you appreciate the subtleties involved).
>
> I agree that ultimately it would be useful in the core. However, I'd
> suggest that you release the functionality as an independent module in
> the first instance, to establish it outside of the core. Once it has
> matured somewhat as a 3rd party module, it would then be ready for
> integration in the core. This also has the benefit that it makes the
> functionality available to users of Python 2.6 (and possibly earlier)
> rather than just in 2.7/3.1 onwards.

I have uploaded a python-coded version of this functionality to the
bug page.  I should backport it through 2.3 and post that to pypi, but
I haven't done that yet.  The current effort was focused on the C
module since that's how the rest of datetime is implemented, and also
I wanted to learn a bit about CPython internals.  To the latter point,
I would _really_ appreciate it if someone could leave a few comments
on Rietveld.

>> Please let me know what you think of the idea and/or its execution.
>
> I hope the above comments help. Ultimately, I'd like to see this added
> to the core. It's tricky enough that having a "standard"
> implementation is a definite benefit in itself. But equally, I'd give
> it time to iron out the corner cases on a faster development cycle
> than the core offers before "freezing" it as part of the stdlib.


I understand these concerns.  I think I was too brief in my initial
message.  Here are the docstrings:

>>> print(monthdelta.__doc__)
Months offset from a date or datetime.

monthdeltas allow date calculation without regard to the different lengths
of different months. A monthdelta value added to a date produces another
date that has the same day-of-the-month, regardless of the lengths of the
intervening months. If the resulting date is in too short a month, the
last day in that month will result:

date(2008,1,30) + monthdelta(1) -> date(2008,2,29)

monthdeltas may be added, subtracted, multiplied, and floor-divided
similarly to timedeltas. They may not be added to timedeltas directly, as
both classes are intended to be used directly with dates and datetimes.
Only ints may be passed to the constructor, the default argument of which
is 1 (one). monthdeltas are immutable.

NOTE: in calculations involving the 29th, 30th, and 31st days of the
month, monthdeltas are not necessarily invertible [i.e., the result above
would NOT imply that date(2008,2,29) - monthdelta(1) -> date(2008,1,30)].

>>> print(monthmod.__doc__)
monthmod(start, end) -> (monthdelta, timedelta)

Distribute the interim between start and end dates into monthdelta and
timedelta portions. If and only if start is after end, returned monthdelta
will be negative. Returned timedelta is never negative, and is always
smaller than the month in which end occurs.

Invariant: dt + monthmod(dt, dt+td)[0] + monthmod(dt, dt+td)[1] = dt + td


There is better-looking documentation in html/library/datetime.html
and html/c-api/datetime.html in the patch.  By all means, if you're
curious, download the patch and try it out yourself!

cheers,
Jess
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Issue5434: datetime.monthdelta

2009-04-16 Thread Jess Austin
On Thu, Apr 16, 2009 at 5:16 AM, Dirkjan Ochtman  wrote:
> On Thu, Apr 16, 2009 at 11:54, Amaury Forgeot d'Arc  
> wrote:
>> In my opinion:
>> arithmetic with months is a mess. There is no such "month interval" or
>> "year interval" with a precise definition.
>> If we adopt some kind of month manipulation, it should be a function
>> or a method, like you would do for features like last_day_of_month(d),
>> or following_weekday(d, 'monday').
>>
>>    date(2008, 1, 30).add_months(1) == date(2008, 2, 29)
>
> I concur. Trying to shoehorn month arithmetic into timedelta is a
> PITA, precisely because it's somewhat inexact. It's better to have
> some separate behavior that has well-defined behavior in edge cases.

This is my experience also, and including a distinct and well-defined
behavior in the core is exactly my intention with this patch.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Issue5434: datetime.monthdelta

2009-04-16 Thread Jess Austin
On Thu, Apr 16, 2009 at 4:54 AM, Amaury Forgeot d'Arc
 wrote:
> FWIW, the Oracle database has two methods for adding months:
> 1- the add_months() function
>    add_months(to_date('31-jan-2005'), 1)
> 2- the ANSI interval:
>    to_date('31-jan-2005') + interval '1' month
>
> "add_months" is calendar sensitive, "interval" is not.
> "interval" raises an exception if the day is not valid for the target
> month (which is the case in my example)
>
> "add_months" is similar to the proposed monthdelta(),
> except that it has a special case for the last day of the month:
> """
> If date is the last day of the month or if the resulting month has
> fewer days than the day
> component of date, then the result is the last day of the resulting month.
> Otherwise, the result has the same day component as date.
> """
> indeed:
>    add_months(to_date('28-feb-2005'), 1) == to_date('31-mar-2005')


My proposal has the "calendar sensitive" semantics you describe.  It
will not raise an exception in this case.


> In my opinion:
> arithmetic with months is a mess. There is no such "month interval" or
> "year interval" with a precise definition.
> If we adopt some kind of month manipulation, it should be a function
> or a method, like you would do for features like last_day_of_month(d),
> or following_weekday(d, 'monday').
>
>    date(2008, 1, 30).add_months(1) == date(2008, 2, 29)


I disagree with this point, in that I really like the pythonic date
calculations we have with timedelta.  It is easier to reason about
adding and subtracting objects than it is to reason about method
invocations.  Also, you can store a monthdelta in a variable, which is
sometimes convenient, and which is difficult to emulate with function
calls.

Except in certain particular cases, I'm not fond of last_day_of_month,
following_weekday, etc. functions.  Much in the way that timezone
considerations have been factored out of the core through the use of
tzinfo, I think these problems are more effectively addressed at the
level of detail one finds at the application level.  On the other
hand, it seems like effective month calculations could be useful in
the core.

cheers,
Jess
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Issue5434: datetime.monthdelta

2009-04-16 Thread Jess Austin
On Thu, Apr 16, 2009 at 3:45 AM,   wrote:
>    >>> date(2008, 1, 30) + monthdelta(1)
>    datetime.date(2008, 2, 29)
>
> What would this loop would print?
>
>    for d in range(1, 32):
>        print date(2008, 1, d) + monthdelta(1)


>>> for d in range(1, 32):
... print(date(2008, 1, d) + monthdelta(1))
...
2008-02-01
2008-02-02
2008-02-03
2008-02-04
2008-02-05
2008-02-06
2008-02-07
2008-02-08
2008-02-09
2008-02-10
2008-02-11
2008-02-12
2008-02-13
2008-02-14
2008-02-15
2008-02-16
2008-02-17
2008-02-18
2008-02-19
2008-02-20
2008-02-21
2008-02-22
2008-02-23
2008-02-24
2008-02-25
2008-02-26
2008-02-27
2008-02-28
2008-02-29
2008-02-29
2008-02-29


> I have this funny feeling that arithmetic using monthdelta wouldn't always
> be intuitive.

I think that's true, especially since these calculations are not
necessarily invertible:


>>> date(2008, 1, 30) + monthdelta(1)
datetime.date(2008, 2, 29)
>>> date(2008, 2, 29) - monthdelta(1)
datetime.date(2008, 1, 29)


It could be that non-intuitivity is inherent in the problem of dealing
with dates and months.  I've aimed for a good compromise between the
needs of the problem and the pythonic example of timedelta.  I would
submit that timedelta itself isn't intuitive at first blush,
especially if one was weaned on the arcana of RDBMS date functions,
but after one uses timedelta for just a bit it makes total sense.  I
hope the same may be said of monthdelta.

cheers,
Jess
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Issue5434: datetime.monthdelta

2009-04-15 Thread Jess Austin
hi,

I'm new to python core development, and I've been advised to write to
python-dev concerning a feature/patch I've placed at
http://bugs.python.org/issue5434, with Rietveld at
http://codereview.appspot.com/25079.

This patch adds a "monthdelta" class and a "monthmod" function to the
datetime module.  The monthdelta class is much like the existing
timedelta class, except that it represents months offset from a date,
rather than an exact period offset from a date.  This allows us to
easily say, e.g. "3 months from now" without worrying about the number
of days in the intervening months.

>>> date(2008, 1, 30) + monthdelta(1)
datetime.date(2008, 2, 29)
>>> date(2008, 1, 30) + monthdelta(2)
datetime.date(2008, 3, 30)

The monthmod function, named in (imperfect) analogy to divmod, allows
us to round-trip by returning the interim between two dates
represented as a (monthdelta, timedelta) tuple:

>>> monthmod(date(2008, 1, 14), date(2009, 4, 2))
(datetime.monthdelta(14), datetime.timedelta(19))

Invariant: dt + monthmod(dt, dt+td)[0] + monthmod(dt, dt+td)[1] == dt + td

These also work with datetimes!  There are more details in the
documentation included in the patch.  In addition to the C module
file, I've updated the datetime CAPI, the documentation, and tests.

I feel this would be a good addition to core python.  In my work, I've
often ended up writing annoying one-off "add-a-month" or similar
functions.  I think since months work differently than most other time
periods, a new object is justified rather than trying to shoe-horn
something like this into timedelta.  I also think that the round-trip
functionality provided by monthmod is important to ensure that
monthdeltas are "first-class" objects.

Please let me know what you think of the idea and/or its execution.

thanks,
Jess Austin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Adding PEP consistent aliases for names that don't currently conform

2009-03-24 Thread Jess Austin
On Tue, Mar 24, 2009 at 3:14 PM, Guido van Rossum  wrote:
> Please don't do this. We need stable APIs. Trying to switch the entire
> community to use CapWord APIs for something as commonly used as
> datetime sounds like wasting a lot of cycles with no reason except the
> mythical "PEP 8 conformance". As I said, it's a pity we didn't change
> this at the 3.0 point, but I think going forward we should try to be
> more committed to slow change. Additions of new functionality are of
> course fine. But renamings (even if the old names remain available)
> are just noise.

OK, I had misunderstood your earlier message.  Sorry for the confusion.

thanks,
Jess
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Adding PEP consistent aliases for names that don't currently conform

2009-03-23 Thread Jess Austin
On Tue, Mar 3 at 19:25:21 Guido van Rossum  wrote:
> On Tue, Mar 3, 2009 at 5:15 PM, Brett Cannon  wrote:
>>
>>
>> On Tue, Mar 3, 2009 at 05:13,  wrote:
>>>
>>> On Tue, 3 Mar 2009 at 06:01, Ivan Krsti?~G wrote:
>>>>
>>>> On Mar 2, 2009, at 7:08 PM, Steve Holden wrote:
>>>>>
>>>>> > > ?PS.: so is datetime.datetime a builtin then? :)
>>>>> > > ?Another historic accident. Like socket.socket. :-(
>>>>> >
>>>>> ?A pity this stuff wasn't addressed for 3.0. Way too late now, though.
>
> A pity indeed.
>
>>>> It may be too late to rename the existing accidents, but why not add
>>>> consistently-named aliases (socket.Socket, datetime.DateTime, etc) and
>>>> strongly encourage their use in new code?
>>
>> Or make the old names aliases for the new names and start a
>> PendingDeprecationWarning on the old names so they can be switched in the
>> distant future?
>
> +1, if it's not done in a rush and only for high-visibility modules --
> let's start with socket and datetime.
>
> We need a really long lead time before we can remove these. I
> recommend starting with a *silent* deprecation in 3.1 combined with a
> PR offensive for the new names.

I've uploaded a patch for the datetime module with respect to this
issue at http://bugs.python.org/issue5530 . I would appreciate it if
experienced developers could take a look at it and provide some
feedback.  Since I've only been hacking on CPython for about a month,
please be kind!  I'm happy to make changes to this.

As it stands now, the patch changes the current objects to have
CapWords names, and subclasses these objects to provide objects with
the old names. Use of methods (including __new__) of the derived
objects causes PendingDeprecations (if one has warning filters
appropriately set).

A warning: this patch requires the patch to the test refactoring at
Issue 5520 to completely apply.  It will fail one test without the
patch at Issue 5516.  Both of these are (inexpertly) linked from the
roundup page for this issue.

I hope this will be helpful.

cheers,
Jess Austin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] tally (and other accumulators)

2006-04-04 Thread Jess Austin
Alex wrote:
> On Apr 4, 2006, at 10:53 PM, Jess Austin wrote:
> > Alex wrote:
> >> import collections
> >> def tally(seq):
> >>  d = collections.defaultdict(int)
> >>  for item in seq:
> >>  d[item] += 1
> >>  return dict(d)
> >
> > I'll stop lurking and submit the following:
> >
> > def tally(seq):
> > return dict((group[0], len(tuple(group[1])))
> > for group in itertools.groupby(sorted(seq)))
> >
> > In general itertools.groupby() seems like a very clean way to do this
> > sort of thing, whether you want to end up with a dict or not.  I'll go
> > so far as to suggest that the existence of groupby() obviates the
> > proposed tally().  Maybe I've just coded too much SQL and it has  
> > warped my brain...
> >
> > OTOH the latter definition of tally won't cope with iterables, and it
> > seems like O(nlogn) rather than O(n).
> 
> It will cope with any iterable just fine (not sure why you think  
> otherwise), but the major big-O impact seems to me to rule it out as  
> a general solution.

You're right in that it won't raise an exception on an iterator, but the
sorted() means that it's much less memory efficient than your version
for iterators.  Another reason to avoid sorted() for this application,
besides the big-O.  Anyway, I still like groupby() for this sort of
thing, with the aforementioned caveats.  Functional code seems a little
clearer to me, although I realize that preference is not held
universally.

cheers,
Jess
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] tally (and other accumulators)

2006-04-04 Thread Jess Austin
Alex wrote: 
> import collections
> def tally(seq):
>  d = collections.defaultdict(int)
>  for item in seq:
>  d[item] += 1
>  return dict(d)

I'll stop lurking and submit the following:

def tally(seq):
return dict((group[0], len(tuple(group[1])))
for group in itertools.groupby(sorted(seq)))

In general itertools.groupby() seems like a very clean way to do this
sort of thing, whether you want to end up with a dict or not.  I'll go
so far as to suggest that the existence of groupby() obviates the
proposed tally().  Maybe I've just coded too much SQL and it has warped
my brain...

OTOH the latter definition of tally won't cope with iterables, and it
seems like O(nlogn) rather than O(n).

cheers,
Jess
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com