At 03:53 PM 6/27/2010 +1000, Nick Coghlan wrote:
We could talk about this even longer, but the most effective way
forward is going to be a patch that improves the URL parsing
situation.
Certainly, it's the only practical solution for the immediate problems in 3.2.
I only mentioned that I "hate the idea" because I'd be more
comfortable if it was explicitly declared to be a temporary hack to
work around the absence of a string coercion protocol, due to the
moratorium on language changes.
But, since the moratorium *is* in effect, I'll try to make this my
last post on string protocols for a while... and maybe wait until
I've looked at the code (str/bytes C implementations) in more detail
and can make a more concrete proposal for what the protocol would be
and how it would work. (Not to mention closer to the end of the moratorium.)
There are a *very small* number of APIs where it is appropriate to
be polymorphic
This is only true if you focus exclusively on bytes vs. unicode,
rather than the general issue that it's currently impractical to pass
*any* sort of user-defined string type through code that you don't
directly control (stdlib or third-party).
The virtues of a separate poly_str type are that:
1. It can be simple and implemented in Python, dispatching to str or
bytes as appropriate (probably in the strings module)
2. No chance of impacting the performance of the core interpreter (as
builtins are not affected)
Note that adding a string coercion protocol isn't going to change
core performance for existing cases, since any place where the
protocol would be invoked would be a code branch that either throws
an error or *already* falls back to some other protocol (e.g. the
buffer protocol).
3. Lower impact if it turns out to have been a bad idea
How many protocols have been added that turned out to be bad
ideas? The only ones that have been removed in 3.x, IIRC, are
three-way compare, slice-specific operations, and __coerce__... and
I'm going to miss __cmp__. ;-)
However, IIUC, the reason these protocols were dropped isn't because
they were "bad ideas". Rather, they're things that can be
implemented in terms of a finer-grained protocol. i.e., if you want
__cmp__ or __getslice__ or __coerce__, you can always implement them
via a mixin that converts the newer fine-grained protocols into
invocations of the older protocol. (As I plan to do for __cmp__ in
the handful of places I use it.)
At the moment, however, this isn't possible for multi-string
operations outside of __add__/__radd__ and comparison -- the coercion
rules are hard-wired and can't be overridden by user-defined types.
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com