Re: [Python-Dev] PEP 3333: wsgi_string() function

2011-01-10 Thread Ian Bicking
On Sun, Jan 9, 2011 at 1:47 AM, Stephen J. Turnbull step...@xemacs.orgwrote:

 Robert Brewer writes:

   Python 3.1 was released June 27th, 2009. We're coming up faster on the
   two-year period than we seem to be on a revised WSGI spec. Maybe we
   should shoot for a bytes of a known encoding type first.

 You have one.  It's called ISO 2022: Information processing -- ISO
 7-bit and 8-bit coded character sets -- Code extension techniques.
 The popularity of that standard speaks for itself.


The kind of object PJE was referring to is more like Ruby's strings, which
do not embed the encoding inside the bytes themselves but have the encoding
as a kind of annotation on the bytes, and do lazy transcoding when combining
strings of different encodings.  The goal with respect to WSGI is that you
could annotate bytes with an encoding but also change or fix that encoding
if other out-of-band information implied that you got the encoding wrong
(e.g., some data is submitted with the encoding of the page the browser was
on, and so nothing inside the request itself will indicate the encoding of
the data).  Latin1 is kind of the poor man's version of this -- it's a good
guess at an encoding, that at worst requires transcoding that can be done in
a predictable way.  (Personally I think Latin1 gets us 99% of the way there,
and so bytes-of-a-known-encoding are not really that important to the WSGI
case.)

  Ian
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Continuing 2.x

2010-10-29 Thread Ian Bicking
On Fri, Oct 29, 2010 at 12:21 PM, Barry Warsaw ba...@python.org wrote:

 On Oct 29, 2010, at 12:43 PM, Casey Duncan wrote:

 I like Python 3, I am using it for my latest projects, but I am also
 keeping
 Python 2 compatibility. This incurs some overhead, and basically means I
 am
 still really only using Python 2 features. So in some respects, my Python
 3.x
 support is only tacit, it works as well as for Python 2, but it's not
 taking
 advantage of Python 3 really. I haven't run into a situation yet where I
 really want to or have to use Python 3 exclusive features, but then again
 I'm
 not really learning to use Python 3 either, short of the new C api.

 One thing that *might* be interesting to explore for Python 3.3 would be
 something like `python3 --1` or some such switch that would help Python 2
 code
 run more easily in Python 3.  This might be a hook to 2to3 or other
 internal
 changes that help some of the trickier bits of writing cross-compatible
 code.


More useful IMHO would be things like from __past__ import
print_statement, still requiring some annotation of code to make it run,
but less invasive than translating code itself.  There's still major things
you can't handle like that, but if something is syntactically acceptable in
both Python 2 and 3 then it's a lot easier to apply simple conditionals
around semantics.  This would remove the need, for example, for people to
use sys.exc_info() to avoid using except Exception as e.

-- 
Ian Bicking  |  http://blog.ianbicking.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Continuing 2.x

2010-10-28 Thread Ian Bicking
On Thu, Oct 28, 2010 at 9:04 AM, Barry Warsaw ba...@python.org wrote:

 Who is the target audience for a Python 2.8?  What exactly would a Python
 2.8
 accomplish?

 If Python 2.8 doesn't include new features, well, then what's the point?
 Python 2.7 will be bug fix maintained for a long time, longer in fact than
 previous Python 2 versions.  So a no-feature Python 2.8 can't be about
 improving Python 2 stability over time (i.e. just fix the bug in Python
 2.7).

 If Python 2.8 is about adding new features, then it has to be about
 backporting those features from Python 3.  Adding new feature only to a
 Python
 2.8 *isn't* Python, it's a fork of Python.


Thinking about language features and core type this seems reasonable, but
with the standard library this seems less reasonable -- there's lots of
conservative changes to the standard library which aren't bug fixes, and the
more the standard library is out of sync between Python 2 and 3 the harder
maintaining software that works across those versions becomes.

Though one opportunity is to distribute modules from the standard library
under new names (e.g., unittest2), and at least in Python 2 you don't have
to do anything fancy or worry about the standard library has catching up to
the standard library forked module.

Library installers seem particularly apropos to this discussion, as everyone
seems excited to get them into the standard library and distributed with
Python, but with the current plan that will never happen with Python 2.

-- 
Ian Bicking  |  http://blog.ianbicking.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Supporting raw bytes data in urllib.parse.* (was Re: Polymorphic best practices)

2010-09-21 Thread Ian Bicking
On Mon, Sep 20, 2010 at 6:19 PM, Nick Coghlan ncogh...@gmail.com wrote:

   What are the cases you believe will cause new mojibake?

 Calling operations like urlsplit on byte sequences in non-ASCII
 compatible encodings and operations like urljoin on byte sequences
 that are encoded with different encodings. These errors differ from
 the URL escaping errors you cite, since they can produce true mojibake
 (i.e. a byte sequence without a single consistent encoding), rather
 than merely non-compliant URLs. However, if someone has let their
 encodings get that badly out of whack in URL manipulation they're
 probably doomed anyway...


FWIW, while I understand the problems non-ASCII-compatible encodings can
create, I've never encountered them, perhaps because ASCII-compatible
encodings are so dominant.

There are ways you can get a URL (HTTP specifically) where there is no
notion of Unicode.  I think the use case everyone has in mind here is where
you get a URL from one of these sources, and you want to handle it.  I have
a hard time imagining the sequence of events that would lead to mojibake.
Naive parsing of a document in bytes couldn't do it, because if you have a
non-ASCII-compatible document your ASCII-based parsing will also fail (e.g.,
looking for b'href=(.*?)').  I suppose if you did
urlparse.urlsplit(user_input.encode(sys.getdefaultencoding())) you could end
up with the problem.

All this is unrelated to the question, though -- a separate byte-oriented
function won't help any case I can think of.  If the programmer is
implementing something like
urlparse.urlsplit(user_input.encode(sys.getdefaultencoding())), it's because
they *want* to get bytes out.  So if it's named urlparse.urlsplit_bytes()
they'll just use that, with the same corruption.  Since bytes and text don't
interact well, the choice of bytes in and bytes out will be a deliberate
one.  *Or*, bytes will unintentionally come through, but that will just
delay the error a while when the bytes out don't work (e.g.,
urlparse.urljoin(text_url, urlparse.urlsplit(byte_url).path).  Delaying the
error is a little annoying, but a delayed error doesn't lead to mojibake.

Mojibake is caused by allowing bytes and text to intermix, and the
polymorphic functions as proposed don't add new dangers in that regard.

-- 
Ian Bicking  |  http://blog.ianbicking.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Web-SIG] Backup plan: WSGI 1 Addenda and wsgiref update for Py3

2010-09-21 Thread Ian Bicking
On Tue, Sep 21, 2010 at 12:47 PM, Chris McDonough chr...@plope.com wrote:

 On Tue, 2010-09-21 at 12:09 -0400, P.J. Eby wrote:
  While the Web-SIG is trying to hash out PEP 444, I thought it would
  be a good idea to have a backup plan that would allow the Python 3
  stdlib to move forward, without needing a major new spec to settle
  out implementation questions.

 If a WSGI-1-compatible protocol seems more sensible to folks, I'm
 personally happy to defer discussion on PEP 444 or any other
 backwards-incompatible proposal.


I think both make sense, making WSGI 1 sensible for Python 3 (as well as
other small errata like the size hint) doesn't detract from PEP 444 at all,
IMHO.

-- 
Ian Bicking  |  http://blog.ianbicking.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Web-SIG] Backup plan: WSGI 1 Addenda and wsgiref update for Py3

2010-09-21 Thread Ian Bicking
On Tue, Sep 21, 2010 at 12:09 PM, P.J. Eby p...@telecommunity.com wrote:

 The Python 3 specific changes are to use:

 * ``bytes`` for I/O streams in both directions
 * ``str`` for environ keys and values
 * ``bytes`` for arguments to start_response() and write()


This is the only thing that seems odd to me -- it seems like the response
should be symmetric with the request, and the request in this case uses str
for headers (status being header-like), and bytes for the body.

Otherwise this seems good to me, the only other major errata I can think of
are all listed in the links you included.

* text stream for wsgi.errors

 In other words, strings in, bytes out for headers, bytes for bodies.

 In general, only changes that don't break Python 2 WSGI implementations are
 allowed.  The changes should also not break mod_wsgi on Python 3, but may
 make some Python 3 wsgi applications non-compliant, despite continuing to
 function on mod_wsgi.

 This is because mod_wsgi allows applications to output string headers and
 bodies, but I am ruling that option out because it forces every piece of
 middleware to have to be tested with arbitrary combinations of strings and
 bytes in order to test compliance.  If you want your application to output
 strings rather than bytes, you can always use a decorator to do that.  (And
 a sample one could be provided in wsgiref.)


I agree allowing both is not ideal.


-- 
Ian Bicking  |  http://blog.ianbicking.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Polymorphic best practices [was: (Not) delaying the 3.2 release]

2010-09-17 Thread Ian Bicking
On Fri, Sep 17, 2010 at 3:25 PM, Michael Foord fuzzy...@voidspace.org.ukwrote:

  On 16/09/2010 23:05, Antoine Pitrou wrote:

 On Thu, 16 Sep 2010 16:51:58 -0400
 R. David Murrayrdmur...@bitdance.com  wrote:

 What do we store in the model?  We could say that the model is always
 text.  But then we lose information about the original bytes message,
 and we can't reproduce it.  For various reasons (mailman being a big
 one),
 this is not acceptable.  So we could say that the model is always bytes.
 But we want access to (for example) the header values as text, so header
 lookup should take string keys and return string values[2].

 Why can't you have both in a single class? If you create the class
 using a bytes source (a raw message sent by SMTP, for example), the
 class automatically parses and decodes it to unicode strings; if you
 create the class using an unicode source (the text body of the e-mail
 message and the list of recipients, for example), the class
 automatically creates the bytes representation.

  I think something like this would be great for WSGI. Rather than focus on
 whether bytes *or* text should be used, use a higher level object that
 provides a bytes view, and (where possible/appropriate) a unicode view too.


This is what WebOb does; e.g., there is only bytes version of a POST body,
and a view on that body that does decoding and encoding.  If you don't touch
something, it is never decoded or encoded.  I only vaguely understand the
specifics here, and I suspect the specifics matter, but this seems
applicable in this case too -- if you have an incoming email with a
smattering of bytes, inline (2047) encoding, other encoding declarations,
and then orthogonal systems like quoted-printable, you don't want to touch
that stuff if you don't need to as handling unicode objects implies you are
normalizing the content, and that might have subtle impacts you don't know
about, or don't want to know about, or maybe just don't fit into the unicode
model (like a string with two character sets).

Note that WebOb does not have two views, it has only one view -- unicode
viewing bytes.  I'm not sure I could keep two views straight.  I *think*
Antoine is describing two possible canonical data types (unicode or bytes)
and two views.  That sounds hard.

-- 
Ian Bicking  |  http://blog.ianbicking.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 376 proposed changes for basic plugins support

2010-08-02 Thread Ian Bicking
Just to add a general opinion in here:

Having worked with Setuptools' entry points, and a little with some Zope
pluginish systems (Products.*, which I don't think anyone liked much, and
some ways ZCML is used is pluginish), I'm not very excited about these.  The
plugin system that causes the least confusion and yet seems to accomplish
everything it needs is just listing objects in configuration -- nothing gets
activated implicitly with installation, and names are Python package/object
names without indirection.  The only thing I'd want to add is the ability to
also point to files, as a common use for plugins is adding ad hoc
functionality to an application, and the overhead of package creation isn't
always called for.  hg for example seems both simple and general enough, and
it doesn't use anything fancy.

Purely for the purpose of discovery and documentation it might be helpful to
have APIs, then some tool could show available plugins (especially if PyPI
had a query interface for this), or at least installed plugins, with the
necessary code to invoke them.

*Maybe* it would make sense to generalize the discovery of plugin types, so
that you can simply refer to an object and the application can determine
what kind of plugin it is.  But having described this, it actually doesn't
seem like a useful thing to generalize.

-- 
Ian Bicking  |  http://blog.ianbicking.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Thoughts fresh after EuroPython

2010-07-26 Thread Ian Bicking
On Mon, Jul 26, 2010 at 9:06 AM, Barry Warsaw ba...@python.org wrote:

 On Jul 24, 2010, at 07:08 AM, Guido van Rossum wrote:
 privileges enough. So, my recommendation (which surely is a
 turn-around of my *own* attitude in the past) is to give out more
 commit privileges sooner.

 +1, though I'll observe that IME, actual commit privileges become much less
 of
 a special badge once a dvcs-based workflow is put in place.  In the absence
 of
 that, I agree that we have enough checks and balances in place to allow
 more
 folks to commit changes


Even with DVCS in place, commit privileges allow the person who cares about
a change to move it forward, including the more mechanical aspects.  E.g. if
there are positive reviews of a person's changes in their fork, they can
push those changes in.  Or more generally, there's a lot of ways of getting
approval, but limited commit privileges means all approval must ultimately
be funneled through someone with commit.  Also different parts of the
codebase should have different levels of review and conservativism; e.g.,
adding clarifications to the docs requires a different level of review than
changing stuff in the core.  We could try to build that into the tools, but
it's a lot easier to make the tools permissive and build these distinctions
into social structures.

-- 
Ian Bicking  |  http://blog.ianbicking.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python Language Summit EuroPython 2010

2010-07-21 Thread Ian Bicking
On Wed, Jul 21, 2010 at 10:11 AM, Tim Golden m...@timgolden.me.uk wrote:

 A discussion on the Cheeseshop / Package Index highlighted the fact that
 the packaging infrastructure has become increasingly important especially
 since setuptools, buildout and pip all download from it. Richard produced
 graphs showing the increase in package downloads over time, and attributed
 the recent slight tail-off to the fact that the toolchains are now becoming
 more canny with respect to cacheing and mirroring.

 Martin  Richard confirmed that mirrors are now in place and Marc Andre
 confirmed
 that he would be putting together a proposal to have PyPI hosted in the
 cloud. Guido pointed out that if an AppEngine implementation were
 desirable,
 he was sure that AppEngine team would support it with resources as needed.
 Martin didn't feel that there was a problem with loading on the box in
 question; it's the uptime that's behind people's concern as it's now so
 essential to installing and deploying Python applications.


From what I've been able to tell from afar, I strongly suspect PyPI's
downtimes would be greatly reduced with a move to mod_wsgi (currently it is
using mod_fcgi, and most downtime is solved with an Apache restart --
mod_wsgi generally recovers from these problems without intervention).
Martin attempted this at one time but ran into some installation problems.
It seems like the team of people managing PyPI could benefit from the
addition of someone with more of a sysadmin background (e.g., to help with
installing a monitor on the server).

-- 
Ian Bicking  |  http://blog.ianbicking.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Removing IDLE from the standard library

2010-07-12 Thread Ian Bicking
On Sun, Jul 11, 2010 at 3:38 PM, Ron Adam r...@ronadam.com wrote:

 There might be another alternative.

 Both idle and pydoc are applications (are there others?) that are in the
 standard library.  As such, they or parts of them, are possibly importable
 to other projects.  That restricts changes because a committer needs to
 consider the chances that a change may break something else.

 I suggest they be moved out of the lib directory, but still be included
 with python.  (Possibly in the tools directory.)  That removes some of the
 backward compatibility restrictions or at least makes it clear there isn't a
 need for backward compatibility.


I also like this idea.  This means Python comes with an IDE out of he box
but without the overhead of a management and release process that is built
for something very different than a GUI program (the standard library).
This would mean that IDLE would be in site-packages, could easily be
upgraded using normal tools, and maybe most importantly it could have its
own community tools and development process that is more casual (and can
more easily integrate new contributors) and higher velocity of changes and
releases.  Python releases would then ship the most recent stable release of
IDLE.

-- 
Ian Bicking  |  http://blog.ianbicking.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] bytes / unicode

2010-06-25 Thread Ian Bicking
On Fri, Jun 25, 2010 at 2:05 AM, Stephen J. Turnbull step...@xemacs.orgwrote:

  But join('x', 'y') - 'x/y' and join(b'x', b'y') - b'x/y' make
   sense to me.
  
   So, actually, I *don't* understand what you mean by needing LBYL.

 Consider docutils.  Some folks assert that URIs *are* bytes and should
 be manipulated as such.  So base URIs should be bytes.


I don't get what you are arguing against.  Are you worried that if we make
URL code polymorphic that this will mean some code will treat URLs as bytes,
and that code will be incompatible with URLs as text?  No one is arguing we
remove text support from any of these functions, only that we allow bytes.


-- 
Ian Bicking  |  http://blog.ianbicking.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] thoughts on the bytes/string discussion

2010-06-25 Thread Ian Bicking
On Fri, Jun 25, 2010 at 5:06 AM, Stephen J. Turnbull step...@xemacs.orgwrote:

So with this idea in mind it makes more sense to me that *specific
 pieces of
   text* can be reasonably treated as both bytes and text.  All the string
   literals in urllib.parse.urlunspit() for example.
  
   The semantics I imagine are that special('/')+b'x'==b'/x' (i.e., it does
 not
   become special('/x')) and special('/')+x=='/x' (again it becomes str).
  This
   avoids some of the cases of unicode or str infecting a system as they
 did in
   Python 2 (where you might pass in unicode and everything works fine
 until
   some non-ASCII is introduced).

 I think you need to give explicit examples where this actually helps
 in terms of type contagion.  I expect that it doesn't help at all,
 especially not for the people whose native language for URIs is bytes.
 These specials are still going to flip to unicode as soon as it comes
 in, and that will be incompatible with the bytes they'll need later.
 So they're still going to need to filter out unicode on input.

 It looks like it would be useful for programmers of polymorphic
 functions, though.


I'm proposing these specials would be used in polymorphic functions, like
the functions in urllib.parse.  I would not personally use them in my own
code (unless of course I was writing my own polymorphic functions).

This also makes it less important that the objects be a full stand-in for
text, as their use should be isolated to specific functions, they aren't
objects that should be passed around much.  So you can easily identify and
quickly detect if you use unsupported operations on those text-like
objects.  (This is all a very different use case from bytes+encoding, I
think)

-- 
Ian Bicking  |  http://blog.ianbicking.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] thoughts on the bytes/string discussion

2010-06-25 Thread Ian Bicking
On Fri, Jun 25, 2010 at 11:30 AM, Stephen J. Turnbull step...@xemacs.orgwrote:

 Ian Bicking writes:

   I'm proposing these specials would be used in polymorphic functions,
 like
   the functions in urllib.parse.  I would not personally use them in my
 own
   code (unless of course I was writing my own polymorphic functions).
  
   This also makes it less important that the objects be a full stand-in
 for
   text, as their use should be isolated to specific functions, they aren't
   objects that should be passed around much.  So you can easily identify
 and
   quickly detect if you use unsupported operations on those text-like
   objects.

 OK.  That sounds reasonable to me, but I don't see any need for
 a builtin type for it.  Inclusion in the stdlib is not quite a
 no-brainer, but given Guido's endorsement of polymorphism, I can't
 bring myself to go lower than +0.9 wink.


Agreed on a builtin; I think it would be fine to put something in the
strings module, and then in these examples code that used '/' would instead
use strings.ascii('/') (not sure so sure of what the name should be though).


-- 
Ian Bicking  |  http://blog.ianbicking.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] thoughts on the bytes/string discussion

2010-06-24 Thread Ian Bicking
On Thu, Jun 24, 2010 at 12:38 PM, Bill Janssen jans...@parc.com wrote:

 Here are a couple of ideas I'm taking away from the bytes/string
 discussion.

 First, it would probably be a good idea to have a String ABC.

 Secondly, maybe the string situation in 2.x wasn't as broken as we
 thought it was.  In particular, those who deal with lots of encoded
 strings seemed to find it handy, and miss it in 3.x.  Perhaps strings
 are more like numbers than we think.  We have separate types for int,
 float, Decimal, etc.  But they're all numbers, and they all
 cross-operate.  In 2.x, it seems there were two missing features: no
 encoding attribute on str, which should have been there and should have
 been required, and the default encoding being ASCII (I can't tell you
 how many times I've had to fix that issue when a non-ASCII encoded str
 was passed to some output function).


I've started to form a conceptual notion that I think fits these cases.

We've setup a system where we think of text as natively unicode, with
encodings to put that unicode into a byte form.  This is certainly
appropriate in a lot of cases.  But there's a significant class of problems
where bytes are the native structure.  Network protocols are what we've been
discussing, and are a notable case of that.  That is, b'/' is the most
native sense of a path separator in a URL, or b':' is the most native sense
of what separates a header name from a header value in HTTP.  To disallow
unicode URLs or unicode HTTP headers would be rather anti-social, especially
because unicode is now the native string type in Python 3 (as an aside for
the WSGI spec we've been talking about using native strings in some
positions like dictionary keys, meaning Python 2 str and Python 3 str, while
being more exacting in other areas such as a response body which would
always be bytes).

The HTTP spec and other network protocols seems a little fuzzy on this,
because it was written before unicode even existed, and even later activity
happened at a point when unicode and text weren't widely considered the
same thing like they are now.  But I think the original intention is
revealed in a more modern specification like WebSockets, where they are very
explicit that ':' is just shorthand for a particular byte, it is not text
in our new modern notion of the term.

So with this idea in mind it makes more sense to me that *specific pieces of
text* can be reasonably treated as both bytes and text.  All the string
literals in urllib.parse.urlunspit() for example.

The semantics I imagine are that special('/')+b'x'==b'/x' (i.e., it does not
become special('/x')) and special('/')+x=='/x' (again it becomes str).  This
avoids some of the cases of unicode or str infecting a system as they did in
Python 2 (where you might pass in unicode and everything works fine until
some non-ASCII is introduced).

The one place where this might be tricky is if you have an encoding that is
not ASCII compatible.  But we can't guard against every possibility.  So it
would be entirely wrong to take a string encoded with UTF-16 and start to
use b'/' with it.  But there are other nonsensical combinations already
possible, especially with polymorphic functions, we can't guard against all
of them.  Also I'm unsure if something like UTF-16 is in any way compatible
with the kind of legacy systems that use bytes.  Can you encode your
filesystem with UTF-16?  I don't think you could encode a cookie with it.

So maybe having a second string type in 3.x that consists of an encoded
 sequence of bytes plus the encoding, call it estr, wouldn't have been
 a bad idea.  It would probably have made sense to have estr cooperate
 with the str type, in the same way that two different kinds of numbers
 cooperate, promoting the result of an operation only when necessary.
 This would automatically achieve the kind of polymorphic functionality
 that Guido is suggesting, but without losing the ability to do

  x = e(ASCII)bar
  a = ''.join(foo, x)

 (or whatever the syntax for such an encoded string literal would be --
 I'm not claiming this is a good one) which presume would bind a to a
 Unicode string foobar -- have to work out what gets promoted to what.


I would be entirely happy without a literal syntax.  But as Phillip has
noted, this can't be implemented *entirely* in a library as there are some
constraints with the current str/bytes implementations.  Reading PEP 3003
I'm not clear if such changes are part of the moratorium?  They seem like
they would be (sadly), but it doesn't seem clearly noted.

I think there's a *different* use case for things like
bytes-in-a-utf8-encoding (e.g., to allow XML data to be decoded lazily), but
that could be yet another class, and maybe shouldn't be polymorphicly usable
as bytes (i.e., treat it as an optimized str representation that is
otherwise semantically equivalent).  A String ABC would formalize these
things.

-- 
Ian Bicking  |  http://blog.ianbicking.org

Re: [Python-Dev] thoughts on the bytes/string discussion

2010-06-24 Thread Ian Bicking
On Thu, Jun 24, 2010 at 3:59 PM, Guido van Rossum gu...@python.org wrote:

 The protocol specs typically go out of their way to specify what byte
 values they use for syntactically significant positions (e.g. ':' in
 headers, or '/' in URLs), while hand-waving about the meaning of what
 goes in between since it is all typically treated as not of
 syntactic significance. So you can write a parser that looks at bytes
 exclusively, and looks for a bunch of ASCII punctuation characters
 (e.g. '', '', '/', ''), and doesn't know or care whether the stuff
 in between is encoded in Latin-15, MacRoman or UTF-8 -- it never looks
 inside stretches of characters between the special characters and
 just copies them. (Sometimes there may be *some* sections that are
 required to be ASCII and there equivalence of a-z and A-Z is well
 defined.)


Yes, these are the specific characters that I think we can handle
specially.  For instance, the list of all string literals used by urlsplit
and urlunsplit:
'//'
'/'
':'
'?'
'#'
''
'http'
A list of all valid scheme characters (a-z etc)
Some lists for scheme-specific parsing (which all contain valid scheme
characters)

All of these are constrained to ASCII, and must be constrained to ASCII, and
everything else in a URL is treated as basically opaque.

So if we turned these characters into byte-or-str objects I think we'd
basically be true to the intent of the specs, and in a practical sense we'd
be able to make these functions polymorphic.  I suspect this same pattern
will be present most places where people want polymorphic behavior.

For now we could do something incomplete and just avoid using operators we
can't overload (is it possible to at least make them produce a readable
exception?)

I think we'll avoid a lot of the confusion that was present with Python 2 by
not making the coercions transitive.  For instance, here's something that
would work in Python 2:

  urlunsplit(('http', 'example.com', '/foo', u'bar=baz', ''))

And you'd get out a unicode string, except that would break the first time
that query string (u'bar=baz') was not ASCII (but not until then!)

Here's the urlunsplit code:

def urlunsplit(components):
scheme, netloc, url, query, fragment = components
if netloc or (scheme and scheme in uses_netloc and url[:2] != '//'):
if url and url[:1] != '/': url = '/' + url
url = '//' + (netloc or '') + url
if scheme:
url = scheme + ':' + url
if query:
url = url + '?' + query
if fragment:
url = url + '#' + fragment
return url

If all those literals were this new special kind of string, if you call:

  urlunsplit((b'http', b'example.com', b'/foo', 'bar=baz', b''))

You'd end up constructing the URL b'http://example.com/foo' and then
running:

url = url + special('?') + query

And that would fail because b'http://example.com/foo' + special('?') would
be b'http://example.com/foo?' and you cannot add that to the str 'bar=baz'.
So we'd be avoiding the Python 2 craziness.

-- 
Ian Bicking  |  http://blog.ianbicking.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] bytes / unicode

2010-06-23 Thread Ian Bicking
On Wed, Jun 23, 2010 at 10:30 AM, Tres Seaver tsea...@palladion.com wrote:

  Stephen J. Turnbull wrote:

  We do need str-based implementations of modules like urllib.


 Why would that be?  URLs aren't text, and never will be.  The fact that
 to the eye they may seem to be text-ish doesn't make them text.  This
 *is* a case where dont make me think is a losing propsition:
 programmers who work with URLs in any non-opaque way as text are
 eventually going to be bitten by this issue no matter how hard we wave
 our hands.


HTML is text, and URLs are embedded in that text, so it's easy to get a URL
that is text.  Though, with a little testing, I notice that text alone can't
tell you what the right URL really is (at least the intended URL when unsafe
characters are embedded in HTML).

To test I created two pages, one in Latin-1 another in UTF-8, and put in the
link:

  ./test.html?param=Réunion

On a Latin-1 page it created a link to test.html?param=R%E9union and on a
UTF-8 page it created a link to test.html?param=R%C3%A9union (the second
link displays in the URL bar as test.html?param=Réunion but copies with
percent encoding).  Though if you link to ./Réunion.html then both pages
create UTF-8 links.  And both pages also link
http://Réunion.comhttp://xn--runion-bva.comto
http://xn--runion-bva.com/.  So really neither bytes nor text works
completely; query strings receive the encoding of the page, which would be
handled transparently if you worked on the page's bytes.  Path and domain
are consistently encoded with UTF-8 and punycode respectively and so would
be handled best when treated as text.  And of course if you are a page with
a non-ASCII-compatible encoding you really must handle encodings before the
URL is sensible.

Another issue here is that there's no encoding for turning a URL into
bytes if the URL is not already ASCII.  A proper way to encode a URL would
be:

(Totally as an aside, as I remind myself of new module names I notice it's
not easy to google specifically for Python 3 docs, e.g. python 3 urlsplit
gives me 2.6 docs)

from urllib.parse import urlsplit, urlunsplit
import encodings.idna

def encode_http_url(url, page_encoding='ASCII', errors='strict'):
scheme, netloc, path, query, fragment = urlsplit(url)
scheme = scheme.encode('ASCII', errors)
auth = port = None
if '@' in netloc:
auth, netloc = netloc.split('@', 1)
if ':' in netloc:
netloc, port = netloc.split(':', 1)
netloc = encodings.idna.ToASCII(netloc)
if port:
netloc = netloc + b':' + port.encode('ASCII', errors)
if auth:
netloc = auth.encode('UTF-8', errors) + b'@' + netloc
path = path.encode('UTF-8', errors)
query = query.encode(page_encoding, errors)
fragment = fragment.encode('UTF-8', errors)
return urlunsplit_bytes((scheme, netloc, path, query, fragment))

Where urlunsplit_bytes handles bytes (urlunsplit does not).  It's helpful
for me at least to look at that code specifically:

def urlunsplit(components):
scheme, netloc, url, query, fragment = components
if netloc or (scheme and scheme in uses_netloc and url[:2] != '//'):
if url and url[:1] != '/': url = '/' + url
url = '//' + (netloc or '') + url
if scheme:
url = scheme + ':' + url
if query:
url = url + '?' + query
if fragment:
url = url + '#' + fragment
return url

In this case it really would be best to have Python 2's system where things
are coerced to ASCII implicitly.  Or, more specifically, if all those string
literals in that routine could be implicitly converted to bytes using
ASCII.  Conceptually I think this is reasonable, as for URLs (at least with
HTTP, but in practice I think this applies to all URLs) the ASCII bytes
really do have meaning.  That is, '/' (*in the context of urlunsplit*)
really is \x2f specifically.  Or another example, making a GET request
really means sending the bytes \x47\x45\x54 and there is no other set of
bytes that has that meaning.  The WebSockets specification for instance
defines things like colon:
http://tools.ietf.org/html/draft-hixie-thewebsocketprotocol-76#page-5 -- in
an earlier version they even used bytes to describe HTTP (
http://tools.ietf.org/html/draft-hixie-thewebsocketprotocol-54#page-13),
though this annoyed many people.

-- 
Ian Bicking  |  http://blog.ianbicking.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] bytes / unicode

2010-06-23 Thread Ian Bicking
Oops, I forgot some important quoting (important for the algorithm,
maybe not actually for the discussion)...

from urllib.parse import urlsplit, urlunsplit
import encodings.idna

# urllib.parse.quote both always returns str, and is not as
conservative in quoting as required here...
def quote_unsafe_bytes(b):
result = []
for c in b:
if c  0x20 or c = 0x80:
result.extend(('%%%02X' % c).encode('ASCII'))
else:
result.append(c)
return bytes(result)

def encode_http_url(url, page_encoding='ASCII', errors='strict'):
    scheme, netloc, path, query, fragment = urlsplit(url)
    scheme = scheme.encode('ASCII', errors)
    auth = port = None
    if '@' in netloc:
    auth, netloc = netloc.split('@', 1)
    if ':' in netloc:
    netloc, port = netloc.split(':', 1)
    netloc = encodings.idna.ToASCII(netloc)
    if port:
    netloc = netloc + b':' + port.encode('ASCII', errors)
    if auth:
    netloc = quote_unsafe_bytes(auth.encode('UTF-8', errors)) +
b'@' + netloc
    path = quote_unsafe_bytes(path.encode('UTF-8', errors))
    query = quote_unsafe_bytes(query.encode(page_encoding, errors))
    fragment = quote_unsafe_bytes(fragment.encode('UTF-8', errors))
    return urlunsplit_bytes((scheme, netloc, path, query, fragment))



--
Ian Bicking  |  http://blog.ianbicking.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] bytes / unicode

2010-06-22 Thread Ian Bicking
On Tue, Jun 22, 2010 at 6:31 AM, Stephen J. Turnbull step...@xemacs.orgwrote:

 Toshio Kuratomi writes:

   I'll definitely buy that.  Would urljoin(b_base, b_subdir) = bytes and
   urljoin(u_base, u_subdir) = unicode be acceptable though?

 Probably.

 But it doesn't matter what I say, since Guido has defined that as
 polymorphism and approved it in principle.

   (I think, given other options, I'd rather see two separate
   functions, though.

 Yes.

   If you want to deal with things like this::
 http://host/café http://host/caf%C3%A9

 Yes.


Just for perspective, I don't know if I've ever wanted to deal with a URL
like that.  I know how it is supposed to work, and I know what a browser
does with that, but so many tools will clean that URL up *or* won't be able
to deal with it at all that it's not something I'll be passing around.  So
from a practical point of view this really doesn't come up, and if it did it
would be in a situation where you could easily do something ad hoc (though
there is not currently a routine to quote unsafe characters in a URL... that
would be helpful, though maybe urllib.quote(url.encode('utf8'), '%/:') would
do it).

Also while it is problematic to treat the URL-unquoted value as text
(because it has an unknown encoding, no encoding, or regularly a mixture of
encodings), the URL-quoted value is pretty easy to pass around, and
normalization (in this case to http://host/caf%C3%A9) is generally fine.

While it's nice to be correct about encodings, sometimes it is impractical.
And it is far nicer to avoid the situation entirely.  That is, decoding
content you don't care about isn't just inefficient, it's complicated and
can introduce errors.  The encoding of the underlying bytes of a %-decoded
URL is largely uninteresting.  Browsers (whose behavior drives a lot of
convention) don't touch any of that encoding except lately occasionally to
*display* some data in a more friendly way.  But it's only display, and
errors just make it revert to the old encoded display.

Similarly I'd expect (from experience) that a programmer using Python to
want to take the same approach, sticking with unencoded data in nearly all
situations.


-- 
Ian Bicking  |  http://blog.ianbicking.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] bytes / unicode

2010-06-22 Thread Ian Bicking
On Tue, Jun 22, 2010 at 1:07 PM, James Y Knight f...@fuhm.net wrote:

 The surrogateescape method is a nice workaround for this, but I can't help
 thinking that it might've been better to just treat stuff as
 possibly-invalid-but-probably-utf8 byte-strings from input, through
 processing, to output. It seems kinda too late for that, though: next time
 someone designs a language, they can try that. :)


surrogateescape does help a lot, my only problem with it is that it's
out-of-band information.  That is, if you have data that went through
data.decode('utf8', 'surrogateescape') you can restore it to bytes or
transcode it to another encoding, but you have to know that it was decoded
specifically that way.  And of course if you did have to transcode it (e.g.,
text.encode('utf8', 'surrogateescape').decode('latin1')) then if you had
actually handled the text in any way you may have broken it; you don't
*really* have valid text.  A lazier solution feels like it would be easier
and more transparent to work with.

But... I also don't see any major language constraint to having another kind
of string that is bytes+encoding.  I think PJE brought up a problem with a
couple coercion aspects.

-- 
Ian Bicking  |  http://blog.ianbicking.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] bytes / unicode

2010-06-22 Thread Ian Bicking
On Tue, Jun 22, 2010 at 11:17 AM, Guido van Rossum gu...@python.org wrote:

 (2) Data sources.

 These can be functions that produce new data from non-string data,
 e.g. str(int), read it from a named file, etc. An example is read()
 vs. write(): it's easy to create a (hypothetical) polymorphic stream
 object that accepts both f.write('booh') and f.write(b'booh'); but you
 need some other hack to make read() return something that matches a
 desired return type. I don't have a generic suggestion for a solution;
 for streams in particular, the existing distinction between binary and
 text streams works, of course, but there are other situations where
 this doesn't generalize (I think some XML interfaces have this
 awkwardness in their API for converting a tree to a string).


This reminds me of the optimization ElementTree and lxml made in Python 2
(not sure what they do in Python 3?) where they use str when a string is
ASCII to avoid the memory and performance overhead of unicode.  Also at
least lxml is also dealing with the divide between the internal libxml2
string representation and the Python representation.  This is a place where
bytes+encoding might also have some benefit.  XML is someplace where you
might load a bunch of data but only touch a little bit of it, and the amount
of data is frequently large enough that the efficiencies are important.

-- 
Ian Bicking  |  http://blog.ianbicking.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 2.7b1 and argparse's version action

2010-04-18 Thread Ian Bicking
On Sun, Apr 18, 2010 at 6:24 PM, Steven Bethard steven.beth...@gmail.comwrote:

 On Sun, Apr 18, 2010 at 3:52 PM, Antoine Pitrou solip...@pitrou.net
 wrote:
  Steven Bethard steven.bethard at gmail.com writes:
  Note
  that even though I agree with you that -v/--version is probably not
  the best choice, in the poll[2] 11% of people still wanted this.
 
  This strikes me as a small minority.

 Agreed, but it's also the current behavior, and has been since the
 beginning of argparse. Note that no one complained about it until
 Tobias filed the issue in Nov 06, 2009.


I encountered this problem within minutes of first using argparse.  Of
course I'm very familiar with optparse and the standard optparse
instantiation flies off my fingers without thinking.  But then there's going
to be a lot more people with that background using argparse once it is in
the standard library -- people who don't really care about argparse or
optparse but just want to use the standard thing.  I don't see any reason
why argparse can't simply do exactly what optparse did.  There's nothing
wrong with it.  It's what many people expect.  We should just defer to
tradition when the choice isn't important (it's getting to be a very bike
shed thread).

Somewhat relatedly, what is the plan for past and future argparse releases?
Michael Foord for instance is releasing unittest improvements in parallel
under the name unittest2.  I believe there is strong disfavor with releasing
packages that overlap with the standard library, so continuing to release
argparse under the name argparse will cause problems.  I would hate to see
release complications or confusions keep argparse from seeing future
development.

-- 
Ian Bicking  |  http://blog.ianbicking.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposing PEP 376

2010-04-07 Thread Ian Bicking
On Wed, Apr 7, 2010 at 9:40 AM, Tarek Ziadé ziade.ta...@gmail.com wrote:

 so for the PEP :

 - sys.prefix - the installation prefix provided by --prefix at
 installation time
 - site-packages - the installation libdir, provided by --install-lib
 at installation time


How do you actually calculate site-packages?  Would you store the directory
name somewhere?  Would you import the module and look at
os.path.dirname(os.path.dirname(module.__file__))?  Or just scan to see
where the module would be?

If you store the directory name somewhere then you have another absolute
path.  This is why, for simplicity, I thought it should be relative to the
directory where the record file is (lots of extraneous ../, but the most
obvious meaning of a relative filename).

-- 
Ian Bicking  |  http://blog.ianbicking.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposing PEP 376

2010-04-07 Thread Ian Bicking
On Wed, Apr 7, 2010 at 12:45 PM, P.J. Eby p...@telecommunity.com wrote:

  Examples under debian:

docutils/__init__.py  - located in
 /usr/local/lib/python2.6/site-packages/
../../../bin/rst2html.py   -  located in /usr/local/bin
/etc/whatever  - located in /etc


 I'm wondering if there's really any benefit to having
 ../../../bin/rst2html.py vs. /usr/local/bin/rst2html.py.  Was there a use
 case for that, or should we just go with relative paths ONLY for children of
 the libdir?

 (I only suggested this setup in order to preserve as much of the
 prefix-relativity proposal as possible, but I wasn't the one who proposed
 prefix-relativity so I don't recall what the use case is, and I don't even
 remember who proposed it.  I only ever had a usecase for libdir-relativity
 personally.)


Yes, in a virtualenv environment there will be ../../../bin/rst2html.py that
will still be under the (virtual) sys.prefix, and the whole bundle can be
usefully moved around.

-- 
Ian Bicking  |  http://blog.ianbicking.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Bootstrap script for package management tool in Python 2.7 (Was: Re: [Distutils] At least one package management tool for 2.7)

2010-03-29 Thread Ian Bicking
On Mon, Mar 29, 2010 at 11:26 AM, Larry Hastings la...@hastings.org wrote:

 anatoly techtonik wrote:

 So, there won't be any package management tool shipped with Python 2.7
 and users will have to download and install `setuptools` manually as
 before:

  search - download - unzip - cmd - cd - python
 setup.py install


 Therefore I still propose shipping bootstrap package that instruct
 user how to download and install an actual package  management tool
 when users tries to use it.


 For what it's worth, Guido prototyped something similar in March of 2008,
 but his was an actual bootstrapping tool for package management:

   http://mail.python.org/pipermail/python-dev/2008-March/077837.html

 His tool knew how to download a tar file, untar it, and run python
 setup.py install on it.  No version numbers, no dependency management,
 simple enough that it should be easy to get right.  Only appropriate for
 bootstrapping into a real package management tool.

 The thread ends with him saying I don't have time to deal with this
 further this week, and I dunno, maybe it just fell off the radar?  I'd been
 thinking about resurrecting the discussion but I didn't have time either.


I would consider this bootstrap to be quite workable, though I would add
that any extra option to the bootstrap script should be passed to setup.py
install, and the download should be cached (so you can do -h and not have to
re-download the package once you figure out the extra options -- at least a
--user option is reasonable here for people without root).  Specifically
targeting this bootstrap for tools like pip and virtualenv is no problem.

I think looking around PyPI etc is kind of more than I'd bother with.  Those
things change, this bootstrap code won't change, it could cause unnecessary
future pain.  Maybe (*maybe*) it could look in
http://pypi.python.org/well-known-packages/PACKAGE_NAME and so we can have
it install a certain small number of things quickly that way -- if the URL
it looks to is targeted only for the bootstrap script itself then we don't
have to worry about compatibility problems as much.

Oh... then i can think of a half dozen other options it could take, and then
it becomes an installer.  Blech.  OK, I'd be willing to cut off the options
at --user (which I think is a minimum... maybe --prefix too), and maybe some
simple package detection so people could write python -m boostrap
Setuptools --user -- entirely based on some well-known URL baked into
bootstrap.py, where the URL is independent of any other service (and so is
least likely to cause future problems or ambiguities).

An advantage to this kind of bootstrapper is that as future packaging
systems are developed there's a clear way to get started with them, without
prematurely baking anything in to Python.

-- 
Ian Bicking  |  http://blog.ianbicking.org  |  http://twitter.com/ianbicking
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] __file__

2010-02-26 Thread Ian Bicking
The one issue I thought would be resolved by not easily allowing
.pyc-only distributions is the case when you rename a file (say
module.py to newmodule.py) and there is a module.pyc laying around,
and you don't get the ImportError you would expect from import
module -- and to make it worse everything basically works, except
there's two versions of the module that slowly become different.  This
regularly causes problems for me, and those problems would get more
common and obscure if the pyc files were stashed away in a more
invisible location.

I can't even tell what the current proposal is; maybe this is
resolved?  If distributing bytecode required renaming pyc files to .py
as Glenn suggested that would resolve the problem quite nicely from my
perspective.  (Frankly I find the whole use case for distributing
bytecodes a bit specious, but whatever.)

-- 
Ian Bicking  |  http://blog.ianbicking.org  |  http://twitter.com/ianbicking
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal for virtualenv functionality in Python

2010-02-21 Thread Ian Bicking
 be possible
to move these environments around without breaking things.  That would be
compelling.


 I'm one of those folks who'd like to see this be stackable.  If we tweak
 the semantics just a bit I think it works:

   * pythonv should inspect its --prefix arguments, as well as passing
 them on to the child python process it runs.


With a config file I'd just expect a list of prefixes being allowed;
directly nesting feels unnecessarily awkward.  You could use a : (or
Windows-semicolon) list just like with PYTHONPATH.


   * When pythonv wants to run the next python process in line, it
 scans the path looking for the pythonX.X interpreter but /ignores/
 all the interpreters that are in in a --prefix bin directory it's
 already seen.
   * python handles multiple --prefix options, and later ones take
 precedence over earlier ones.
   * What should sys.interpreter be?  Explicit is better than implicit:
 the first pythonv to run also adds a --interpreter argv[0] to
 the front of the command-line.  Or they could all add it and
 python only uses the last one.  This is one area where python vs
 python3.2 makes things a little complicated.


Ah, yes, the same problem I note above.  It should definitely be the thing
the person actually typed, or what is in the #! line.



 I'm at PyCon and would be interested in debating / sprinting on this if
 there's interest.


Yeah, if you see me around, please catch me!


-- 
Ian Bicking  |  http://blog.ianbicking.org  |  http://twitter.com/ianbicking
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal for virtualenv functionality in Python

2010-02-20 Thread Ian Bicking
On Fri, Feb 19, 2010 at 10:39 PM, Glenn Linderman
v+pyt...@g.nevcal.comv%2bpyt...@g.nevcal.com
 wrote:

 On approximately 2/19/2010 1:18 PM, came the following characters from the
 keyboard of P.J. Eby:

  At 01:49 PM 2/19/2010 -0500, Ian Bicking wrote:

 I'm not sure how this should best work on Windows (without symlinks,
 and where things generally work differently), but I would hope if
 this idea is more visible that someone more opinionated than I would
 propose the appropriate analog on Windows.


 You'd probably have to just copy pythonv.exe to an appropriate
 directory, and have it use the configuration file to find the real
 prefix.  At least, that'd be a relatively obvious way to do it, and it
 would have the advantage of being symmetrical across platforms: just
 copy or symlink pythonv, and make sure the real prefix is in your
 config file.

 (Windows does have shortcuts but I don't think that there's any way
 for a linked program to know *which* shortcut it was launched from.)


 No automatic way, but shortcuts can include parameters, not just the
 program name.  So a parameter could be --prefix as was suggested in another
 response, but for a different reason.

 Windows also has hard-links for files.

 A lot of Windows tools are completely ignorant of both of those linking
 concepts... resulting in disks that look to be over capacity when they are
 not, for example.


Virtualenv uses copies when it can't use symlinks.  A copy (or hard link)
seems appropriate on systems that do not have symlinks.  It would seem
reasonable that on Windows it might look in the registry to find the actual
location where Python was installed.  Or... whatever technique Windows
people think is best; it's simply necessary that the interpreter know its
location (the isolated environment) and also know where Python is installed.
 All this needs to be calculated in C, as the standard library needs to be
on the path very early (so os.symlink wouldn't help, but any C-level
function to determine this would be helpful).

(It's maybe a bit lame of me that I'm dropping this in the middle of PyCon,
as I'm not online frequently during the conference; sorry about that)

-- 
Ian Bicking  |  http://blog.ianbicking.org  |  http://twitter.com/ianbicking
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Proposal for virtualenv functionality in Python

2010-02-19 Thread Ian Bicking
 need to be aware of this to
compile extensions properly (we can be somewhat aware of these cases by
looking at places where virtualenv already has problems compiling
extensions).

Some people have argued for something like sys.prefixes, a list of locations
you might look at, which would allow a kind of nesting of these environments
(where sys.prefixes[-1] == sys.prefix; or maybe reversed).  Personally this
seems like it would be hard to keep mental track of this, but I can
understand the purpose -- you could for instance create a kind of template
prefix that has *most* of what you want installed in it, then create
sub-environments that contain for instance an actual application, or a
checkout (to test just one new piece of code).

I'm not sure how this should best work on Windows (without symlinks, and
where things generally work differently), but I would hope if this idea is
more visible that someone more opinionated than I would propose the
appropriate analog on Windows.


-- 
Ian Bicking  |  http://blog.ianbicking.org  |  http://twitter.com/ianbicking
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Improved Traceback Module

2010-01-28 Thread Ian Bicking
On Thu, Jan 28, 2010 at 11:01 AM, s...@pobox.com wrote:


pje If you look for a local variable in each frame containing a format
pje string, let's say __trace__, you could apply that format string to
pje a locals+globals dictionary for the frame, in place of dumping all
pje the locals by default

 I commented on the blog post before noticing all the replies here.  I'll
 embellish that suggestion by suggesting that instance attributes can be as
 valuable when debugging instance methods.  Perhaps __trace_self__ (or
 similar) could be fed from self.__dict__ if it exists?


It seems reasonable to special case the variable named self.  You might
also want other hooks.  For instance in weberror we take a convention from
Zope of looking or __traceback_supplement__, which is a factory for an
object that informs the traceback (a factory so you don't have to actually
instantiate it until there's an error).  I then extended its protocol a bit,
and use it for putting request information into the traceback.  I can
imagine two lighter ways to do this.  One is something like:

__traceback_inspect__ = ['self', 'request']

which indicates those two local variables should be inspected.  Another
might be some magic method on the request object.  Of course if
repr(request) is sufficient then you are golden.  But it almost certainly
isn't sufficient.  There's usually key objects that deserve special
attention in the case of an error, but which you don't want to flood the
output just because you happen to print their repr.  (With WebOb actually
str(request) would be quite good, while repr(request) would be too brief.)

To echo Guido, in my own traceback extensions I have at least a couple
levels of try:except: around anything fancy.  repr() definitely fails.
 Unicode errors happen at a lot of different levels (repr() returning
unicode, for example).  And everything you do may break simply by an error
in the code, and you still shouldn't lose at least the old traceback, so
putting one big try:except:traceback.print_exc() around your code is also
appropriate.  Well... not quite appropriate because that would show the
exception in the traceback machinery.  Instead you should save exc_info and
show both tracebacks.

Given the amount of data involved you also don't want the traceback to
become too hard to read for simple bugs.  What is really useful for an
unattended process that occasionally fails with unexpected input, may be
excessive for development; either it has to be easy to switch on and off, or
there needs to be some compromise.  In HTML it's easy to make a compromise
(put in a little Javascript to hide the extended detail until asked for, for
instance).  Of course, in some contexts (an email, a web page) the stuff at
the top is most visible, and a great place for an abbreviated view, while in
other contexts (mostly at a console) the bottom is easiest.

Oh, and you even should consider: will you get a unicode error on output?
 I'd actually suggest returning a unicode subclass that won't ever emit
UnicodeEncodeError when it is converted to a str or bytes.  Correctness at
that stage is not as important as not losing the exception.

So... a few suggestions.

-- 
Ian Bicking  |  http://blog.ianbicking.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Executing zipfiles and directories (was Re: PyCon Keynote)

2010-01-26 Thread Ian Bicking
On Tue, Jan 26, 2010 at 1:40 PM, Paul Moore p.f.mo...@gmail.com wrote:

 2010/1/26 Nick Coghlan ncogh...@gmail.com:
  Glenn Linderman wrote:
  That would seem to go a long ways toward making the facility user
  friendly, at least on Windows, which is where your complaint about icons
  was based, and the only change to Python would be to recognize that if a
  .py contains a .zip signature,
 
  That should work today - the zipfile/directory support shouldn't care
  about the filename at all (although the test suite doesn't currently
  cover any extensions other than .zip, so I could be wrong about that...).

 You're right, it works:

 type __main__.py
 print Hello from a zip file

 zip mz.py __main__.py
  adding: __main__.py (172 bytes security) (stored 0%)

 mz.py
 Hello from a zip file


Sadly you can't then do:

  chmod +x mz.py
  ./mz.py

because it doesn't have #!/usr/bin/env python like typical executable
Python scripts have.  You can put the shebang line at the beginning of the
zip file, and zip will complain about it but will still unpack the file, but
it won't be runnable as Python won't recognize it as a zip anymore.  Now if
you could, say, put in #!/usr/bin/env pythonz (and then implement a
pythonz command that could do useful stuff) then that might work.  Though
generally #! is so broken that it's really hard to come up with a reasonable
option for these cases.

-- 
Ian Bicking  |  http://blog.ianbicking.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Executing zipfiles and directories (was Re: PyCon Keynote)

2010-01-26 Thread Ian Bicking
On Tue, Jan 26, 2010 at 2:44 PM, Glyph Lefkowitz gl...@twistedmatrix.comwrote:


 On Jan 26, 2010, at 3:20 PM, Ian Bicking wrote:

 Sadly you can't then do:

   chmod +x mz.py
   ./mz.py


 Unless I missed some subtlety earlier in the conversation, yes you can :).


You are entirely correct; I accidentally was using Python 2.5 in my test.

-- 
Ian Bicking  |  http://blog.ianbicking.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Suggestion: new 3 release with backwards compatibility

2010-01-05 Thread Ian Bicking
On Tue, Jan 5, 2010 at 11:21 AM, Brian Curtin brian.cur...@gmail.comwrote:

 On Tue, Jan 5, 2010 at 10:10, Juan Fernando Herrera J. 
 juan...@gmail.comwrote:

 How about a new python 3 release with (possibly partial) backwards
 compatibility with 2.6? I'm a big 3 fan, but I'm dismayed at the way major
 software hasn't been ported to it. I'm eager to use 3, but paradoxically,
 the 3 release makes me rather stuck with 2.6. Excuse me if this has been
 suggested in the past.


 The proper route to take, in my opinion, is to see what 2.x libraries you
 are using that are not 3.x compatible, run 2to3 on them, then run their test
 suite, and see where you get. Submit a patch or two to the library and see
 what happens -- it at least gets the wheels in motion.


It's not even that easy -- libraries can't apply patches for Python 3
compatibility as they usually break Python 2 compatibility.  Potentially
libraries could apply patches that make a codebase 2to3 ready, but from what
I've seen that's more black magic than straight forward updating, as such
patches have to trick 2to3 producing the output that is desired.

The only workable workflow I've seen people propose for maintaining a single
codebase with compatibility across both 2 and 3 is to use such tricks, with
aliases to suppress some 2to3 updates when they are inappropriate, so that
you can run 2to3 on install and have a single canonical Python 2 source.
 Python 2.7 won't help much (even though it is trying) as the introduction
of non-ambiguous constructions like b aren't compatible with previous
versions of Python and so can't be used in many libraries (support at least
back to Python 2.5 is the norm for most libraries, I think).

Also, running 2to3 on installation is kind of annoying, as you get source
that isn't itself the canonical source, so to fix bugs you have to look at
the installed source and trace it back to the bug in the original source.

I suspect a reasonable workflow might be possible with hg and maybe patch
queues, but I don't feel familiar enough with those tools to map that out.

-- 
Ian Bicking  |  http://blog.ianbicking.org  |
http://topplabs.org/civichacker
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Suggestion: new 3 release with backwards compatibility

2010-01-05 Thread Ian Bicking
On Tue, Jan 5, 2010 at 3:07 PM, Martin v. Löwis mar...@v.loewis.de wrote:

  It's not even that easy -- libraries can't apply patches for Python 3
  compatibility as they usually break Python 2 compatibility.  Potentially
  libraries could apply patches that make a codebase 2to3 ready, but from
  what I've seen that's more black magic than straight forward updating,
  as such patches have to trick 2to3 producing the output that is desired.

 I wouldn't qualify it in that way. It may be necessary occasionally to
 trick 2to3, but that's really a bug in 2to3 which you should report, so
 that trickery is then a work-around for a bug - something that you may
 have to do with other API, as well.

 The black magic is really more in the parts that 2to3 doesn't touch
 at all (because they are inherently not syntactic); these are the
 problem areas Guido refers to. The black magic then is to make the
 same code work unmodified for both 2.x and 3.x.

Just to clarify, the black magic I'm referring to is things like:

try:
    unicode_ = unicode
except NameError:
    unicode_ = str

and some other aliases like this that are unambiguous and which 2to3
won't touch (if you write them correctly).  If the porting guide noted
all these tricks (of which several have been developed, and I'm only
vaguely aware of a few) that would be helpful.  It's not a lot of
tricks, but the tricks are not obvious and 2to3 gets the translation
wrong pretty often without them.  For instance, when I say str in
Python 2 I often means bytes, unsurprisingly, but 2to3 translates both
str and unicode to str.  That *nothing* translates to bytes by default
(AFAIK) means that people must either be living in a bytes-free world
(which sure, lots of code does) or they are using tricks not included
in 2to3 itself.


Also replying to Glyph:
  Also, running 2to3 on installation is kind of annoying, as you get source 
  that isn't itself the canonical source, so to fix bugs you have to look at 
  the installed source and trace it back to the bug in the original source.

 Given the way tracebacks are built, i.e. from filenames stored in .pycs 
 rather than based on where the code was actually loaded in
the filesystem, couldn't 2to3 could do .pyc rewriting to point at the
original source?  Sort of like our own version of the #line directive?
:)

 Seriously though, I find it hard to believe that this is a big problem.  The 
 3.x source looks pretty similar to the 2.x source, and it's good to look at 
 both if you're dealing with a 3.x issue.

Since 2to3 maintains line numbers yes, it wouldn't be that bad.  But
then I don't currently develop any code that is installed, I only
develop code that is directly from a source code checkout, and where
the checkout is put on the path.  I guess I could have something that
automatically builds the code on every edit, and that's not
infeasible.  It's just not fun.  So long as I have to support Python 2
(which is like forever) then adding Python 3 only makes development
that much more complicated and much less fun, with no concrete
benefits.  Which is a terribly crotchety of me.  Sorry.

--
Ian Bicking  |  http://blog.ianbicking.org  |  http://topplabs.org/civichacker
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Pronouncement on PEP 389: argparse?

2009-12-14 Thread Ian Bicking
On Mon, Dec 14, 2009 at 12:04 PM, Steven Bethard
steven.beth...@gmail.com wrote:
 So there wasn't really any more feedback on the last post of the
 argparse PEP other than a typo fix and another +1.

I just converted a script over to argparse.  It seems nice enough, I
was doing a two-level command, and it was quite handy for that.

One concern I had is that the naming seems at times trivially
different than optparse, just because opt or option is replaced by
arg or argument.  So .add_option becomes .add_argument, and
OptionParser becomes ArgumentParser.  This seems unnecessary to me,
and it make converting the application harder than it had to be.  It
wasn't hard, but it could have been really easy.  There are a couple
other details like this that I think are worth resolving if argparse
really is supposed to replace optparse.

I'd change this language:
The optparse module is deprecated, and has been replaced by the
argparse module.
To:
The optparse module is deprecated and will not be developed further;
development will continue with the argparse module

There's a lot of scripts using optparse, and if they are successfully
using it there's no reason to stop using it.  The proposed language
seems to imply it is wrong to keep using optparse, which I don't think
is the case.  And people can pick up on this kind of language and get
all excitable about it.

-- 
Ian Bicking  |  http://blog.ianbicking.org  |  http://topplabs.org/civichacker
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Pronouncement on PEP 389: argparse?

2009-12-14 Thread Ian Bicking
On Mon, Dec 14, 2009 at 12:43 PM, Steven Bethard
steven.beth...@gmail.com wrote:
 On Mon, Dec 14, 2009 at 10:22 AM, Ian Bicking i...@colorstudy.com wrote:
 On Mon, Dec 14, 2009 at 12:04 PM, Steven Bethard
 steven.beth...@gmail.com wrote:
 So there wasn't really any more feedback on the last post of the
 argparse PEP other than a typo fix and another +1.

 I just converted a script over to argparse.  It seems nice enough, I
 was doing a two-level command, and it was quite handy for that.

 One concern I had is that the naming seems at times trivially
 different than optparse, just because opt or option is replaced by
 arg or argument.  So .add_option becomes .add_argument, and
 OptionParser becomes ArgumentParser.  This seems unnecessary to me,
 and it make converting the application harder than it had to be.  It
 wasn't hard, but it could have been really easy.  There are a couple
 other details like this that I think are worth resolving if argparse
 really is supposed to replace optparse.

 Thanks for the feedback. Could you comment further on exactly what
 would be sufficient? It would be easy, for example, to add a subclass
 of ArgumentParser called OptionParser that has an add_option method.
 Do you also need the following things to work?

Well, to argue against myself: having another class like OptionParser
also feels like backward compatibility cruft.  argparse is close
enough to optparse (which is good) that I just wish it was a bit
closer.

 * options, args = parser.parse_args() # options and args aren't
 separate in argparse

This is a substantive enough difference that I don't really mind it,
though if OptionParser really was a different class then maybe
parse_args should act the same as optparse.OptionParser.  What happens
if you have positional arguments, but haven't declared any such
arguments with .add_argument?  Does it just result in an error?  I
suppose it must.

 * type='int', etc. # string type names aren't used in argparse

This seems simple to support and unambiguous, so yeah.

 * action='store_false' default value is None # it's True in argparse

I don't personally care about this; I agree the None default in
optparse is sometimes peculiar (also for action='count' and
action='append', where 0 and [] are the sensible defaults).

Also I'd like %prog and %default supported, which should be fairly
simple; heck, you could just do something like usage.replace('%prog',
'%(prog)s') before substitution.  Since %prog isn't otherwise valid
(unless it was %%prog, which seems unlikely?) this seems easy.


Ideally I really wish ArgumentParser was just named OptionParser, and
that .add_argument was .add_option, and that argparse's current
parse_args was named something different, so both the optparse
parse_args (which returns (options, args)) and argparse's different
parse_args return value could coexist.  Also generally if the common
small bits of optparse (like type='int' and %prog) just worked, even
if they weren't really extensible in the same manner as optparse.

Another thing I just noticed is that argparse using -v for version
where optparse does not (it only adds --version); most of my scripts
that use -v to mean --verbose, causing problems.  Since this is a poll
question on the argparse site I assume this is an outstanding question
for argparse, but just generally I think that doing things the same
way as optparse should be preferred when at all reasonable.


-- 
Ian Bicking  |  http://blog.ianbicking.org  |  http://topplabs.org/civichacker
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Pronouncement on PEP 389: argparse?

2009-12-14 Thread Ian Bicking
On Mon, Dec 14, 2009 at 6:34 PM, sstein...@gmail.com
sstein...@gmail.com wrote:
 Although I am of the people who think working modules shouldn't be 
 deprecated, I
 also don't think adding compatibility aliases is a good idea. They only make 
 the
 APIs more bloated and maintenance more tedious. Let's keep the new APIs 
 clean of
 any unnecessary baggage.

 Agreed.  If you want to make an adapter to do things like convert 'int' to 
 int, then call the new API then fine, but don't start crufting up a new API 
 to make it 'easier' to convert.

 All crufting it up does is make it _less_ clear how to use the new API by 
 bring along things that don't belong in it.

The new API is almost exactly like the old optparse API.  It's not
like it's some shining jewel of perfection that would be tainted by
somehow being similar to optparse when it's almost exactly like
optparse already.

If it wasn't like optparse, then fine, whatever; but it *is* like
optparse, so these differences feel unnecessary.  Converting 'int' to
int internally in argparse is hardly difficult or unclear.

If argparse doesn't do this, then I think at least it should give good
error messages for all cases where these optparse-isms remain.  For
instance, now if you include %prog in your usage you get: ValueError:
unsupported format character 'p' (0x70) at index 1 -- that's simply a
bad error message.  Giving a proper error message takes about as much
code as making %prog work.  I don't feel strongly that one is better
than the other, but at least one of those should be done.


-- 
Ian Bicking  |  http://blog.ianbicking.org  |  http://topplabs.org/civichacker
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Unittest/doctest formatting differences in 2.7a1?

2009-12-09 Thread Ian Bicking
On Wed, Dec 9, 2009 at 11:23 AM, Lennart Regebro lrege...@jarn.com wrote:

   Evolving the tests to avoid depending on these sorts of implementation
  details is reasonable, IMO, and cuold even be considered a bugfix by
  the Zope community.

 Evolving doctest.py so it can handle this by itself would be
 considered a bugfix by me. :)


It's about time doctest got another run of development anyway.  I can
imagine a couple features that might help:

* Already in there, but sometimes hard to enable, is ellipsis.  Can you
already do this?


 throw_an_exception()
Traceback (most recent call last):
...
DesiredException: ...

I'd like to see doctests be able to enable the ELLIPSIS option internally
and globally (currently it can only be enabled outside the doctest, or for a
single line).

* Another option might be something version-specific, like:

 throw_an_exception() # +python2.7
... old exception ...
 throw_an_exception() # +python=2.7
... new exception ...

* Maybe slightly more general, would be the ability to extend OutputCheckers
more easily than currently.  Maybe for instance # py_version(less=2.7)
would enable the py_version output checker, which would always succeed if
the version was greater than or equal to 2.7 (effectively ignoring the
output).  Or, maybe checkers could be extended so they could actually
suppress the execution of code (avoiding throw_an_exception() from being
called twice).

* Or, something more explicit than ELLIPSIS but able also be more flexible
than currently possible, like:

 throw_an_exception()
Traceback (most recent call last):
...
DesiredException: [[2.6 error message | 2.7 error message]]

-- 
Ian Bicking  |  http://blog.ianbicking.org  |
http://topplabs.org/civichacker
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Unittest/doctest formatting differences in 2.7a1?

2009-12-09 Thread Ian Bicking
On Wed, Dec 9, 2009 at 5:47 PM, Paul Moore p.f.mo...@gmail.com wrote:
 2009/12/9 Lennart Regebro lrege...@jarn.com:
 On Wed, Dec 9, 2009 at 18:45, Ian Bicking i...@colorstudy.com wrote:
 It's about time doctest got another run of development anyway.  I can
 imagine a couple features that might help:
 * Already in there, but sometimes hard to enable, is ellipsis.  Can you
 already do this?

      throw_an_exception()
     Traceback (most recent call last):
         ...
     DesiredException: ...

 I think so, but what you need is:

      throw_an_exception()
     Traceback (most recent call last):
         ...
    ...DesiredException: ...

 No you don't. From the manual:

 
 When the IGNORE_EXCEPTION_DETAIL doctest option is is specified,
 everything following the leftmost colon is ignored.
 

 So just use #doctest: +IGNORE_EXCEPTION_DETAIL

Maybe that could be extended to also ignore everything up to a period
(i.e., ignore the module name that seems to show up in 2.7 exception
names, but not in previous versions).


-- 
Ian Bicking  |  http://blog.ianbicking.org  |  http://topplabs.org/civichacker
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PyPI front page

2009-11-12 Thread Ian Bicking
On Thu, Nov 12, 2009 at 7:52 PM, Antoine Pitrou solip...@pitrou.net wrote:

 Ben Finney ben+python at benfinney.id.au writes:
 
  There's a problem with the poll's placement: on the front page of the
  PyPI website.

 Speaking of which, why is it that http://pypi.python.org/pypi and
 http://pypi.python.org/pypi/ (note the ending slash) return different
 contents
 (the latter being very voluminous)? I always mistake one for the other when
 entering the URL directly.


easy_install replied on the behavior of /pypi/ (it uses the long list to do
case-insensitive searches).  Someone changed it, easy_install broke, and a
compromise was to keep /pypi/ the way it was (but not /pypi).

Probably this could be removed, as the /simple/ index is already
case-insensitive, so easy_install shouldn't have to hit /pypi/ at all.

-- 
Ian Bicking  |  http://blog.ianbicking.org  |
http://topplabs.org/civichacker
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Distutils and Distribute roadmap (and some words on Virtualenv, Pip)

2009-10-09 Thread Ian Bicking
On Fri, Oct 9, 2009 at 3:54 AM, kiorky kio...@cryptelium.net wrote:
 If I had my way, buildout would use virtualenv and throw away its
 funny script generation.  If virtualenv had existed before buildout

 Which one, the one provided to generate scripts from entry points with the 
 *.egg
 recipes or the bin/buildout auto regeneration?

Well, if multi-versioned installs were deprecated, it would not be
necessary to use Setuptools' style of script generation.  Instead you
could simply dereference the entry point, calling the underlying
function directly in the script.  This detail is probably more of a
distutils-sig question, and I don't have a strong opinion.

But I was thinking specifically of the egg activation buildout puts at
the top of scripts.

 began development, probably things would have gone this way.  I think
 it would make the environment more pleasant for buildout users.  Also

 * I don't think so, buildout is the only tool atm that permit to have really
 reproducible and isolated environments. Even, if you use the pip freezing
 machinery, it is not equivalent to buildout, Control!

I believe that to fully insulate buildout you need still virtualenv
--no-site-packages.  But I'm not arguing that virtualenv/pip makes
buildout obsolete, only that they have overlapping functionality, and
I think buildout would benefit from making use of that overlap.

 * Buildout can have single part to construct required eggs, at a specific
 version and let you control that. Pip will just search for this version, see
 that it's not available and fail. You have even recipes (like
 minitage.recipe.egg that permit to construct eggs with special version when 
 you
 apply patches onto, thus, you can have the same egg in different flavors in 
 the
 same eggs cache available for different projects. Those projects will just 
 have
 to pin the right version to use, Control!.

In my own work I use multiple virtualenv environments for this use
case, to similar effect.  pip of course is not a generalized build
tool, but then minitage.recipe.egg is not the main egg installer
either.

 * Another thing is the funny script generation, you have not one global
 site-packages for your project, but one global cache. But from this global
 cache, your scripts will only have available the eggs you declared, see 
 Control!
 * Moreover buildout is not only a python packages manager, it's some of its
 recipes that permit to use it as. Buildout is just a great deployment tool 
 that
 allow to script and manage your project in a funny and flexible way, 
 Control!

Sure; I'm just advocating that buildout more explicitly use some of
the functionality of virtualenv/pip (which may require some more
features in those tools, but I'm open to that).  But specific
discussion of this would probably be more appropriate on
distutils-sig.

-- 
Ian Bicking  |  http://blog.ianbicking.org  |  http://topplabs.org/civichacker
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Distutils and Distribute roadmap (and some words on Virtualenv, Pip)

2009-10-09 Thread Ian Bicking
On Fri, Oct 9, 2009 at 7:32 AM, Paul Moore p.f.mo...@gmail.com wrote:
 2009/10/9 Antoine Pitrou solip...@pitrou.net:
 Ian Bicking ianb at colorstudy.com writes:

 Someone mentioned that easy_install provided some things pip didn't;
 outside of multi-versioned installs (which I'm not very enthusiastic
 about) I'm not sure what this is?

 http://pip.openplans.org/#differences-from-easy-install

 If it's obsolete the website should be updated...

 Specifically, combine only installs from source with might not work
 on Windows and the result is pretty certainly unusable for C
 extensions on Windows. You can pretty much guarantee that the average
 user on Windows won't have a C compiler[1], and even if they do, they
 won't be able to carefully line up all the 3rd party C libraries
 needed to build some extensions.

 Binary packages are essential on Windows.

I'll admit I have some blindness when it comes to Windows.  I agree
binary installation on Windows is important.  (I don't think it's very
important on other platforms, or at least not very effective in
easy_install so it wouldn't be a regression.)

I note some other differences in that document:

 It cannot install from eggs. It only installs from source. (Maybe this will 
 be changed sometime, but it’s low priority.)

Outside of binaries on Windows, I'm still unsure if installing eggs
serves a useful purpose.  I'm not sure if eggs are any better than
wininst binaries either...?

 It doesn’t understand Setuptools extras (like package[test]). This should be 
 added eventually.

I haven't really seen Setuptools' extras used effectively, so I'm
unsure if it's a useful feature.  I understand the motivation for
extras, but motivated features aren't necessarily useful features.

 It is incompatible with some packages that customize distutils or setuptools 
 in their setup.py files.

I don't have a solution for this, and generally easy_install does not
perform much better than pip in these cases.  Work in Distribute
hopefully will apply to this issue.

-- 
Ian Bicking  |  http://blog.ianbicking.org  |  http://topplabs.org/civichacker
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Distutils and Distribute roadmap (and some words on Virtualenv, Pip)

2009-10-09 Thread Ian Bicking
Probably all these discussions are better on distutils-sig (just
copying python-dev to note the movement of the discussion)

On Fri, Oct 9, 2009 at 11:49 AM, Michael Foord
fuzzy...@voidspace.org.uk wrote:
 Outside of binaries on Windows, I'm still unsure if installing eggs
 serves a useful purpose.  I'm not sure if eggs are any better than
 wininst binaries either...?

 Many Windows users would be quite happy if the standard mechanism for
 installing non-source distributions on Windows was via the wininst binaries.

 I wonder if it is going to be possible to make this compatible with the
 upcoming distutils package management 'stuff' (querying for installed
 packages, uninstallation etc) since installation/uninstallation goes through
 the Windows system package management feature.  I guess it would be
 eminently possible but require some reasonably high level Windows-fu to do.

As far as pip works, it unpacks a package and runs python setup.py
install (and some options that aren't that interesting, but are
provided specifically by setuptools).  Well, it's slightly more
complicated, but more to the point it doesn't install in-process or
dictate how setup.py works, except that it takes some specific
options.  Running a Windows installer in the same way would be fine,
in that sense.  Alternately pip could unpack the wininst zip file and
install it directly; I'm not sure if that would be better or worse?
If wininst uses the central package manager of the OS then certain
features (like virtualenv, PYTHONHOME, --prefix, etc) would not be
possible.

For Distribute (or Setuptools or by association pip) to see that a
package is installed, it must have the appropriate metadata.  For
Setuptools (and Distribute 0.6) this is a directory or file, on
sys.path, Package.egg-info (or in Package-X.Y.egg/EGG-INFO).  If a
file, it should be a PKG-INFO file, if a directory it should contain a
PKG-INFO file.  So however the package gets installed, if that
metadata is installed then it can be queried.  I don't think querying
the Windows system package management would be necessary or desirable.
 Nobody is trying that with deb/rpm either.

-- 
Ian Bicking  |  http://blog.ianbicking.org  |  http://topplabs.org/civichacker
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Distutils and Distribute roadmap (and some words on Virtualenv, Pip)

2009-10-08 Thread Ian Bicking
.  Also virtualenv
offers more system isolation.

If I had my way, buildout would use virtualenv and throw away its
funny script generation.  If virtualenv had existed before buildout
began development, probably things would have gone this way.  I think
it would make the environment more pleasant for buildout users.  Also
I wish it used pip instead of its own installation procedure (based on
easy_install).  I don't think the philosophical differences are that
great, and that it's more a matter of history -- because the code is
written, there's not much incentive for buildout to remove that code
and rely on other libraries (virtualenv and pip).

--
Ian Bicking  |  http://blog.ianbicking.org  |  http://topplabs.org/civichacker
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Web-SIG] Adding wsgiref to stdlib

2006-05-24 Thread Ian Bicking
Ian Bicking wrote:
 Phillip J. Eby wrote:
 
At 02:32 PM 4/28/2006 -0500, Ian Bicking wrote:

I'd like to include paste.lint with that as well (as wsgiref.lint or
whatever).  Since the last discussion I enumerated in the docstring all
the checks it does.  There's still some outstanding issues, mostly where
I'm not sure if it is too restrictive (marked with @@ in the source).
It's at:

   http://svn.pythonpaste.org/Paste/trunk/paste/lint.py

Ian, I see this is under the MIT license.  Do you also have a PSF 
contributor agreement (to license under AFL/ASF)?  If not, can you place 
a copy of this under a compatible license so that I can add this to the 
version of wsgiref that gets checked into the stdlib?
 
 
 I don't have a contributor agreement.  I can change the license in 
 place, or sign an agreement, or whatever; someone should just tell me 
 what to do.

I faxed in a contributor aggreement, and added this to the comment 
header of the file:


# Also licenced under the Apache License, 2.0: 
http://opensource.org/licenses/apache2.0.php
# Licensed to PSF under a Contributor Agreement
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Web-SIG] Adding wsgiref to stdlib

2006-05-22 Thread Ian Bicking
Phillip J. Eby wrote:
 At 02:32 PM 4/28/2006 -0500, Ian Bicking wrote:
 I'd like to include paste.lint with that as well (as wsgiref.lint or
 whatever).  Since the last discussion I enumerated in the docstring all
 the checks it does.  There's still some outstanding issues, mostly where
 I'm not sure if it is too restrictive (marked with @@ in the source).
 It's at:

http://svn.pythonpaste.org/Paste/trunk/paste/lint.py
 
 Ian, I see this is under the MIT license.  Do you also have a PSF 
 contributor agreement (to license under AFL/ASF)?  If not, can you place 
 a copy of this under a compatible license so that I can add this to the 
 version of wsgiref that gets checked into the stdlib?

I don't have a contributor agreement.  I can change the license in 
place, or sign an agreement, or whatever; someone should just tell me 
what to do.


-- 
Ian Bicking  |  [EMAIL PROTECTED]  |  http://blog.ianbicking.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Web-SIG] Adding wsgiref to stdlib

2006-04-28 Thread Ian Bicking
Guido van Rossum wrote:
 PEP 333 specifies WSGI, the Python Web Server Gateway Interface v1.0;
 it's written by Phillip Eby who put a lot of effort in it to make it
 acceptable to very diverse web frameworks. The PEP has been well
 received by web framework makers and users.
 
 As a supplement to the PEP, Phillip has written a reference
 implementation, wsgiref. I don't know how many people have used
 wsgiref; I'm using it myself for an intranet webserver and am very
 happy with it. (I'm asking Phillip to post the URL for the current
 source; searching for it produces multiple repositories.)
 
 I believe that it would be a good idea to add wsgiref to the stdlib,
 after some minor cleanups such as removing the extra blank lines that
 Phillip puts in his code. Having standard library support will remove
 the last reason web framework developers might have to resist adopting
 WSGI, and the resulting standardization will help web framework users.

I'd like to include paste.lint with that as well (as wsgiref.lint or 
whatever).  Since the last discussion I enumerated in the docstring all 
the checks it does.  There's still some outstanding issues, mostly where 
I'm not sure if it is too restrictive (marked with @@ in the source). 
It's at:

   http://svn.pythonpaste.org/Paste/trunk/paste/lint.py

I think another useful addition would be some prefix-based dispatcher, 
similar to paste.urlmap (but probably a bit simpler): 
http://svn.pythonpaste.org/Paste/trunk/paste/urlmap.py

The motivation there is to give people the basic tools to simple 
multi-application hosting, and in the process implicitly suggest how 
other dispatching can be done.  I think this is something that doesn't 
occur to people naturally, and they see it as a flaw in the server (that 
the server doesn't have a dispatching feature), and the result is either 
frustration, griping, or bad kludges.  By including a basic 
implementation of WSGI-based dispatching the standard library can lead 
people in the right direction for more sophisticated dispatching.

And prefix dispatching is also quite useful on its own, it's not just 
educational.

 Last time this was brought up there were feature requests and
 discussion on how industrial strength the webserver in wsgiref ought
 to be but nothing like the flamefest that setuptools caused (no
 comments please).

No one disagreed with the basic premise though, just some questions 
about the particulars of the server.  I think there were at least a 
couple small suggestions for the wsgiref server; in particular maybe a 
slight refactoring to make it easier to use with https.


-- 
Ian Bicking  /  [EMAIL PROTECTED]  /  http://blog.ianbicking.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Web-SIG] Adding wsgiref to stdlib

2006-04-28 Thread Ian Bicking
Guido van Rossum wrote:
 I think another useful addition would be some prefix-based dispatcher,
 similar to paste.urlmap (but probably a bit simpler):
 http://svn.pythonpaste.org/Paste/trunk/paste/urlmap.py
 
 
 IMO this is getting into framework design. Perhaps something like this
 could be added in 2.6?

I don't think it's frameworky.  It could be used to build a very 
primitive framework, but even then it's not a particularly useful 
starting point.

In Paste this would generally be used below any framework (or above I 
guess, depending on which side is up).  You'd pass /blog to a blog 
app, /cms to a cms app, etc.  WSGI already is very specific about what 
needs to be done when doing this dispatching (adjusting SCRIPT_NAME and 
PATH_INFO), and that's all that the dispatching needs to do.

The applications themselves are written in some framework with internal 
notions of URL dispatching, but this doesn't infringe upon those. 
(Unless the framework doesn't respect SCRIPT_NAME and PATH_INFO; but 
that's their problem, as the dispatcher is just using what's already 
allowed for in the WSGI spec.)  It also doesn't overlap with frameworks, 
as prefix-based dispatching isn't really that useful in a framework.

The basic implementation is:

class PrefixDispatch(object):
 def __init__(self):
 self.applications = {}
 def add_application(self, prefix, app):
 self.applications[prefix] = app
 def __call__(self, environ, start_response):
 apps = sorted(self.applications.items(),
   key=lambda x: -len(x[0]))
 path_info = environ.get('PATH_INFO', '')
 for prefix, app in apps:
 if not path_info.startswith(prefix):
 continue
 environ['SCRIPT_NAME'] = environ.get('SCRIPT_NAME', '')+prefix
 environ['PATH_INFO'] = environ.get('PATH_INFO', 
'')[len(prefix):]
 return app(environ, start_response)
 start_response('404 Not Found', [('Content-type', 'text/html')])
 return ['htmlbodyh1Not Found/h1/body/html']


There's a bunch of checks that should take place (most related to /'s), 
and the not found response should be configurable (probably as an 
application that can be passed in as an argument).  But that's most of 
what it should do.


-- 
Ian Bicking  /  [EMAIL PROTECTED]  /  http://blog.ianbicking.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Web-SIG] Adding wsgiref to stdlib

2006-04-28 Thread Ian Bicking
Phillip J. Eby wrote:
 I'd like to include paste.lint with that as well (as wsgiref.lint or
 whatever).  Since the last discussion I enumerated in the docstring all
 the checks it does.  There's still some outstanding issues, mostly where
 I'm not sure if it is too restrictive (marked with @@ in the source).
 It's at:

http://svn.pythonpaste.org/Paste/trunk/paste/lint.py
 
 
 +1, but lose the unused 'global_conf' parameter and 'make_middleware' 
 functions.

Yeah, those are just related to Paste Deploy and wouldn't go in.

 I think another useful addition would be some prefix-based dispatcher,
 similar to paste.urlmap (but probably a bit simpler):
 http://svn.pythonpaste.org/Paste/trunk/paste/urlmap.py
 
 
 I'd rather see something a *lot* simpler - something that just takes a 
 dictionary mapping names to application objects, and parses path 
 segments using wsgiref functions.  That way, its usefulness as an 
 example wouldn't be obscured by having too many features.  Such a thing 
 would still be quite useful, and would illustrate how to do more 
 sophisticated dispatching.  Something more or less like:
 
 from wsgiref.util import shift_path_info
 
 # usage:
 #main_app = AppMap(foo=part_one, bar=part_two, ...)
 
 class AppMap:
 def __init__(self, **apps):
 self.apps = apps
 
 def __call__(self, environ, start_response):
 name = shift_path_info(environ)
 if name is None:
 return self.default(environ, start_response)
 elif name in self.apps:
 return self.apps[name](environ,start_response)
 return self.not_found(environ, start_response)
 
 def default(self, environ, start_response):
 self.not_found(environ, start_response)
 
 def not_found(self, environ, start_response):
 # code to generate a 404 response here
 
 This should be short enough to highlight the concept, while still 
 providing a few hooks for subclassing.

That's mostly what I was thinking, though using a full prefix (instead 
of just a single path segment), and the default is the application at 
'', like in my other email.

paste.urlmap has several features I wouldn't propose (like domain and 
port matching, more Paste Deploy stuff, and a proxy object that I should 
probably just delete); I probably should have been more specific. 
URLMap's dictionary interface isn't that useful either.

Another feature that the example in my other email doesn't have is / 
handling, specifically redirecting /something-that-matches to 
/something-that-matches/ (something Apache's Alias doesn't do but should).

Host and port matching is pretty easy to do at the same time, and in my 
experience can be useful to do at the same time, but I don't really care 
if that feature goes in.

-- 
Ian Bicking  /  [EMAIL PROTECTED]  /  http://blog.ianbicking.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Web-SIG] Adding wsgiref to stdlib

2006-04-28 Thread Ian Bicking
Phillip J. Eby wrote:
 At 01:19 PM 4/28/2006 -0700, Guido van Rossum wrote:
 
 It still looks like an application of WSGI, not part of a reference
 implementation. Multiple apps looks like an advanced topic to me; more
 something that the infrastructure (Apache server or whatever) ought to
 take care of.
 
 
 I'm fine with a super-simple implementation that emphasizes the concept, 
 not feature-richness.  A simple dict-based implementation showcases both 
 the wsgiref function for path shifting, and the idea of composing an 
 application out of mini-applications.  (The point is to demonstrate how 
 people can compose WSGI applications *without* needing a framework.)
 
 But I don't think that this demo should be a prefix mapper; people doing 
 more sophisticated routing can use Paste or Routes.

I don't see why not to use prefix matching.  It is more consistent with 
the handling of the default application ('', instead of a method that 
needs to be overridden), and more general, and the algorithm is only 
barely more complex and not what I'd call sophisticated.  The default 
application handling in particular means that AppMap isn't really useful 
without subclassing or assigning to .default.

Prefix matching wouldn't show off anything else in wsgiref, because 
there's nothing else to use; paste.urlmap doesn't use any other part of 
Paste either (except one unimportant exception) because there's just no 
need.


-- 
Ian Bicking  /  [EMAIL PROTECTED]  /  http://blog.ianbicking.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Web-SIG] Adding wsgiref to stdlib

2006-04-28 Thread Ian Bicking
Phillip J. Eby wrote:
 At 04:04 PM 4/28/2006 -0500, Ian Bicking wrote:
 
 I don't see why not to use prefix matching.  It is more consistent with
 the handling of the default application ('', instead of a method that
 needs to be overridden), and more general, and the algorithm is only
 barely more complex and not what I'd call sophisticated.  The default
 application handling in particular means that AppMap isn't really useful
 without subclassing or assigning to .default.

 Prefix matching wouldn't show off anything else in wsgiref,
 
 
 Right, that would be taking away one of the main reasons to include it.

That's putting the cart in front of the horse, using a matching 
algorithm because that's what shift_path_info does, not because it's the 
most natural or useful way to do the match.

I suggest prefix matching not because it shows how the current functions 
in wsgiref work, but because it shows a pattern of dispatching WSGI 
applications on a level that is typically (but for WSGI, unnecessarily) 
built into the server.  The educational value is in the pattern, not in 
the implementation.

If you want to show how the functions in wsgiref work, then that belongs 
in documentation.  Which would be good too, people like examples, and 
the more examples in the wsgiref docs the better.  People are much less 
likely to see examples in the code itself.

 To make the real dispatcher, I'd flesh out what I wrote a little bit, to 
 handle the default method in a more meaningful way, including the 
 redirect.  All that should only add a few lines, however.

It will still be only a couple lines less than prefix matching.

Another issue with your implementation is the use of keyword arguments 
for the path mappings, even though path mappings have no association 
with keyword arguments or valid Python identifiers.

-- 
Ian Bicking  /  [EMAIL PROTECTED]  /  http://blog.ianbicking.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Web-SIG] Adding wsgiref to stdlib

2006-04-28 Thread Ian Bicking
Phillip J. Eby wrote:
 At 05:47 PM 4/28/2006 -0500, Ian Bicking wrote:
 It will still be only a couple lines less than prefix matching.
 
 That's beside the point.  Prefix matching is inherently a more complex 
 concept, and more likely to be confusing, without introducing much in 
 the way of new features.  

I just don't understand this.  It's not more complex.  Prefix matching 
works like:

   get the prefixes
   order them longest first
   check each one against PATH_INFO
   use the matched app
   or call the not found handler

Name matching works like:

   get the mapping
   get the next chunk
   get the app associated with that chunk
   use that app
   or call the not found handler

One is not more complex than the other.


 If I want to dispatch /foo/bar, why not just use:
 
 AppMap(foo=AppMap(bar=whatever))

You create an intermediate application with no particular purpose.  You 
get two default handlers, two not found handlers, and you create an 
object tree that is distracting because it is artificial.  Paths are 
strings, not trees or objects.  When you confuse strings for objects you 
are moving into framework territory.

 If I was going to include a more sophisticated dispatcher, I'd add an 
 ordered regular expression dispatcher, since that would support use 
 cases that the simple or prefix dispatchers would not, but it would also 
 support the prefix cases without nesting.

That is significantly more complex, because SCRIPT_NAME/PATH_INFO cannot 
be used to express what the regular expression matched.  It also 
overlaps with frameworks.  WSGI doesn't offer any standard mechanism to 
do that sort of thing.  It could (e.g., a wsgi.path_vars key), but it 
doesn't.  Or you do something that looks like mod_rewrite, but no one 
wants that.

Prefix based routing represents a real cusp -- more than that, and you 
have to invent conventions not already present in the WSGI spec, and you 
overlap with frameworks.  Less than that... well, you can't do a whole 
lot less than that.

-- 
Ian Bicking  |  [EMAIL PROTECTED]  |  http://blog.ianbicking.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Dropping __init__.py requirement for subpackages

2006-04-26 Thread Ian Bicking
Joe Smith wrote:
 It seems to me that the right way to fix this is to simply make a small 
 change to the error message.
 On a failed import, have the code check if there is a directory that would 
 have been  the requested package if
 it had contained an __init__ module. If there is then append a message like 
 You might be missing an __init__.py file.

+1.  It's not that putting an __init__.py file in is hard, it's that 
people have a hard time realizing when they've forgotten to do it.

-- 
Ian Bicking  /  [EMAIL PROTECTED]  /  http://blog.ianbicking.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] setuptools in 2.5.

2006-04-21 Thread Ian Bicking
Paul Moore wrote:
 And no, I don't want to install the 2 versions side-by-side. Ian
 Bicking complained recently about the uncertainty of multiple
 directories on sys.path meaning you can't be sure which version of a
 module you get. Well, having 2 versions of a module installed and
 knowing that which one is in use depends on require calls which get
 issued at runtime worries me far more.

These are valid concerns.  From my own experience, I don't think 
setuptools makes it any worse than the status quo, but it certainly 
doesn't magically solve these issues.  And though these issues are 
intrinsically hard, I think Python makes it harder than it should.  For 
instance, if you really want to be confident about how your libraries 
are layed out, this script is the most reliable way: 
http://peak.telecommunity.com/dist/virtual-python.py

It basically copies all of Python to a new directory.  That this is 
required to get a self-consistent and well-encapsulated Python setup 
is... well, not good.  Maybe this could be fixed for Python 2.5 as well 
-- to at least make this isolation easier to apply.


-- 
Ian Bicking  /  [EMAIL PROTECTED]  /  http://blog.ianbicking.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] setuptools in 2.5. (summary)

2006-04-20 Thread Ian Bicking
M.-A. Lemburg wrote:
 Anthony Baxter wrote:
 
In an attempt to help this thread reach some sort of resolution, 
here's a collection of arguments against and in favour of setuptools 
in 2.5. My conclusions are at the end.
 
 
 Thanks for the summary. I'd like to add some important aspects
 (for me at least) that are missing:
 
 - setuptools should not change the standard distutils install
   command to install everything as eggs
 
   Eggs are just one distribution format out of many. They do
   server their purpose, just like RPMs, DEBs or Windows installers do.

I think Eggs can be a bit confusing.  They really serve two purposes, 
but using the same format.  They are a distribution mechanism, which is 
probably one of the less important aspects, and there's the installation 
format.

So you don't have to use them as a distribution format to still use them 
as an installation format.  As an installation format they overlap with 
OS-level metadata, but that OS-level metadata has always been completely 
unavailable to Python programs so a little duplication has to be put up 
with.  And anyway, the packaging systems can manage the system integrity 
well enough to keep that information in sync.  Even though eggs overlap, 
they don't have to compete.

   However, when running python setup.py install you are in fact
   installing from source, so there's no need to wrap things up
   again.
 
   The distutils default of actually installing things in the
   standard Python is good, has worked for years and should continue
   to do so.
 
   The extra information needed by the dependency checking can
   easily be added to the package directory of the installed package
   or stored elsewhere in a repository of installed packages or as
   separate egg-info directory if we want to stick with setuptools'
   way of using the path name for getting meta-information on a
   package.

Phillip can clarify this more, but I believe he's planning on Python 2.5 
setuptools to install similar to distutils, but with a sibling .egg-info 
directory.  There's already an option to do this, it's just a matter of 
whether it will be the default.

A package with a sibling .egg-info directory is a real egg, but that 
it's a real egg probably highlights that eggness can be a bit confusing.

   Placing the egg-files into the system as ZIP files should
   be an option, e.g. as separate install_egg command,
   not the default.

I would prefer this too; even though Phillip has fixed the traceback 
problems for 2.5 I personally just prefer files I can view in other 
tools as well (my text editor doesn't like zip files, for instance).  I 
typically make this change in distutils.cfg for my own systems.


-- 
Ian Bicking  /  [EMAIL PROTECTED]  /  http://blog.ianbicking.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] setuptools in 2.5.

2006-04-20 Thread Ian Bicking
And now for a little pushback the other way -- as of this January 
TurboGears has served up 100,000 egg files (I'm not sure what the window 
for all those downloads is, but it hasn't been very long).  Has it 
occurred to you that they know something you don't about distribution? 
ElementTree would be among those egg files, so you should also consider 
how many people *haven't* asked you about problems related to the 
installation process.

Really, I just shouldn't have made this argument; the discussion was 
going back towards a calmer and more constructive discussion and I 
pushed it the other way.  Sorry.  Please ignore.



-- 
Ian Bicking  /  [EMAIL PROTECTED]  /  http://blog.ianbicking.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] setuptools in 2.5.

2006-04-20 Thread Ian Bicking
Paul Moore wrote:
 2. Distributors will supply .egg files rather than bdist_wininst
 installers (this is already happening).

Really people should at least be uploading source packages in addition 
to eggs; it's certainly not hard to do so.

Perhaps a distributor quick intro needs to be written for the standard 
library.  Something short; both distutils and setuptools documentation 
are pretty long, and for someone who just has some simple Python code to 
get out it's overwhelming.

Fredrik also asked for a document, but I don't think it is this 
document; it wasn't clear to me what exactly he wanted documented.


-- 
Ian Bicking  /  [EMAIL PROTECTED]  /  http://blog.ianbicking.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 359: The make Statement

2006-04-17 Thread Ian Bicking
Steven Bethard wrote:
 This PEP proposes a generalization of the class-declaration syntax,
 the ``make`` statement.  The proposed syntax and semantics parallel
 the syntax for class definition, and so::
 
make callable name tuple:
block

I can't really see any use case for tuple.  In particular, you could 
always choose to implement this:

   make Foo someobj(stuff): ...

like:

   make Foo(stuff) someobj: ...

I don't think I'd naturally use the tuple position for anything, and so 
it's an arbitrary and usually empty position in the call, just to 
support type() which already has its own syntax.

So maybe it makes less sense to copy the class/metaclass arguments so 
closely, and so moving to this might feel a bit better:

   make someobj Foo(stuff): ...

And actually it reminds me more of class statements, which are in the 
form keyword name(things_you_build_from).  Which then obviously leads 
to more parenthesis:

   make someobj(Foo(stuff)): ...

Except I don't know what make someobj(A, B) would mean, so maybe the 
parenthesis are uncalled for.  I prefer the look of the statement 
without parenthesis anyway.

Really, to me this syntax feels like support for a more prototype-based 
construct.  And many of the class-abusing metaclasses I've used have 
really looked similar to prototypes.  The class statement is caught up 
in a bunch of very class-like semantics, and a more explicit/manual 
technique of creating objects opens up lots of potential.

With that in mind, I think __call__ might be the wrong method to call on 
the builder.  For instance, if you were actually going to implement 
prototypes on this, you wouldn't want to steal all uses of __call__ just 
for the cloning machinery.  So __make__ would be nicer.  Personally this 
would also let people using older constructs (like a plain 
__call__(**kw)) to keep that in addition to supporting this new construct.


-- 
Ian Bicking  /  [EMAIL PROTECTED]  /  http://blog.ianbicking.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 359: The make Statement

2006-04-14 Thread Ian Bicking
BJörn Lindqvist wrote:
 [nice way to declare properties with make]
 
 
Of course, properties are only one of the many possible uses of the
make statement.  The make statement is useful in essentially any
situation where a name is associated with a namespace.  So, for
 
 
 So far, in this thread that is the only useful use of the make
 statement that has been presented. I'd like to see more examples.

In SQLObject I would prefer:

class Foo(SQLObject):
 make IntCol bar:
 notNull = True

In FormEncode I would prefer:

make Schema registration:
 make String name:
 max_length = 100
 not_empty = True
 make PostalCode postal_code:
 not_empty = True
 make Int age:
 min = 18

In another thread on the python-3000 list I suggested (using :

class Point(object):
 make setonce x:
 x coordinate
 make setonce y:
 y coordinate

For a read-only x and y property (setonce because they have to be set to 
*something*, but just never re-set).

Interfaces are nice:

make interface IValidator:
 make attribute if_empty:
 If this attribute is not NoDefault, then this value
 will be used in lieue of an empty value
 default = NoDefault
 def to_python(value, state): ...


Another descriptor, stricttype 
(http://svn.colorstudy.com/home/ianb/recipes/stricttype.py):

class Pixel(object):
 make stricttype x:
 type = int
 make stricttype y:
 type = int

(Both this descriptor and setonce need to know their name if they are 
going to store their value in the object in a stable location)


 It would be really cool if you could go through the standard library,
 and replace code there with code using the make statement. I think a
 patch showing how much nicer good Python code would be with the make
 statement would be a very convincing argument.

I don't know if the standard library will have a whole lot; make is 
really only useful when frameworks are written to use it, and there's 
just not a lot of framework in the standard library.  Maybe:

make OptionParser myparser:
 make Option verbose:
 short = '-v'
 help = ...



-- 
Ian Bicking  /  [EMAIL PROTECTED]  /  http://blog.ianbicking.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 359: The make Statement

2006-04-13 Thread Ian Bicking
Steven Bethard wrote:
 On 4/13/06, Martin v. Löwis [EMAIL PROTECTED] wrote:
 
Steven Bethard wrote:

I know 2.5's not out yet, but since I now have a PEP number, I'm going
to go ahead and post this for discussion.  Currently, the target
version is Python 2.6.  You can also see the PEP at:
http://www.python.org/dev/peps/pep-0359/

Thanks in advance for the feedback!

 [snip]
 
Would it be possible/useful to have a pre-block hook to the callable,
which would provide the dictionary; this dictionary might not be
a proper dictionary (but only some mapping), or it might be pre-initialized.
 
 
 Yeah, something along these lines came up in dicussing using the make
 statement for XML generation.  You might want to write something like:
 
 make Element html:
 make Element head:
 ...
 make Element body:
 ...
 
 however, this doesn't work with the current semantics because:
 
 (1) dict's are unordered
 (2) dict's can't have the same name (key) twice

Is the body of the make statement going to work like the body of a class 
statement?  I would assume so, in which case (2) would be a given.  That 
is, if you can do:

   make Element html:
   title_text = 'foo'
   make Element title:
   content = title_text
   del title_text

Then you really can't have multiple keys with the same name unless you 
give up the ability to refer in the body of the make statement to things 
defined earlier in that same body.  Unless items that were rebound were 
hidden, but still somehow accessible to Element.

 and so you can only generate XML/HTML where the order of elements
 doesn't matter and you never have repeated elements.  That's not
 really XML/HTML anymore.
 
 You could probably solve this if you could supply a different type of
 dict-like object for the block to be executed in.  Then we'd have to
 have a translation from something like::
 
   make callable name tuple in mapping:
   block
 
 to something like::
 
   name = callable(name, tuple, namespace)
 
 where namespace is created by executing the statements of block in
 the mapping object.  Skipping the syntax discussion for the moment,
 I guess I have two problems with this:
 
 (1) It complicates the statement semantics pretty substantially
 (2) It breaks the parallel with the class statement since you can't
 supply an alternate mapping type for class bodies to be executed in
 (3) It adds some degree of coupling between the mapping type and the
 callable.  For the example above, I expect I'd have to do something
 like::
 
 make Element html in ElementDict():
 make Element head in ElementDict():
 ...
 make Element body in ElementDict():
 ...

Maybe Element.__make_dict__ could be ElementDict.  This doesn't feel 
that unclean if you are also using Element.__make__ instead of 
Element.__call__; though there is a hidden cleverness factor (maybe in a 
bad way).

-- 
Ian Bicking  /  [EMAIL PROTECTED]  /  http://blog.ianbicking.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] tally (and other accumulators)

2006-04-04 Thread Ian Bicking
Alex Martelli wrote:
 It's a bit late for 2.5, of course, but, I thought I'd propose it  
 anyway -- I noticed it on c.l.py.
 
 In 2.3/2.4 we have many ways to generate and process iterators but  
 few accumulators -- functions that accept an iterable and produce  
 some kind of summary result from it.  sum, min, max, for example.  
 And any, all in 2.5.
 
 The proposed function tally accepts an iterable whose items are  
 hashable and returns a dict mapping each item to its count (number of  
 times it appears).
 
 This is quite general and simple at the same time: for example, it  
 was proposed originally to answer some complaint about any and all  
 giving no indication of the count of true/false items:
 
 tally(bool(x) for x in seq)
 
 would give a dict with two entries, counts of true and false items.
 
 Just like the other accumulators mentioned above, tally is simple to  
 implement, especially with the new collections.defaultdict:
 
 import collections
 def tally(seq):
  d = collections.defaultdict(int)
  for item in seq:
  d[item] += 1
  return dict(d)

Or:

   import collections
   bag = collections.Bag([1, 2, 3, 2, 1])
   assert bag.count(1) == 2
   assert bag.count(0) == 0
   assert 3 in bag
   # etc...


-- 
Ian Bicking  /  [EMAIL PROTECTED]  /  http://blog.ianbicking.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Class decorators

2006-03-30 Thread Ian Bicking
Fred L. Drake, Jr. wrote:
   It's too bad this syntax is ambiguous:
  
class Foo:
Docstring here, blah blah blah

@implements(IFoo)
  
   As this achieves a desirable highlighting of the specialness, without
   forcing the decorator outside the class.  Oh well.
 
 Agreed, but... guess we can't have everything.  On the other hand, something 
 like:
 
 class Foo:
 Documentation is good.
 
 @class implements(IFoo)
 
 is not ambiguous.  Hmm.  It even says what it means.  :-)

This is quite reminiscent of Ruby to me, where:

   class Foo:
   implements(IFoo)

basically means:

   class Foo:
   pass
   Foo.implements(IFoo)

For a variety of reasons that doesn't work for Python, but what you 
propose accomplishes the same basic thing.

I'm coming in a little late on all this, but I find moving the decorator 
inside the class statement to be a substantial improvement, even if it 
is also a trivial improvement ;)  Anytime I've done thought experiments 
about using class decorators, the results is very hard to read.  That 
classes are inherently declarative and open, while functions are 
imperative and closed, makes the constructs very different.


-- 
Ian Bicking  /  [EMAIL PROTECTED]  /  http://blog.ianbicking.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] decorator module patch

2006-03-12 Thread Ian Bicking
Georg Brandl wrote:
 Also, I thought we were trying to move away from modules that shared a name 
 with one of their public functions or classes. As it is, I'm not even sure 
 that a name like decorator gives the right emphasis.
 
 I thought about decorators too, that would make decorators.decorator. Hm.

I personally like pluralized modules for exactly the reason that they 
don't clash as much with members or likely local variables. 
datetime.datetime frequently leads me to make mistakes.

 In general, decorators belong in the appropriate domain-specific module 
 (similar to context managers). In this case, though, the domain is the 
 manipulation of Python functions - maybe the module should be called 
 metafunctions or functools to reflect its application domain, rather 
 than 
 the coincidental fact that its first member happens to be a decorator.
 
 Depends on what else will end up there. If it's memoize or deprecated then
 the name functools doesn't sound too good either.

memoize seems to fit into functools fairly well, though deprecated not 
so much.  functools is similarly named to itertools, another module that 
is kind of vague in scope (though functools is much more vague). 
partial would make just as much sense in functools as in functional.

-- 
Ian Bicking  |  [EMAIL PROTECTED]  |  http://blog.ianbicking.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] multidict API

2006-03-10 Thread Ian Bicking
I'm not really making any actionable proposal here, so maybe this is 
off-topic; if so, sorry.

Back during the defaultdict discussion I proposed a multidict object 
(http://mail.python.org/pipermail/python-dev/2006-February/061264.html) 
-- right now I need to implement one to represent web form submissions. 
  It would also be ordered in that case.

The question then is what the API should look like for such an object -- 
an ordered, multi-value dictionary.  I would really like if this object 
was in the collections module, but I'm too lazy to try to pursue that 
now.  But if it did show up, I'd like the class I write to look the 
same.  There's some open questions I see:

* Does __getitem__ return a list of all matching keys (never a KeyError, 
though possibly returning []), or does it return the first matching key?

* Either way, I assume there will be another method, like getfirst or 
getall, that will present the other choice.  What would it be named? 
Should it have a default?

* Should there be a method to get a single value, that implicitly 
asserts that there is only one matching key?

* Should the default for .get() be None, or something else?

* Does __setitem__ overwrite any or all values with matching keys?

* If so, there should be another method like .add(key, value) which does 
not overwrite.  Or, if __setitem__ does not overwrite, then there should 
be a method that does.

* Does __delitem__ raise a KeyError if the key is not found?

* Does .keys() return all unique keys, or all keys in order (meaning a 
key may show up more than once in the list)?

I really could go either way on all of these questions, though I think 
there's constraints -- answer one of the questions and another becomes 
obvious.  But you can answer them in whatever order you want.

-- 
Ian Bicking  /  [EMAIL PROTECTED]  /  http://blog.ianbicking.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] multidict API

2006-03-10 Thread Ian Bicking
Raymond Hettinger wrote:
 [Ian Bicking]
 
The question then is what the API should look like for such an object -- 
an ordered, multi-value dictionary.
 
 
 May I suggest that multidict begin it's life as a cookbook recipe so that its 
 API can mature.

There's already quite a few recipes out there.  But I should probably 
collect them as well.

http://www.voidspace.org.uk/python/odict.html
http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/107747
http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/438823
http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/173072
http://urchin.earth.li/~twic/odict.py
http://www.astro.washington.edu/owen/ROPython.html
http://home.arcor.de/wolfgang.grafen/Python/Modules/Modules.html
email.Message.Message
http://cvs.eby-sarna.com/wsgiref/src/wsgiref/headers.py?view=markup

Well, there's a few, mostly ordered, some multivalue.  A comparison 
would be helpful, but maybe a little later.  odict is probably the most 
filled-out, though it is probably more listish than I really would like.

-- 
Ian Bicking  /  [EMAIL PROTECTED]  /  http://blog.ianbicking.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] quit() on the prompt

2006-03-08 Thread Ian Bicking
Neil Schemenauer wrote:
Bad idea, as several pointed out -- quit() should return a 0 exit
to the shell.
 
 
 I like the idea of making quit callable.  One small concern I have
 is that people will use it in scripts to exit (rather than one of
 the other existing ways to exit).  OTOH, maybe that's a feature.

I actually thought it was only defined for interactive sessions, but a 
brief test shows I was wrong.  It doesn't bother me, but it does make me 
think that exit(1) should exit with a code of one.

-- 
Ian Bicking  /  [EMAIL PROTECTED]  /  http://blog.ianbicking.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] quit() on the prompt

2006-03-07 Thread Ian Bicking
Frederick suggested a change to quit/exit a while ago, so it wasn't just 
a string with slight instructional purpose, but actually useful.  The 
discussion was surprisingly involved, despite the change really trully 
not being that big.  And everyone drifted off, too tired from the 
discussion to make a change.  I suppose it didn't help that the original 
proposal struck some people as too magic, while there were some more 
substantive problems brought up as well, and when you mix aesthetic with 
technical concerns everyone gets all distracted and worked up.  Anyway, 
I would like to re-propose one of the ideas that came up (originally 
from Ping?):

class Quitter(object):
 def __init__(self, name):
 self.name = name
 def __repr__(self):
 return 'Use %s() to exit' % self.name
 def __call__(self):
 raise SystemExit()
quit = Quitter('quit')
exit = Quitter('exit')

This is not very magical, but I think is more helpful than the current 
behavior.  It does not satisfy the just do what I said argument for 
not requiring the call (quit() not quit), but eh -- I guess it seemed 
like everything that didn't require a call had some scary corner case 
where the interpreter would abruptly exit.

-- 
Ian Bicking  /  [EMAIL PROTECTED]  /  http://blog.ianbicking.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] quit() on the prompt

2006-03-07 Thread Ian Bicking
BJörn Lindqvist wrote:
 do {
 cmd = readline()
 do_stuff_with_cmd(cmd);
 } while (!strcmp(cmd, quit));
 printf(Bye!);
 exit(0);
 
 KISS?

I believe there were concerns that rebinding quit would cause strange 
behavior.  E.g.:

quit = False
while not quit: ...
quit
   $

Or:

if raw_input('quit?') == 'yes':
   ... quit

will that work?  Should it?  Functions are pretty predictable in 
comparison to these other options.  So, at least to me, quit() == KISS


-- 
Ian Bicking  /  [EMAIL PROTECTED]  /  http://blog.ianbicking.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] collections.idset and collections.iddict?

2006-03-06 Thread Ian Bicking
Guido van Rossum wrote:
 On 3/6/06, Raymond Hettinger [EMAIL PROTECTED] wrote:
 [Neil Schemenauer]
 I occasionally need dictionaries or sets that use object identity
 rather than __hash__ to store items.  Would it be appropriate to add
 these to the collections module?
 Why not decorate the objects with a class adding a method:
def __hash__(self):
return id(self)

 That would seem to be more Pythonic than creating custom variants of other
 containers.
 
 I hate to second-guess the OP, but you'd have to override __eq__ too,
 and probably __ne__ and __cmp__ just to be sure. And probably that
 wouldn't do -- since the default __hash__ and __eq__ have the desired
 behavior, the OP is apparently talking about objects that override
 these operations to do something meaningful; overriding them back
 presumably breaks other functionality.
 
 I wonder if this use case and the frequently requested
 case-insensitive dict don't have some kind of generalization in common
 -- perhaps a dict that takes a key function a la list.sort()?

That's what occurred to me as soon as I read Neil's post as well.  I 
think it would have the added benefit that it would be case insensitive 
while still preserving case.  Here's a rough idea of the semantics:

from UserDict import DictMixin

class KeyedDict(DictMixin):

 def __init__(self, keyfunc):
 self.keyfunc = keyfunc
 self.data = {}

 def __getitem__(self, key):
 return self.data[self.keyfunc(key)][1]

 def __setitem__(self, key, value):
 self.data[self.keyfunc(key)] = (key, value)

 def __delitem__(self, key):
 del self.data[self.keyfunc(key)]

 def keys(self):
 return [v[0] for v in self.data.values()]


I definitely like this more than a key-normalizing dictionary -- the 
normalized key is never actually exposed anywhere.  I didn't follow the 
defaultdict thing through to the end, so I didn't catch what the 
constructor was going to look like for that; but I assume those choices 
will apply here as well.

-- 
Ian Bicking  |  [EMAIL PROTECTED]  |  http://blog.ianbicking.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] operator.is*Type

2006-02-22 Thread Ian Bicking
Raymond Hettinger wrote:
from operator import isSequenceType, isMappingType
class anything(object):

... def __getitem__(self, index):
... pass
...

something = anything()
isMappingType(something)

True

isSequenceType(something)

True

I suggest we either deprecate these functions as worthless, *or* we
define the protocols slightly more clearly for user defined classes.
 
 
 They are not worthless.  They do a damned good job of differentiating 
 anything 
 that CAN be differentiated.

But they are just identical...?  They seem terribly pointless to me. 
Deprecation is one option, of course.  I think Michael's suggestion also 
makes sense.  *If* we distinguish between sequences and mapping types 
with two functions, *then* those two functions should be distinct.  It 
seems kind of obvious, doesn't it?

I think hasattr(obj, 'keys') is the simplest distinction of the two 
kinds of collections.

 Your example simply highlights the consequences of one of Python's most 
 basic, 
 original design choices (using getitem for both sequences and mappings).  
 That 
 choice is now so fundamental to the language that it cannot possibly change. 
 Get used to it.
 
 In your example, the results are correct.  The anything class can be viewed 
 as 
 either a sequence or a mapping.
 
 In this and other posts, you seem to be focusing your design around notions 
 of 
 strong typing and mandatory interfaces.  I would suggest that that approach 
 is 
 futile unless you control all of the code being run.

I think you are reading too much into it.  If the functions exist, they 
should be useful.  That's all I see in Michael's suggestion.


-- 
Ian Bicking  /  [EMAIL PROTECTED]  /  http://blog.ianbicking.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP for Better Control of Nested Lexical Scopes

2006-02-21 Thread Ian Bicking
Mark Russell wrote:
 On 21 Feb 2006, at 19:25, Jeremy Hylton wrote:
 
If I recall the discussion correctly, Guido said he was open to a
version of nested scopes that allowed rebinding.
 
 
 PEP 227 mentions using := as a rebinding operator, but rejects the  
 idea as it would encourage the use of closures.  But to me it seems  
 more elegant than some special keyword, especially is it could also  
 replace the global keyword.  It doesn't handle things like x += y  
 but I think you could deal with that by just writing x := x + y.

By rebinding operator, does that mean it is actually an operator?  I.e.:

   # Required assignment to declare?:
   chunk = None
   while chunk := f.read(1000):
   ...


-- 
Ian Bicking  /  [EMAIL PROTECTED]  /  http://blog.ianbicking.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] defaultdict proposal round three

2006-02-20 Thread Ian Bicking
Alex Martelli wrote:
I prefer this approach over subclassing.  The mental load from an  
additional
method is less than the load from a separate type (even a  
subclass).   Also,
avoidance of invariant issues is a big plus.  Besides, if this allows
setdefault() to be deprecated, it becomes an all-around win.
 
 
 I'd love to remove setdefault in 3.0 -- but I don't think it can be  
 done before that: default_factory won't cover the occasional use  
 cases where setdefault is called with different defaults at different  
 locations, and, rare as those cases may be, any 2.* should not break  
 any existing code that uses that approach.

Would it be deprecated in 2.*, or start deprecating in 3.0?

Also, is default_factory=list threadsafe in the same way .setdefault is? 
  That is, you can safely do this from multiple threads:

   d.setdefault(key, []).append(value)

I believe this is safe with very few caveats -- setdefault itself is 
atomic (or else I'm writing some bad code ;).  My impression is that 
default_factory will not generally be threadsafe in the way setdefault 
is.  For instance:

   def make_list(): return []
   d = dict
   d.default_factory = make_list
   # from multiple threads:
   d.getdef(key).append(value)

This would not be correct (a value can be lost if two threads 
concurrently enter make_list for the same key).  In the case of 
default_factory=list (using the list builtin) is the story different? 
Will this work on Jython, IronPython, or PyPy?  Will this be a 
documented guarantee?  Or alternately, are we just creating a new way to 
punish people who use threads?  And if we push threadsafety up to user 
code, are we trading a very small speed issue (creating lists that are 
thrown away) for a much larger speed issue (acquiring a lock)?

I tried to make a test for this threadsafety, actually -- using a 
technique besides setdefault which I knew was bad (try:except 
KeyError:).  And (except using time.sleep(), which is cheating), I 
wasn't actually able to trigger the bug.  Which is frustrating, because 
I know the bug is there.  So apparently threadsafety is hard to test in 
this case.  (If anyone is interested in trying it, I can email what I have.)

Note that multidict -- among other possible concrete collection patterns 
(like Bag, OrderedDict, or others) -- can be readily implemented with 
threading guarantees.

-- 
Ian Bicking  /  [EMAIL PROTECTED]  /  http://blog.ianbicking.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] defaultdict proposal round three

2006-02-20 Thread Ian Bicking
Steven Bethard wrote:
Alternative A: add a new method to the dict type with the semantics of
__getattr__ from the last proposal, using default_factory if not None
(except on_missing is inlined).
 
 
 I'm not certain I understood this right but (after
 s/__getattr__/__getitem__) this seems to suggest that for keeping a
 dict of counts the code wouldn't really improve much:
 
 dd = {}
 dd.default_factory = int
 for item in items:
 # I want to do ``dd[item] += 1`` but with a regular method instead
 # of __getitem__, this is not possible
 dd[item] = dd.somenewmethod(item) + 1

This would be better done with a bag (a set that can contain multiple 
instances of the same item):

dd = collections.Bag()
for item in items:
   dd.add(item)

Then to see how many there are of an item, perhaps something like:
   dd.count(item)

No collections.Bag exists, but of course one should.  It has nice 
properties -- inclusion is done with __contains__ (with dicts it 
probably has to be done with get), you can't accidentally go below zero, 
the methods express intent, and presumably it will implement only a 
meaningful set of methods.


-- 
Ian Bicking  /  [EMAIL PROTECTED]  /  http://blog.ianbicking.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] defaultdict proposal round three

2006-02-20 Thread Ian Bicking
Guido van Rossum wrote:
 Why are you so keen on using a dictionary to share data between
 threads that  may both modify it? IMO this is asking for trouble --
 the advice about sharing data between threads is always to use the
 Queue module.

I use them often for a shared caches.  But yeah, it's harder than I 
thought at first -- I think the actual cases I'm using work, since they 
use simple keys (ints, strings), but yeah, thread guarantees are too 
difficult to handle in general.  Damn threads.

-- 
Ian Bicking  /  [EMAIL PROTECTED]  /  http://blog.ianbicking.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: defaultdict

2006-02-19 Thread Ian Bicking
Michael Urman wrote:
 On 2/19/06, Josiah Carlson [EMAIL PROTECTED] wrote:
 
My post probably hasn't convinced you, but much of the confusion, I
believe, is based on Martin's original belief that 'k in dd' should
always return true if there is a default.  One can argue that way, but
then you end up on the circular train of thought that gets you to you
can't do anything useful if that is the case, .popitem() doesn't work,
len() is undefined,   Keep it simple, keep it sane.
 
 
 A default factory implementation fundamentally modifies the behavior
 of the mapping. There is no single answer to the question what is the
 right behavior for contains, len, popitem as that depends on what the
 code that consumes the mapping is written like, what it is attempting
 to do, and what you are attempting to override it to do. Or, simply,
 on why you are providing a default value. Resisting the temptation to
 guess the why and just leaving the methods as is seems  the best
 choice; overriding __contains__ to return true is much easier than
 reversing that behavior would be.

I agree that there is simply no universally correct answer for the 
various uses of default_factory.  I think ambiguity on points like this 
is a sign that something is overly general.

In many of the concrete cases it is fairly clear how these methods 
should work.  In the most obvious case (default_factory=list) what seems 
to be to be the correct implementation is one that no one is proposing, 
that is, x in d means d.get(x).  But that uses the fact that the 
return value of default_factory() is a false value, which we cannot 
assume in general.  And it effects .keys() -- which I would propose 
overriding for multidict (so it only returns keys with non-empty lists 
for values), but I don't see how it could be made correct for 
default_factory.

I just don't see why we should cram all these potential features into 
dict by using a vague feature like default_factory.  Why can't we just 
add a half-dozen new types of collections (to the module of the same 
name)?  Each one will get its own page of documentation, a name, a 
proper __repr__, and well defined meaning for all of these methods that 
it shares with dict only insofar as it makes sense to share.

Note that even if we use defaultdict or autodict or something besides 
changing dict itself, we still won't get a good __contains__, a good 
repr, or any of the other features that specific collection 
implementations will give us.

Isn't there anyone else who sees the various dict-like objects being 
passed around as recipes, and thinks that maybe that's a sign they 
should go in the stdlib?  The best of those recipes aren't 
all-encompassing, they just do one kind of container well.

-- 
Ian Bicking  |  [EMAIL PROTECTED]  |  http://blog.ianbicking.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: defaultdict

2006-02-17 Thread Ian Bicking
Raymond Hettinger wrote:
Over lunch with Alex Martelli, he proposed that a subclass of dict
with this behavior (but implemented in C) would be a good addition to
the language
 
 
 I would like to add something like this to the collections module, but a PEP 
 is 
 probably needed to deal with issues like:
 
 * implications of a __getitem__ succeeding while get(value, x) returns x 
 (possibly different from the overall default)
 * implications of a __getitem__ succeeding while __contains__ would fail
 * whether to add this to the collections module (I would say yes)
 * whether to allow default functions as well as default values (so you could 
 instantiate a new default list)
 * comparing all the existing recipes and third-party modules that have 
 already 
 done this
 * evaluating its fitness for common use cases (i.e. bags and dict of lists).

It doesn't seem that useful for bags, assuming we're talking about an 
{object: count} implementation of bags; bags should really have a more 
set-like interface than a dict-like interface.

A dict of lists typically means a multi-valued dict.  In that case it 
seems like x[key_not_found] should return the empty list, as that means 
zero values; even though zero values also means that 
x.has_key(key_not_found) should return False as well.  *but* getting 
x[key_not_found] does not (for a multi-valued dict) mean that suddently 
has_key should return true.  I find the side-effect nature of 
__getitem__ as proposed in default_dict to be rather confusing, and when 
reading code it will very much break my expectations.  I assume that 
attribute access and [] access will not have side effects.  Coming at it 
from that direction, I'm -1, though I'm +1 on dealing with the specific 
use case that started this (x.setdefault(key, []).append(value)).

An implementation targetted specifically at multi-valued dictionaries 
seems like it would be better.  Incidentally, on Web-SIG we've discussed 
wsgiref, and it includes a mutli-values, ordered, case-insensitive 
dictionary.  Such a dictionary(ish) object has clear applicability for 
HTTP headers, but certainly it is something I've used many times 
elsewhere.  In a case-sensitive form it applies to URL variables. 
Really there's several combinations of features, each with different uses.

So we have now...

dicts: unordered, key:value (associative), single-value
sets: unordered, not key:value, single-value
lists: ordered, not key:value, multi-value

We don't have...

bags: unordered, not key:value, multi-value
multi-dict: unordered, key:value, multi-value
ordered-dict: ordered, key:value, single-value
ordered-multi-dict: ordered, key:value, single-value

For all key:value collections, normalized keys can be useful.  (Though 
notably the wsgiref Headers object does not have normalized keys, but 
instead does case-insensitive comparisons.)  I don't know where 
dict-of-dict best fits in here.



-- 
Ian Bicking  /  [EMAIL PROTECTED]  /  http://blog.ianbicking.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] The decorator(s) module

2006-02-17 Thread Ian Bicking
Georg Brandl wrote:
 Hi,
 
 it has been proposed before, but there was no conclusive answer last time:
 is there any chance for 2.5 to include commonly used decorators in a module?

One peculiar aspect is that decorators are a programming technique, not 
a particular kind of functionality.  So the module seems kind of funny 
as a result.

 Of course not everything that jumps around should go in, only pretty basic
 stuff that can be widely used.
 
 Candidates are:
  - @decorator. This properly wraps up a decorator function to change the
signature of the new function according to the decorated one's.

Yes, I like this, and it is purely related to decorators not anything 
else.  Without this, decorators really hurt introspectability.

  - @contextmanager, see PEP 343.

This is abstract enough that it doesn't belong anywhere in particular.

  - @synchronized/@locked/whatever, for thread safety.

Seems better in the threading module.  Plus contexts and with make it 
much less important as a decorator.

  - @memoize

Also abstract, so I suppose it would make sense.

  - Others from wiki:PythonDecoratorLibrary and Michele Simionato's decorator
module at http://www.phyast.pitt.edu/~micheles/python/documentation.html.

redirecting_stdout is better implemented using contexts/with.  @threaded 
(which runs the decorated function in a thread) seems strange to me. 
@blocking seems like it is going into async directions that don't really 
fit in with decorators (as a general concept).

I like @tracing, though it doesn't seem like it is really implemented 
there, it's just an example?

 Unfortunately, a @property decorator is impossible...

It already works!  But only if you want a read-only property.  Which is 
actually about 50%+ of the properties I create.  So the status quo is 
not really that bad.


-- 
Ian Bicking  /  [EMAIL PROTECTED]  /  http://blog.ianbicking.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Counter proposal: multidict (was: Proposal: defaultdict)

2006-02-17 Thread Ian Bicking
I really don't like that defaultdict (or a dict extension) means that 
x[not_found] will have noticeable side effects.  This all seems to be a 
roundabout way to address one important use case of a dictionary with 
multiple values for each key, and in the process breaking an important 
quality of good Python code, that attribute and getitem access not have 
noticeable side effects.

So, here's a proposed interface for a new multidict object, borrowing 
some methods from Set but mostly from dict.  Some things that seemed 
particularly questionable to me are marked with ??.

class multidict:

 def __init__([mapping], [**kwargs]):
 
 Create a multidict:

 multidict() - new empty multidict
 multidict(mapping) - equivalent to:
 ob = multidict()
 ob.update(mapping)
 multidict(**kwargs) - equivalent to:
 ob = multidict()
 ob.update(kwargs)
 

 def __contains__(key):
 
 True if ``self[key]`` is true
 

 def __getitem__(key):
 
 Returns a list of items associated with the given key.  If
 nothing, then the empty list.

 ??: Is the list mutable, and to what effect?
 

 def __delitem__(key):
 
 Removes any instances of key from the dictionary.  Does
 not raise an error if there are no values associated.

 ??: Should this raise a KeyError sometimes?
 

 def __setitem__(key, value):
 
 Same as:

 del self[key]
 self.add(key, value)
 

 def get(key, default=[]):
 
 Returns a list of items associated with the given key,
 or if that list would be empty it returns default
 

 def getfirst(key, default=None):
 
 Equivalent to:
 if key in self:
 return self[key][0]
 else:
 return default
 

 def add(key, value):
 
 Adds the value with the given key, so that
 self[key][-1] == value
 

 def remove(key, value):
 
 Remove (key, value) from the mapping (raising KeyError if not
 present).
 

 def discard(key, value):
 
 Remove like self.remove(key, value), except do not raise
 KeyError if missing.
 

 def pop(key):
 
 Removes key and returns the value; returns [] and does nothing
 if the key is not found.
 

 def keys():
 
 Returns all the keys which have some associated value.
 

 def items():
 
 Returns [(key, value)] for every key/value pair.  Keys that
 have multiple values will be returned as multiple (key, value)
 tuples.
 

 def __len__():
 
 Equivalent to len(self.items())

 ??: Not len(self.keys())?
 

 def update(E, **kwargs):
 
 if E has iteritems then::

 for k, v in E.iteritems():
 self.add(k, v)

 elif E has keys:

 for k in E:
 self.add(k, E[k])

 else:

 for k, v in E:
 self.add(k, v)

 ??: Should **kwargs be allowed?  If so, should it the values
 be sequences?
 

 # iteritems, iterkeys, iter, has_key, copy, popitem, values, clear
 # with obvious implementations
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Counter proposal: multidict

2006-02-17 Thread Ian Bicking
Guido van Rossum wrote:
 On 2/17/06, Ian Bicking [EMAIL PROTECTED] wrote:
 
I really don't like that defaultdict (or a dict extension) means that
x[not_found] will have noticeable side effects.  This all seems to be a
roundabout way to address one important use case of a dictionary with
multiple values for each key, and in the process breaking an important
quality of good Python code, that attribute and getitem access not have
noticeable side effects.

So, here's a proposed interface for a new multidict object, borrowing
some methods from Set but mostly from dict.  Some things that seemed
particularly questionable to me are marked with ??.
 
 
 Have you seen my revised proposal (which is indeed an addition to the
 standard dict rather than a subclass)?

Yes, and though it is more general it has the same issue of side
effects.  Doesn't it seem strange that getting an item will change the
values of .keys(), .items(), and .has_key()?

 Your multidict addresses only one use case for the proposed behavior;
 what's so special about dicts of lists that they should have special
 support? What about dicts of dicts, dicts of sets, dicts of
 user-defined objects?

What's so special?  95% (probably more!) of current use of .setdefault()
is .setdefault(key, []).append(value).

Also, since when do features have to address all possible cases? 
Certainly there are other cases, and I think they can be answered with 
other classes.  Here are some current options:

.setdefault() -- works with any subtype; slightly less efficient than 
what you propose.  Awkward to read; doesn't communicate intent very well.

UserDict -- works for a few cases where you want to make dict-like 
objects.  Messes up the concept of identity and containment -- resulting 
objects both are dictionaries, and contain a dictionary (obj.data).

DictMixin -- does anything you can possibly want, requiring only the 
overriding of a couple methods.

dict subclassing -- does anything you want as well, but you typically 
have to override many more methods than with DictMixin (and if you don't 
have to override every method, that's not documented in any way).  Isn't 
written with subclassing in mind.  Really, you are proposing that one 
specific kind of override be made feasible, either with subclassing or 
injecting a method.


That said, I'm not saying that several kinds of behavior shouldn't be 
supported.  I just don't see why dict should support them all (or 
multidict).  And I also think dict will support them poorly.

multidict implements one behavior *well*.  In a documented way, with a 
name people can refer to.  I can say multidict, I can't say a dict 
where I set default_factory to list (well, I can say that, but that 
just opens up yet more questions and clarifications).

Some ways multidict differs from default_factory=list:

* __contains__ works (you have to use .get() with default_factory to get 
a meaningful result)
* Barring cases where there are exceptions, x[key] and x.get(key) return 
the same value for multidict; with default_factory one returns [] and 
the other returns None when the key isn't found.  But if you do x[key]; 
x.get(key) then x.get(key) always returns [].
* You can't use __setitem__ to put non-list items into a multidict; with 
multidict you don't have to guard against non-sequences values.
* [] is meaningful not just as the default value, but as a null value; 
the multidict implementation respects both aspects.
* Specific method x.add(key, value) that indicates intent in a way that 
x[key].append(value) does not.
* items and iteritems return values meaningful to the context (a list of 
(key, single_value) -- this is usually what I want, and avoids a nested 
for loop).  __len__ also usefully different than in dict.
* .update() handles iteritems sensibly, and updates from dictionaries 
sensibly -- if you mix a default_factory=list dict with a normal 
(single-value) dictionary you'll get an effectively corrupted dictionary 
(where some keys are lists)
* x.getfirst(key) is useful
* I think this will be much easier to reason about in situations with 
threads -- dict acts very predictably with threads, and people rely upon 
that
* multidict can be written either with subclassing intended, or with an 
abstract superclass, so that other kinds of specializations of this 
superset of the dict interface can be made more easily (if DictMixin 
itself isn't already sufficient)

So, I'm saying: multidict handles one very common collection need that 
dict handles awkwardly now.  multidict is a meaningful and useful class 
with its own identity/name and meaning separate from dict, and has 
methods that represent both the intersection and the difference between 
the two classes.  multidict does not in any way preclude other 
collection objects for other situations; it is entirely unfair to expect 
a new class to solve all issues.  multidict suggests an interface that 
other related classes can use (e.g., an ordered version).  multidict

Re: [Python-Dev] Proposal: defaultdict

2006-02-17 Thread Ian Bicking
Guido van Rossum wrote:
 d = {}
 d.default_factory = set
 ...
 d[key].add(value)

Another option would be:

   d = {}
   d.default_factory = set
   d.get_default(key).add(value)

Unlike .setdefault, this would use a factory associated with the 
dictionary, and no default value would get passed in.  Unlike the 
proposal, this would not override __getitem__ (not overriding 
__getitem__ is really the only difference with the proposal).  It would 
be clear reading the code that you were not implicitly asserting they 
key in d was true.

get_default isn't the best name, but another name isn't jumping out at 
me at the moment.  Of course, it is not a Pythonic argument to say that 
an existing method should be overridden, or functionality made nameless 
simply because we can't think of a name (looking to anonymous functions 
of course ;)

-- 
Ian Bicking  /  [EMAIL PROTECTED]  /  http://blog.ianbicking.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: defaultdict

2006-02-17 Thread Ian Bicking
Guido van Rossum wrote:
 On 2/17/06, Adam Olsen [EMAIL PROTECTED] wrote:
 
It's also makes it harder to read code.  You may expect d[key] to
raise an exception, but it won't because of a single line up several
pages (or in another file entierly!)
 
 
 Such are the joys of writing polymorphic code. I don't really see how
 you can avoid this kind of confusion -- I could have given you some
 other mapping object that does weird stuff.

The way you avoid confusion is by not working with code or programmers 
who write bad code.  Python and polymorphic code in general pushes the 
responsibility for many errors from the language structure onto the 
programmer -- it is the programmers' responsibility to write good code. 
  Python has never kept people from writing obcenely horrible code.  We 
ought to have an obfuscated Python contest just to prove that point -- 
it is through practice and convention that readable Python code happens, 
not through the restrictions of the language.  (Honestly, I think such a 
contest would be a good idea.)

I know *I* at least don't like code that mixes up access and 
modification.  Maybe not everyone does (or maybe not everyone thinks of 
getitem as access, but that's unlikely).  I will assert that it is 
Pythonic to keep access and modification separate, which is why methods 
and attributes are different things, and why assignment is not an 
expression, and why functions with side effects typically return None, 
or have names that are very explicit about the side effect, with names 
containing command verbs like update or set.  All of these 
distinguish access from modification.

Note that all of what I'm saying *only* applies to the overriding of 
__getitem__, not the addition of any new method.  I think multidict is 
better for the places it applies, but I see no problem at all with a new 
method on dictionaries that calls on_missing.

-- 
Ian Bicking  /  [EMAIL PROTECTED]  /  http://blog.ianbicking.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?]

2006-02-17 Thread Ian Bicking
Martin v. Löwis wrote:
 Users do
 
 py Martin v. Löwis.encode(utf-8)
 Traceback (most recent call last):
   File stdin, line 1, in ?
 UnicodeDecodeError: 'ascii' codec can't decode byte 0xf6 in position 11:
 ordinal not in range(128)
 
 because they want to convert the string to Unicode, and they have
 found a text telling them that .encode(utf-8) is a reasonable
 method.
 
 What it *should* tell them is
 
 py Martin v. Löwis.encode(utf-8)
 Traceback (most recent call last):
   File stdin, line 1, in ?
 AttributeError: 'str' object has no attribute 'encode'

I think it would be even better if they got ValueError: utf8 can only 
encode unicode objects.  AttributeError is not much more clear than the 
UnicodeDecodeError.

That str.encode(unicode_encoding) implicitly decodes strings seems like 
a flaw in the unicode encodings, quite seperate from the existance of 
str.encode.  I for one really like s.encode('zlib').encode('base64') -- 
and if the zlib encoding raised an error when it was passed a unicode 
object (instead of implicitly encoding the string with the ascii 
encoding) that would be fine.

The pipe-like nature of .encode and .decode works very nicely for 
certain transformations, applicable to both unicode and byte objects. 
Let's not throw the baby out with the bath water.


-- 
Ian Bicking  /  [EMAIL PROTECTED]  /  http://blog.ianbicking.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: defaultdict

2006-02-17 Thread Ian Bicking
Martin v. Löwis wrote:
I know *I* at least don't like code that mixes up access and 
modification.  Maybe not everyone does (or maybe not everyone thinks of 
getitem as access, but that's unlikely).  I will assert that it is 
Pythonic to keep access and modification separate, which is why methods 
and attributes are different things, and why assignment is not an 
expression, and why functions with side effects typically return None, 
or have names that are very explicit about the side effect, with names 
containing command verbs like update or set.  All of these 
distinguish access from modification.
 
 
 Do you never write
 
  d[some_key].append(some_value)
 
 This is modification and access, all in a single statement, and all
 without assignment operator.

(d[some_key]) is access.  (...).append(some_value) is modification. 
Expressions are compound; of course you can mix both access and 
modification in a single expression.  d[some_key] is access that returns 
something, and .append(some_value) modifies that something, it doesn't 
modify d.

 I don't see the setting of the default value as a modification.
 The default value has been there, all the time. It only is incarnated
 lazily.

It is lazily incarnated for multidict, because there is no *noticeable* 
side effect -- if there is any internal side effects that is an 
implementation detail.  However for default_factory=list, the result of 
.keys(), .has_key(), and .items() changes when you do d[some_key].

-- 
Ian Bicking  /  [EMAIL PROTECTED]  /  http://blog.ianbicking.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: defaultdict

2006-02-17 Thread Ian Bicking
Adam Olsen wrote:
 The latter is even the prefered form, since it only invokes a single
 dict lookup:
 
 On 2/16/06, Delaney, Timothy (Tim) [EMAIL PROTECTED] wrote:
 
try:
v = d[key]
except:
v = d[key] = value
 
 
 Obviously this example could be changed to use default_factory, but I
 find it hard to believe the only use of that pattern is to set default
 keys.

I'd go further -- I doubt many cases where try:except KeyError: is used 
could be refactored to use default_factory -- default_factory can only 
be used to set default keys to something that can be determined sometime 
close to the time the dictionary is created, and that the default is not 
dependent on the context in which the key is fetched, and that default 
value will not cause unintended side effects if the dictionary leaks out 
of the code where it was initially used (like if the dictionary is 
returned to someone).  Any default factory is more often an algorithmic 
detail than truly part of the nature of the dictionary itself.

For instance, here is something I do often:

try:
 value = cache[key]
except KeyError:
 ... calculate value ...
 cache[key] = value

Realistically, factoring ... calculate value ... into a factory that 
calculates the value would be difficult, produce highly unreadable code, 
perform worse, and have more bugs.  For simple factories like list and 
dict the factory works okay.  For immutable values like 0 and None, 
the factory (lambda : 0 and lambda : None) is a wasteful way to create a 
default value (because storing the value in the dictionary is 
unnecessary).  For non-trivial factories the whole thing falls apart, 
and one can just hope that no one will try to use this feature and will 
instead stick with the try:except KeyError: technique.


-- 
Ian Bicking  /  [EMAIL PROTECTED]  /  http://blog.ianbicking.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?]

2006-02-17 Thread Ian Bicking
Josiah Carlson wrote:
If some users
can't understand this (passing different arguments to a function may
produce different output),

It's worse than that. The return *type* depends on the *value* of
the argument. I think there is little precedence for that: normally,
the return values depend on the argument values, and, in a polymorphic
function, the return type might depend on the argument types (e.g.
the arithmetic operations). Also, the return type may depend on the
number of arguments (e.g. by requesting a return type in a keyword
argument).
 
 
 You only need to look to dictionaries where different values passed into
 a function call may very well return results of different types, yet
 there have been no restrictions on mapping to and from single types per
 dictionary.
 
 Many dict-like interfaces for configuration files do this, things like
 config.get('remote_host') and config.get('autoconnect') not being
 uncommon.

I think there is *some* justification, if you don't understand up front 
that the codec you refer to (using a string) is just a way of avoiding 
an import (thankfully -- dynamically importing unicode codecs is 
obviously infeasible).  Now, if you understand the argument refers to 
some algorithm, it's not so bad.

The other aspect is that there should be something consistent about the 
return types -- the Python type is not what we generally rely on, 
though.  In this case they are all data.  Unicode and bytes are both 
data, and you could probably argue lists of ints is data too (but an 
arbitrary list definitely isn't data).  On the outer end of data might 
be an ElementTree structure (but that's getting fishy).  An open file 
object is not data.  A tuple probably isn't data.

-- 
Ian Bicking  /  [EMAIL PROTECTED]  /  http://blog.ianbicking.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?]

2006-02-17 Thread Ian Bicking
Martin v. Löwis wrote:
 Ian Bicking wrote:
 
That str.encode(unicode_encoding) implicitly decodes strings seems like
a flaw in the unicode encodings, quite seperate from the existance of
str.encode.  I for one really like s.encode('zlib').encode('base64') --
and if the zlib encoding raised an error when it was passed a unicode
object (instead of implicitly encoding the string with the ascii
encoding) that would be fine.

The pipe-like nature of .encode and .decode works very nicely for
certain transformations, applicable to both unicode and byte objects.
Let's not throw the baby out with the bath water.
 
 
 The way you use it, it's a matter of notation only: why
 is
 
 zlib(base64(s))
 
 any worse? I think it's better: it doesn't use string literals to
 denote function names.

Maybe it isn't worse, but the real alternative is:

   import zlib
   import base64

   base64.b64encode(zlib.compress(s))

Encodings cover up eclectic interfaces, where those interfaces fit a 
basic pattern -- data in, data out.

-- 
Ian Bicking  /  [EMAIL PROTECTED]  /  http://blog.ianbicking.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] The decorator(s) module

2006-02-17 Thread Ian Bicking
Alex Martelli wrote:
 Maybe we could fix that by having property(getfunc) use
 getfunc.__doc__ as the __doc__ of the resulting property object
 (easily overridable in more normal property usage by the doc=
 argument, which, I feel, should almost invariably be there).

+1

-- 
Ian Bicking  /  [EMAIL PROTECTED]  /  http://blog.ianbicking.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?]

2006-02-17 Thread Ian Bicking
Martin v. Löwis wrote:
Maybe it isn't worse, but the real alternative is:

  import zlib
  import base64

  base64.b64encode(zlib.compress(s))

Encodings cover up eclectic interfaces, where those interfaces fit a
basic pattern -- data in, data out.
 
 
 So should I write
 
 3.1415.encode(sin)
 
 or would that be
 
 3.1415.decode(sin)

The ambiguity shows that sin is clearly not an encoding.  Doesn't read 
right anyway.

[0.3, 0.35, ...].encode('fourier') would be sensible though.  Except of 
course lists don't have an encode method; but that's just a convenience 
of strings and unicode because those objects are always data, where 
lists are only sometimes data.  If extended indefinitely, the namespace 
issue is notable.  But it's not going to be extended indefinitely, so 
that's just a theoretical problem.

 What about
 
 http://www.python.org.decode(URL)

you mean 'a%20b'.decode('url') == 'a b'?  That's not what you meant, but 
nevertheless that would be an excellent encoding ;)


-- 
Ian Bicking  /  [EMAIL PROTECTED]  /  http://blog.ianbicking.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Extension to ConfigParser

2006-01-31 Thread Ian Bicking
Sorry, I didn't follow up here like I should have, and I haven't 
followed the rest of this conversation, so apologies if I am being 
redundant...

Fuzzyman wrote:
While ConfigParser is okay for simple configuration, it is (IMHO) not a 
very good basis for anyone who wants to build better systems, like 
config files that can be changed programmatically, or error messages 
that point to file and line numbers.  Those aren't necessarily features 
we need to expose in the standard library, but it'd be nice if you could 
implement that kind of feature without having to ignore the standard 
library entirely.

  
 
 Can you elaborate on what kinds of programattic changes you envisage ?
 I'm just wondering if there are classes of usage not covered by
 ConfigObj. Of course you can pretty much do anything to a ConfigObj
 instance programattically, but even so...

ConfigObj does fine, my criticism was simply of ConfigParser in this 
case.  Just yesterday I was doing (with ConfigParser):

  conf.save('app:main', '## Uncomment this next line to enable 
authentication:\n#filter-with', 'openid')

This is clearly lame ;)

That said, I'm not particularly enthused about a highly featureful 
config file *format* in the standard library, even if I would like a 
much more robust implementation.

  
 
 I don't see how you can easily separate the format from the parser -
 unless you just leave raw values. (As I said in the other email, I don't
 think I fully understand you.)
 
 If accessing raw values suits your purposes, why not subclass
 ConfigParser and do magic in the get* methods ?

I guess I haven't really looked closely at the implementation of 
ConfigParser, so I don't know how serious the subclassing would have to 
be.  But, for example, if you wanted to do nested sections this is not 
infeasible with the current syntax, you just have to overload the 
meaning of the section names.  E.g., [foo.bar] (a section named 
foo.bar) could mean that this is a subsection of foo.  Or, if the 
parser allows you to see the order of sections, you could use [[bar]] (a 
section named [bar]) to imply a subsection, not unlike what you have 
already, except without the indentation.

I think there's lots of other kinds of things you can do with the INI 
syntax as-is, but providing a different interface to it.  If you allow 
an easy-to-reuse parser, you can even check that syntax at read time. 
(Or if you keep enough information, check the syntax later and still be 
able to signal errors with filenames and line numbers)

An example of a parser that doesn't imply much of anything about the 
object being produced is one that I wrote here: 
http://svn.colorstudy.com/INITools/trunk/initools/iniparser.py

On top of that I was able to build some other fancy things without much 
problem (which ended up being too fancy, but that's a different issue ;)

 From my light reading on ConfigObj, it looks like it satisfies my 
personal goals (though I haven't used it), but maybe has too many 
features, like nested sections.  And it seems like maybe the API can be 
  
 
 I personally think nested sections are very useful and would be sad to
 not see them included. Grouping additional configuration options as a
 sub-section can be *very* handy.

Using .'s in names can also do grouping, or section naming conventions.

-- 
Ian Bicking  |  [EMAIL PROTECTED]  |  http://blog.ianbicking.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Extension to ConfigParser

2006-01-30 Thread Ian Bicking
Fuzzyman wrote:
 The resolution I'm suggesting means that people can continue to use 
 ConfigParser, with major feature enhancements. *Or* they can migrate to 
 a slightly different API that is easier to use - without needing to 
 switch between incompatible modules.

I don't think enhancing ConfigParser significantly is a good way 
forward.  Because of ConfigParser's problems people have made all sorts 
of workarounds, and so I don't think there's any public interface that 
we can maintain while changing the internals without breaking lots of 
code.  In practice, everything is a public interface.  So I think the 
implementation as it stands should stay in place, and if anything it 
should be deprecated instead of being enhanced in-place.

Another class or module could be added that fulfills the documented 
interface to ConfigParser.  This would provide an easy upgrade path, 
without calling it a backward-compatible interface.  I personally would 
like if any new config system included a parser, and then an interface 
to the configuration that was read (ConfigParser is only the latter). 
Then people who want to do their own thing can work with just the 
parser, without crudely extending and working around the configuration 
interface.

-- 
Ian Bicking  /  [EMAIL PROTECTED]  /  http://blog.ianbicking.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Extension to ConfigParser

2006-01-30 Thread Ian Bicking
Guido van Rossum wrote:
I don't think enhancing ConfigParser significantly is a good way
forward.  Because of ConfigParser's problems people have made all sorts
of workarounds, and so I don't think there's any public interface that
we can maintain while changing the internals without breaking lots of
code.  In practice, everything is a public interface.  So I think the
implementation as it stands should stay in place, and if anything it
should be deprecated instead of being enhanced in-place.
 
 
 Somehow that's not my experience. What's so bad about ConfigParser?
 What would break if we rewrote the save functionality to produce a
 predictable order?

That's a fairly minor improvement, and I can't see how that would break 
anything.  But Michael (aka Fuzzyman -- sorry, I just can't refer to you 
as Fuzzyman without feeling absurd ;) was proposing ConfigObj 
specifically (http://www.voidspace.org.uk/python/configobj.html).  I 
assume the internals of ConfigObj bear no particular resemblence to 
ConfigParser, even if ConfigObj can parse the same syntax (plus some, 
and with different failure cases) and provide the same public API.

While ConfigParser is okay for simple configuration, it is (IMHO) not a 
very good basis for anyone who wants to build better systems, like 
config files that can be changed programmatically, or error messages 
that point to file and line numbers.  Those aren't necessarily features 
we need to expose in the standard library, but it'd be nice if you could 
implement that kind of feature without having to ignore the standard 
library entirely.

That said, I'm not particularly enthused about a highly featureful 
config file *format* in the standard library, even if I would like a 
much more robust implementation.

 From my light reading on ConfigObj, it looks like it satisfies my 
personal goals (though I haven't used it), but maybe has too many 
features, like nested sections.  And it seems like maybe the API can be 
reduced in size with a little high-level refactoring -- APIs generally 
grow over time so as to preserve backward compatibility, but I think if 
it was introduced into the standard library that might be an opportunity 
to trim the API back again before it enters the long-term API freeze 
that the standard library demands.

-- 
Ian Bicking  /  [EMAIL PROTECTED]  /  http://blog.ianbicking.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Path inherits from string

2006-01-26 Thread Ian Bicking
Fredrik Lundh wrote:
However, I might be wrong because according to [1] it should work. And
having to wrap the Path object in str() (open(str(somepath))) each and
every time the called function expects a string is not a practical
solution.
 
 
 in Python, the usual way to access an attribute of an object is to
 access the attribute; e.g.
 
 f = open(p.name)

You mean f = open(Path(p).name), because it is likely that people will 
  also have to accept strings for the nearterm (and probably longeterm) 
future.  And the error message without will be inscrutable (and will 
still be inscrutable in many cases when you try to access other methods, 
sadly).  And currently .name is taken for something else in the API. 
And the string path is not really an attribute because the string path 
*is* the object, it is not *part* of the object.

OTOH, str(path) will break unicode filenames.  And unicode() breaks 
anything that simply desires to pass data through without effecting its 
encoding.

An open method on paths simplifies many of these issues, but doesn't do 
anything for passing a path to legacy code.  Changing open() and all the 
functions that Path replaces (e.g., os.path.join) to accept Path objects 
may resolve issues with a substantial portion of code.  But any code 
that does a typecheck on arguments will be broken -- which in the case 
of paths is quite common since many functions take both filename and 
file object arguments.

-- 
Ian Bicking  /  [EMAIL PROTECTED]  /  http://blog.ianbicking.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] The path module PEP

2006-01-25 Thread Ian Bicking
BJörn Lindqvist wrote:
 * match() and matchcase() wraps the fnmatch.fnmatch() and
   fnmatch.fnmatchcase() functions. I believe that the renaming is
   uncontroversial and that the introduction of matchcase() makes it so
   the whole fnmatch module can be deprecated.

The renaming is fine with me.  I generally use the fnmatch module for 
wildcard matching, not necessarily against path names.  Path.match 
doesn't replace that functionality.  Though fnmatch.translate isn't even 
publically documented, which is the function I actually tend to use.

Though it seems a little confusing to me that glob treats separators 
specially, and that's not implemented at the fnmatch level.  So 
Path('/a/b/d/c').match('a/*/d') is true, but Path('/').walk('a/*/d') 
won't return Path('/a/b/c/d').  I think .match() should be fixed.  But I 
don't think fnmatch should be changed.

I'm actually finding myself a little confused by the glob arguments (if 
the glob contains '/'), now that I really think about them.

-- 
Ian Bicking  /  [EMAIL PROTECTED]  /  http://blog.ianbicking.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] The path module PEP

2006-01-25 Thread Ian Bicking
John J Lee wrote:
 On Tue, 24 Jan 2006, Ian Bicking wrote:
 [...]
 
Losing .open() would make it much harder for anyone wanting to write, 
say, a URI library that implements the Path API.
 
 [...]
 
 Why?  Could you expand a bit?
 
 What's wrong with urlopen(filesystem_path_instance) ?

My example shows this more clearly I think:

   def read_config(path):
   text = path.open().read()
   ... do something ...

If I implement a URI object with an .open() method, then I can use it 
with this function, even though read_config() was written with file 
paths in mind.  But without it that won't work:

   def read_config(path):
   text = open(path).read()
   ...

-- 
Ian Bicking  /  [EMAIL PROTECTED]  /  http://blog.ianbicking.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] The path module PEP

2006-01-25 Thread Ian Bicking
Tony Meyer wrote:
Remove __div__ (Ian, Jason, Michael, Oleg)

This is one of those where everyone (me too) says I don't care either
way. If that is so, then I see no reason to change it unless someone
can show a scenario in which it hurts readability. Plus, a few people
have said that they like the shortcut.

* http://mail.python.org/pipermail/python-list/2005-July/292251.html
* http://mail.python.org/pipermail/python-dev/2005-June/054496.html
* http://mail.python.org/pipermail/python-list/2005-July/291628.html
* http://mail.python.org/pipermail/python-list/2005-July/291621.html
 
 
 Well, if you include the much larger discussion on python-list,  
 people (including me) have said that removing __div__ is a good  
 idea.  If it's included in the PEP, please at least include a  
 justification and cover the problems with it.  The vast majority of  
 people (at least at the time) were either +0 or -0, not +1.  +0's are  
 not justification for including something.

If it were possible to use .join() for joining paths, I think I wouldn't 
mind so much.  But reusing a string method for something very different 
seems like a bad idea.  So we're left with .joinpath().  Still better 
than os.path.join() I guess, but only a little.  I guess that's why I'm 
+1 on /.

 Against it:
 
   * Zen: Beautiful is better than ugly. Explicit is better than  
 implicit. Readability counts. There should be one-- and preferably  
 only one --obvious way to do it.

I think / is pretty.  I think it reads well.  There's already some 
inevitable redundancy in this interface.  I use os.path.join so much 
that I know anything I use will feel readable quickly, but I also think 
I'll find / more appealing.

   * Not every platform that Python supports has '/' as the path  
 separator.  Windows, a pretty major one, has '\'.  I have no idea  
 what various portable devices use, but there's a reasonable chance  
 it's not '/'.

I believe all platforms support /; at least Windows and Mac do, in 
addition to their native separators.  I assume any platform that 
supports filesystem access will support / in Python.

If anything, a good shortcut for .joinpath() will at least encourage 
people to use it, thus discouraging hardcoding of path separators.  I 
expect it would encourage portable paths.

Though Path('/foo') / '/bar' == Path('/bar'), which is *not* intuitive, 
though in the context of join it's not as surprising.  So that is a 
problem.  If / meant under this path then that could be a useful 
operator (in that I'd really like such an operator or method).  Either 
paths would be forced to be under the original path, or it would be an 
error if they somehow escaped.  Currently there's no quick-and-easy way 
to ensure this, except to join the paths, do abspath(), then confirm 
that the new path starts with the old path.

   * It's being used to mean join, which is the exact opposite  
 of /'s other meaning (divide).
 
   * Python's not Perl.  We like using functions and not symbols.

A little too heavy on the truisms.  Python isn't the anti-Perl.

Renaming methods because of PEP 8 (Gustavo, Ian, Jason)

I'm personally not keen on that. I like most of the names as they
are. abspath(), joinpath(), realpath() and splitall() looks so much
better than abs_path(), join_path(), real_path() and split_all() in my
eyes. If someone like the underscores I'll add it to Open Issues.
 
 
 +1 to following PEP 8.  These aren't built-ins, it's a library  
 module.  In addition to the PEP, underscores make it much easier to  
 read, especially for those for whom English is not their first language.

I don't find abs_path() much easier to read than abspath() -- neither is 
a full name.  absolute_path() perhaps, but that is somewhat redundant; 
absolute()...?  Eh.

Precedence in naming means something, and in this case all the names 
have existed for a very long time (as long as Python?)  PEP 8 encourages 
following naming precedence.  While I don't see a need to match every 
existing function with a method, to the degree they do match I see no 
reason why we shouldn't keep the names.  And I see reasons why the names 
shouldn't be changed.


-- 
Ian Bicking  /  [EMAIL PROTECTED]  /  http://blog.ianbicking.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] The path module PEP

2006-01-25 Thread Ian Bicking
BJörn Lindqvist wrote:
 Remove __div__ (Ian, Jason, Michael, Oleg)
 
 This is one of those where everyone (me too) says I don't care either
 way. If that is so, then I see no reason to change it unless someone
 can show a scenario in which it hurts readability. Plus, a few people
 have said that they like the shortcut.
 
 * http://mail.python.org/pipermail/python-list/2005-July/292251.html
 * http://mail.python.org/pipermail/python-dev/2005-June/054496.html
 * http://mail.python.org/pipermail/python-list/2005-July/291628.html
 * http://mail.python.org/pipermail/python-list/2005-July/291621.html

Curious how often I use os.path.join and division, I searched a project 
of mine, and in 12k lines there were 34 uses of join, and 1 use of 
division.  In smaller scripts os.path.join tends to show up a lot more 
(per line).  I'm sure there's people who use division far more than I, 
and os.path.join less, but I'm guessing the majority of users are more 
like me.

That's not necessarily a justification of / for paths, but at least this 
use for / wouldn't be obscure or mysterious after you get a little 
experience seeing code that uses it.


-- 
Ian Bicking  /  [EMAIL PROTECTED]  /  http://blog.ianbicking.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] The path module PEP

2006-01-25 Thread Ian Bicking
Tony Meyer wrote:
 [Ian Bicking]
 
 If it were possible to use .join() for joining paths, I think I  
 wouldn't mind so much.  But reusing a string method for something  
 very different seems like a bad idea.  So we're left with .joinpath 
 ().  Still better than os.path.join() I guess, but only a little.   I 
 guess that's why I'm +1 on /.
 
 
 Why does reusing a string method for something very different seem  like 
 a bad idea, but reusing a mathematical operator for something  very 
 different seem like a good idea?  Path's aren't strings, so join () 
 seems the logical choice.  (There are also alternatives to  joinpath 
 if the name is the thing: add(), for example).

Paths are strings, that's in the PEP.

As an aside, I think it should be specified what (if any) string methods 
won't be inherited by Path (or will be specifically disabled by making 
them throw some exception).  I think .join() and __iter__ at least 
should be disabled.

 Precedence in naming means something, and in this case all the  names 
 have existed for a very long time (as long as Python?)  PEP 8  
 encourages following naming precedence.  While I don't see a need  to 
 match every existing function with a method, to the degree they  do 
 match I see no reason why we shouldn't keep the names.  And I  see 
 reasons why the names shouldn't be changed.
 
 
 PEP 8 encourages following naming precedence within a module, doesn't  
 it?  Guido has said that he'd like to have the standard library  tidied 
 up, at least somewhat (e.g. StringIO.StringIO -  stringio.StringIO) for 
 Python 3000.  It would make it less painful if  new additions already 
 followed the plan.

I think the use of underscores or squished words isn't as bit a deal as 
the case of modules.  It's often rather ambiguous what a word really 
is.  At least in English word combinations slowly and ambiguously float 
towards being combined.  So abspath and abs_path both feel sufficiently 
inside the scope of PEP 8 that precedence is worth maintaining. 
rfc822's getallmatchingheaders method was going too far, but a little 
squishing doesn't bother me, if it is consistent (and it's actually 
easier to be consistent about squishing than underscores).

-- 
Ian Bicking  /  [EMAIL PROTECTED]  /  http://blog.ianbicking.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] / as path join operator

2006-01-25 Thread Ian Bicking
Steven Bethard wrote:
 My only fear with the / operator is that we'll end up with the same
 problems we have for using % in string formatting -- the order of
 operations might not be what users expect.  Since join is conceptually
 an addition-like operator, I would expect:
 
 Path('home') / 'a' * 5
 
 to give me:
 
 home/a
 
 If I understand it right, it would actually give me something like:
 
 home/ahome/ahome/ahome/ahome/a

Both of these examples are rather silly, of course ;)  There's two 
operators currently used commonly with strings (that I assume Path would 
inherit): + and %.  Both actually make sense with paths too.

   filename_template = '%(USER)s.conf'
   p = Path('/conf') / filename_template % os.environ
which means:
   p = (Path('/conf') / filename_template) % os.environ

But probably the opposite is intended.  Still, it will usually be 
harmless.  Which is sometimes worse than usually harmful.

+ seems completely innocuous, though:

   ext = '.jpg'
   name = fields['name']
   image = Path('/images') / name + ext

It doesn't really matter what order it happens in there.  Assuming 
concatenation results in a new Path object, not a str.

-- 
Ian Bicking  |  [EMAIL PROTECTED]  |  http://blog.ianbicking.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] The path module PEP

2006-01-25 Thread Ian Bicking
Barry Warsaw wrote:
 On Wed, 2006-01-25 at 18:10 -0600, Ian Bicking wrote:
 
 
Paths are strings, that's in the PEP.

As an aside, I think it should be specified what (if any) string methods 
won't be inherited by Path (or will be specifically disabled by making 
them throw some exception).  I think .join() and __iter__ at least 
should be disabled.
 
 
 Whenever I see derived classes deliberately disabling base class
 methods, I see red flags that something in the design of the hierarchy
 isn't right.

IMHO the hierarchy problem is a misdesign of strings; iterating over 
strings is usually a bug, not a deliberately used feature.  And it's a 
particularly annoying bug, leading to weird results.

In this case a Path is not a container for characters.  Strings aren't 
containers for characters either -- apparently they are containers for 
smaller strings, which in turn contain themselves.  Paths might be seen 
as a container for other subpaths, but I think everyone agrees this is 
too ambigous and implicit.  So there's nothing sensible that __iter__ 
can do, and having it do something not sensible (just to fill it in with 
something) does not seem very Pythonic.

join is also a funny method that most people wouldn't expect on strings 
anyway.  But putting that aside, the real issue I see is that it is a 
miscognate for os.path.join, to which it has no relation.  And I can't 
possibly imagine what you'd use it for in the context of a path.


-- 
Ian Bicking  |  [EMAIL PROTECTED]  |  http://blog.ianbicking.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] The path module PEP

2006-01-25 Thread Ian Bicking
Gustavo J. A. M. Carneiro wrote:
   On a slightly different subject, regarding path / path, I think it
 feels much more natural path + path.  Path.join is really just a string
 concatenation, except that it adds a path separator in the middle if
 necessary, if I'm not mistaken.

No, it isn't, which maybe is why / is bad.  os.path.join(a, b) basically 
returns the path as though b is interpreted to be relative to a.  I.e., 
os.path.join('/foo', '/bar') == '/bar'.  Not much like concatenation at 
all.  Plus string concatenation is quite useful with paths, e.g., to add 
an extension.

If a URI class implemented the same methods, it would be something of a 
question whether uri.joinpath('/foo/bar', 'baz') would return '/foo/baz' 
(and urlparse.urljoin would) or '/foo/bar/baz' (as os.path.join does). 
I assume it would be be the latter, and urljoin would be a different 
method, maybe something novel like urljoin.


-- 
Ian Bicking  |  [EMAIL PROTECTED]  |  http://blog.ianbicking.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


  1   2   >