[Python-Dev] UTF-8 Decoder

2009-04-13 Thread Jeroen Ruigrok van der Werven
[Note: I haven't looked thoroughly at our handling yet, so hence I raise the
question.]

This got posted on the Unicode list, does it seem interesting for Python
itself, the UTF-8 to UTF-16 transcoding might be?

http://bjoern.hoehrmann.de/utf-8/decoder/dfa/

-- 
Jeroen Ruigrok van der Werven asmodai(-at-)in-nomine.org / asmodai
イェルーン ラウフロック ヴァン デル ウェルヴェン
http://www.in-nomine.org/ | http://www.rangaku.org/ | GPG: 2EAC625B
Whenever you meet difficult situations dash forward bravely and joyfully...
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-ideas] Proposed addtion to urllib.parse in 3.1 (and urlparse in 2.7)

2009-04-13 Thread Mart Sõmermaa
On Mon, Apr 13, 2009 at 12:56 AM, Antoine Pitrou solip...@pitrou.netwrote:

 Mart Sõmermaa mrts.pydev at gmail.com writes:
 
  Proposal: add add_query_params() for appending query parameters to an URL
 to
 urllib.parse and urlparse.

 Is there anything to /remove/ a query parameter?


I'd say this is outside the scope of add_query_params().

As for the duplicate handling, I've implemented a threefold strategy that
should address all use cases raised before:

 def add_query_params(*args, **kwargs):

add_query_parms(url, [allow_dups, [args_dict, [separator]]], **kwargs)

Appends query parameters to an URL and returns the result.

:param url: the URL to update, a string.
:param allow_dups: if
* True: plainly append new parameters, allowing all duplicates
  (default),
* False: disallow duplicates in values and regroup keys so that
  different values for the same key are adjacent,
* None: disallow duplicates in keys -- each key can have a single
  value and later values override the value (like dict.update()).
:param args_dict: optional dictionary of parameters, default is {}.
:param separator: either ';' or '', the separator between key-value
pairs, default is ''.
:param kwargs: parameters as keyword arguments.

:return: original URL with updated query parameters or the original URL
unchanged if no parameters given.


The commit is

http://github.com/mrts/qparams/blob/b9bdbec46bf919d142ff63e6b2b822b5d57b6f89/qparams.py

extensive description of the behaviour is in the doctests.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-ideas] Proposed addtion to url lib.parse in 3.1 (and urlparse in 2.7)

2009-04-13 Thread Antoine Pitrou
Mart Sõmermaa mrts.pydev at gmail.com writes:
 
 On Mon, Apr 13, 2009 at 12:56 AM, Antoine Pitrou solipsis at pitrou.net
wrote:
 Mart Sõmermaa mrts.pydev at gmail.com writes:
 
  Proposal: add add_query_params() for appending query parameters to an URL
to
 urllib.parse and urlparse.
 Is there anything to /remove/ a query parameter?
 
 I'd say this is outside the scope of add_query_params().

Given the name of the proposed function, sure. But it sounds a bit weird to
have a function dedicated to adding parameters and nothing to remove them.

You could e.g. rename the function to update_query_params() and decide that
every parameter whose specified value is None must atcually be removed from
the URL.

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-ideas] Proposed addtion to urllib.parse in 3.1 (and urlparse in 2.7)

2009-04-13 Thread Michael Foord

Antoine Pitrou wrote:

Mart Sõmermaa mrts.pydev at gmail.com writes:
  

On Mon, Apr 13, 2009 at 12:56 AM, Antoine Pitrou solipsis at pitrou.net


wrote:
  

Mart Sõmermaa mrts.pydev at gmail.com writes:


Proposal: add add_query_params() for appending query parameters to an URL
  

to
  

urllib.parse and urlparse.
Is there anything to /remove/ a query parameter?

I'd say this is outside the scope of add_query_params().



Given the name of the proposed function, sure. But it sounds a bit weird to
have a function dedicated to adding parameters and nothing to remove them.

  


Weird or not, is there actually a *need* to remove query parameters?

Michael


You could e.g. rename the function to update_query_params() and decide that
every parameter whose specified value is None must atcually be removed from
the URL.

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk
  



--
http://www.ironpythoninaction.com/
http://www.voidspace.org.uk/blog


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-ideas] Proposed addtion to urllib.parse in 3.1 (and urlparse in 2.7)

2009-04-13 Thread Antoine Pitrou
Michael Foord fuzzyman at voidspace.org.uk writes:
 
 Weird or not, is there actually a *need* to remove query parameters?

Say you are filtering or sorting data based on some URL parameters. If the user
wants to remove one of those filters, you have to remove the corresponding query
parameter.

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-ideas] Proposed addtion to urllib.parse in 3.1 (and urlparse in 2.7)

2009-04-13 Thread Senthil Kumaran
On Mon, Apr 13, 2009 at 5:31 PM, Antoine Pitrou solip...@pitrou.net wrote:
 Say you are filtering or sorting data based on some URL parameters. If the 
 user
 wants to remove one of those filters, you have to remove the corresponding 
 query
 parameter.

This is a use-case and possibly a hypothetical one which a programmer
might do under special situations.
There are lots of such use cases for which urllib.parse or urlparse
has been used for.

But my thoughts with this proposal is do we have a good RFC
specfications to implementing this?
If not and if we go by just go by the practical needs, then eventually
we will end up with bugs or feature requests in this which will take a
lot of discussions and time to get fixed.

Someone pointed out to read HTML 5.0 spec instead of RFC for this
request. I am yet to do that, but my opinion with respect to additions
to url* module is - backing of RFCs would be the best way to go and
maintain.

-- 
Senthil
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-ideas] Proposed addtion to urllib.parse in 3.1 (and urlparse in 2.7)

2009-04-13 Thread Tino Wildenhain

Hi,

Senthil Kumaran wrote:

On Mon, Apr 13, 2009 at 5:31 PM, Antoine Pitrou solip...@pitrou.net wrote:

Say you are filtering or sorting data based on some URL parameters. If the user
wants to remove one of those filters, you have to remove the corresponding query
parameter.


This is a use-case and possibly a hypothetical one which a programmer
might do under special situations.
There are lots of such use cases for which urllib.parse or urlparse
has been used for.

But my thoughts with this proposal is do we have a good RFC
specfications to implementing this?
If not and if we go by just go by the practical needs, then eventually
we will end up with bugs or feature requests in this which will take a
lot of discussions and time to get fixed.

Someone pointed out to read HTML 5.0 spec instead of RFC for this
request. I am yet to do that, but my opinion with respect to additions
to url* module is - backing of RFCs would be the best way to go and
maintain.



I'd rather like to see an ordered dict like object returned by urlparse 
for parameters this would make extra methods superfluous.


Also note that you might need to specify the encoding
of the data somewhere (most of the times its utf-8 but it depends on the
encoding used in the form page).

A nice add-on would actually be a template form object which holds all
the expected items and their type (and if optional or not) with little
wrappers for common types (int, float, string, list, ...) which
generate nice execeptions when used somewhere and not filled/no default
or actually wrong data for a type.

Otoh, this might get a bit too much in direction of a web app framework.

Regards
Tino

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 2.6.2 final

2009-04-13 Thread Barry Warsaw

On Apr 11, 2009, at 8:20 AM, Mark Dickinson wrote:

On Fri, Apr 10, 2009 at 2:31 PM, Barry Warsaw ba...@python.org  
wrote:

bugs.python.org is apparently down right now, but I set issue 5724 to
release blocker for 2.6.2.  This is waiting for input from Mark  
Dickinson,

and it relates to test_cmath failing on Solaris 10.


I'd prefer to leave this alone for 2.6.2.  There's a fix posted to  
the issue
tracker, but it's not entirely trivial and I think the risk of  
accidental

breakage outweighs the niceness of seeing 'all tests passed' on
Solaris.


Agreed.  I've knocked this back to 'high' priority and accepted it for  
2.6.3.  Mark, feel free to apply it after 2.6.2 is tagged (which  
should be in about 8 hours or 2200 UTC today).


-Barry



PGP.sig
Description: This is a digitally signed message part
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Dropping bytes support in json

2009-04-13 Thread Barry Warsaw

On Apr 10, 2009, at 11:08 AM, James Y Knight wrote:

Until you write a parser for every header, you simply cannot decode  
to unicode. The only sane choices are:

1) raw bytes
2) parsed structured data


The email package does not need a parser for every header, but it  
should provide a framework that applications (or third party  
libraries) can use to extend the built-in header parsers.  A bare  
minimum for functionality requires a Content-Type parser.  I think the  
email package should also include an address header (Originator,  
Destination) parser, and a Message-ID header parser.  Possibly  
others.  The default would probably be some unstructured parser for  
headers like Subject.


-Barry



PGP.sig
Description: This is a digitally signed message part
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Email-SIG] Dropping bytes support in json

2009-04-13 Thread Barry Warsaw

On Apr 10, 2009, at 2:00 PM, Glenn Linderman wrote:

If one name has to be longer than the other, it should be the bytes  
version.  Real user code is more likely to want to use the text  
version, and hopefully there will be more of that type of code than  
implementations using bytes.


Of course, one could use message.header and message.bythdr and  
they'd be the same length.


Actually, thinking about this over the weekend, it's much better for  
message['subject'] to return a Header instance in all cases.  Use  
bytes(header) to get the raw bytes.


A good API for getting the parsed and decoded header values needs to  
take into account that it won't always be a string.  For unstructured  
headers like Subject, str(header) would work just fine.  For an  
Originator or Destination address, what does str(header) return?  And  
what would be the API for getting the set of realname/addresses out of  
the header?


-Barry



PGP.sig
Description: This is a digitally signed message part
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] headers api for email package

2009-04-13 Thread Barry Warsaw

On Apr 11, 2009, at 8:39 AM, Chris Withers wrote:


Barry Warsaw wrote:

 message['Subject']
The raw bytes or the decoded unicode?


A header object.


Yep.  You got there before I did. :)


Okay, so you've picked one.  Now how do you spell the other way?


str(message['Subject'])


Yes for unstructured headers like Subject.  For structured headers...  
hmm.



bytes(message['Subject'])


Yes.

Now, setting headers.  Sometimes you have some unicode thing and  
sometimes you have some bytes.  You need to end up with bytes in  
the ASCII range and you'd like to leave the header value unencoded  
if so.  But in both cases, you might have bytes or characters  
outside that range, so you need an explicit encoding, defaulting to  
utf-8 probably.

 Message.set_header('Subject', 'Some text', encoding='utf-8')
 Message.set_header('Subject', b'Some bytes')


Where you just want a damned valid email and stop making my life  
hard!:


Message['Subject']='Some text'


Yes.  In which case I propose we guess the encoding as 1) ascii, 2)  
utf-8, 3) wtf?



Where you care about what encoding is used:

Message['Subject']=Header('Some text',encoding='utf-8')


Yes.


If you have bytes, for whatever reason:

Message['Subject']=b'some bytes'.decode('utf-8')

...because only you know what encoding those bytes use!


So you're saying that __setitem__() should not accept raw bytes?

-Barry



PGP.sig
Description: This is a digitally signed message part
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Contributor Agreements for Patches - was [Jython-dev] Jython on Google AppEngine!

2009-04-13 Thread Martin v. Löwis
 * What is the scope of a patch that requires a contributor
   agreement?

Van's advise is as follows:

There is no definite ruling on what constitutes work that is
copyright-protected; estimates vary between 10 and 50 lines.
Establishing a rule based on line limits is not supported by
law. Formally, to be on the safe side, paperwork would be needed
for any contribution (no matter how small); this is tedious and
probably unnecessary, as the risk of somebody suing is small.
Also, in that case, there would be a strong case for an implied
license.

So his recommendation is to put the words

By submitting a patch or bug report, you agree to license it under the
Apache Software License, v. 2.0, and further agree that it may be
relicensed as necessary for inclusion in Python or other downstream
projects.

into the tracker; this should be sufficient for most cases. For
committers, we should continue to require contributor forms.

Contributor forms can be electronic, but they need to name the
parties, include a signature (including electronic), and include
a company contribution agreement as necessary.

Regards,
Martin

P.S. I'm sure Van will jump in if I misunderstood parts of this.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Contributor Agreements for Patches - was [Jython-dev] Jython on Google AppEngine!

2009-04-13 Thread Tobias Ivarsson
On Mon, Apr 13, 2009 at 4:44 PM, Martin v. Löwis mar...@v.loewis.dewrote:

  * What is the scope of a patch that requires a contributor
agreement?

 Van's advise is as follows:

 There is no definite ruling on what constitutes work that is
 copyright-protected; estimates vary between 10 and 50 lines.
 Establishing a rule based on line limits is not supported by
 law. Formally, to be on the safe side, paperwork would be needed
 for any contribution (no matter how small); this is tedious and
 probably unnecessary, as the risk of somebody suing is small.
 Also, in that case, there would be a strong case for an implied
 license.

 So his recommendation is to put the words

 By submitting a patch or bug report, you agree to license it under the
 Apache Software License, v. 2.0, and further agree that it may be
 relicensed as necessary for inclusion in Python or other downstream
 projects.

 into the tracker; this should be sufficient for most cases. For
 committers, we should continue to require contributor forms.


Sounds great to me.

Cheers,
Tobias
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] headers api for email package

2009-04-13 Thread R. David Murray

On Mon, 13 Apr 2009 at 10:28, Barry Warsaw wrote:

On Apr 11, 2009, at 8:39 AM, Chris Withers wrote:


Barry Warsaw wrote:
 message['Subject']
 The raw bytes or the decoded unicode?

A header object.


Yep.  You got there before I did. :)


+1


 Okay, so you've picked one.  Now how do you spell the other way?

str(message['Subject'])


Yes for unstructured headers like Subject.  For structured headers... hmm.


Some reasonable printable interpretation that has no semantic meaning?


bytes(message['Subject'])


Yes.

 Now, setting headers.  Sometimes you have some unicode thing and 
 sometimes you have some bytes.  You need to end up with bytes in the 
 ASCII range and you'd like to leave the header value unencoded if so. 
 But in both cases, you might have bytes or characters outside that range, 
 so you need an explicit encoding, defaulting to utf-8 probably.

 Message.set_header('Subject', 'Some text', encoding='utf-8')
 Message.set_header('Subject', b'Some bytes')

Where you just want a damned valid email and stop making my life hard!:

Message['Subject']='Some text'


Yes.  In which case I propose we guess the encoding as 1) ascii, 2) utf-8, 3) 
wtf?


Given some usenet postings I've just dealt with, (3) appears to
sometimes be spelled 'x-unknown' and sometimes (in the most recent case)
'unknown-8bit'.  A quick google turns up a hit on RFC1428 for the latter,
and a bunch of trouble tickets for the former...so I think 'wtf' is
correctly spelled 'unknown-8bit'.

However, it's not supposed to be used by mail composers, who are
expected to know the encoding.  It's for mail gateways that are
transforming something and don't know the encoding.  I'm not
sure what this means for the email module, which certainly
will be used in a mail gatewaysmaybe it's the responsibility
of the application code to explicitly say 'unknown encoding'?


Where you care about what encoding is used:

Message['Subject']=Header('Some text',encoding='utf-8')


Yes.


If you have bytes, for whatever reason:

Message['Subject']=b'some bytes'.decode('utf-8')

...because only you know what encoding those bytes use!


So you're saying that __setitem__() should not accept raw bytes?


If I'm understanding things correctly, if it did accept bytes the
person using that interface would need to do whatever encoding (eg:
encoded-word) was needed, so the interface should check that the byte
string is 8 bit clean.  But having some sort of 'setraw' method on Header
might be better for that case.

--David
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Dropping bytes support in json

2009-04-13 Thread Daniel Stutzbach
On Fri, Apr 10, 2009 at 10:06 PM, Martin v. Löwis mar...@v.loewis.dewrote:

 However, I really think that this question cannot be answered by
 reading the RFC. It should be answered by verifying how people use
 the json library in 2.x.


I use the json module in 2.6 to communicate with a C# JSON library and a
JavaScript JSON library.  The C# and JavaScript libraries produce and
consume the equivalent of str, not the equivalent of bytes.

Yes, the data eventually has to go over a socket as bytes, but that's often
handled by a different layer of code.

For JavaScript, data is typically received by via XMLHttpRequest(), which
automatically figures out the encoding from the HTTP headers and/or other
information (defaulting to UTF-8) and returns a str-like object that I pass
to the JavaScript JSON library.

For C#, I wrap the socket in a StreamReader object, which decodes the byte
stream into a string stream (similar to Python's new TextIOWrapper class).

Hope that helps,

--
Daniel Stutzbach, Ph.D.
President, Stutzbach Enterprises, LLC http://stutzbachenterprises.com
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Google Summer of Code/core Python projects - RFC

2009-04-13 Thread Walter Dörwald
C. Titus Brown wrote:

 [...]
 I have had a hard time getting a good sense of what core code is well
 tested and what is not well tested, across various platforms.  While
 Walter's C/Python integrated code coverage site is nice, it would be
 even nicer to have a way to generate all that information within any
 particular checkout on a real-time basis.

This might have to be done incrementally. Creating the output for
http://coverage.livinglogic.de/ takes about 90 minutes. This breaks done
like this:

Downloading: 2sec
Unpacking: 3sec
Configuring: 30sec
Compiling: 1min
Running the test suite: 1hour
Reading coverage files: 8sec
Generating HTML files: 30min

 Doing so in the context of
 Snakebite would be icing... and I think it's worth supporting in core,
 especially if it can be done without any changes *to* core.

The only thing we'd probably need in core is a way to configure Python
to run with code coverage. The coverage script does this by patching the
makefile.

Running the code coverage script on Snakebite would be awesome. The
script is available from here:

http://pypi.python.org/pypi/pycoco

 - Another small nit is that they should address Python 2.x, too.
 
 I asked that they focus on EITHER 2.x or 3.x, since too broad is an
 equally valid criticism.  Certainly 3.x is the future so I though
 focusing on increasing code coverage, and especially C code coverage,
 could best be applied to 3.x.

Servus,
   Walter
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Email-SIG] headers api for email package

2009-04-13 Thread Stephen J. Turnbull
Barry Warsaw writes:
  On Apr 11, 2009, at 8:39 AM, Chris Withers wrote:
  
   Barry Warsaw wrote:
message['Subject']
   The raw bytes or the decoded unicode?
  
   A header object.
  
  Yep.  You got there before I did. :)
  
   Okay, so you've picked one.  Now how do you spell the other way?
  
   str(message['Subject'])
  
  Yes for unstructured headers like Subject.  For structured headers...  
  hmm.

Well, suppose we get really radical here.  *People* see email as
(rich-)text.  So ... message['Subject'] returns an object, partly to
be consistent with more complex headers' APIs, but partly to remind us
that nothing in email is as simple as it seems.  Now,
str(message['Subject']) is really for presentation to the user, right?
OK, so let's make it a presentation function!  Decode the MIME-words,
optionally unfold folded lines, optionally compress spaces, etc.  This
by default returns the subject field as a single, possibly quite long,
line.  Then a higher-level API can rewrap it, add fonts etc, for fancy
presentation.  This also suggests that we don't the field tag (ie,
Subject) to be part of this value.

Of course a *really* smart higher-level API would access structured
headers based on their structure, not on the one-size-fits-all str()
conversion.

Then MTAs see email as a string of octets.  So guess what:

   bytes(message['Subject'])

gives wire format.  Yow!  I think I'm just joking.  Right?

   Now, setting headers.  Sometimes you have some unicode thing and  
   sometimes you have some bytes.  You need to end up with bytes in  
   the ASCII range and you'd like to leave the header value unencoded  
   if so.  But in both cases, you might have bytes or characters  
   outside that range, so you need an explicit encoding, defaulting to  
   utf-8 probably.
Message.set_header('Subject', 'Some text', encoding='utf-8')
Message.set_header('Subject', b'Some bytes')
  
   Where you just want a damned valid email and stop making my life  
   hard!:

-1  I mean, yeah, Brother, I feel your pain but it just isn't that
easy.  If that were feasible, it would be *criminal* to have a
.set_header() method at all!  In fact,

   Message['Subject']='Some text'

is going to (a) need to take *only* unicodes, or (b) raise Exceptions
at the slightest provocation when handed bytes.

And things only get worse if you try to provide this interface for say
From (let alone Content-Type).  Is it really worth doing the
mapping interface if it's only usable with free-form headers (ie, only
Subject among the commonly used headers)?

  Yes.  In which case I propose we guess the encoding as 1) ascii, 2)  
  utf-8, 3) wtf?

Uh, what guessing?  If you don't know what you have but you believe it
to be a valid header field, then presumably you got it off the wire
and it's still in bytes and you just spit it out on the wire without
trying to decode or encode it.  But as I already said, I think that's
a bad idea.  Otherwise, you should have a unicode, and you simply look
at the range of the string.  If it fits in ASCII, Bob's your uncle.
If not, Bob's your aunt (and you use UTF-8).

   Where you care about what encoding is used:
  
   Message['Subject']=Header('Some text',encoding='utf-8')
  
  Yes.
  
   If you have bytes, for whatever reason:
  
   Message['Subject']=b'some bytes'.decode('utf-8')
  
   ...because only you know what encoding those bytes use!
  
  So you're saying that __setitem__() should not accept raw bytes?

How do you distinguish raw bytes from encoded bytes?
__setitem__() shouldn't accept bytes at all.  There should be an API
which sets a .formatted_for_the_wire member, and it should have a
validate option (ie, when true the API attempts to parse the header
and raises an exception if it fails to do so; when false, it assumes
you know what you're doing and will send out the bytes verbatim).

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Dropping bytes support in json

2009-04-13 Thread Martin v. Löwis
 I use the json module in 2.6 to communicate with a C# JSON library and a
 JavaScript JSON library.  The C# and JavaScript libraries produce and
 consume the equivalent of str, not the equivalent of bytes.

I assume there is a TCP connection between the json module and the
C#/JavaScript libraries?

If so, it doesn't matter what representation these implementations chose
to use.

 Hope that helps,

Maybe I misunderstood, and you are *not* communicating over the wire.
In this case, I'm puzzled how you get the data from Python to the C#
JSON library, or to the JavaScript library.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-ideas] Proposed addtion to urllib.parse in 3.1 (and urlparse in 2.7)

2009-04-13 Thread Steven Bethard
On Mon, Apr 13, 2009 at 2:29 AM, Mart Sõmermaa mrts.py...@gmail.com wrote:


 On Mon, Apr 13, 2009 at 12:56 AM, Antoine Pitrou solip...@pitrou.net
 wrote:

 Mart Sõmermaa mrts.pydev at gmail.com writes:
 
  Proposal: add add_query_params() for appending query parameters to an
  URL to
 urllib.parse and urlparse.

 Is there anything to /remove/ a query parameter?

 I'd say this is outside the scope of add_query_params().

 As for the duplicate handling, I've implemented a threefold strategy that
 should address all use cases raised before:

  def add_query_params(*args, **kwargs):
     
     add_query_parms(url, [allow_dups, [args_dict, [separator]]], **kwargs)

     Appends query parameters to an URL and returns the result.

     :param url: the URL to update, a string.
     :param allow_dups: if
     * True: plainly append new parameters, allowing all duplicates
   (default),
     * False: disallow duplicates in values and regroup keys so that
   different values for the same key are adjacent,
     * None: disallow duplicates in keys -- each key can have a single
   value and later values override the value (like dict.update()).

Unnamed flag parameters are unfriendly to the reader. If I see something like:

  add_query_params(url, True, dict(a=b, c=d))

I can pretty much guess what the first and third arguments are, but I
have no clue for the second. Even if I have read the documentation
before, I may not remember whether the middle argument is allow_dups
or keep_dups.

Steve

     :param args_dict: optional dictionary of parameters, default is {}.
     :param separator: either ';' or '', the separator between key-value
     pairs, default is ''.
     :param kwargs: parameters as keyword arguments.

     :return: original URL with updated query parameters or the original URL
     unchanged if no parameters given.
     

 The commit is

 http://github.com/mrts/qparams/blob/b9bdbec46bf919d142ff63e6b2b822b5d57b6f89/qparams.py

 extensive description of the behaviour is in the doctests.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Email-SIG] headers api for email package

2009-04-13 Thread Steven D'Aprano
On Tue, 14 Apr 2009 03:15:20 am Stephen J. Turnbull wrote:

 *People* see email as (rich-)text.

We do?

It's not clear what you actually mean by (rich-)text. In the context 
of email, I understand it to mean HTML in the body, web-bugs, security 
exploits, 36pt hot-pink bold text on a lime-green background, and all 
the other wonderful things modern mail clients let you put in your 
email. But as far as I know, no mail client tries to render HTML tags 
inside mail headers, so you're probably not talking about HTML 
rich-text. I guess you mean Unicode characters. Am I right?

Now, correct me if I'm wrong, but I don't think mail headers can 
actually be anything *but* bytes. I see that my mail client, at least, 
sends bytes in the Subject header. If I try to send characters, e.g. 
the subject header Testing-β- (without the quotes), what actually 
gets sent is the bytes =?utf-8?q?Testing-=CE=B2-?= (again without the 
quotation marks). This seems to be covered by RFC 2047:

http://tools.ietf.org/html/rfc2047

If you're proposing converting those bytes into characters, that's all 
very well and good, but what's your strategy for dealing with the 
inevitable wrongly-formatted headers? If the header can't be correctly 
decoded into text, there still needs to be a way to get to the raw 
bytes. Apart from (e.g.) mail processing apps like SpamBayes which will 
want to inspect the raw bytes, mail readers will need to deal with 
badly formatted mail. The RFC states:

However, a mail reader MUST NOT prevent the display or handling of a 
message because an 'encoded-word' is incorrectly formed.



[...]
 Then MTAs see email as a string of octets.  So guess what:

    bytes(message['Subject'])

 gives wire format.  Yow!  I think I'm just joking.  Right?

Er, I'm not sure. Are you joking? I hope not, because it is important to 
be able to get to the raw, unmodified bytes that the MTA sees, without 
all the fancy processing you suggest.


[...]
 Otherwise, you should have a unicode, and you simply look
 at the range of the string.  If it fits in ASCII, Bob's your uncle.
 If not, Bob's your aunt (and you use UTF-8).

Again, correct me if I'm wrong, but *all* valid mail headers must fit in 
ASCII. RFC 5335 defines an experimental approach to allowing full 
Unicode in mail headers, but surely it's going to be a while before 
that's common, let alone standard.

http://tools.ietf.org/html/rfc5335



-- 
Steven D'Aprano
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Dropping bytes support in json

2009-04-13 Thread Daniel Stutzbach
On Mon, Apr 13, 2009 at 12:19 PM, Martin v. Löwis mar...@v.loewis.dewrote:

  I use the json module in 2.6 to communicate with a C# JSON library and a
  JavaScript JSON library.  The C# and JavaScript libraries produce and
  consume the equivalent of str, not the equivalent of bytes.

 I assume there is a TCP connection between the json module and the
 C#/JavaScript libraries?


Yes, there's a TCP connection.  Sorry for not making that clear to begin
with.

I also sometimes store JSON objects in a database.  In that case, I pass
strings to the database API which stores them in a TEXT field.  Obviously
somewhere they get encoding to bytes, but that's handled by the database.


 If so, it doesn't matter what representation these implementations chose
 to use.


True, I can always convert from bytes to str or vise versa.  Sometimes it is
illustrative to see how others have chosen to solve the same problem.  The
JSON specification and other implementations serializes an object to a
string.  Python's json.dumps() needs to either return a str or let the user
specify an encoding.

At least one of these two needs to work:

json.dumps({}).encode('utf-16le')  # dumps() returns str
'{\x00}\x00'

json.dumps({}, encoding='utf-16le')  # dumps() returns bytes
'{\x00}\x00'

In 2.6, the first one works.  The second incorrectly returns '{}'.

--
Daniel Stutzbach, Ph.D.
President, Stutzbach Enterprises, LLC http://stutzbachenterprises.com
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Dropping bytes support in json

2009-04-13 Thread James Y Knight

On Apr 13, 2009, at 10:11 AM, Barry Warsaw wrote:
The email package does not need a parser for every header, but it  
should provide a framework that applications (or third party  
libraries) can use to extend the built-in header parsers.  A bare  
minimum for functionality requires a Content-Type parser.  I think  
the email package should also include an address header (Originator,  
Destination) parser, and a Message-ID header parser.  Possibly others.


Sure, that's fine...

The default would probably be some unstructured parser for headers  
like Subject.



But for unknown headers, it's not a useful choice to return a str  
object. str is just one possible structured data representation for  
a header: there's no correct useful decoding of all headers into str.  
Of course for the Subject header, str is the correct result type,  
but that's not a default, that's explicit support for Subject. You  
can't correctly decode To into a str, so what makes you think you  
can decode X-Gabazaborph into str?


The only useful and correct representation for unknown (or  
unimplemented) headers is the raw bytes.


James

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Dropping bytes support in json

2009-04-13 Thread Martin v. Löwis
 Yes, there's a TCP connection.  Sorry for not making that clear to begin
 with.
 
 If so, it doesn't matter what representation these implementations chose
 to use.
 
 
 True, I can always convert from bytes to str or vise versa.

I think you are missing the point. It will not be necessary to convert.
You can write the JSON into the TCP connection in Python, and it will
come out just fine as strings just fine in C# and JavaScript. This
is how middleware works - it abstracts from programming languages, and
allows for different representations in different languages, in a
manner invisible to the participating processes.

 At least one of these two needs to work:
 
 json.dumps({}).encode('utf-16le')  # dumps() returns str
 '{\x00}\x00'
 
 json.dumps({}, encoding='utf-16le')  # dumps() returns bytes
 '{\x00}\x00'
 
 In 2.6, the first one works.  The second incorrectly returns '{}'.

Ok, that might be a bug in the JSON implementation - but you shouldn't
be using utf-16le, anyway. Use UTF-8 always, and it will work fine.

The questions is: which of them is more appropriate, if, what you want,
is bytes. I argue that the second form is better, since it saves you
an encode invocation.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-ideas] Proposed addtion to urllib.parse in 3.1 (and urlparse in 2.7)

2009-04-13 Thread Mart Sõmermaa
On Mon, Apr 13, 2009 at 8:23 PM, Steven Bethard
steven.beth...@gmail.com wrote:

 On Mon, Apr 13, 2009 at 2:29 AM, Mart Sõmermaa mrts.py...@gmail.com wrote:
 
 
  On Mon, Apr 13, 2009 at 12:56 AM, Antoine Pitrou solip...@pitrou.net
  wrote:
 
  Mart Sõmermaa mrts.pydev at gmail.com writes:
  
   Proposal: add add_query_params() for appending query parameters to an
   URL to
  urllib.parse and urlparse.
 
  Is there anything to /remove/ a query parameter?
 
  I'd say this is outside the scope of add_query_params().
 
  As for the duplicate handling, I've implemented a threefold strategy that
  should address all use cases raised before:
 
   def add_query_params(*args, **kwargs):
  
  add_query_parms(url, [allow_dups, [args_dict, [separator]]], **kwargs)
 
  Appends query parameters to an URL and returns the result.
 
  :param url: the URL to update, a string.
  :param allow_dups: if
  * True: plainly append new parameters, allowing all duplicates
(default),
  * False: disallow duplicates in values and regroup keys so that
different values for the same key are adjacent,
  * None: disallow duplicates in keys -- each key can have a single
value and later values override the value (like dict.update()).

 Unnamed flag parameters are unfriendly to the reader. If I see something like:

  add_query_params(url, True, dict(a=b, c=d))

 I can pretty much guess what the first and third arguments are, but I
 have no clue for the second. Even if I have read the documentation
 before, I may not remember whether the middle argument is allow_dups
 or keep_dups.

Keyword arguments are already used for specifying the arguments to the
query, so naming can't be used. Someone may need an 'allow_dups' key
in their query and forget to pass it in params_dict.

A default behaviour should be found that works according to most
user's expectations so that they don't need to use the positional
arguments generally.

Antoine Pitrou wrote:
 You could e.g. rename the function to update_query_params() and decide that
 every parameter whose specified value is None must atcually be removed from
 the URL.

I agree that removing parameters is useful. Currently, None is used
for signifying a key with no value. Instead, booleans could be used:
if a key is True (but obviously not any other value that evaluates to
True), it is a key with no value, if False (under the same evaluation
restriction), it should be removed from the query if present. None
should not be treated specially under that scheme. As an example:

 update_query_params('http://example.com/?q=foo', q=False, a=True, b='c', 
 d=None)
'http://example.com/?ab=cd=None'

However,
1) I'm not sure about the implications of 'foo is True', I have never
used it and PEP 8 explicitly warns against it -- does it work
consistently across different Python implementations? (Assuming on the
grounds that True should be a singleton no different from None that it
should work.)
2) the API gets overly complicated -- as per the complaint above, it's
usability-challenged already.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Dropping bytes support in json

2009-04-13 Thread Bob Ippolito
On Mon, Apr 13, 2009 at 1:02 PM, Martin v. Löwis mar...@v.loewis.de wrote:
 Yes, there's a TCP connection.  Sorry for not making that clear to begin
 with.

     If so, it doesn't matter what representation these implementations chose
     to use.


 True, I can always convert from bytes to str or vise versa.

 I think you are missing the point. It will not be necessary to convert.
 You can write the JSON into the TCP connection in Python, and it will
 come out just fine as strings just fine in C# and JavaScript. This
 is how middleware works - it abstracts from programming languages, and
 allows for different representations in different languages, in a
 manner invisible to the participating processes.

 At least one of these two needs to work:

 json.dumps({}).encode('utf-16le')  # dumps() returns str
 '{\x00}\x00'

 json.dumps({}, encoding='utf-16le')  # dumps() returns bytes
 '{\x00}\x00'

 In 2.6, the first one works.  The second incorrectly returns '{}'.

 Ok, that might be a bug in the JSON implementation - but you shouldn't
 be using utf-16le, anyway. Use UTF-8 always, and it will work fine.

 The questions is: which of them is more appropriate, if, what you want,
 is bytes. I argue that the second form is better, since it saves you
 an encode invocation.

It's not a bug in dumps, it's a matter of not reading the
documentation. The encoding parameter of dumps decides how byte
strings should be interpreted, not what the output encoding is.

The output of json/simplejson dumps for Python 2.x is either an ASCII
bytestring (default) or a unicode string (when ensure_ascii=False).
This is very practical in 2.x because an ASCII bytestring can be
treated as either text or bytes in most situations, isn't going to get
mangled over any kind of encoding mismatch (as long as it's an ASCII
superset), and skips an encoding step if getting sent over the wire..

 simplejson.dumps(['\x00f\x00o\x00o'], encoding='utf-16be')
'[foo]'
 simplejson.dumps(['\x00f\x00o\x00o'], encoding='utf-16be', 
 ensure_ascii=False)
u'[foo]'

-bob
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Dropping bytes support in json

2009-04-13 Thread Daniel Stutzbach
On Mon, Apr 13, 2009 at 3:28 PM, Bob Ippolito b...@redivi.com wrote:

 It's not a bug in dumps, it's a matter of not reading the
 documentation. The encoding parameter of dumps decides how byte
 strings should be interpreted, not what the output encoding is.


You're right; I apologize for not reading more closely.

--
Daniel Stutzbach, Ph.D.
President, Stutzbach Enterprises, LLC http://stutzbachenterprises.com
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Dropping bytes support in json

2009-04-13 Thread Daniel Stutzbach
On Mon, Apr 13, 2009 at 3:02 PM, Martin v. Löwis mar...@v.loewis.dewrote:

  True, I can always convert from bytes to str or vise versa.

 I think you are missing the point. It will not be necessary to convert.


Sometimes I want bytes and sometimes I want str.  I am going to be
converting some of the time. ;-)

Below is a basic CGI application that assumes that json module works with
str, not bytes.  How would you write it if the json module does not support
returning a str?

print(Content-Type: application/json; charset=utf-8)
input_object = json.loads(sys.stdin.read())
output_object = do_some_work(input_object)
print(json.dumps(output_object))
print()

The questions is: which of them is more appropriate, if, what you want,
 is bytes. I argue that the second form is better, since it saves you
 an encode invocation.


If what you want is bytes, encoding has to happen somewhere.  If the json
module has some optimizations to do the encoding at the same time as the
serialization, great.  However, based on the original post of this thread,
it sounds like that code doesn't exist or doesn't work correctly.

What's the benefit of preventing users from getting a str out if that's what
they want?

--
Daniel Stutzbach, Ph.D.
President, Stutzbach Enterprises, LLC http://stutzbachenterprises.com
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-ideas] Proposed addtion to urllib.parse in 3.1 (and urlparse in 2.7)

2009-04-13 Thread Greg Ewing

Antoine Pitrou wrote:


Say you are filtering or sorting data based on some URL parameters. If the user
wants to remove one of those filters, you have to remove the corresponding query
parameter.


For an application like that, I would be keeping the
parameters as a list or some other structured way and
only converting them to a URL when needed.

--
Greg
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Dropping bytes support in json

2009-04-13 Thread Greg Ewing

Barry Warsaw wrote:
The default 
would probably be some unstructured parser for  headers like Subject.


Only for headers known to be unstructured, I think.
Completely unknown headers should be available only
as bytes.

--
Greg
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Email-SIG] Dropping bytes support in json

2009-04-13 Thread Greg Ewing

Barry Warsaw wrote:
For an  
Originator or Destination address, what does str(header) return?


It should be an error, I think.

--
Greg
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Dropping bytes support in json

2009-04-13 Thread Alexandre Vassalotti
On Mon, Apr 13, 2009 at 5:25 PM, Daniel Stutzbach
dan...@stutzbachenterprises.com wrote:
 On Mon, Apr 13, 2009 at 3:02 PM, Martin v. Löwis mar...@v.loewis.de
 wrote:

  True, I can always convert from bytes to str or vise versa.

 I think you are missing the point. It will not be necessary to convert.

 Sometimes I want bytes and sometimes I want str.  I am going to be
 converting some of the time. ;-)

 Below is a basic CGI application that assumes that json module works with
 str, not bytes.  How would you write it if the json module does not support
 returning a str?

 print(Content-Type: application/json; charset=utf-8)
 input_object = json.loads(sys.stdin.read())
 output_object = do_some_work(input_object)
 print(json.dumps(output_object))
 print()


Like this?

print(Content-Type: application/json; charset=utf-8)
input_object = json.loads(sys.stdin.buffer.read())
output_object = do_some_work(input_object)
stdout.buffer.write(json.dumps(output_object))


-- Alexandre
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Email-SIG] Dropping bytes support in json

2009-04-13 Thread R. David Murray

On Tue, 14 Apr 2009 at 11:28, Greg Ewing wrote:


Barry Warsaw wrote:

 For an  Originator or Destination address, what does str(header) return?


It should be an error, I think.


That doesn't make sense to me.  str(arbitrary object) should return
_something_.

--David
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Email-SIG] Dropping bytes support in json

2009-04-13 Thread Greg Ewing

R. David Murray wrote:


That doesn't make sense to me.  str(arbitrary object) should return
_something_.


Well, it might return something like AddressList
object at 0x123456. But you shouldn't rely on it
to give you anything useful for an arbitrary header.

--
Greg
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com



Re: [Python-Dev] Dropping bytes support in json

2009-04-13 Thread Antoine Pitrou
Bob Ippolito bob at redivi.com writes:
 
 The output of json/simplejson dumps for Python 2.x is either an ASCII
 bytestring (default) or a unicode string (when ensure_ascii=False).
 This is very practical in 2.x because an ASCII bytestring can be
 treated as either text or bytes in most situations, isn't going to get
 mangled over any kind of encoding mismatch (as long as it's an ASCII
 superset), and skips an encoding step if getting sent over the wire..

Which means that the json module already deals with text rather than bytes,
apart from the optimization that pure ASCII text is returned as 8-bit strings.

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Dropping bytes support in json

2009-04-13 Thread Greg Ewing

Alexandre Vassalotti wrote:


print(Content-Type: application/json; charset=utf-8)
input_object = json.loads(sys.stdin.read())
output_object = do_some_work(input_object)
print(json.dumps(output_object))
print()


That assumes the encoding being used by stdout has
ascii as a subset.

--
Greg
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Shorter float repr in Python 3.1?

2009-04-13 Thread Eric Smith
Mark has uploaded our newest work to Rietveld, again at 
http://codereview.appspot.com/33084/show. Since the last version, Mark 
has added 387 support (and other fixes) and I've added localized 
formatting ('n') back in as well as ',' formatting for float and int. I 
think this addresses all open issues. If you have time, please review 
the code on Rietveld.


We believe we're ready to merge this back into the py3k branch. Pending 
any comments here or on Rietveld, we'll do the merge in the next day or so.


Before then, if anyone could build and test the py3k-short-float-repr 
branch on any of the following machines, that would be great:


Windows (preferably 64-bit)
Itanium
Old Intel/Linux (e.g., the snakebite nitrogen box)
Something bigendian, like a G4 Mac

We're pretty well tested on x86 Mac and Linux, and I've run it once on 
my Windows 32-bit machine.


I have a Snakebite account, and I'll try running on nitrogen once I 
figure out how to log in again.


I just had Itanium and PPC buildbots test our branch, and they both 
succeeded (or at least failed with errors not related to our changes).


Eric.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Shorter float repr in Python 3.1?

2009-04-13 Thread Benjamin Peterson
2009/4/13 Eric Smith e...@trueblade.com:
 Mark has uploaded our newest work to Rietveld, again at
 http://codereview.appspot.com/33084/show. Since the last version, Mark has
 added 387 support (and other fixes) and I've added localized formatting
 ('n') back in as well as ',' formatting for float and int. I think this
 addresses all open issues. If you have time, please review the code on
 Rietveld.

 We believe we're ready to merge this back into the py3k branch. Pending any
 comments here or on Rietveld, we'll do the merge in the next day or so.

Cool. Will you use svnmerge.py to integrate the branch? After having
some odd behavior merging the io-c branch, suggest you just apply a
patch to the py3k branch,



-- 
Regards,
Benjamin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Shorter float repr in Python 3.1?

2009-04-13 Thread Eric Smith

Benjamin Peterson wrote:

Cool. Will you use svnmerge.py to integrate the branch? After having
some odd behavior merging the io-c branch, suggest you just apply a
patch to the py3k branch,


We're just going to apply 2 patches, without using svnmerge. First we'll 
add new files and the configure changes. Once we're sure that builds 
everywhere, then the second step will actually hook in the new functions 
and will have the formatting changes.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-ideas] Proposed addtion to urllib.parse in 3.1 (and urlparse in 2.7)

2009-04-13 Thread Steven Bethard
On Mon, Apr 13, 2009 at 1:14 PM, Mart Sõmermaa mrts.py...@gmail.com wrote:
 On Mon, Apr 13, 2009 at 8:23 PM, Steven Bethard steven.beth...@gmail.com 
 wrote:
 On Mon, Apr 13, 2009 at 2:29 AM, Mart Sõmermaa mrts.py...@gmail.com wrote:
  As for the duplicate handling, I've implemented a threefold strategy that
  should address all use cases raised before:
 
   def add_query_params(*args, **kwargs):
      
      add_query_parms(url, [allow_dups, [args_dict, [separator]]], **kwargs)
 
      Appends query parameters to an URL and returns the result.
 
      :param url: the URL to update, a string.
      :param allow_dups: if
          * True: plainly append new parameters, allowing all duplicates
            (default),
          * False: disallow duplicates in values and regroup keys so that
            different values for the same key are adjacent,
          * None: disallow duplicates in keys -- each key can have a single
            value and later values override the value (like dict.update()).

 Unnamed flag parameters are unfriendly to the reader. If I see something 
 like:

  add_query_params(url, True, dict(a=b, c=d))

 I can pretty much guess what the first and third arguments are, but I
 have no clue for the second. Even if I have read the documentation
 before, I may not remember whether the middle argument is allow_dups
 or keep_dups.

 Keyword arguments are already used for specifying the arguments to the
 query, so naming can't be used. Someone may need an 'allow_dups' key
 in their query and forget to pass it in params_dict.

 A default behaviour should be found that works according to most
 user's expectations so that they don't need to use the positional
 arguments generally.

I believe the usual Python approach here is to have two variants of
the function, add_query_params and add_query_params_no_dups (or
whatever you want to name them). That way the flag parameter is
named right in the function name.

Steve
-- 
I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a
tiny blip on the distant coast of sanity.
--- Bucky Katt, Get Fuzzy
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Shorter float repr in Python 3.1?

2009-04-13 Thread Ned Deily
In article 49e3d34e.8040...@trueblade.com,
 Eric Smith e...@trueblade.com wrote:
 Before then, if anyone could build and test the py3k-short-float-repr 
 branch on any of the following machines, that would be great:
 
[...]
 Something bigendian, like a G4 Mac

I'll crank up some OS X installer builds and run them on G3 and G4 Macs 
vs 32-/64- Intel.  Any tests of interest beyond the default regttest.py?

-- 
 Ned Deily,
 n...@acm.org

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Dropping bytes support in json

2009-04-13 Thread Martin v. Löwis
 Below is a basic CGI application that assumes that json module works
 with str, not bytes.  How would you write it if the json module does not
 support returning a str?

In a CGI application, you shouldn't be using sys.stdin or print().
Instead, you should be using sys.stdin.buffer (or sys.stdin.buffer.raw),
and sys.stdout.buffer.raw. A CGI script essentially does binary IO;
if you use TextIO, there likely will be bugs (e.g. if you have
attachments of type application/octet-stream).

 print(Content-Type: application/json; charset=utf-8)
 input_object = json.loads(sys.stdin.read())
 output_object = do_some_work(input_object)
 print(json.dumps(output_object))
 print()

out = sys.stdout.buffer.raw
out.write(bContent-Type: application/json; charset=utf-8\n\n)
input_object = json.loads(sys.stdin.buffer.raw.read())
output_object = do_some_work(input_object)
out.write(json.dumps(output_object))

 What's the benefit of preventing users from getting a str out if that's
 what they want?

If they really want it, there is no benefit from preventing them.
I'm just puzzled why they want it, and what possible applications
might be where they want it. Perhaps they misunderstand something
when they think they want it.

Regards,
Martin

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com