[issue874900] threading module can deadlock after fork

2008-07-12 Thread Gregory P. Smith

Gregory P. Smith <[EMAIL PROTECTED]> added the comment:

I still don't like the _after_fork() implementation.  Its O(n) where n
== number of threads the parent process had.

Very wasteful when the fork() was done in the most common case of being
followed by an exec and calling os._exit().  It won't scale nicely with
system load (forks will start taking longer and longer the more threads
exist).

Could os.fork() be extended to have an optional will_exec_or_die
parameter that determines if _after_fork() is even called at all? 
Things like subprocess should pass in True.  The default should be False
for compatiblity.

___
Python tracker <[EMAIL PROTECTED]>

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2828] Clean up undoc.rst

2008-07-12 Thread Brett Cannon

Brett Cannon <[EMAIL PROTECTED]> added the comment:

Only three modules remain; ntpath, posixpath, and sunaudio. Waiting on
python-3000 to reply to my inquiry as to whether I can take sunaudio out.

--
resolution:  -> fixed
status: open -> closed

___
Python tracker <[EMAIL PROTECTED]>

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3339] dummy_thread LockType.acquire() always returns None, should be True or False

2008-07-12 Thread Brett Cannon

Changes by Brett Cannon <[EMAIL PROTECTED]>:


--
resolution: accepted -> fixed

___
Python tracker <[EMAIL PROTECTED]>

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3339] dummy_thread LockType.acquire() always returns None, should be True or False

2008-07-12 Thread Brett Cannon

Brett Cannon <[EMAIL PROTECTED]> added the comment:

r64903-64905 have the fixes for trunk, 3.0, and 2.5, respectively.
Thanks for reporting this, Henk. Andrii, I never even looked at your
patch since while I was evaluating the bug the problem was rather
obvious and simple, as you said. =) Thanks for the work, though.

--
resolution:  -> accepted
status: open -> closed

___
Python tracker <[EMAIL PROTECTED]>

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1778443] robotparser.py fixes

2008-07-12 Thread Benjamin Peterson

Benjamin Peterson <[EMAIL PROTECTED]> added the comment:

OK. I committed the patch in r64901. Thanks for the work!

--
resolution: remind -> accepted
status: open -> closed

___
Python tracker <[EMAIL PROTECTED]>

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue874900] threading module can deadlock after fork

2008-07-12 Thread Gregory P. Smith

Changes by Gregory P. Smith <[EMAIL PROTECTED]>:


Removed file: http://bugs.python.org/file10872/fork-and-thread4.patch

___
Python tracker <[EMAIL PROTECTED]>

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue874900] threading module can deadlock after fork

2008-07-12 Thread Gregory P. Smith

Changes by Gregory P. Smith <[EMAIL PROTECTED]>:


Removed file: http://bugs.python.org/file10869/fork-and-thread3.patch

___
Python tracker <[EMAIL PROTECTED]>

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue874900] threading module can deadlock after fork

2008-07-12 Thread Gregory P. Smith

Changes by Gregory P. Smith <[EMAIL PROTECTED]>:


Removed file: http://bugs.python.org/file10859/fork-and-thread2.patch

___
Python tracker <[EMAIL PROTECTED]>

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue874900] threading module can deadlock after fork

2008-07-12 Thread Gregory P. Smith

Changes by Gregory P. Smith <[EMAIL PROTECTED]>:


Removed file: http://bugs.python.org/file10855/fork-and-thread.patch

___
Python tracker <[EMAIL PROTECTED]>

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue874900] threading module can deadlock after fork

2008-07-12 Thread Gregory P. Smith

Gregory P. Smith <[EMAIL PROTECTED]> added the comment:

and a few more bugs.  a new patch is attached.  With this applied,
pitrou's fork_threading.py bug demonstration script finally does the
right thing.

test_threading, including the new deadlock tests, and
test_multiprocessing both pass.

Tested on x86 MacOS X 10.4 & x86 Ubuntu 8.04.

Would someone please try this on other platforms?

(The new threading test's use of subprocess with [sys.executable, '-c',
"""long script"""] makes me slightly nervous about portability outside
of Linux and BSD.)

--
versions: +Python 2.5
Added file: http://bugs.python.org/file10889/fork-and-threading5-gps01.patch

___
Python tracker <[EMAIL PROTECTED]>

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3349] search for 'patch' produces roundup error

2008-07-12 Thread anatoly techtonik

anatoly techtonik <[EMAIL PROTECTED]> added the comment:

Thanks.

___
Python tracker <[EMAIL PROTECTED]>

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue874900] threading module can deadlock after fork

2008-07-12 Thread Gregory P. Smith

Gregory P. Smith <[EMAIL PROTECTED]> added the comment:

The existing fork-and-thread4.patch doesn't actually reset
_active_limbo_lock.  Its missing a "global _active_limbo_lock" statement
in the threading._after_fork() function.

___
Python tracker <[EMAIL PROTECTED]>

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3317] duplicate lines in zipfile.py

2008-07-12 Thread Alan McIntyre

Alan McIntyre <[EMAIL PROTECTED]> added the comment:

Thanks for fixing this, Amaury.  I ran the test_zipfile64 and
test_zipfile tests on Linux and OS X, and they pass.

___
Python tracker <[EMAIL PROTECTED]>

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue874900] threading module can deadlock after fork

2008-07-12 Thread Gregory P. Smith

Changes by Gregory P. Smith <[EMAIL PROTECTED]>:


--
assignee:  -> gregory.p.smith
nosy: +gregory.p.smith

___
Python tracker <[EMAIL PROTECTED]>

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2377] Replace import.c with a pure Python implementation

2008-07-12 Thread Benjamin Peterson

Changes by Benjamin Peterson <[EMAIL PROTECTED]>:


--
nosy: +benjamin.peterson

___
Python tracker <[EMAIL PROTECTED]>

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3320] various doc typos

2008-07-12 Thread Benjamin Peterson

Changes by Benjamin Peterson <[EMAIL PROTECTED]>:


--
resolution:  -> fixed
status: open -> closed

___
Python tracker <[EMAIL PROTECTED]>

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3320] various doc typos

2008-07-12 Thread Benjamin Peterson

Benjamin Peterson <[EMAIL PROTECTED]> added the comment:

Thanks very much, and feel free to be bored anytime you want! :)
Committed in r64897.

Jesse, can you look at the mp item?

--
nosy: +benjamin.peterson, jnoller

___
Python tracker <[EMAIL PROTECTED]>

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2910] Remove plat-mac from 3.0

2008-07-12 Thread Benjamin Peterson

Benjamin Peterson <[EMAIL PROTECTED]> added the comment:

Carbon may still be in your wc because svn didn't delete it on update.
(I can't remember the exact behavior of SVN with dir deletes, but it's
weird.) See [1]. I killed plat-mac off in r64896.

[1] http://svn.python.org/view/python/branches/py3k/Lib/plat-mac/

--
resolution:  -> fixed
status: open -> closed

___
Python tracker <[EMAIL PROTECTED]>

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2775] Implement PEP 3108

2008-07-12 Thread Brett Cannon

Changes by Brett Cannon <[EMAIL PROTECTED]>:


--
dependencies:  -Fixer for dbm is failing

___
Python tracker <[EMAIL PROTECTED]>

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2885] Create the urllib package

2008-07-12 Thread Brett Cannon

Brett Cannon <[EMAIL PROTECTED]> added the comment:

Docs have been updated. At this point the fixers for urllib and urllib2
is all that is left.

___
Python tracker <[EMAIL PROTECTED]>

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2910] Remove plat-mac from 3.0

2008-07-12 Thread Brett Cannon

Brett Cannon <[EMAIL PROTECTED]> added the comment:

I notice that Carbon ended back up in plat-mac. Is that an accident?
What is the status of being able to ditch the directory?

--
assignee: brett.cannon -> 
priority: normal -> critical

___
Python tracker <[EMAIL PROTECTED]>

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2828] Clean up undoc.rst

2008-07-12 Thread Brett Cannon

Changes by Brett Cannon <[EMAIL PROTECTED]>:


--
priority: high -> critical

___
Python tracker <[EMAIL PROTECTED]>

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2377] Replace import.c with a pure Python implementation

2008-07-12 Thread Brett Cannon

Changes by Brett Cannon <[EMAIL PROTECTED]>:


--
priority: high -> normal

___
Python tracker <[EMAIL PROTECTED]>

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2377] Replace import.c with a pure Python implementation

2008-07-12 Thread Brett Cannon

Changes by Brett Cannon <[EMAIL PROTECTED]>:


--
priority: critical -> high

___
Python tracker <[EMAIL PROTECTED]>

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2377] Replace import.c with a pure Python implementation

2008-07-12 Thread Brett Cannon

Changes by Brett Cannon <[EMAIL PROTECTED]>:


--
versions: +Python 3.1 -Python 3.0

___
Python tracker <[EMAIL PROTECTED]>

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3297] Python interpreter uses Unicode surrogate pairs only before the pyc is created

2008-07-12 Thread Adam Olsen

Adam Olsen <[EMAIL PROTECTED]> added the comment:

Err, to clarify, the parse/compile/whatever stages is producing broken
UTF-32 (surrogates are ill-formed there too), and that gets transformed
into CESU-8 when the .pyc is saved.

___
Python tracker <[EMAIL PROTECTED]>

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3297] Python interpreter uses Unicode surrogate pairs only before the pyc is created

2008-07-12 Thread Adam Olsen

Adam Olsen <[EMAIL PROTECTED]> added the comment:

Marc, perhaps Unicode has refined their definitions since you last looked?

Valid UTF-8 *cannot* contain surrogates[1].  If it does, you have
CESU-8[2][3], not UTF-8.

So there are two bugs: first, the UTF-8 codec should refuse to load
surrogates.  Second, since the original bug showed up before the .pyc is
created, something in the parse/compilation/whatever stage is producing
CESU-8.


[1] 4th bullet point of D92 in
http://www.unicode.org/versions/Unicode5.0.0/ch03.pdf
[2] http://unicode.org/reports/tr26/
[3] http://en.wikipedia.org/wiki/CESU-8

___
Python tracker <[EMAIL PROTECTED]>

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3324] Broken link in online doc

2008-07-12 Thread Benjamin Peterson

Benjamin Peterson <[EMAIL PROTECTED]> added the comment:

This has been fixed in the development docs:
http://docs.python.org/dev/install/index.html

--
nosy: +benjamin.peterson
resolution:  -> out of date
status: open -> closed

___
Python tracker <[EMAIL PROTECTED]>

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3347] urllib.robotparser doesn't work in Py3k

2008-07-12 Thread Brett Cannon

Changes by Brett Cannon <[EMAIL PROTECTED]>:


--
nosy: +brett.cannon

___
Python tracker <[EMAIL PROTECTED]>

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3349] search for 'patch' produces roundup error

2008-07-12 Thread Brett Cannon

Brett Cannon <[EMAIL PROTECTED]> added the comment:

I already reported it:
http://psf.upfronthosting.co.za/roundup/meta/issue213 .

--
nosy: +brett.cannon

___
Python tracker <[EMAIL PROTECTED]>

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2280] parser module chokes on unusual characters

2008-07-12 Thread Benjamin Peterson

Changes by Benjamin Peterson <[EMAIL PROTECTED]>:


--
resolution:  -> out of date
status: open -> closed

___
Python tracker <[EMAIL PROTECTED]>

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3300] urllib.quote and unquote - Unicode issues

2008-07-12 Thread Matt Giuca

Matt Giuca <[EMAIL PROTECTED]> added the comment:

So today I grepped for "urllib" in the entire library in an effort to
track down every dependency on quote and unquote to see exactly how my
patch breaks other code. I've now investigated every module in the
library which uses quote, unquote or urlencode, and my findings are
documented below in detail.

So far I have found no code "breakage" except for the original
email.util issue I fixed in patch 2. Of course that doesn't mean the
behaviour hasn't changed. Nearly all modules in the report below have
changed their behaviour so they used to deal with Latin-1-encoded URLs
and now deal with UTF-8-encoded URLs. As discussed at length above, I
see this as a positive change, since nearly everybody encodes URLs in
UTF-8, and of course it allows for all characters.

I also point out that the http.server module (unpatched) is internally
broken when dealing with filenames with characters outside range(0,256);
my patch fixes it.

I'm attaching patch 5, which adds a bunch of new test cases to various
modules which demonstrate those modules correctly handling UTF-8-encoded
URLs. It also fixes a bug in email.utils which I introduced in patch 2.

Note that I haven't yet fully investigated urllib.request.

Aside from that, the only remaining matter is whether or not it's better
to encode URLs as UTF-8 or Latin-1 by default, and I'm pretty sure that
question doesn't need debate.

So basically I think if there's support for it, this patch is just about
ready to be accepted. I'm hoping it can be included in the 3.0b2 release
next week.

I'd be glad to hear any feedback about this proposal.

Not Yet Investigated


./urllib/request.py
By far the biggest user of quote and unquote.
username, password, hostname and paths are now all converted
to/from UTF-8 percent-encodings.
Other concerns are:
* Data in the form application/x-www-form-urlencoded
* FTP access
I think this needs to be tested further.

Looks fine, not tested
--

./xmlrpc/client.py
Just used to decode URI auth string (user:pass). This will change
to UTF-8, but is probably OK.
./logging/handlers.py
Just uses it in the HTTP handler to encode a dictionary. Probably
preferable to use UTF-8 to encode an arbitrary string.
./macurl2path.py
Calls to urllib look broken. Not tested.

Tested manually, fine
-

./wsgiref/simple_server.py
Just used to set PATH_INFO, fine if URLs are UTF-8 encoded.
./http/server.py
All uses are for translating between actual file-system paths to
URLs. This works fine for UTF-8 URLs. Note that since it uses
quote to create URLs in a dir listing, and unquote to handle
them, it breaks when unquote is not the inverse of quote.

Consider the following simple script:

import http.server
s = http.server.HTTPServer(('',8000),
http.server.SimpleHTTPRequestHandler)
s.serve_forever()

This will "kind of" work in the unpatched version, using
Latin-1 URLs, but filenames with characters above 256 will
break (give a 404 error).
The patch fixes this.
./urllib/robotparser.py
No test cases. Manually tested, URLs properly match when
percent-encoded in UTF-8.
./nturl2path.py
No test cases available. Manually tested, fine if URLs are
UTF-8 encoded.

Test cases either exist or added, fine
--

./test/test_urllib.py
I wrote a large wad of test cases for all the new functionality.
./wsgiref/util.py
Added test cases expecting UTF-8.
./http/cookiejar.py
I changed a test case to expect UTF-8.
./email/utils.py
I changed this file to behave as it used to, to satisfy its
existing test cases.
./cgi.py
Added test cases for UTF-8-encoded query strings.

Commit log:

urllib.parse.unquote: Added "encoding" and "errors" optional arguments,
allowing the caller to determine the decoding of percent-encoded octets.
As per RFC 3986, default is "utf-8" (previously implicitly decoded as
ISO-8859-1).

urllib.parse.quote: Added "encoding" and "errors" optional arguments,
allowing the caller to determine the encoding of non-ASCII characters
before being percent-encoded. Default is "utf-8" (previously characters
in range(128, 256) were encoded as ISO-8859-1, and characters above that
as UTF-8). Also characters above 128 are no longer allowed to be "safe".

Doc/library/urllib.parse.rst: Updated docs on quote and unquote to
reflect new interface.

Lib/test/test_urllib.py: Added several new test cases testing encoding
and decoding Unicode strings with various encodings. This includes
updating one test case to now expect UTF-8 by default.

Lib/test/test_http_cookiejar.py, Lib/test/test_cgi.py,
Lib/test/test_wsgiref.py: Updated and added test cases to deal with
UTF-8-encoded URIs.

Lib/email/utils.py: Calls urllib.parse.quote and urllib.parse.unquote
with encoding="latin-1", to preserve existing 

[issue3349] search for 'patch' produces roundup error

2008-07-12 Thread Martin v. Löwis

Martin v. Löwis <[EMAIL PROTECTED]> added the comment:

lower-right -> lower-left (i.e. labelled "Report Tracker Problem")

___
Python tracker <[EMAIL PROTECTED]>

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3349] search for 'patch' produces roundup error

2008-07-12 Thread Martin v. Löwis

Martin v. Löwis <[EMAIL PROTECTED]> added the comment:

Can you please re-report this at the tracker linked in the lower-right
corner of this page, i.e.

http://psf.upfronthosting.co.za/roundup/meta

Closing it here.

--
nosy: +loewis
resolution:  -> invalid
status: open -> closed

___
Python tracker <[EMAIL PROTECTED]>

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3349] search for 'patch' produces roundup error

2008-07-12 Thread anatoly techtonik

New submission from anatoly techtonik <[EMAIL PROTECTED]>:

If you'll try to search 'patch' word in this bugtracker, you will likely
to get the error saved in attached file.

--
files: bugs.python.org.txt
messages: 69588
nosy: techtonik
severity: normal
status: open
title: search for 'patch' produces roundup error
Added file: http://bugs.python.org/file10887/bugs.python.org.txt

___
Python tracker <[EMAIL PROTECTED]>

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3348] Cannot start wsgiref simple server in Py3k

2008-07-12 Thread Matt Giuca

New submission from Matt Giuca <[EMAIL PROTECTED]>:

The wsgiref "simple server" module has a demo server, which fails to
start in Python 3.0 for a bunch of reasons.

To verify this, just go into the Lib/wsgiref directory, and run:
python3.0 ./simple_server.py
(which launches the demo server).

This opens your web browser and points it at the server, and you get the
following error:

ValueError: need more than 1 value to unpack

I fixed a number of issues which simply killed the server:

* In get_environ, it did not iterate over the headers mapping properly
at all (was expecting a sequence of strings, it actually is a mapping).
I think the email.message.Message class changed. Fixed.
* In demo_app, it calls sort on the output of dict.items() - a list in
Python 2, but an iterator in Python 3, so it fails. Fixed (using "sorted").

Unfortunately, the final issue is a bit harder to fix. It seems when I
run the demo server, it opens a binary stream, but handlers.py sends
strings to be written, giving the error

TypeError: send() argument 1 must be bytes or read-only buffer, not str

However in the test case, it opens a text stream, so handlers.py works fine.

The following *HACK* fixes it so the demo server works, but breaks the
test suite (it is NOT included in the attached patch):

--- Lib/wsgiref/handlers.py (revision 64895)
+++ Lib/wsgiref/handlers.py (working copy)
@@ -382,8 +382,8 @@
 self.environ.update(self.base_env)
 
 def _write(self,data):
-self.stdout.write(data)
-self._write = self.stdout.write
+self.stdout.write(data.encode('utf-8'))
+#self._write = self.stdout.write
 
I can't figure out right away what to do about this, but the best
solution would be to get the demo server to open the socket in text mode.

In any case, the patch is attached for branch /branches/py3k, revision
64895.

Commit log:

* Lib/wsgiref/simple_server.py: Fixed two fatal errors which prevent the
demo server from running (broken due to Python 3.0).
Note: Demo server may still not run due to an issue between strings and
bytes.

--
components: Library (Lib)
files: simple_server.py.patch
keywords: patch
messages: 69587
nosy: mgiuca
severity: normal
status: open
title: Cannot start wsgiref simple server in Py3k
type: behavior
versions: Python 3.0
Added file: http://bugs.python.org/file10886/simple_server.py.patch

___
Python tracker <[EMAIL PROTECTED]>

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3347] urllib.robotparser doesn't work in Py3k

2008-07-12 Thread Matt Giuca

New submission from Matt Giuca <[EMAIL PROTECTED]>:

urllib.robotparser is broken in Python 3.0, due to a bytes object
appearing where a str is expected.

Example:

>>> import urllib.robotparser
>>> r =
urllib.robotparser.RobotFileParser('http://www.python.org/robots.txt')
>>> r.read()
TypeError: expected an object with the buffer interface

This is because the variable f in RobotFileParser.read is opened by
urlopen as a binary file, so f.read() returns a bytes object.

I've included a patch, which checks if it's a bytes, and if so, decodes
it with 'utf-8'. A more thorough fix might figure out what the charset
of the document is (in f.headers['Content-Type']), but at least this
works, and will be sufficient in almost all cases.

Also there are no test cases for urllib.robotparser.

Patch (robotparser.py.patch) is for branch /branches/py3k, revision 64891.

Commit log:

Lib/urllib/robotparser.py: Fixed robotparser for Python 3.0. urlopen
returns bytes objects where str expected. Decode the bytes using UTF-8.

--
components: Library (Lib)
files: robotparser.py.patch
keywords: patch
messages: 69586
nosy: mgiuca
severity: normal
status: open
title: urllib.robotparser doesn't work in Py3k
type: behavior
versions: Python 3.0
Added file: http://bugs.python.org/file10885/robotparser.py.patch

___
Python tracker <[EMAIL PROTECTED]>

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3338] cPickle segfault with deep recursion

2008-07-12 Thread Darryl Dixon

Darryl Dixon <[EMAIL PROTECTED]> added the comment:

Happens with Python 2.5.2 on 64bit also:

Python 2.5.2 (r252:60911, Apr 21 2008, 11:17:30) 
[GCC 4.2.3 (Ubuntu 4.2.3-2ubuntu7)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import platform
>>> platform.architecture()
('64bit', '')
>>> from cPickle import Pickler
>>> class rec:
...   child = None
...   def __init__(self, counter):
... if counter > 0:
...   self.child = rec(counter-1)
... 
>>> import sys
>>> sys.setrecursionlimit(1)
>>> mychain = rec(2600)
>>> from cStringIO import StringIO
>>> stream = StringIO()
>>> p = Pickler(stream, 1)
>>> res = p.dump(mychain)
Segmentation fault

--
versions: +Python 2.5

___
Python tracker <[EMAIL PROTECTED]>

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3008] Let bin/oct/hex show floats

2008-07-12 Thread Mark Dickinson

Mark Dickinson <[EMAIL PROTECTED]> added the comment:

Some final tinkering:

 - docstrings and docs expanded slightly;  docs mention interoperability
with C and Java.

 - in float.hex(), there's always a sign included in the exponent (e.g. 
"0x1p+0" instead of "0x1p0").  This just makes for a little bit more 
consistency with repr(float), with C99 and with the way the Decimal module 
behaves (but not with Java, which omits the + sign).

Added file: http://bugs.python.org/file10884/hex_float8.patch

___
Python tracker <[EMAIL PROTECTED]>

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3300] urllib.quote and unquote - Unicode issues

2008-07-12 Thread Matt Giuca

Matt Giuca <[EMAIL PROTECTED]> added the comment:

OK I spent awhile writing test cases for quote and unquote, encoding and
decoding various Unicode strings with different encodings. As a result,
I found a bunch of issues in my previous patch, so I've rewritten the
patches to both quote and unquote. They're both actually more similar to
the original version now.

I'd be interested in hearing if anyone disagrees with my expected output
for these test cases.

I'm now confident I have good test coverage directly on the quote and
unquote functions. However, I haven't tested the other library functions
which depend upon them (though the entire test suite passes). Though as
I showed in that big post I made yesterday, other modules such as cgi
seem to be working fine (their behaviour has changed; they use UTF-8
now; but that's the whole point of this patch).

I still haven't figured out what the behaviour of "safe" should be in
quote. Should it only allow ASCII characters (thereby limiting the
output to an ASCII string, as specified by RFC 3986)? Should it also
allow Latin-1 characters, or all Unicode characters as well (perhaps
allowing you to create IRIs -- admittedly I don't know much about IRIs).
The new implementation of quote makes it rather difficult to allow
non-Latin-1 characters to be made "safe", as it encodes the string into
bytes before any processing.

Patch (parse.py.patch4) is for branch /branches/py3k, revision 64891.

Commit log:

urllib.parse.unquote: Added "encoding" and "errors" optional arguments,
allowing the caller to determine the decoding of percent-encoded octets.
As per RFC 3986, default is "utf-8" (previously implicitly decoded as
ISO-8859-1).

urllib.parse.quote: Added "encoding" and "errors" optional arguments,
allowing the caller to determine the encoding of non-ASCII characters
before being percent-encoded. Default is "utf-8" (previously characters
in range(128, 256) were encoded as ISO-8859-1, and characters above that
as UTF-8). Also characters above 128 are no longer allowed to be "safe".

Doc/library/urllib.parse.rst: Updated docs on quote and unquote to
reflect new interface.

Lib/test/test_urllib.py: Added several new test cases testing encoding
and decoding Unicode strings with various encodings. This includes
updating one test case to now expect UTF-8 by default.

Lib/test/test_http_cookiejar.py: Updated test case which expected output
in ISO-8859-1, now expects UTF-8.

Lib/email/utils.py: Calls urllib.parse.quote and urllib.parse.unquote
with encoding="latin-1", to preserve existing behaviour (which the whole
email module is dependent upon).

Added file: http://bugs.python.org/file10883/parse.py.patch4

___
Python tracker <[EMAIL PROTECTED]>

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3346] test_ossaudiodev fails

2008-07-12 Thread Ismail Donmez

New submission from Ismail Donmez <[EMAIL PROTECTED]>:

This is a rather new regression:

test_ossaudiodev
test test_ossaudiodev failed -- Traceback (most recent call last):
  File "/home/cartman/Sources/py3k/Lib/test/test_ossaudiodev.py", line
146, in test_playback
self.play_sound_file(*sound_info)
  File "/home/cartman/Sources/py3k/Lib/test/test_ossaudiodev.py", line
66, in play_sound_file
setattr(dsp, attr, 42)
AttributeError: 'ossaudiodev.oss_audio_device' object has no attribute
'closed'

--
components: Tests
messages: 69582
nosy: cartman
severity: normal
status: open
title: test_ossaudiodev fails
versions: Python 3.0

___
Python tracker <[EMAIL PROTECTED]>

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3297] Python interpreter uses Unicode surrogate pairs only before the pyc is created

2008-07-12 Thread Marc-Andre Lemburg

Marc-Andre Lemburg <[EMAIL PROTECTED]> added the comment:

Adam, I do know what I'm talking about: I was the lead designer of the
Unicode integration you find in Python and implemented most of it.

What you see as repr() of a Unicode object is the result of applying a
codec to the internal representation. Please don't confuse the output of
the codec ("unicode-escape") with the internal representation.

That said, Ezio did uncover a bug and we need to find the cause. It's
likely caused by the fact that the UTF-8 codec does not recombine
surrogates on UCS4 builds. See this comment in the codec implementation:

case 3:
if ((s[1] & 0xc0) != 0x80 ||
(s[2] & 0xc0) != 0x80) {
errmsg = "invalid data";
startinpos = s-starts;
endinpos = startinpos+3;
goto utf8Error;
}
ch = ((s[0] & 0x0f) << 12) + ((s[1] & 0x3f) << 6) + (s[2] &
0x3f);
if (ch < 0x0800) {
/* Note: UTF-8 encodings of surrogates are considered
   legal UTF-8 sequences;

   XXX For wide builds (UCS-4) we should probably try
   to recombine the surrogates into a single code
   unit.
*/
errmsg = "illegal encoding";
startinpos = s-starts;
endinpos = startinpos+3;
goto utf8Error;
}
else
*p++ = (Py_UNICODE)ch;
break;

___
Python tracker <[EMAIL PROTECTED]>

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3008] Let bin/oct/hex show floats

2008-07-12 Thread Mark Dickinson

Mark Dickinson <[EMAIL PROTECTED]> added the comment:

In the spirit of being "liberal in what you accept, but strict in what you 
emit", here's a version that makes both the leading '0x' and the trailing 
'p...' exponent optional on input.  Both of these are still produced on 
output.

Note that this version is still perfectly interoperable with C99 and Java 
1.5+:  fromhex accepts anything produced by C and Java (e.g. via C's '%a', 
or Java's toHexString), and the output of hex can be read by C99's 
strtod/sscanf and Java's Double constructor, and/or used as hex literals 
in C or Java source.

Added file: http://bugs.python.org/file10882/hex_float7.patch

___
Python tracker <[EMAIL PROTECTED]>

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3008] Let bin/oct/hex show floats

2008-07-12 Thread Mark Dickinson

Mark Dickinson <[EMAIL PROTECTED]> added the comment:

Here's an updated patch that makes the trailing 'p123' exponent optional 
in fromhex.  (This matches the behaviour of C99's strtod and sscanf;  in 
contrast, Java always requires the exponent.)

I'm beginning to wonder whether the '0x' shouldn't also be optional on 
input as well, in the same way that it's optional in int():

>>> int('0x45', 16)
69
>>> int('45', 16)
69

This would then allow, e.g.,

>>> float.fromhex('45')
69.0

Added file: http://bugs.python.org/file10881/hex_float6.patch

___
Python tracker <[EMAIL PROTECTED]>

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com