[issue5950] Make zimport work with zipfile containing comments

2010-07-09 Thread Dmitry

Dmitry  added the comment:

I'm talking about internal zimport function (see attached testcase): 

i.e.

import sys;
sys.path.insert(0,'test.zip');

import test
test.testme()

doesn't work if test.zip contains comment.

--

___
Python tracker 
<http://bugs.python.org/issue5950>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue5950] zimport doesn't work with zipfile containing comments

2009-05-06 Thread Dmitry

New submission from Dmitry :

Synopsys:
zimport not able to import a module from zipfile if zipfile contains
comment.
Versions:
This is Zip 2.32 (June 19th 2006), by Info-ZIP
Python 2.5.2 or 2.6.2

Steps to reproduce:
create a module, create an app that imports this module.
zip the module, make sure it works. 
Run: echo "Some comments" | zip -z module.zip
the app stop working.

--
components: Interpreter Core
files: testcase.zip
messages: 87340
nosy: dsamersoff
severity: normal
status: open
title: zimport doesn't work with zipfile containing comments
versions: Python 2.5, Python 2.6
Added file: http://bugs.python.org/file13905/testcase.zip

___
Python tracker 
<http://bugs.python.org/issue5950>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8123] TypeError in urllib when trying to use HTTP authentication

2010-03-12 Thread Dmitry Jemerov

New submission from Dmitry Jemerov :

I'm trying to download a file from a site using HTTP authentication. I'm 
subclassing FancyURLOpener, returning my credentials from the 
prompt_user_passwd() method, and using opener.retrieve() to download the file. 
I get the following error:

  File "C:/JetBrains/IDEA/build/eap/downandup.py", line 36, in download
opener.retrieve(url, os.path.join(target_path, name))
  File "C:\Python31\lib\urllib\request.py", line 1467, in retrieve
fp = self.open(url, data)
  File "C:\Python31\lib\urllib\request.py", line 1435, in open
return getattr(self, name)(url)
  File "C:\Python31\lib\urllib\request.py", line 1609, in open_http
return self._open_generic_http(http.client.HTTPConnection, url, data)
  File "C:\Python31\lib\urllib\request.py", line 1605, in _open_generic_http
response.status, response.reason, response.msg, data)
  File "C:\Python31\lib\urllib\request.py", line 1621, in http_error
result = method(url, fp, errcode, errmsg, headers)
  File "C:\Python31\lib\urllib\request.py", line 1859, in http_error_401
return getattr(self,name)(url, realm)
  File "C:\Python31\lib\urllib\request.py", line 1931, in retry_http_basic_auth
return self.open(newurl)
  File "C:\Python31\lib\urllib\request.py", line 1435, in open
return getattr(self, name)(url)
  File "C:\Python31\lib\urllib\request.py", line 1609, in open_http
return self._open_generic_http(http.client.HTTPConnection, url, data)
  File "C:\Python31\lib\urllib\request.py", line 1571, in _open_generic_http
auth = base64.b64encode(user_passwd).strip()
  File "C:\Python31\lib\base64.py", line 56, in b64encode
raise TypeError("expected bytes, not %s" % s.__class__.__name__)
TypeError: expected bytes, not str

The problem happens because _open_generic_http extracts the user password from 
the string URL, and passes the string to the b64encode method, which only 
accepts bytes and not strings. The problem happens with Python 3.1.1 for me, 
but as far as I can see it's still not fixed in the py3k branch as of now.

--
components: Library (Lib)
messages: 100938
nosy: Dmitry.Jemerov
severity: normal
status: open
title: TypeError in urllib when trying to use HTTP authentication
versions: Python 3.1

___
Python tracker 
<http://bugs.python.org/issue8123>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8123] TypeError in urllib when trying to use HTTP authentication

2010-03-13 Thread Dmitry Jemerov

Dmitry Jemerov  added the comment:

from urllib.request import *

opener = FancyURLopener()
opener.retrieve("http://username:passw...@google.com/index.html";, "index.html")

--

___
Python tracker 
<http://bugs.python.org/issue8123>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8228] pprint, single/multiple items per line parameter

2010-03-24 Thread Dmitry Chichkov

New submission from Dmitry Chichkov :

I've run into a case where pprint isn't really pretty. 

import pprint
pprint.PrettyPrinter().pprint([1]*100)

Prints a lengthy column of '1'; Not pretty at all. Look:

[1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1,
 1]

--
components: Library (Lib)
messages: 101672
nosy: Dmitry.Chichkov
severity: normal
status: open
title: pprint, single/multiple items per line parameter
type: feature request
versions: Python 2.6

___
Python tracker 
<http://bugs.python.org/issue8228>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8228] pprint, single/multiple items per line parameter

2010-03-25 Thread Dmitry Chichkov

Dmitry Chichkov  added the comment:

Quick, dirty and utterly incorrect patch that works for me. Includes 
issue_5131.patch (defaultdict support, etc). Targets trunk (2.6), revision 
77310.

--
keywords: +patch
Added file: http://bugs.python.org/file16640/issue_8228.dirty.patch

___
Python tracker 
<http://bugs.python.org/issue8228>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8228] pprint, single/multiple items per line parameter

2010-04-01 Thread Dmitry Chichkov

Dmitry Chichkov  added the comment:

Yes. This patch is nowhere near the production level. Unfortunately it works 
for me. And in the moment I don't have time to improve it further. Current 
version doesn't check the item's width upfront, there is definitely room for 
improvement. 

There is also an issue with the included issue_5131.patch - pretty-printed code 
can not be executed, unless 'defaultdict(' ')' type-specs are removed.

--

___
Python tracker 
<http://bugs.python.org/issue8228>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue4171] SSL handshake fails after TCP connection in getpeername()

2010-04-25 Thread Dmitry Dvoinikov

Dmitry Dvoinikov  added the comment:

The problem does not reproduce in 3.1.1 nor in 3.1.2
(either x86 or x64).

Antoine Pitrou пишет:
> Antoine Pitrou  added the comment:
> 
> What happens if you remove the call to settimeout()?
> Also, it would be nice if you could try with the latest py3k checkout. 
> There's a couple of fixes for do_handshake there (including timeout issues).
> 
> --
> nosy: +pitrou
> priority:  -> normal
> versions: +Python 3.1, Python 3.2 -Python 3.0
> 
> ___
> Python tracker 
> <http://bugs.python.org/issue4171>
> ___
>

--

___
Python tracker 
<http://bugs.python.org/issue4171>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8583] Hardcoded namespace_separator in the cElementTree.XMLParser

2010-04-30 Thread Dmitry Chichkov

New submission from Dmitry Chichkov :

The namespace_separator parameter is hard coded in the cElementTree.XMLParser 
class disallowing the option of ignoring XML Namespaces with cElementTree 
library.

Here's the code example:
 from xml.etree.cElementTree import iterparse
 from StringIO import StringIO
 xml = """http://www.very_long_url.com";>"""
 for event, elem in iterparse(StringIO(xml)): print event, elem

It produces:
 end http://www.very_long_url.com}child' at 0xb7ddfa58>
 end http://www.very_long_url.com}root' at 0xb7ddfa40> 

In the current implementation local tags get forcibly concatenated with URIs 
often resulting in the ugly code on the user's side and performance degradation 
(at least due to extra concatenations and extra lengthy compare operations in 
the elements matching code).

Internally cElementTree uses EXPAT parser, which is doing namespace processing 
only optionally, enabled by providing a value for namespace_separator argument. 
This argument is hard-coded in the cElementTree: 
 self->parser = EXPAT(ParserCreate_MM)(encoding, &memory_handler, "}");

Well, attached is a patch exposing this parameter in the 
cElementTree.XMLParser() arguments. This parameter is optional and the default 
behavior should be unchanged.  Here's the test code:

import cElementTree

x = """http://www.very_long_url.com";>text"""

parser = cElementTree.XMLParser()
parser.feed(x)
elem = parser.close()
print elem

parser = cElementTree.XMLParser(namespace_separator="}")
parser.feed(x)
elem = parser.close()
print elem

parser = cElementTree.XMLParser(namespace_separator=None)
parser.feed(x)
elem = parser.close()
print elem

The resulting output:
http://www.very_long_url.com}root' at 0xb7e885f0>
http://www.very_long_url.com}root' at 0xb7e88608>


--
components: Library (Lib)
messages: 104671
nosy: dmtr
priority: normal
severity: normal
status: open
title: Hardcoded namespace_separator in the cElementTree.XMLParser
type: performance
versions: Python 2.5, Python 2.6, Python 2.7

___
Python tracker 
<http://bugs.python.org/issue8583>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8583] Hardcoded namespace_separator in the cElementTree.XMLParser

2010-04-30 Thread Dmitry Chichkov

Changes by Dmitry Chichkov :


--
keywords: +patch
Added file: http://bugs.python.org/file17153/issue-8583.patch

___
Python tracker 
<http://bugs.python.org/issue8583>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8583] Hardcoded namespace_separator in the cElementTree.XMLParser

2010-04-30 Thread Dmitry Chichkov

Dmitry Chichkov  added the comment:

And obviously iterparse can be either overridden in the local user code or 
patched in the library. Here's the iterparse code/test code:

import  cElementTree
from cStringIO import StringIO

class iterparse(object):
root = None
def __init__(self, file, events=None, namespace_separator = "}"):
if not hasattr(file, 'read'):
file = open(file, 'rb')
self._file = file
self._events = events
self._namespace_separator = namespace_separator
def __iter__(self):
events = []
b = cElementTree.TreeBuilder()
p = cElementTree.XMLParser(b, namespace_separator= \
self._namespace_separator)
p._setevents(events, self._events)
while 1:
  data = self._file.read(16384)
  if not data:
break
  p.feed(data)
  for event in events:
yield event
  del events[:]
root = p.close()
for event in events:
  yield event
self.root = root


x = """http://www.very_long_url.com";>text"""
context = iterparse(StringIO(x), events=("start", "end", "start-ns"))
for event, elem in context: print event, elem

context = iterparse(StringIO(x), events=("start", "end", "start-ns"), 
namespace_separator = None)
for event, elem in context: print event, elem


It produces:
start-ns ('', 'http://www.very_long_url.com')
start http://www.very_long_url.com}root' at 0xb7ccf650>
start http://www.very_long_url.com}child' at 0xb7ccf5a8>
end http://www.very_long_url.com}child' at 0xb7ccf5a8>
end http://www.very_long_url.com}root' at 0xb7ccf650>
start 
start 
end 
end 

Note the absence of URIs and ignored start-ns events in the 'space_separator = 
None' version.

--

___
Python tracker 
<http://bugs.python.org/issue8583>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8583] Hardcoded namespace_separator in the cElementTree.XMLParser

2010-05-01 Thread Dmitry Chichkov

Dmitry Chichkov  added the comment:

This patch does not modify the existing behavior of the library. The 
namespace_separator parameter is optional. Parameter already exists in the 
EXPAT library, but it is hard coded in the cElementTree.XMLParser code.

Fredrik, yes, namespaces are a fundamental part of the XML information model. 
Yet an option of having them ignored is a very valuable one in the performance 
critical code.

--

___
Python tracker 
<http://bugs.python.org/issue8583>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8583] Hardcoded namespace_separator in the cElementTree.XMLParser

2010-05-02 Thread Dmitry Chichkov

Dmitry Chichkov  added the comment:

I agree that the argument name choice is poor. But it have already been made by 
whoever coded the EXPAT parser which cElementTree.XMLParser wraps. So there is 
not much room here.

As to 'proposed feature have to be used with great care by users' - this s 
already taken care of. If you look - cElementTree.XMLParser class is a rather 
obscure one. As I understand it is only being used by users requiring high 
performance xml parsing for large datasets (10GB - 10TB range) in data-mining 
applications.

--

___
Python tracker 
<http://bugs.python.org/issue8583>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8583] Hardcoded namespace_separator in the cElementTree.XMLParser

2010-05-02 Thread Dmitry Chichkov

Dmitry Chichkov  added the comment:

Interestingly in precisely these applications often you don't care about 
namespaces at all. Often all you need is to extract 'text' or 'name' elements 
irregardless of the namespace.

--

___
Python tracker 
<http://bugs.python.org/issue8583>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue4171] SSL handshake fails after TCP connection in getpeername()

2010-05-07 Thread Dmitry Dvoinikov

Dmitry Dvoinikov  added the comment:

Well, I'm sorry to bring this up again, but the problem persists
with Python 3.1.2 (x86, Windows XP). The difference with the
test script behaviour is that now it doesn't break every time.
Perhaps this is the reason I said the problem was gone.
In fact, now that I run the aforementioned script I may get

worked so far
but not here it didn't

and some other time I may get

worked so far
Traceback (most recent call last):
  File "test.py", line 23, in 
test_handshake(address, False)
  File "test.py", line 17, in test_handshake
ssl.do_handshake()
  File "C:\Python31\lib\ssl.py", line 327, in do_handshake
self._sslobj.do_handshake()
AttributeError: 'NoneType' object has no attribute 'do_handshake'

and the outcome is unpredictable. It may work many times in a row
and it may break many times in a row.

If this is of any relevance, I've had pywin32-2.14 installed since.

--
status: closed -> open

___
Python tracker 
<http://bugs.python.org/issue4171>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue4171] SSL handshake fails after TCP connection in getpeername()

2010-05-07 Thread Dmitry Dvoinikov

Dmitry Dvoinikov  added the comment:

Checked out and built revision 80956 of py3k against OpenSSL 0.9.8n. Here is 
the banner:

Python 3.2a0 (py3k:80956, May  8 2010, 11:31:45) [MSC v.1500 32 bit (Intel)] on 
win32

Now, the breaking script appears not to be breaking any more, even though I 
tried it in a loop, a 1000 attempts to execute were all successful. 

It seems to be fine now, thank you for your help.

--

___
Python tracker 
<http://bugs.python.org/issue4171>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue9291] mimetypes initialization fails on Windows because of non-Latin characters in registry

2010-07-18 Thread Dmitry Jemerov

New submission from Dmitry Jemerov :

On Windows, mimetypes initialization reads the list of MIME types from the 
Windows registry. It assumes that all characters are Latin-1 encoded, and fails 
when it's not the case, with the following exception:

Traceback (most recent call last):
  File "mttest.py", line 3, in 
mimetypes.init()
  File "c:\Python27\lib\mimetypes.py", line 355, in init
db.read_windows_registry()
  File "c:\Python27\lib\mimetypes.py", line 260, in read_windows_registry
for ctype in enum_types(mimedb):
  File "c:\Python27\lib\mimetypes.py", line 250, in enum_types
ctype = ctype.encode(default_encoding) # omit in 3.x!
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe0 in position 0: ordinal 
not in range(128)

This can be reproduced, for example, on a Russian Windows XP installation which 
has QuickTime installed (QuickTime creates the non-Latin entries in the 
registry). The following line causes the exception to happen:

import mimetypes; mimetypes.init()

--
components: Library (Lib), Windows
messages: 110637
nosy: Dmitry.Jemerov
priority: normal
severity: normal
status: open
title: mimetypes initialization fails on Windows because of non-Latin 
characters in registry
versions: Python 2.7

___
Python tracker 
<http://bugs.python.org/issue9291>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue9291] mimetypes initialization fails on Windows because of non-Latin characters in registry

2010-07-20 Thread Dmitry Jemerov

Dmitry Jemerov  added the comment:

The problem doesn't happen on Python 3.1.2 because it doesn't have the code in 
mimetypes that accesses the Windows registry. Haven't tried the 3.2 alphas yet.

--

___
Python tracker 
<http://bugs.python.org/issue9291>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8123] TypeError in urllib when trying to use HTTP authentication

2010-07-23 Thread Dmitry Jemerov

Dmitry Jemerov  added the comment:

Patch (with unittest) attached.

--
keywords: +patch
Added file: http://bugs.python.org/file18139/8123.patch

___
Python tracker 
<http://bugs.python.org/issue8123>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue9291] mimetypes initialization fails on Windows because of non-Latin characters in registry

2010-07-23 Thread Dmitry Jemerov

Dmitry Jemerov  added the comment:

Patch (suggested fix and unittest) attached.

--
keywords: +patch
Added file: http://bugs.python.org/file18143/9291.patch

___
Python tracker 
<http://bugs.python.org/issue9291>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue9291] mimetypes initialization fails on Windows because of non-Latin characters in registry

2010-07-23 Thread Dmitry Jemerov

Dmitry Jemerov  added the comment:

And by the way I've verified that the problem doesn't happen in py3k trunk.

--

___
Python tracker 
<http://bugs.python.org/issue9291>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue9360] nntplib cleanup

2010-07-23 Thread Dmitry Jemerov

New submission from Dmitry Jemerov :

The patch performs an extensive cleanup of nntplib:
 - Change API methods to return strings instead of bytes. This breaks API 
compatibility, but given that the parameters need to be passed as strings and 
many of the returned values would need to be passed to other API methods, I 
consider the current API to be broken. I've discussed this with Brett at the 
EuroPython sprint, and he agrees.
 - Add tests.
 - Add pending deprecation warnings for xgtitle() and xpath() methods, which 
are not useful in modern environments.
 - Use named tuples for returned values where appopriate.
 - Modernize the implementation a little bit.
 - Clean up the docstrings.

--
components: Library (Lib)
files: nntplib_cleanup.patch
keywords: patch
messages: 111364
nosy: Dmitry.Jemerov
priority: normal
severity: normal
status: open
title: nntplib cleanup
versions: Python 3.2
Added file: http://bugs.python.org/file18159/nntplib_cleanup.patch

___
Python tracker 
<http://bugs.python.org/issue9360>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue9360] nntplib cleanup

2010-07-23 Thread Dmitry Jemerov

Dmitry Jemerov  added the comment:

This is an issue only for the actual article content, right? I'll be happy to 
extend the API to allow getting the original bytes of the article content, 
while keeping the rest of API (like group names) string-based.

--

___
Python tracker 
<http://bugs.python.org/issue9360>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue9520] Add Patricia Trie high performance container (python's defaultdict(int) is unusable on datasets with 10, 000, 000+ keys.)

2010-08-04 Thread Dmitry Chichkov

New submission from Dmitry Chichkov :

On large data sets (10-100 million keys) the default python dictionary 
implementation fails to meet memory and performance constraints. It also 
apparently fails to keep O(1) complexity (after just 1M keys). As such, there 
is a need for good, optimized, practical implementation of a high-performance 
container that can store such datasets. 

One of the alternatives is a regular Patricia Trie data structure. It can meet 
performance requirements on such datasets. Criteria:
* strictly O(1);
* works well on large datasets (~100M keys); memory efficient;
* same or better performance as dict();
* supports regular dict | counter interface;
* supports unicode keys;
* supports any characters in the keys;
* persistent (can be pickled); 
* in-memory library;

There are a few existing implementations available:
* BioPython trie - http://biopython.org/
* py-radix - http://www.mindrot.org/projects/py-radix/
* PyPi trie - http://pypi.python.org/pypi/trie

A few other relevant alternatives/implementations:
* NIST trie - http://www.itl.nist.gov/div897/sqg/dads/HTML/patriciatree.html
* C++ Trie Library - http://wikipedia-clustering.speedblue.org/trie.php
* PyAvl trie - http://pypi.python.org/pypi/pyavl/1.12_1
* PyJudy tree - http://www.dalkescientific.com/Python/PyJudy.html
* Redis - http://code.google.com/p/redis/
* Memcached - http://memcached.org/

An alternative to a basic Patricia Trie could be some state-of-the-art approach 
from some modern research (i.e. 
http://www.aclweb.org/anthology/W/W09/W09-1505.pdf ), 


The best existing implementation I've been able to find so far is one in the 
BioPython. Compared to defaultdict(int) on the task of counting words. Dataset 
123,981,712 words (6,504,484 unique), 1..21 characters long:
* bio.tree - 459 Mb/0.13 Hours, good O(1) behavior
* defaultdict(int) - 693 Mb/0.32 Hours, poor, almost O(N) behavior

At 8,000, keys python defaultdict(int) starts showing almost O(N) behavior 
and gets unusable with 10,000,000+ unique keys. 



A few usage/installatio notes on BioPython trie:
$ sudo apt-get install python-biopython

>>> from Bio import trie
>>> trieobj = trie.trie()
>>> trieobj["hello"] = 5
>>> trieobj["hello"] += 1
>>> print trieobj["hello"]
>>> print trieobj.keys()

More examples at:
http://python-biopython.sourcearchive.com/documentation/1.54/test__triefind_8py-source.html

--
components: Library (Lib)
messages: 112937
nosy: dmtr
priority: normal
severity: normal
status: open
title: Add Patricia Trie high performance container (python's defaultdict(int) 
is unusable on datasets with 10,000,000+ keys.)
type: performance
versions: Python 3.1, Python 3.2, Python 3.3

___
Python tracker 
<http://bugs.python.org/issue9520>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue9520] Add Patricia Trie high performance container

2010-08-05 Thread Dmitry Chichkov

Dmitry Chichkov  added the comment:

Thank you for your comment. Perhaps we should try separate this into two issues:
1) Bug. Python's dict() is unusable on datasets with 10,000,000+ keys. Here I 
should provide a solid test case showing a deviation from O(1);

2) Feature request/idea. Add Patricia Trie high performance container.

--

___
Python tracker 
<http://bugs.python.org/issue9520>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue9520] Add Patricia Trie high performance container

2010-08-05 Thread Dmitry Chichkov

Dmitry Chichkov  added the comment:

No. I'm not simply running out of system memory. 8Gb/x64/linux. And in my test 
cases I've only seen ~25% of memory utilized. And good idea. I'll try to play 
with the cyclic garbage collector.

It is harder than I thought to make a solid synthetic test case addressing that 
issue. The trouble you need to be able to generate data (e.g. 100,000,000 
words/5,000,000 unique) with a distribution close to that in the real life 
scenario (e.g. word lengths, frequencies and uniqueness in the english text). 
If somebody have a good idea onto how to do it nicely - you'd be very welcome. 

My best shot so far is in the attachment.

--
Added file: http://bugs.python.org/file18406/dc.dict.bench.0.01.py

___
Python tracker 
<http://bugs.python.org/issue9520>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue9520] Add Patricia Trie high performance container

2010-08-08 Thread Dmitry Chichkov

Dmitry Chichkov  added the comment:

Yes. Data containers optimized for very large datasets, compactness and strict 
adherence to O(1) can be beneficial. 

Python have great high performance containers, but there is a certain lack of 
compact ones. For example, on the x64 machine the following dict() mapping 
10,000,000 very short unicode keys (~7 chars) to integers eats 149 bytes per 
entry. 
>>> import os, re
>>> d = dict()
>>> for i in xrange(0, 1000): d[unicode(i)] = i
>>> print re.findall("(VmPeak.*|VmSize.*)", open('/proc/%d/status' % 
>>> os.getpid()).read())
['VmPeak:\t 1458324 kB', 'VmSize:\t 1458324 kB']

I can understand that there are all kinds of reasons why it is so and even why 
it is good. But having an unobtrusive *compact* container could be nice 
(although you'd be most welcome if you could tweak default containers, so they 
would adjust to the large datasets appropriately),

Also I can't emphasize more that compactness is still important sometimes. 
Modern days datasets are getting larger and larger (literally terabytes) and 
'just add more memory' strategy is not always feasible. 


Regarding the dict() violation of O(1). So far I was unable to reproduce it in 
the test. I can certainly see it on the real dataset, and trust me it was very 
annoying to see ETA 10 hours going down to 8 hours and then gradually up again 
to 17 hours and hanging there. This was _solved_ by switching from dict() to 
Bio.trie(). So this problem certainly had something to do with dict(). I don't 
know what is causing it though.

--

___
Python tracker 
<http://bugs.python.org/issue9520>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue9520] Add Patricia Trie high performance container

2010-08-13 Thread Dmitry Chichkov

Changes by Dmitry Chichkov :


Added file: http://bugs.python.org/file18515/dc.dict.bench.0.02.py

___
Python tracker 
<http://bugs.python.org/issue9520>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue9520] Add Patricia Trie high performance container

2010-08-14 Thread Dmitry Chichkov

Dmitry Chichkov  added the comment:

Yes, it looks like you are right. And while there is some slight performance 
degradation, at least nothing drastic is happening up to 30M keys. Using your 
modified test:

1000 words ( 961 keys), 3609555 words/s, 19239926 lookups/s, 51 
bytes/key (0.0MB)
   1 words (9042 keys), 3390980 words/s, 18180771 lookups/s, 87 
bytes/key (0.0MB)
  10 words (   83168 keys), 3298809 words/s, 12509481 lookups/s, 37 
bytes/key (3.0MB)
 100 words (  755372 keys), 2477793 words/s, 7537963 lookups/s, 66 
bytes/key (48.0MB)
 500 words ( 3501140 keys), 2291004 words/s, 6487727 lookups/s, 57 
bytes/key (192.0MB)
1000 words ( 6764089 keys), 2238081 words/s, 6216454 lookups/s, 59 
bytes/key (384.0MB)
2000 words (13061821 keys), 2175688 words/s, 5817085 lookups/s, 61 
bytes/key (768.0MB)
5000 words (31188460 keys), 2117751 words/s, 5137209 lookups/s, 51 
bytes/key (1536.0MB)

It looks like internal memory estimates (sys.getsizeof()) are also pretty good:

Before: ['VmPeak:\t10199764 kB', 'VmSize:\t 5552416 kB']
5000 words (31188460 keys), 2117751 words/s, 5137209 lookups/s, 51 
bytes/key (1536.0MB)
After: ['VmPeak:\t10199764 kB', 'VmSize:\t 7125284 kB']
 
7 125 284 kB - 5 552 416 kB = 1 536.00391 MB

--

___
Python tracker 
<http://bugs.python.org/issue9520>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8797] urllib2 basicauth broken in 2.6.5: RuntimeError: maximum recursion depth exceeded in cmp

2010-08-25 Thread Dmitry Jemerov

Dmitry Jemerov  added the comment:

I've also run into this problem after upgrading to Python 2.6.6. My code, which 
uses the same HTTPBasicAuthHandler instance for many requests to the same 
server, worked correctly with Python 2.6.2 and broke with 2.6.6. It would be 
great if zenyatta's patch to fix the regression was included in 2.6.7.
Also, unfortunately NEWS.txt doesn't mention this change at all.

--
nosy: +Dmitry.Jemerov

___
Python tracker 
<http://bugs.python.org/issue8797>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue9892] Event spends less time in wait() than requested

2010-09-18 Thread Dmitry Dvoinikov

New submission from Dmitry Dvoinikov :

If you request Event.wait(x), the call consistently returns in less than x 
seconds.

Sample:
-
from threading import Event
from time import time

e = Event()

before = time()
e.wait(0.1)
after = time()

print(after - before)

# under Python 3.1 prints 0.10...
# under Python 3.2 prints 0.092999...
-

--
components: Library (Lib)
messages: 116772
nosy: ddvoinikov
priority: normal
severity: normal
status: open
title: Event spends less time in wait() than requested
type: behavior
versions: Python 3.2

___
Python tracker 
<http://bugs.python.org/issue9892>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue9892] Event spends less time in wait() than requested

2010-09-18 Thread Dmitry Dvoinikov

Dmitry Dvoinikov  added the comment:

You are right, sorry. It's Windows XP Prof, Python 3.2a2.

The differences in OS may be the cause, but the problem doesn't appear in 3.1 
on the same machine.

--

___
Python tracker 
<http://bugs.python.org/issue9892>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue14360] email.encoders.encode_quopri doesn't work with python 3.2

2012-03-18 Thread Dmitry Shachnev

New submission from Dmitry Shachnev :

Currently my /usr/lib/python3.2/email/encoders.py has this code:

def _qencode(s):
enc = _encodestring(s, quotetabs=True)
# Must encode spaces, which quopri.encodestring() doesn't do
return enc.replace(' ', '=20')

The problem is that _encodestring (which is just quopri.encodestring) always 
returns bytes, trying to run replace() on bytes raises "TypeError: expected an 
object with the buffer interface".

This leads to email.encoders.encode_quopri never working.

So, I think this should be changed to something like this:

<...>
return enc.decode().replace(' ', '=20')

Example log:

Python 3.2.3rc1 (default, Mar  9 2012, 23:02:43) 
[GCC 4.6.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import email.encoders
>>> from email.mime.text import MIMEText
>>> msg = MIMEText(b'some text here')
>>> email.encoders.encode_quopri(msg)
Traceback (most recent call last):
  File "", line 1, in 
  File "/usr/lib/python3.2/email/encoders.py", line 44, in encode_quopri
encdata = _qencode(orig)
  File "/usr/lib/python3.2/email/encoders.py", line 23, in _qencode
return enc.replace(' ', '=20')
TypeError: expected an object with the buffer interface

Reproduced on Ubuntu precise with Python 3.2.3rc1. Replacing encode_quopri with 
encode_base64 works fine.

--
components: Library (Lib)
messages: 156238
nosy: barry, mitya57
priority: normal
severity: normal
status: open
title: email.encoders.encode_quopri doesn't work with python 3.2
type: crash
versions: Python 3.2

___
Python tracker 
<http://bugs.python.org/issue14360>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue14360] email.encoders.encode_quopri doesn't work with python 3.2

2012-03-18 Thread Dmitry Shachnev

Changes by Dmitry Shachnev :


--
type: crash -> behavior

___
Python tracker 
<http://bugs.python.org/issue14360>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue14360] email.encoders.encode_quopri doesn't work with python 3.2

2012-03-21 Thread Dmitry Shachnev

Dmitry Shachnev  added the comment:

(Sorry for not replying earlier).

I think the main priority here is getting things working, not the tests (so I 
have little interest in that).

First of all, should quopri.encodestring() really return bytes? Everything it 
returns is ascii text, obviously.

Then, which types of argument should encode_* functions take (I think str 
should be supported, and it's not a case here as encode_quopri will only accept 
bytes)?

--

___
Python tracker 
<http://bugs.python.org/issue14360>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue14360] email.encoders.encode_quopri doesn't work with python 3.2

2012-03-21 Thread Dmitry Shachnev

Dmitry Shachnev  added the comment:

> In fact, there's really no reason to call an encode_ method at all, since if 
> you pass a string to MIMEText when giving it a non-ascii unicode string, it 
> will default to utf-8 and do the appropriate CTE encoding.

No, it doesn't:
Python 3.2.3rc1 (default, Mar  9 2012, 23:02:43) 
[GCC 4.6.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from email.mime.text import MIMEText
>>> print(MIMEText('йцукен'))
Content-Type: text/plain; charset="us-ascii"
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit

йцукен
>>>

As you can see, it leaves russian text in unmodified state and sets the charset 
to "us-ascii". Should it be considered as a bug?

> What is your use case, by the way?
I'm writing a "send via e-mail" plugin for my ReText editor 
(http://retext.sourceforge.net/).

--

___
Python tracker 
<http://bugs.python.org/issue14360>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue14360] email.encoders.encode_quopri doesn't work with python 3.2

2012-03-21 Thread Dmitry Shachnev

Dmitry Shachnev  added the comment:

> You can get it to work by explicitly passing the charset
Thanks, I didn't know about that.

> Given the above, do you need it anymore?
No.

--

___
Python tracker 
<http://bugs.python.org/issue14360>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue14360] email.encoders.encode_quopri doesn't work with python 3.2

2012-03-21 Thread Dmitry Shachnev

Dmitry Shachnev  added the comment:

> Then, which types of argument should encode_* functions take (I think str 
> should be supported, and it's not a case here as encode_quopri will only 
> accept bytes)?

I meant which types of *payload* should they accept. Here's an illustration of 
what I mean: http://paste.ubuntu.com/894731/.

--

___
Python tracker 
<http://bugs.python.org/issue14360>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue14648] Attempt to format ascii and non-ascii strings together fails with "... UCS2 ..."

2012-04-23 Thread Dmitry Dvoinikov

New submission from Dmitry Dvoinikov :

Using Python 3.3.0a2 (default, Apr  1 2012, 19:34:58) [MSC v.1500 64 bit 
(AMD64)] on win32.

This line of code

"{0:s}{1:s}".format("ABC", "\u0410\u0411\u0412")

results in

SystemError: Cannot copy UCS2 characters into a string of ascii characters

--
components: Interpreter Core
messages: 159014
nosy: ddvoinikov
priority: normal
severity: normal
status: open
title: Attempt to format ascii and non-ascii strings together fails with "... 
UCS2 ..."
type: behavior
versions: Python 3.3

___
Python tracker 
<http://bugs.python.org/issue14648>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11035] Segmentation fault

2011-01-28 Thread Dmitry Groshev

New submission from Dmitry Groshev :

Here is a console output:
si14@si14-work:~/repos/monitoring/root$ python2.7 server.py 

127.0.0.1 - - [2011-01-28 12:29:30] "GET /update HTTP/1.1" 200 320 "-" 
"Python-urllib/2.7"
{"seenby":[1],"received":1296207058.993983,"observer":1,"type":"ping","source":102,"time":1296207058.990101,"data":[[1296206970.543701,0.010154962539672852],[1296206980.383922,0.010203123092651367],[1296206990.222841,0.01015615463256836],[1296207000.050695,0.010264873504638672],[1296207009.876834,0.011881113052368164],[1296207019.698611,0.010120153427124023],[1296207029.519147,0.010251045227050781],[1296207039.342266,0.010113000869750977],[1296207049.167352,0.010238885879516602],[1296207058.990089,0.010174989700317383]],"class":"statistics"}
127.0.0.1 - - [2011-01-28 12:30:59] "POST / HTTP/1.1" 200 2 "-" 
"Python-urllib/2.7"
Segmentation fault
si14@si14-work:~/repos/monitoring/root$ 
I'm not sure that this is a gevent issue, so I'm posting it here. server.py 
sources are attached.

--
components: Interpreter Core
files: server.py
messages: 127266
nosy: Dmitry.Groshev
priority: normal
severity: normal
status: open
title: Segmentation fault
type: crash
versions: Python 2.7
Added file: http://bugs.python.org/file20570/server.py

___
Python tracker 
<http://bugs.python.org/issue11035>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11035] Segmentation fault

2011-01-28 Thread Dmitry Groshev

Dmitry Groshev  added the comment:

I should also say that this bug appears at first time, but it doesn't make it 
less scary. The packet that crashed python was the same as the one shown and 
I've already used this tiny "server" for a day or two without modifications, so 
it seems to me that this is not my mistake.

--

___
Python tracker 
<http://bugs.python.org/issue11035>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11035] Segmentation fault

2011-01-28 Thread Dmitry Groshev

Dmitry Groshev  added the comment:

Ok, I've played with this some more and got some more segmenation faults. It 
looks like a gevent problem, but I think that python shouldn't completely fall 
so easy. Here is more traces:

si14@si14-work:~/repos/monitoring/root$ python2.7 server.py 
{"seenby":[1],"received":1296208139.606481,"observer":1,"type":"ping","source":102,"time":1296208139.603046,"data":[[1296208051.083743,0.010155200958251953],[1296208060.923999,0.011048078536987305],[1296208070.76751,0.010570049285888672],[1296208080.613247,0.011930227279663086],[1296208090.454012,0.010123968124389648],[1296208100.283144,0.010128021240234375],[1296208110.114118,0.010215997695922852],[1296208119.943081,0.010147809982299805],[1296208129.774567,0.010593891143798828],[1296208139.603033,0.010541915893554688]],"class":"statistics"}
127.0.0.1 - - [2011-01-28 12:48:59] "POST / HTTP/1.1" 200 2 "-" 
"Python-urllib/2.7"
Traceback (most recent call last):
  File "evhttp.pxi", line 473, in gevent.core._http_cb_handler 
(gevent/core.c:13165)
Segmentation fault

127.0.0.1 - - [2011-01-28 12:47:30] "GET /update HTTP/1.1" 200 320 "-" 
"Python-urllib/2.7"
^CTraceback (most recent call last):
  File "server.py", line 34, in 
gevent.wsgi.WSGIServer(("localhost", 8020), printer).serve_forever()
  File "build/bdist.linux-i686/egg/gevent/baseserver.py", line 183, in 
serve_forever
TypeError: raise: arg 3 must be a traceback or None
Segmentation fault
si14@si14-work:~/repos/monitoring/root$

--

___
Python tracker 
<http://bugs.python.org/issue11035>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11035] Segmentation fault

2011-01-28 Thread Dmitry Groshev

Dmitry Groshev  added the comment:

I've changed gevent.wsgi server to gevent.pywsgi and it works as expected. Now 
I switched it back to gevent.wsgi and it doesn't crash too! That's strange, but 
I understand that you can't fix it without normal backtrace. I'm sorry for the 
inconvenience.

--

___
Python tracker 
<http://bugs.python.org/issue11035>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11328] NESTED WHILE CYCLES ERROR

2011-02-25 Thread Dmitry Negius

New submission from Dmitry Negius :

Nested "while" cycles does not work. This make impossible to write a class of 
programs with nested while cycles.

--
components: Interpreter Core
files: bug.py
messages: 129506
nosy: negius
priority: normal
severity: normal
status: open
title: NESTED WHILE CYCLES ERROR
type: compile error
versions: Python 2.6, Python 2.7, Python 3.2
Added file: http://bugs.python.org/file20903/bug.py

___
Python tracker 
<http://bugs.python.org/issue11328>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue14983] [patch] email.generator should always add newlines after closing boundaries

2012-06-01 Thread Dmitry Shachnev

New submission from Dmitry Shachnev :

Trying to write a email-sending script with PGP-signing functionality, I 
stumbled upon a problem (see [1]): it was impossible to sign mutlipart emails 
(actually the signing was performed, but the verifying programs thought that 
the signature is bad).

After comparing messages produced by email.generator and popular mail clients 
(Evolution, KMail), I've found out that the mail clients always add line breaks 
after ending boundaries.

The attached patch makes email.generator behave like all email clients. After 
applying it, it's possible to sign even complicated mails like 
"multipart/alternate with attachments".

An illustration:

 --1== # Part 1 (base message) begin
 ...
 --2== # Part 1.1 begin
 ...
 --2== # Part 1.2 begin
 ...
 --2==--   # Part 1 end
   # There should be empty line here
 --1== # Part 2 (signature) begin
 ...
 --1==--   # End of the message

[1]: 
http://stackoverflow.com/questions/10496902/pgp-signing-multipart-e-mails-with-python

--
components: email
files: always_add_newlines.patch
keywords: patch
messages: 162126
nosy: barry, mitya57, r.david.murray
priority: normal
severity: normal
status: open
title: [patch] email.generator should always add newlines after closing 
boundaries
type: behavior
versions: Python 3.2
Added file: http://bugs.python.org/file25798/always_add_newlines.patch

___
Python tracker 
<http://bugs.python.org/issue14983>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue14380] MIMEText should default to utf8 charset if input text contains non-ASCII

2012-06-02 Thread Dmitry Shachnev

Dmitry Shachnev  added the comment:

Maybe it'll be better to use 'latin-1' charset for latin-1 texts?

Something like this:

if _charset == 'us-ascii':
try:
_text.encode(_charset)
except UnicodeEncodeError:
try:
_text.encode('latin-1')
except UnicodeEncodeError:
_charset = 'utf-8'
else:
_charset = 'latin-1'

This will make messages in most latin languages use quoted-printable encoding.

--
components: +email -Library (Lib)
nosy: +barry

___
Python tracker 
<http://bugs.python.org/issue14380>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue14983] email.generator should always add newlines after closing boundaries

2012-06-02 Thread Dmitry Shachnev

Dmitry Shachnev  added the comment:

> Looking at your stackoverflow post, you might be able to fix this by doing an 
> rstrip on the string body before signing it.

My body doesn't end with \n, so that doesn't help. If you suggest me any (easy) 
way to fix this on the level of my script, I will be happy.

> Probably adding the CRLF is the way to go, but I wonder, what are the 
> controlling RFCs for signing/verifying and what do they say?

The standard is RFC 1847, it doesn't say anything about multipart emails. I've 
just looked at what other mail clients do, it works, but I can't say anything 
about its correctness. Also, looking at some multipart signed emails in my 
inbox, they *all* have the blank line before the signature part.

--

___
Python tracker 
<http://bugs.python.org/issue14983>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue14983] email.generator should always add newlines after closing boundaries

2012-06-02 Thread Dmitry Shachnev

Dmitry Shachnev  added the comment:

> By 'body' I actually meant the multipart part you are signing.

Yes, I've understood you, and I mean the same :) The signature is created 
against the not-ending-with-newline string, in any case.

--

___
Python tracker 
<http://bugs.python.org/issue14983>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue14983] email.generator should always add newlines after closing boundaries

2012-06-03 Thread Dmitry Shachnev

Dmitry Shachnev  added the comment:

> Hmm.  So that means the verifiers are not paying attention to the MIME > RFC? 
>  That's unfortunate.

It seems that's true...

--

___
Python tracker 
<http://bugs.python.org/issue14983>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7304] email.message.Message.set_payload and as_string given charset 'us-ascii' plus 8bit data produces invalid message

2012-06-06 Thread Dmitry Shachnev

Changes by Dmitry Shachnev :


--
nosy: +mitya57

___
Python tracker 
<http://bugs.python.org/issue7304>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15016] [patch] add special case for latin messages in email.mime.text

2012-06-06 Thread Dmitry Shachnev

New submission from Dmitry Shachnev :

(Follow-up to issue 14380)

The attached patch makes the email.mime.text.MIMEText constructor use the 
iso-8859-1 (aka latin-1) encoding for messages where all characters are in 
range(256). This also makes them use quoted-printable transfer encoding instead 
of base64.

So, the current algorithm of guessing encoding is as follows:

- all characters are in range(128) -> encoding is us-ascii
- all characters are in range(256) -> encoding is iso-8859-1 (aka latin-1)
- else -> encoding is utf-8

--
components: email
messages: 162399
nosy: barry, mitya57, r.david.murray
priority: normal
severity: normal
status: open
title: [patch] add special case for latin messages in email.mime.text
type: behavior
versions: Python 3.2, Python 3.3

___
Python tracker 
<http://bugs.python.org/issue15016>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15016] [patch] add special case for latin messages in email.mime.text

2012-06-06 Thread Dmitry Shachnev

Changes by Dmitry Shachnev :


--
keywords: +patch
Added file: http://bugs.python.org/file25845/issue_15016.patch

___
Python tracker 
<http://bugs.python.org/issue15016>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15016] [patch] add special case for latin messages in email.mime.text

2012-06-06 Thread Dmitry Shachnev

Dmitry Shachnev  added the comment:

Updated the patch:

- Avoid using letter Ш in test, it's better to use chr(256) as the test case;
- Updated the comment in MIMEText constructor to reflect the new behaviour.

--
hgrepos: +135
Added file: http://bugs.python.org/file25846/issue_15016_v2.patch

___
Python tracker 
<http://bugs.python.org/issue15016>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15016] [patch] add special case for latin messages in email.mime.text

2012-06-06 Thread Dmitry Shachnev

Changes by Dmitry Shachnev :


--
hgrepos:  -135

___
Python tracker 
<http://bugs.python.org/issue15016>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15016] [patch] add special case for latin messages in email.mime.text

2012-06-06 Thread Dmitry Shachnev

Changes by Dmitry Shachnev :


Removed file: http://bugs.python.org/file25845/issue_15016.patch

___
Python tracker 
<http://bugs.python.org/issue15016>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15016] Add special case for latin messages in email.mime.text

2012-06-06 Thread Dmitry Shachnev

Dmitry Shachnev  added the comment:

Done, sent an e-mail to contribut...@python.org.

--

___
Python tracker 
<http://bugs.python.org/issue15016>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15016] Add special case for latin messages in email.mime.text

2012-06-23 Thread Dmitry Shachnev

Dmitry Shachnev  added the comment:

> This seems to be an enhancement sort of request rather than a bug... so I 
> wonder why Python 3.2 is listed?

Fixed.

> ... although I see nothing in the PEP about how to access that information 
> from Python.

You are right, it seems there is no Python API for that (yet?), so I don't see 
any better solutions for determining the maximum character for now. Also, note 
that this algorithm had already been used before my patch.

--
type: behavior -> enhancement
versions:  -Python 3.2

___
Python tracker 
<http://bugs.python.org/issue15016>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15524] Dict items() ordering varies across interpreter invocations

2012-08-01 Thread Dmitry Dvoinikov

New submission from Dmitry Dvoinikov:

The following line prints different things each time you run it:

python3 -c "print(', '.join({ '1': '2', '3': '4' }.keys()))"

The output is either "1, 3" or "3, 1". Is such indeterministic behavior 
intentional ?

Using Python 3.3.0b1 (default, Aug  1 2012, 06:09:44)

--
components: Interpreter Core
messages: 167116
nosy: ddvoinikov
priority: normal
severity: normal
status: open
title: Dict items() ordering varies across interpreter invocations
type: behavior
versions: Python 3.3

___
Python tracker 
<http://bugs.python.org/issue15524>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15534] xmlrpc escaping breaks on unicode \u043c

2012-08-02 Thread Dmitry Dvoinikov

New submission from Dmitry Dvoinikov:

For the following script

import xmlrpc.client; from xmlrpc.client import escape
text = "...\u043c..<"
print(escape(text))

Python 3.3.0b1 produces
...ь..<...<
whereas Python 3.2
...ь..<

--
components: Library (Lib)
messages: 167199
nosy: ddvoinikov
priority: normal
severity: normal
status: open
title: xmlrpc escaping breaks on unicode \u043c
type: behavior
versions: Python 3.3

___
Python tracker 
<http://bugs.python.org/issue15534>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue15570] email.header.decode_header parses differently

2012-08-06 Thread Dmitry Dvoinikov

New submission from Dmitry Dvoinikov:

The following script
---
import email.header
print(email.header.decode_header("foo =?windows-1251?Q?bar?="))
---
produces

[(b'foo', None), (b'bar', 'windows-1251')]

in Python 3.2 but

[(b'foo ', None), (b'bar', 'windows-1251')]

in Python 3.3.0b1

--
components: Library (Lib)
messages: 167602
nosy: ddvoinikov
priority: normal
severity: normal
status: open
title: email.header.decode_header parses differently
type: behavior
versions: Python 3.3

___
Python tracker 
<http://bugs.python.org/issue15570>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6327] [mimetext] long lines get cut with exclamation mark and newline

2011-09-06 Thread Dmitry Simonov

Dmitry Simonov  added the comment:

Quote:
==
Notes

Note that mailservers have a 990-character limit on each line contained within 
an email message. If an email message is sent that contains lines longer than 
990-characters, those lines will be subdivided by additional line ending 
characters, which can cause corruption in the email message, particularly for 
HTML content. To prevent this from occurring, add your own line-ending 
characters at appropriate locations within the email message to ensure that no 
lines are longer than 990 characters.
==

--
nosy: +Dmitry.Simonov

___
Python tracker 
<http://bugs.python.org/issue6327>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13323] urllib2 does not correctly handle multiple www-authenticate headers in an HTTP response

2011-11-02 Thread dmitry be

Changes by dmitry be :


--
nosy: +Dmitry.Beransky

___
Python tracker 
<http://bugs.python.org/issue13323>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13413] time.daylight incorrect behavior in linux glibc

2011-11-15 Thread Dmitry Balabanov

New submission from Dmitry Balabanov :

In Europe/Moscow timezone:
>> import time
>> time.daylight
1
>>> time.timezone
-10800

But if compile and run attached program result would be:
timezone: -14400, daylight: 0

Daylight is not applicable in Europe/Moscow timezone from this winter. But 
python detect daylight flag as differences between January and July localtime. 

Why not tzset()?

--
files: daylight.c
messages: 147752
nosy: dimonb
priority: normal
severity: normal
status: open
title: time.daylight incorrect behavior in linux glibc
type: behavior
Added file: http://bugs.python.org/file23704/daylight.c

___
Python tracker 
<http://bugs.python.org/issue13413>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue16089] _elementtree.TreeBuilder broken with a non-C-deriving element_factory

2012-11-09 Thread Dmitry Shachnev

Dmitry Shachnev added the comment:

There are still some false-positive warnings caused by C code that affect 
docutils (see 
http://sourceforge.net/tracker/?func=detail&atid=422030&aid=3555164&group_id=38414).

Look at this traceback for exmaple:

http://paste.ubuntu.com/1345164/

There, _ElementInterfaceWrapper is a subclass of etree._ElementInterface 
(http://repo.or.cz/w/docutils.git/blob/HEAD:/docutils/docutils/writers/odf_odt/__init__.py#l91).

--
nosy: +larry, mitya57

___
Python tracker 
<http://bugs.python.org/issue16089>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue16089] _elementtree.TreeBuilder broken with a non-C-deriving element_factory

2012-11-09 Thread Dmitry Shachnev

Changes by Dmitry Shachnev :


___
Python tracker 
<http://bugs.python.org/issue16089>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue16089] _elementtree.TreeBuilder broken with a non-C-deriving element_factory

2012-11-09 Thread Dmitry Shachnev

Changes by Dmitry Shachnev :


___
Python tracker 
<http://bugs.python.org/issue16089>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue16891] Fix docs about module search order

2013-01-08 Thread Dmitry Mugtasimov

New submission from Dmitry Mugtasimov:

http://docs.python.org/2/tutorial/modules.html should be rewritten.
AS IS
6.1.2. The Module Search Path

When a module named spam is imported, the interpreter first searches for a 
built-in module with that name. If not found, it then searches for a file named 
spam.py in a list of directories given by the variable sys.path. sys.path is 
initialized from these locations:

TO BE
6.1.2. The Module Search Path

When a module named spam is imported, the interpreter first searches for a 
built-in module with that name. If not found, it looks in the containing 
package (the package of which the current module is a submodule). If not found, 
it then searches for a file named spam.py in a list of directories given by the 
variable sys.path. sys.path is initialized from these locations:

--
Note that now "6.1.2. The Module Search Path" and "6.4.2. Intra-package 
References" are contradictary since in 6.4.2 it is said: "In fact, such 
references are so common that the import statement first looks in the 
containing package before looking in the standard module search path.", but 
this is not reflected in 6.1.2.

--
EXAMPLE (for more information see  
http://stackoverflow.com/questions/14183541/why-python-finds-module-instead-of-package-if-they-have-the-same-name#comment19687166_14183541
 ):
/home/dmugtasimov/tmp/name-res3/xyz
__init__.py
a.py
b.py
t.py
xyz.py

Files init.py, b.py and xyz.py are empty
File a.py:

import os, sys
ROOT_DIRECTORY = os.path.abspath(os.path.join(os.path.dirname(__file__), '..'))
if not sys.path or ROOT_DIRECTORY not in sys.path:
print 'sys.path is modified in a.py'
sys.path.insert(0, ROOT_DIRECTORY)
else:
print 'sys.path is NOT modified in a.py'

print 'sys.path:', sys.path
print 'BEFORE import xyz.b'
import xyz.b
print 'AFTER import xyz.b'

File t.py:

import os, sys
ROOT_DIRECTORY = os.path.abspath(os.path.join(os.path.dirname(__file__), '..'))
if not sys.path or ROOT_DIRECTORY not in sys.path:
print 'sys.path is modified in t.py'
sys.path.insert(0, ROOT_DIRECTORY)
else:
print 'sys.path is NOT modified in t.py'

import xyz.a

Run:

python a.py

Output:

sys.path is modified in a.py
sys.path: ['/home/dmugtasimov/tmp/name-res3', 
'/home/dmugtasimov/tmp/name-res3/xyz',
 '/usr/local/lib/python2.7/dist-packages/tornado-2.3-py2.7.egg',
 '/home/dmugtasimov/tmp/name-res3/xyz', '/usr/lib/python2.7',
 '/usr/lib/python2.7/plat-linux2', '/usr/lib/python2.7/lib-tk',
 '/usr/lib/python2.7/lib-old', '/usr/lib/python2.7/lib-dynload',
 '/usr/local/lib/python2.7/dist-packages',
 '/usr/local/lib/python2.7/dist-packages/setuptools-0.6c11-py2.7.egg-info',
 '/usr/lib/python2.7/dist-packages',
 '/usr/lib/python2.7/dist-packages/PIL',
 '/usr/lib/python2.7/dist-packages/gst-0.10',
 '/usr/lib/python2.7/dist-packages/gtk-2.0',
 '/usr/lib/python2.7/dist-packages/ubuntu-sso-client']
BEFORE import xyz.b
AFTER import xyz.b

Run:

python -vv a.py

Output:

import xyz # directory /home/dmugtasimov/tmp/name-res3/xyz
# trying /home/dmugtasimov/tmp/name-res3/xyz/__init__.so
# trying /home/dmugtasimov/tmp/name-res3/xyz/__init__module.so
# trying /home/dmugtasimov/tmp/name-res3/xyz/__init__.py
# /home/dmugtasimov/tmp/name-res3/xyz/__init__.pyc matches 
/home/dmugtasimov/tmp/name-res3/xyz/__init__.py
import xyz # precompiled from 
/home/dmugtasimov/tmp/name-res3/xyz/__init__.pyc
# trying /home/dmugtasimov/tmp/name-res3/xyz/b.so
# trying /home/dmugtasimov/tmp/name-res3/xyz/bmodule.so
# trying /home/dmugtasimov/tmp/name-res3/xyz/b.py
# /home/dmugtasimov/tmp/name-res3/xyz/b.pyc matches 
/home/dmugtasimov/tmp/name-res3/xyz/b.py
import xyz.b # precompiled from /home/dmugtasimov/tmp/name-res3/xyz/b.pyc

Run:

python t.py

Output:

sys.path is modified in t.py
sys.path is NOT modified in a.py
sys.path: ['/home/dmugtasimov/tmp/name-res3', 
'/home/dmugtasimov/tmp/name-res3/xyz',
 '/usr/local/lib/python2.7/dist-packages/tornado-2.3-py2.7.egg',
 '/home/dmugtasimov/tmp/name-res3/xyz', '/usr/lib/python2.7',
 '/usr/lib/python2.7/plat-linux2', '/usr/lib/python2.7/lib-tk',
 '/usr/lib/python2.7/lib-old', '/usr/lib/python2.7/lib-dynload',
 '/usr/local/lib/python2.7/dist-packages',
 '/usr/local/lib/python2.7/dist-packages/setuptools-0.6c11-py2.7.egg-info',
 '/usr/lib/python2.7/dist-packages',
 '/usr/lib/python2.7/dist-packages/PIL',
 '/usr/lib/python2.7/dist-packages/gst-0.10',
 '/u

[issue16891] Fix docs about module search order

2013-01-08 Thread Dmitry Mugtasimov

Dmitry Mugtasimov added the comment:

UPDATE:
CHANGE
http://stackoverflow.com/questions/14183541/why-python-finds-module-instead-of-package-if-they-have-the-same-name#comment19687166_14183541

TO
http://stackoverflow.com/questions/14183541/why-python-finds-module-instead-of-package-if-they-have-the-same-name

Because the whole question and replies are important.

--

___
Python tracker 
<http://bugs.python.org/issue16891>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue16891] Fix docs about module search order

2013-01-08 Thread Dmitry Mugtasimov

Dmitry Mugtasimov added the comment:

As I investigate it a little closer it seems to me that it is not a 
documentation issue, but an implementation issue.

http://docs.python.org/2/reference/simple_stmts.html#import
"A package can contain other packages and modules while modules cannot contain 
other modules or packages."

The only why to import name from module is
from xyz import b
where "b" is name defined inside xyz module.

Issuing
import xyz.b
means that "b" is module or package.

Therefore xyz cannot be a module, since "...modules cannot contain other 
modules or packages." This means that xyz is package.

The problem is that it is considered as module for case:
python t.py

--

___
Python tracker 
<http://bugs.python.org/issue16891>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue16891] Fix docs about module search order

2013-01-08 Thread Dmitry Mugtasimov

Dmitry Mugtasimov added the comment:

A lot of people are still using python 2.7, even 2.6. For me it would be a nice 
fix in docs since I spent a plenty of time, trying to figure out what is going 
on.

In my previous comment I also pointed out that implementation probably should 
be fixed too, since interpreter tries to import module from module, which 
should not be possible according to docs. It may be an issue for Python 3 too, 
but I did not check.

--

___
Python tracker 
<http://bugs.python.org/issue16891>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue16891] Fix docs about module search order

2013-01-08 Thread Dmitry Mugtasimov

Dmitry Mugtasimov added the comment:

Further investigation led me to the conclusion that "TO BE" should look like 
this:

6.1.2. The Module Search Path

When a module named spam is imported, the interpreter first searches in the 
containing package (the package of which the current module is a submodule) if 
applicable (a module is not required to be a submodule of a package). If not 
found or module is not a part of any package it searches for a built-in module 
with that name. If not found, it then searches for a file named spam.py in a 
list of directories given by the variable sys.path. sys.path is initialized 
from these locations:

--

___
Python tracker 
<http://bugs.python.org/issue16891>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17143] trace.py uses _warn without importing it

2013-02-06 Thread Dmitry Jemerov

New submission from Dmitry Jemerov:

trace.py in Python 3.3 standard library uses the _warn function without 
importing it. As a result, an attempt to use a now-deprecated function fails 
with a NameError:

> python3
Python 3.3.0 (v3.3.0:bd8afb90ebf2, Sep 29 2012, 01:25:11) 
[GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import trace
>>> trace.modname('')
Traceback (most recent call last):
  File "", line 1, in 
  File 
"/Library/Frameworks/Python.framework/Versions/3.3/lib/python3.3/trace.py", 
line 827, in modname
_warn("The trace.modname() function is deprecated",
NameError: global name '_warn' is not defined

--
components: Library (Lib)
messages: 181531
nosy: Dmitry.Jemerov
priority: normal
severity: normal
status: open
title: trace.py uses _warn without importing it
versions: Python 3.3

___
Python tracker 
<http://bugs.python.org/issue17143>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17143] trace.py uses _warn without importing it

2013-02-06 Thread Dmitry Jemerov

Dmitry Jemerov added the comment:

Workaround: "import warnings; trace._warn = warnings.warn" after importing trace

--

___
Python tracker 
<http://bugs.python.org/issue17143>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2832] Line numbers reported by extract_stack are offset by the #-*- encoding line

2008-05-11 Thread Dmitry Dvoinikov

New submission from Dmitry Dvoinikov <[EMAIL PROTECTED]>:

Stack trace information extracted with traceback.extract_stack is
incorrect in that the #-*- line causes double counting. For example:

#comment
from traceback import extract_stack
print("this is line", extract_stack()[-1][1])

prints 'this is line 3', but

#comment
#-*- coding: windows-1251 -*-
from traceback import extract_stack
print("this is line", extract_stack()[-1][1])

prints 'this is line 6'

--
components: Library (Lib)
messages: 66708
nosy: ddvoinikov
severity: normal
status: open
title: Line numbers reported by extract_stack are offset by the #-*- encoding 
line
type: behavior
versions: Python 3.0

__
Tracker <[EMAIL PROTECTED]>
<http://bugs.python.org/issue2832>
__
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2833] __exit__ silences the active exception

2008-05-12 Thread Dmitry Dvoinikov

New submission from Dmitry Dvoinikov <[EMAIL PROTECTED]>:

If a context manager is used within exception handling block, the active
exception is silenced after the context block completes and __exit__ exits.

try:
raise Exception("foo")
except:
with SomeContextManager():
pass
raise # in Py2.5 throws 'foo', in Py3.0 fails with RuntimeError

--
components: Interpreter Core
messages: 66713
nosy: ddvoinikov
severity: normal
status: open
title: __exit__ silences the active exception
type: behavior
versions: Python 3.0

__
Tracker <[EMAIL PROTECTED]>
<http://bugs.python.org/issue2833>
__
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3613] base64.encodestring does not actually accept strings

2008-08-19 Thread Dmitry Dvoinikov

New submission from Dmitry Dvoinikov <[EMAIL PROTECTED]>:

This quote from base64.py:

---
bytes_types = (bytes, bytearray)  # Types acceptable as binary data
...
def encodestring(s):
"""Encode a string into multiple lines of base-64 data.

Argument and return value are bytes.
"""
if not isinstance(s, bytes_types):
raise TypeError("expected bytes, not %s" % s.__class__.__name__)
...
---

shows that encodestring method won't accept str for an argument, only
bytes. Perhaps this is by design, but then wouldn't it make sense to
change the name of the method ?

Anyway, this behavior clashes in (the least I know) xmlrpc.client, line
1168 when basic authentication is present:

---
auth = base64.encodestring(urllib.parse.unquote(auth))
---

because unquote() returns str, not bytes.

--
components: Library (Lib)
messages: 71513
nosy: ddvoinikov
severity: normal
status: open
title: base64.encodestring does not actually accept strings
type: behavior
versions: Python 3.0

___
Python tracker <[EMAIL PROTECTED]>
<http://bugs.python.org/issue3613>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3614] typo in xmlrpc.client

2008-08-19 Thread Dmitry Dvoinikov

New submission from Dmitry Dvoinikov <[EMAIL PROTECTED]>:

In xmlrpc.client:1204:
---
headers = {}
if extra_headers:
for key, val in extra_headers:
header[key] = val
---
shouldn't it read 
---
headers[key] = val
  ^
---
?

Otherwise it bails out with 
---
NameError: global name 'header' is not defined
---

--
components: Library (Lib)
messages: 71514
nosy: ddvoinikov
severity: normal
status: open
title: typo in xmlrpc.client
type: behavior
versions: Python 3.0

___
Python tracker <[EMAIL PROTECTED]>
<http://bugs.python.org/issue3614>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3613] base64.encodestring does not actually accept strings

2008-08-20 Thread Dmitry Dvoinikov

Dmitry Dvoinikov <[EMAIL PROTECTED]> added the comment:

> I think it probably is correct to NOT accept a string

I agree.

> it should be renamed to encodestring

Huh ? It is already called that :) IMO it should be renamed to
encodebytes or simply encode if the module is only (or most frequently)
used to encode bytes.

> Best we can do is document them.

Oh well.

___
Python tracker <[EMAIL PROTECTED]>
<http://bugs.python.org/issue3613>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3714] nntplib module broken by str to unicode conversion

2008-08-28 Thread Dmitry Vasiliev

New submission from Dmitry Vasiliev <[EMAIL PROTECTED]>:

The following commands fail badly:

>>> from nntplib import NNTP
>>> s = NNTP("free-text.usenetserver.com")
Traceback (most recent call last):
  File "", line 1, in 
  File "/py3k/Lib/nntplib.py", line 116, in __init__
self.welcome = self.getresp()
  File "/py3k/Lib/nntplib.py", line 215, in getresp
resp = self.getline()
  File "/py3k/Lib/nntplib.py", line 209, in getline
elif line[-1:] in CRLF: line = line[:-1]
TypeError: 'in ' requires string as left operand, not bytes

Actually there are many places in nntplib module which need to be
converted to bytes, or socket input/output values need to be converted
from/to str. I think API need to be updated to pass user defined
encoding at some stages. I can make a patch later if needed.

--
components: Library (Lib)
messages: 72090
nosy: hdima
severity: normal
status: open
title: nntplib module broken by str to unicode conversion
type: crash
versions: Python 3.0

___
Python tracker <[EMAIL PROTECTED]>
<http://bugs.python.org/issue3714>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3714] nntplib module broken by str to unicode conversion

2008-08-28 Thread Dmitry Vasiliev

Dmitry Vasiliev <[EMAIL PROTECTED]> added the comment:

I've attached the patch which adds encoding parameter to the NNTP class.

--
keywords: +patch
Added file: http://bugs.python.org/file11292/nntplib.patch

___
Python tracker <[EMAIL PROTECTED]>
<http://bugs.python.org/issue3714>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3725] telnetlib module broken by str to unicode conversion

2008-08-29 Thread Dmitry Vasiliev

New submission from Dmitry Vasiliev <[EMAIL PROTECTED]>:

Simple example:

>>> from telnetlib import Telnet
>>> t = Telnet("google.com", 80)
>>> t.write("GET / HTTP/1.1\r\n")
Traceback (most recent call last):
  File "", line 1, in 
  File "/py3k/Lib/telnetlib.py", line 280, in write
self.sock.sendall(buffer)
TypeError: sendall() argument 1 must be string or buffer, not str
>>> t.write(b"GET / HTTP/1.1\r\n")
Traceback (most recent call last):
  File "", line 1, in 
  File "/py3k/Lib/telnetlib.py", line 277, in write
if IAC in buffer:
TypeError: Type str doesn't support the buffer API

--
components: Library (Lib)
messages: 72131
nosy: hdima
severity: normal
status: open
title: telnetlib module broken by str to unicode conversion
type: crash
versions: Python 3.0

___
Python tracker <[EMAIL PROTECTED]>
<http://bugs.python.org/issue3725>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3725] telnetlib module broken by str to unicode conversion

2008-08-29 Thread Dmitry Vasiliev

Dmitry Vasiliev <[EMAIL PROTECTED]> added the comment:

I think only bytes need to be allowed for write() and read*() because of
low-level nature of Telnet. I can create a patch later.

___
Python tracker <[EMAIL PROTECTED]>
<http://bugs.python.org/issue3725>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3727] poplib module broken by str to unicode conversion

2008-08-29 Thread Dmitry Vasiliev

New submission from Dmitry Vasiliev <[EMAIL PROTECTED]>:

Example:

>>> from poplib import POP3
>>> p = POP3("localhost")
>>> p.user("user")
Traceback (most recent call last):
  File "", line 1, in 
  File "/py3k/Lib/poplib.py", line 179, in user
return self._shortcmd('USER %s' % user)
  File "/py3k/Lib/poplib.py", line 151, in _shortcmd
self._putcmd(line)
  File "/py3k/Lib/poplib.py", line 98, in _putcmd
self._putline(line)
  File "/py3k/Lib/poplib.py", line 91, in _putline
self.sock.sendall('%s%s' % (line, CRLF))
TypeError: sendall() argument 1 must be string or buffer, not str
>>> p.user(b"user")
Traceback (most recent call last):
  File "", line 1, in 
  File "/py3k/Lib/poplib.py", line 179, in user
return self._shortcmd('USER %s' % user)
  File "/py3k/Lib/poplib.py", line 151, in _shortcmd
self._putcmd(line)
  File "/py3k/Lib/poplib.py", line 98, in _putcmd
self._putline(line)
  File "/py3k/Lib/poplib.py", line 91, in _putline
self.sock.sendall('%s%s' % (line, CRLF))
TypeError: sendall() argument 1 must be string or buffer, not str

--
components: Library (Lib)
messages: 72136
nosy: hdima
severity: normal
status: open
title: poplib module broken by str to unicode conversion
type: crash
versions: Python 3.0

___
Python tracker <[EMAIL PROTECTED]>
<http://bugs.python.org/issue3727>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3728] imaplib module broken by str to unicode conversion

2008-08-29 Thread Dmitry Vasiliev

New submission from Dmitry Vasiliev <[EMAIL PROTECTED]>:

Example:

>>> from imaplib import IMAP4
>>> m = IMAP4("localhost")
Traceback (most recent call last):
  File "", line 1, in 
  File "/py3k/Lib/imaplib.py", line 185, in __init__
self.welcome = self._get_response()
  File "/py3k/Lib/imaplib.py", line 912, in _get_response
if self._match(self.tagre, resp):
  File "/py3k/Lib/imaplib.py", line 1021, in _match
self.mo = cre.match(s)
TypeError: can't use a string pattern on a bytes-like object
>>> m = IMAP4(b"localhost")
Traceback (most recent call last):
  File "", line 1, in 
  File "/py3k/Lib/imaplib.py", line 185, in __init__
self.welcome = self._get_response()
  File "/py3k/Lib/imaplib.py", line 912, in _get_response
if self._match(self.tagre, resp):
  File "/py3k/Lib/imaplib.py", line 1021, in _match
self.mo = cre.match(s)
TypeError: can't use a string pattern on a bytes-like object

--
components: Library (Lib)
messages: 72137
nosy: hdima
severity: normal
status: open
title: imaplib module broken by str to unicode conversion
type: crash
versions: Python 3.0

___
Python tracker <[EMAIL PROTECTED]>
<http://bugs.python.org/issue3728>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3728] imaplib module broken by str to unicode conversion

2008-08-29 Thread Dmitry Vasiliev

Dmitry Vasiliev <[EMAIL PROTECTED]> added the comment:

Ah, yes.

___
Python tracker <[EMAIL PROTECTED]>
<http://bugs.python.org/issue3728>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3714] nntplib module broken by str to unicode conversion

2008-09-08 Thread Dmitry Vasiliev

Dmitry Vasiliev <[EMAIL PROTECTED]> added the comment:

Actually RFC-977 said all characters must be in ASCII, but RFC-3977
changed default character set to UTF-8. So I think UTF-8 must be default
encoding, not Latin-1. Moreover Latin-1 can silently hide a real
encoding, for example:

>>> u'\u0422\u0435\u0441\u0442'.encode("koi8-r").decode("latin1")
u'\xf4\xc5\xd3\xd4'

Additionally in the future it would be a good idea to look in the
article headers for article body encoding.

___
Python tracker <[EMAIL PROTECTED]>
<http://bugs.python.org/issue3714>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3714] nntplib module broken by str to unicode conversion

2008-09-08 Thread Dmitry Vasiliev

Dmitry Vasiliev <[EMAIL PROTECTED]> added the comment:

If I understand it correctly there is no "character set used by server"
because every article can be in different encoding. RFC-3977 say:

"""
The character set of article bodies SHOULD be indicated in the
article headers, and this SHOULD be done in accordance with MIME.
"""

But it's not always true, for example fido7.* groups known to use
"KOI-8R" encoding but I didn't find any relevant headers.

___
Python tracker <[EMAIL PROTECTED]>
<http://bugs.python.org/issue3714>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3714] nntplib module broken by str to unicode conversion

2008-09-08 Thread Dmitry Vasiliev

Dmitry Vasiliev <[EMAIL PROTECTED]> added the comment:

RFC-3977 say the following about headers:

- The names of headers (e.g., "From" or "Subject") MUST be in
  US-ASCII.

- Header values SHOULD use US-ASCII or an encoding based on it, such
  as RFC 2047 [RFC2047], until such time as another approach has
  been standardised.  At present, 8-bit encodings (including UTF-8)
  SHOULD NOT be used because they are likely to cause
  interoperability problems.

But in practice for now there is no way to reliable find a header's
encoding.

___
Python tracker <[EMAIL PROTECTED]>
<http://bugs.python.org/issue3714>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3935] bisect insort C implementation ignores methods on list subclasses

2008-09-26 Thread Dmitry Vasiliev

Dmitry Vasiliev <[EMAIL PROTECTED]> added the comment:

Actually it was an optimization. PyList_Insert() was used for list and
list-derived objects.

I've attached the patch which fix the issue and for me the new code
looks even cleaner than the original code.

--
keywords: +patch
nosy: +hdima
Added file: http://bugs.python.org/file11614/bisect.diff

___
Python tracker <[EMAIL PROTECTED]>
<http://bugs.python.org/issue3935>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3935] bisect insort C implementation ignores methods on list subclasses

2008-09-26 Thread Dmitry Vasiliev

Dmitry Vasiliev <[EMAIL PROTECTED]> added the comment:

Good idea! Don't know why I didn't use it in the very first version. :-)

New patch attached.

Added file: http://bugs.python.org/file11623/bisect2.patch

___
Python tracker <[EMAIL PROTECTED]>
<http://bugs.python.org/issue3935>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3714] nntplib module broken by str to unicode conversion

2008-10-14 Thread Dmitry Vasiliev

Dmitry Vasiliev <[EMAIL PROTECTED]> added the comment:

Oh, you need to read the comments first:

- Use of ISO-8859-1 it's a bad idea here. See msg72776 for details.
Moreover RFC-3977 explicitly say about UTF-8, so I think we need to use
UTF-8.
- Maybe set_encoding() isn't needed but you need to have a possibility
to change encoding after object creation. Because different groups can
use different encodings. But with makefile() addition you just remove
this possibility.

___
Python tracker <[EMAIL PROTECTED]>
<http://bugs.python.org/issue3714>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3725] telnetlib module broken by str to unicode conversion

2008-10-14 Thread Dmitry Vasiliev

Dmitry Vasiliev <[EMAIL PROTECTED]> added the comment:

The patch is good. It's exactly what I told about in msg72132.

___
Python tracker <[EMAIL PROTECTED]>
<http://bugs.python.org/issue3725>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3890] ssl.SSLSocket.recv() implementation may not work with non-blocking sockets

2008-10-20 Thread Dmitry Dvoinikov

Changes by Dmitry Dvoinikov <[EMAIL PROTECTED]>:


--
nosy: +ddvoinikov

___
Python tracker <[EMAIL PROTECTED]>
<http://bugs.python.org/issue3890>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue4171] SSL handshake fails after TCP connection in getpeername()

2008-10-22 Thread Dmitry Dvoinikov

New submission from Dmitry Dvoinikov <[EMAIL PROTECTED]>:

If I connect a TCP socket s using regular s.connect(), then wrap it
using ssl.wrap_socket(s) and call do_handshake on the resulting SSL
socket, handshake fails in ssl.py:320 with 

AttributeError: 'NoneType' object has no attribute 'do_handshake'

The problem is that when TCP socket is being wrapped in ssl.py:116, it
is not recognized as connected by a call to getpeername(), the exception
thrown in ssl.py:116 and silenced is this:

[Errno 10057] A request to send or receive data was disallowed because
the socket is not connected and (when sending on a datagram socket using
a sendto call) no address was supplied

This is awkward, because synchronous s.connect() has just returned
successfully. Even more weird, if I insert s.getpeername() between TCP
connect() and SSL do_handshake() the latter works fine.

Here is a working sample:

---

from socket import socket, AF_INET, SOCK_STREAM
from ssl import wrap_socket, PROTOCOL_TLSv1, CERT_NONE

def test_handshake(address, WORKAROUND):

s = socket(AF_INET, SOCK_STREAM)
s.settimeout(3.0)
s.connect(address)

if WORKAROUND:
s.getpeername()

ssl = wrap_socket(s, server_side = False,
  ssl_version = PROTOCOL_TLSv1,
  cert_reqs = CERT_NONE,
  do_handshake_on_connect = False)
ssl.do_handshake()

address = ("www.amazon.com", 443)

test_handshake(address, True) # with workaround
print("worked so far")
test_handshake(address, False)
print("but not here it didn't")

---

I'm using Python 3.0rc1 under Windows.

--
components: Library (Lib)
messages: 75077
nosy: ddvoinikov
severity: normal
status: open
title: SSL handshake fails after TCP connection in getpeername()
type: behavior
versions: Python 3.0

___
Python tracker <[EMAIL PROTECTED]>
<http://bugs.python.org/issue4171>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue4171] SSL handshake fails after TCP connection in getpeername()

2008-11-09 Thread Dmitry Dvoinikov

Dmitry Dvoinikov <[EMAIL PROTECTED]> added the comment:

1.py == test.py obviously :)

___
Python tracker <[EMAIL PROTECTED]>
<http://bugs.python.org/issue4171>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue4171] SSL handshake fails after TCP connection in getpeername()

2008-11-09 Thread Dmitry Dvoinikov

Dmitry Dvoinikov <[EMAIL PROTECTED]> added the comment:

Same thing on Python 3.0rc2:

C:\TEMP>python test.py
worked so far
Traceback (most recent call last):
  File "1.py", line 23, in 
test_handshake(address, False)
  File "1.py", line 17, in test_handshake
ssl.do_handshake()
  File "C:\Python30\lib\ssl.py", line 327, in do_handshake
self._sslobj.do_handshake()
AttributeError: 'NoneType' object has no attribute 'do_handshake'

___
Python tracker <[EMAIL PROTECTED]>
<http://bugs.python.org/issue4171>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue4363] Make uuid module functions usable without ctypes

2008-11-20 Thread Dmitry Vasiliev

New submission from Dmitry Vasiliev <[EMAIL PROTECTED]>:

The attached patch removes dependency on ctypes from uuid.uuid1() and
uuid.uuid4() functions.

--
components: Library (Lib)
files: uuid.patch
keywords: patch
messages: 76107
nosy: hdima
severity: normal
status: open
title: Make uuid module functions usable without ctypes
type: behavior
versions: Python 2.6
Added file: http://bugs.python.org/file12071/uuid.patch

___
Python tracker <[EMAIL PROTECTED]>
<http://bugs.python.org/issue4363>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17354] TypeError when running setup.py upload --show-response

2013-03-05 Thread Dmitry Shachnev

New submission from Dmitry Shachnev:

When running `python3 setup.py sdist upload --show-response`, one may get this 
exception:

Traceback (most recent call last):
  File "setup.py", line 47, in 
requires=['dbus']
  File "/usr/lib/python3.2/distutils/core.py", line 148, in setup
dist.run_commands()
  File "/usr/lib/python3.2/distutils/dist.py", line 917, in run_commands
self.run_command(cmd)
  File "/usr/lib/python3.2/distutils/dist.py", line 936, in run_command
cmd_obj.run()
  File "/usr/lib/python3.2/distutils/command/upload.py", line 66, in run
self.upload_file(command, pyversion, filename)
  File "/usr/lib/python3.2/distutils/command/upload.py", line 201, in 
upload_file
msg = '\n'.join(('-' * 75, r.read(), '-' * 75))
TypeError: sequence item 1: expected str instance, bytes found

This happens because r is binary stream, so r.read() returns bytes. A trivial 
patch that fixes the problem is attached.

--
components: Library (Lib)
files: distutils-decode-server-response.patch
keywords: patch
messages: 183517
nosy: mitya57
priority: normal
severity: normal
status: open
title: TypeError when running setup.py upload --show-response
versions: Python 3.3
Added file: 
http://bugs.python.org/file29312/distutils-decode-server-response.patch

___
Python tracker 
<http://bugs.python.org/issue17354>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17640] from distutils.util import byte_compile hangs

2013-04-05 Thread Dmitry Sivachenko

New submission from Dmitry Sivachenko:

The following problem exists in python3.3.0 and python3.3.1rc1.

>From the command line it works:

root@dhcp175-40-red:~ # python3
Python 3.3.1rc1 (default, Apr  5 2013, 18:03:56) 
[GCC 4.2.1 20070831 patched [FreeBSD]] on freebsd10
Type "help", "copyright", "credits" or "license" for more information.
>>> from distutils.util import byte_compile
>>> 


>From script it hangs:
root@dhcp175-40-red:~ # cat /tmp/comp.py 
from distutils.util import byte_compile
files = [
'/usr/local/lib/python3.3/site-packages/yaml/__init__.py',]

byte_compile(files, optimize=1, force=None,
 prefix=None, base_dir=None,
 verbose=1, dry_run=0,
 direct=1)

# python3 /tmp/comp.py
--> Now it hangs forever, if I press Ctrl+D, I get:
Traceback (most recent call last):
  File "/tmp/comp.py", line 1, in 
from distutils.util import byte_compile
  File "/usr/local/lib/python3.3/distutils/util.py", line 9, in 
import imp
  File "/usr/local/lib/python3.3/imp.py", line 28, in 
import tokenize
  File "/usr/local/lib/python3.3/tokenize.py", line 37, in 
__all__ = token.__all__ + ["COMMENT", "tokenize", "detect_encoding",
AttributeError: 'module' object has no attribute '__all__'

--
assignee: eric.araujo
components: Distutils
messages: 186084
nosy: Dmitry.Sivachenko, eric.araujo, tarek
priority: normal
severity: normal
status: open
title: from distutils.util import byte_compile hangs
type: behavior
versions: Python 3.3

___
Python tracker 
<http://bugs.python.org/issue17640>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17640] from distutils.util import byte_compile hangs

2013-04-13 Thread Dmitry Sivachenko

Dmitry Sivachenko added the comment:

No, I meant Ctrl+D.
I use FreeBSD-10.

--

___
Python tracker 
<http://bugs.python.org/issue17640>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



  1   2   >