Re: [Python-Dev] openSSL and windows binaries - license

2006-08-08 Thread Martin v. Löwis
Greg Ewing schrieb:
> If distributing the source doesn't violate the patent,
> and distributing a binary doesn't violate the patent,
> then what *would* constitute a violation of a software
> patent?

IANAL, but AFAICT, the rights controlled by patent law
are the right to make, to use, to sell, to offer to sell,
and to import.

In the context of an encryption algorithm, the right to
use would be the most prominent one; you wouldn't be
allowed to use the algorithm unless you have a patent
license. In general, the right to sell and to offer to
sell would be relevant for software as well, but not
so for free software (I assume).

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Weekly Python Patch/Bug Summary

2006-08-08 Thread Kurt B. Kaiser
Patch / Bug Summary
___

Patches :  402 open ( +6) /  3360 closed ( +6) /  3762 total (+12)
Bugs:  861 open ( -3) /  6114 closed (+27) /  6975 total (+24)
RFE :  228 open ( +2) /   234 closed ( +0) /   462 total ( +2)

New / Reopened Patches
__

Replace the ctypes internal '_as_parameter_' mechanism  (2006-08-02)
   http://python.org/sf/1532975  opened by  Thomas Heller

Remove mentions of "PyUnit" from unittest docs  (2006-08-02)
CLOSED http://python.org/sf/156  opened by  Collin Winter

Allow thread(ing) tests to pass without setting stack size  (2006-08-02)
   http://python.org/sf/1533520  opened by  Matt Fleming

Let timeit accept functions  (2006-08-03)
   http://python.org/sf/1533909  opened by  Erik Demaine

Add notes on locale module changes to whatsnew25.tex  (2006-08-03)
   http://python.org/sf/1534027  opened by  Iain Lowe

Typo in weakref error message  (2006-08-03)
CLOSED http://python.org/sf/1534048  opened by  Christopher Tur Lesniewski-Laas

Fix code generation bug in 'compiler' package  (2006-08-03)
CLOSED http://python.org/sf/1534084  opened by  Neil Schemenauer

Cleanup/error-correction for unittest's docs  (2006-08-05)
CLOSED http://python.org/sf/1534922  opened by  Collin Winter

writelines() in bz2 module does not raise check for errors  (2006-08-06)
   http://python.org/sf/1535500  opened by  Lawrence Oluyede

CGIHTTPServer doesn't handle path names with embeded space  (2006-08-06)
   http://python.org/sf/1535504  opened by  Hartmut Goebel

NNTPS support in nntplib  (2006-08-06)
   http://python.org/sf/1535659  opened by  Aurojit Panda

trace.py on win32 has problems with lowercase drive names  (2006-08-07)
   http://python.org/sf/1536071  opened by  Adam Groszer

Build ctypes on OpenBSD x86_64  (2006-08-08)
   http://python.org/sf/1536908  opened by  Thomas Heller

Patches Closed
__

New ver. of 1102879: Fix for 926423: socket timeouts  (2006-07-07)
   http://python.org/sf/1519025  closed by  nnorwitz

Remove mentions of "PyUnit" from unittest docs  (2006-08-02)
   http://python.org/sf/156  deleted by  collinwinter

Typo in weakref error message  (2006-08-03)
   http://python.org/sf/1534048  closed by  fdrake

Fix code generation bug in 'compiler' package  (2006-08-03)
   http://python.org/sf/1534084  closed by  nascheme

Cleanup/error-correction for unittest's docs  (2006-08-05)
   http://python.org/sf/1534922  closed by  gbrandl

New / Reopened Bugs
___

NetBSD build with --with-pydebug causes SIGSEGV  (2006-08-02)
   http://python.org/sf/1533105  opened by  Matt Fleming

the csv module writes files that Excel sees as SYLK files  (2006-08-01)
CLOSED http://python.org/sf/1532483  reopened by  madewokherd

Installed but not listed *.pyo break bdist_rpm  (2006-08-02)
   http://python.org/sf/1533164  opened by  Shmyrev Nick

CTypes _as_parameter_ not working as documented  (2006-08-02)
   http://python.org/sf/1533481  opened by  Shane Holloway

long -> Py_ssize_t (C/API 1.2.1)  (2006-08-02)
   http://python.org/sf/1533486  opened by  Jim Jewett

C/API sec 10 is clipped  (2006-08-02)
   http://python.org/sf/1533491  opened by  Jim Jewett

Tools/modulator does not exist (ext 1.4)  (2006-08-02)
   http://python.org/sf/1533493  opened by  Jim Jewett

botched html for index subheadings  (2004-05-26)
CLOSED http://python.org/sf/960860  reopened by  jimjjewett

__name__ doesn't show up in dir() of class  (2006-08-03)
   http://python.org/sf/1534014  opened by  Tim Chase

getlines() in linecache.py raises TypeError  (2006-08-04)
CLOSED http://python.org/sf/1534517  opened by  Stefan Behnel

Python 2.5 svn crash in _elementtree.c  (2006-08-04)
   http://python.org/sf/1534630  opened by  Barry A. Warsaw

Win32 debug version of _msi creates _msi.pyd, not _msi_d.pyd  (2006-08-04)
   http://python.org/sf/1534738  opened by  John Ehresman

sys.path gets munged with certain directory structures  (2006-08-04)
   http://python.org/sf/1534764  opened by  Gustavo Tabares

logging's fileConfig causes KeyError on shutdown  (2006-08-04)
   http://python.org/sf/1534765  opened by  mdbeachy

Identical floats print inconsistently  (2006-08-04)
CLOSED http://python.org/sf/1534769  opened by  Marc W. Abel

termios.c in qnx4.25  (2005-09-19)
   http://python.org/sf/1295179  reopened by  gbrandl

can't staticaly build modules md5 and sha  (2006-08-06)
CLOSED http://python.org/sf/1535081  opened by  kbob_ru

python segfaults when reading from closed stdin  (2006-08-05)
CLOSED http://python.org/sf/1535165  opened by  Patrick Mezard

typo in test_bz2.py  (2006-08-06)
CLOSED http://python.org/sf/1535182  opened by  Lawrence Oluyede

Python 2.5 windows builds should link hashlib with OpenSSL  (2006-08-06)
   http://python.org/sf/1535502  opened by  Gregory P. Smith

hash(method) sometimes raises OverflowError  (2006-08-07)
   htt

Re: [Python-Dev] openSSL and windows binaries - license

2006-08-08 Thread Greg Ewing
Martin v. Löwis wrote:
> I personally don't think there is a risk
> distributing the code (if there was, distribution of OpenSSL would also
> be a risk); anybody /using/ a patented algorithm would violate the
> patent.

If distributing the source doesn't violate the patent,
and distributing a binary doesn't violate the patent,
then what *would* constitute a violation of a software
patent?

Writing new code using the algorithm? Compiling
something which uses it?

--
Greg
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Dicts are broken Was: unicode hell/mixing str and unicode asdictionarykeys

2006-08-08 Thread Greg Ewing
M.-A. Lemburg wrote:

> Hiding programmer errors is not making life easier in the
> long run, so I'm -1 on having the equality comparison return
> False.

I don't see how this is greatly different from, e.g.

   [1, 2] == (1, 2)

returning False. Comparing things of different types
may or may not indicate a bug in the code as well,
but we don't seem to worry that it doesn't raise an
exception.

--
Greg
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] 2.5 status

2006-08-08 Thread Tim Peters
[Georg Brandl, on
 http://python.org/sf/1523610 - PyArg_ParseTupleAndKeywords
potential core dump
]
> This one's almost fixed if we can decide what to do with "levels".
> I wrote some time ago:
>
> """
> With respect to this bug (which is about stack issues in Python/getargs.c
> involving misuse of the "levels" array), I think that we can drop the
> "levels" thing completely. It's only there to tell the user which exact item
> passed as part of a tuple argument cannot be accepted (and only if that
> function is implemented in C code). As tuple arguments
> are very rare "argument x" should be enough to tell the user that
> something's wrong with that tuple.
> """

More, the problem that remains is purely "a head bug":  nobody ever
bumped into it, and the only way to provoke it is to write C (calling,
e.g., PyArg_ParseTupleAndKeywords) nesting tuple codes in an argument
descriptor string to an absurd depth.  This is far from serious --
heck, it's far from even interesting <0.5 wink>.

I suggest closing this bug, getting it out of PEP 356, and opening a
new low-priority bug report against it.  Fine by me if the `levels`
convolutions went away entirely, but at this stage in the 2.5 release
process I expect that's a bad idea:  there is no /actual/ real-life
bug here, and getting rid of `levels` seems far more likely (because
of the amount of code touched) to introduce a new error than that
someone will nest tuple argument descriptors 32 deep.  If someone's
actually afraid of that, fine, s/32/320/g in getargs.c instead :-)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] openSSL and windows binaries - license

2006-08-08 Thread Gregory P. Smith
On Tue, Aug 08, 2006 at 04:54:44PM -0400, Jim Jewett wrote:
> On 8/8/06, "Martin v. L?wis" <[EMAIL PROTECTED]> wrote:
> > Jim Jewett schrieb:
> > > The OpenSSL library implements some algorithms that are patented.  The
> > > source code should be fine to (re)distribute, but but there may be a
> > > slight legal risk with distributing a binary.
> 
> > I don't want to change the build process in that way (i.e. dropping a
> > feature) just before a release.
> 
> OK, but this does argue against making the fast version available by
> default on windows.  :{

disabling/enabling a cipher in openssl that isn't commonly used and
isn't even directly exposed via any API to a python user hardly sounds
like dropping a feature to me.  it'll make your _ssl.pyd smaller if
anything at all.  (any sane SSL connection will negotiate AES or 3DES
as its cipher; IDEA isn't required)

If the release manager declares, "absolutely no changes to the windows
build process!"  Then clearly none of the changes I submitted will
make it in and neither would removing any hint of IDEA in 2.5 as
they're both too late.

> The 2.5c1 windows binary does not ship with _hashlib, so IDEA is only
> available if someone else has compiled it.

IDEA is a cipher not a hash algorithm.  it won't appear in _hashlib.
the code is probably already linked and present in _ssl.pyd even if
the ssl protocol itself doesn't allow that as a cipher.

> But for a binary release, I think that IDEA should be added to the
> Configure exclude.
> http://svn.python.org/view/external/openssl-0.9.8a/Configure
> 
> # All of the following is disabled by default (RC5 was enabled
> before 0.9.8):
> 
> my %disabled = ( # "what" => "comment"
>  "gmp"  => "default",
> + "idea"=> "default",
>  "mdc2"   => "default",
>  "rc5"=> "default",
>  "shared" => "default",
>  "zlib"   => "default",
>  "zlib-dynamic"   => "default"
>);

yeah i'd just do that if you're worried about the code being in the
binary causing a problem.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] returning longs from __hash__()

2006-08-08 Thread Martin v. Löwis
Tim Peters schrieb:
> It sounds fine to me, except I'm not immediately clear on which code
> needs to be changed.

My change would essentially be the code below, in instance_hash and
slot_tp_hash; I have yet to add test cases and check for documentation
changes.

Regards,
Martin

Index: Objects/typeobject.c
===
--- Objects/typeobject.c(Revision 51155)
+++ Objects/typeobject.c(Arbeitskopie)
@@ -4559,7 +4559,10 @@
Py_DECREF(func);
if (res == NULL)
return -1;
-   h = PyInt_AsLong(res);
+   if (PyLong_Check(res))
+   h = res->ob_type->tp_hash(res);
+   else
+   h = PyInt_AsLong(res);
Py_DECREF(res);
}
else {
Index: Objects/classobject.c
===
--- Objects/classobject.c   (Revision 51155)
+++ Objects/classobject.c   (Arbeitskopie)
@@ -934,11 +934,9 @@
Py_DECREF(func);
if (res == NULL)
return -1;
-   if (PyInt_Check(res)) {
-   outcome = PyInt_AsLong(res);
-   if (outcome == -1)
-   outcome = -2;
-   }
+   if (PyInt_Check(res) || PyLong_Check(res))
+   /* This already converts a -1 result to -2. */
+   outcome = res->ob_type->tp_hash(res);
else {
PyErr_SetString(PyExc_TypeError,
"__hash__() should return an int");
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] openSSL and windows binaries - license

2006-08-08 Thread Jim Jewett
On 8/8/06, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote:
> Jim Jewett schrieb:
> > The OpenSSL library implements some algorithms that are patented.  The
> > source code should be fine to (re)distribute, but but there may be a
> > slight legal risk with distributing a binary.

> I don't want to change the build process in that way (i.e. dropping a
> feature) just before a release.

OK, but this does argue against making the fast version available by
default on windows.  :{

The 2.5c1 windows binary does not ship with _hashlib, so IDEA is only
available if someone else has compiled it.

The problem is that if the binary is recompiled as part of the python
build (as is typical on unix), then a default build will make IDEA
available.

> I personally don't think there is a risk
> distributing the code (if there was, distribution of OpenSSL would also
> be a risk); anybody /using/ a patented algorithm would violate the

There is at least some legal precedent for treating source code
differently from binaries.  (http://jya.com/bernstein-9th.htm)  The
source code is a communication or description (which would not violate
the patent), while the binary is an implementation (which would).

Note that openssl.org do not distribute binaries themselves, and at
least some vendors (such as Red Hat) have excluded the questionable
algorithms from their binaries.

> If you think it's necessary, a note could be added to the readme. Would
> you like to create a patch?

I believe the openSSL readme
http://svn.python.org/view/external/openssl-0.9.8a/README
is sufficient for a source release.

But for a binary release, I think that IDEA should be added to the
Configure exclude.
http://svn.python.org/view/external/openssl-0.9.8a/Configure

# All of the following is disabled by default (RC5 was enabled
before 0.9.8):

my %disabled = ( # "what" => "comment"
 "gmp"=> "default",
+ "idea"  => "default",
 "mdc2"   => "default",
 "rc5"=> "default",
 "shared" => "default",
 "zlib"   => "default",
 "zlib-dynamic"   => "default"
   );

-jJ
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] returning longs from __hash__()

2006-08-08 Thread Tim Peters
[Armin]
>> Maybe the user should just be able to return any integer value from a
>> custom __hash__() without having to worry about not exceeding
>> sys.maxint.
>>
>> After all the returned value has no real meaning.  If a long is returned
>> we could take its hash again, and use that number internally.

[Martin]
> Nick Coghlan already suggested that, when __hash__ returns a long int,
> the tp_hash of long int should be used to compute the true hash value.
>
> Could you see any problems with that approach? If not, and if I don't
> hear other objections, I'd like to go ahead and fix it that way.

It sounds fine to me, except I'm not immediately clear on which code
needs to be changed.  The internal _Py_HashPointer() already does
exactly this (return the hash of a Python long) when PyObject_Hash()
decides to hash an address on a SIZEOF_LONG < SIZEOF_VOID_P box ...
but on a SIZEOF_LONG == SIZEOF_VOID_P box, _Py_HashPointer() may still
return a negative C long.  I /hope/ that a class that decides to add

def __hash__(self):
 return id(self)

will end up using the same hash code internally as when that
supposedly do-nothing-different definition doesn't exist.

Note that a while back I changed several custom __hash__ methods in
Python's test suite to stop returning id(self) (as a result of tests
failing after the "make id() non-negative" change).  That's why we
haven't seen such complaints from the buildbots recently.  I expect
that few Python programmers realize it was never legit for __hash__()
to return id(self), and that it's not worth forcing them to learn that
now ;-)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Releasemanager, please approve #1532975

2006-08-08 Thread Martin v. Löwis
Thomas Heller schrieb:
> Thomas Heller schrieb:
>> Approval requested for patch:
>> http://python.org/sf/1532975
>>
> 
> What does the silence mean?  Should I go ahead and commit this patch?

If it's not there already, you should add it to the PEP. If you think
it is "release-critical" (i.e. it would be really embarrassing for
Python 2.5 if that change wasn't made), you should put it into the
appropriate section.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Releasemanager, please approve #1532975

2006-08-08 Thread Thomas Heller
Thomas Heller schrieb:
> Approval requested for patch:
> http://python.org/sf/1532975
> 

What does the silence mean?  Should I go ahead and commit this patch?

Thanks,
Thomas

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Dicts are broken Was: unicode hell/mixing str and unicode asdictionarykeys

2006-08-08 Thread skip

Martin> Programmers make all kinds of mistakes when comparing objects,
Martin> assuming that things ought to be equal that actually aren't:

py> 1.6/math.pi*math.pi == 1.6
False

By extension, perhaps Computer Science departments should begin offering
Unicode Analysis as an advanced undergraduate class. ;-)

(Sorry, couldn't resist...)

Skip
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] openSSL and windows binaries - license

2006-08-08 Thread Martin v. Löwis
Jim Jewett schrieb:
> The OpenSSL library implements some algorithms that are patented.  The
> source code should be fine to (re)distribute, but but there may be a
> slight legal risk with distributing a binary.

I don't want to change the build process in that way (i.e. dropping a
feature) just before a release. I personally don't think there is a risk
distributing the code (if there was, distribution of OpenSSL would also
be a risk); anybody /using/ a patented algorithm would violate the
patent. I am not a lawyer, so if you want to know for sure, ask one.

If you think it's necessary, a note could be added to the readme. Would
you like to create a patch?

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] returning longs from __hash__()

2006-08-08 Thread Martin v. Löwis
Armin Rigo schrieb:
> Maybe the user should just be able to return any integer value from a
> custom __hash__() without having to worry about not exceeding
> sys.maxint.
> 
> After all the returned value has no real meaning.  If a long is returned
> we could take its hash again, and use that number internally.

Nick Coghlan already suggested that, when __hash__ returns a long int,
the tp_hash of long int should be used to compute the true hash value.

Could you see any problems with that approach? If not, and if I don't
hear other objections, I'd like to go ahead and fix it that way.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Dicts are broken Was: unicode hell/mixing str and unicode asdictionarykeys

2006-08-08 Thread Martin v. Löwis
M.-A. Lemburg schrieb:
> If the programmer writes:
> 
> x = 'äöü'
> y = u'äöü'
> ...
> if x == y:
> do_something()
> 
> then he clearly has had the intention to compare two character
> strings.

Programmers make all kinds of mistakes when comparing objects,
assuming that things ought to be equal that actually aren't:

py> 1.6/math.pi*math.pi == 1.6
False
py> if 10*10 is 100:
...   print "yes"
... else:
...   print "no"
...
no

> Now, if what you were saying were true, then the above would
> simply continue to work without raising an exception, possibly
> causing the application to return wrong results.

That correct. It is a programming mistake, hence you get a wrong
result. However, you cannot assume that every comparison between
a string and a Unicode object is always a programming mistake.
You must not raise exceptions just because of a *potential*
programming mistake; that's what PyChecker is there for.

> Note that we are not discussing changing the behavior of the
> __eq__ comparison between strings and Unicode, since this has
> always been to raise exceptions in case the automatic propagation
> fails.

Not sure what you are discussing: This is *precisely* what I'm
discussing. Making that change would solve this problem.

> The discussion is about silencing exceptions in the dict lookup
> mechanism - something which used to happen and now no longer
> is done.

No, that's not what the discussion is about. The discussion
is about the backwards incompatibility in Python 2.5 wrt.
Python 2.4. There are several ways to solve that; silencing
the exception is just one way.

I think it is the wrong way, as I think that
string-unicode-comparison should have a consistent behaviour
no matter where the comparison occurs.

> Since this behavior is an implementation detail of the
> dictionary implementation, users perceive this change as random
> exceptions occurring in their application.

That key comparison occurs is *not* an implementation detail.
It is a necessary and documented aspect of the dictionary
lookup.

> I've suggested to go about this in a slightly more user-friendly
> way, namely by giving a warning instead of raising an exception
> in Python 2.5 and then going for the exception in Python 2.6.

Yes, and I have suggested to make it even more user-friendly
by defining string-unicode-__eq__ in a sensible manner. It
is more user-friendly, because it doesn't show the inconsistency
Michael Hudson documented in

http://mail.python.org/pipermail/python-dev/2006-August/067981.html

Regards,
Martin

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] openSSL and windows binaries - license

2006-08-08 Thread Jim Jewett
The OpenSSL library implements some algorithms that are patented.  The
source code should be fine to (re)distribute, but but there may be a
slight legal risk with distributing a binary.

Note that http://www.openssl.org/support/faq.html#LEGAL1 says that we
can avoid building the problem sections with

./config no-idea no-mdc2 no-rc5

As best I could tell from

(search for %disabled in)
http://svn.python.org/view/external/openssl-0.9.8a/Configure
(or search for OPTIONS in)
http://svn.python.org/view/external/openssl-0.9.8a/Makefile

python just takes the OpenSSL defaults, which excludes RC5 and MDC2
but does build IDEA.  The documentation does not promise any of these
three, and it doesn't look like they're used internally or advertised,
but they are available if built.  It might be safer to explicitly
exclude IDEA from the binary distribution.

(Well, unless the PSF actually has an appropriate license, which is possible.

 http://svn.python.org/view/external/openssl-0.9.8a/README

says where to get them.)

-jJ
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] unicode hell/mixing str and unicode as dictionary keys

2006-08-08 Thread David Hopwood
Martin v. Löwis wrote:
> David Hopwood schrieb:
>>Michael Foord wrote:
>>>David Hopwood wrote:[snip..]
>>>
>>we should, of course, continue to use the one we always used (for
>>"ascii", there is no difference between the two).
>
>+1
>
>This seems the most (only ?) logical solution.

No; always considering Unicode and non-ASCII byte strings to be distinct
is just as logical.
>>
>>I think you must have misread my comment:
> 
> Indeed. The misunderstanding originates from your sentence starting with
> "no", when, in fact, you seem to be supporting the proposal I made.

I had misunderstood what the existing Python behaviour is. I now think
the current behaviour (which uses "B.decode(system_encoding) == U") is
definitely a bad idea, especially in cases where the system encoding is
not US-ASCII, but I agree that it can't be changed for 2.5.

-- 
David Hopwood <[EMAIL PROTECTED]>



___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Release manager pronouncement needed: PEP 302 Fix

2006-08-08 Thread Jean-Paul Calderone
On Tue, 08 Aug 2006 12:04:16 -0400, "Phillip J. Eby" <[EMAIL PROTECTED]> wrote:
>
>PEP 302 doesn't need to be changed, since Python now conforms to it again. 
>That is, every object in sys.path_importer_cache is either an importer or 
>None.  It's just that there is an additional type of importer there that 
>didn't occur before.  If you have code that expects only certain types of 
>importers to be there, that code was already broken.
>

Thanks.

Jean-Paul
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Release manager pronouncement needed: PEP 302 Fix

2006-08-08 Thread Phillip J. Eby
At 11:11 AM 8/8/2006 -0400, Jean-Paul Calderone wrote:
>On Fri, 28 Jul 2006 18:00:36 -0400, "Phillip J. Eby" 
><[EMAIL PROTECTED]> wrote:
>>At 10:55 PM 7/28/2006 +0200, Martin v. Löwis wrote:
>>>Phillip J. Eby wrote:
>>> > The issue is that a proper fix that caches existence requires adding new
>>> > types to import.c and thus might appear to be more of a feature.  I was
>>> > therefore reluctant to embark upon the work without some assurance 
>>> that it
>>> > wouldn't be rejected as adding a last-minute feature.
>>>
>>>So do you have a patch, or are going to write one?
>>
>>Yes, it's checked in as r50916.
>>
>>It ultimately turned out to be simpler than I thought; only one new type
>>(imp.NullImporter) was required.
>
>Is this going to be the final state of PEP 302 support in Python 2.5?  I
>don't particularly care how this ends up, but I'd like to know what has
>been decided on (PEP 302 doesn't seem to have been updated yet) so I can
>fix Twisted's test suite (which cannot even be run with Python 2.5b3
>right now).

PEP 302 doesn't need to be changed, since Python now conforms to it 
again.  That is, every object in sys.path_importer_cache is either an 
importer or None.  It's just that there is an additional type of importer 
there that didn't occur before.  If you have code that expects only certain 
types of importers to be there, that code was already broken.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] unicode hell/mixing str and unicode as dictionary keys

2006-08-08 Thread M.-A. Lemburg
Armin Rigo wrote:
> Hi,
> 
> On Thu, Aug 03, 2006 at 07:53:11PM +0200, M.-A. Lemburg wrote:
>>> I though I'd heard (from Guido here or on the py3k list) that it was only 
>>> 1 < u'abc' that would raise an exception, and that 1 == u'abc' would still 
>>> evaluate to False.  Did I misunderstand?
>> Could be that I'm wrong.
> 
> I also seem to remember that TypeErrors should only signal ordering
> non-sense, not equality.  In this case, I'm on the opinion that unicode
> objects and completely-unrelated strings of random bytes should
> successfully compare as unequal, but I'm not enough of a unicode user to
> be sure.

Agreed - for Py3k where strings no longer exist and Unicode is
the only text type.

In Python 2.x the situation is a little different, since strings
are still very often used as container for text data.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Aug 08 2006)
>>> Python/Zope Consulting and Support ...http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! 
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Release manager pronouncement needed: PEP 302 Fix

2006-08-08 Thread Jean-Paul Calderone
On Fri, 28 Jul 2006 18:00:36 -0400, "Phillip J. Eby" <[EMAIL PROTECTED]> wrote:
>At 10:55 PM 7/28/2006 +0200, Martin v. Löwis wrote:
>>Phillip J. Eby wrote:
>> > The issue is that a proper fix that caches existence requires adding new
>> > types to import.c and thus might appear to be more of a feature.  I was
>> > therefore reluctant to embark upon the work without some assurance that it
>> > wouldn't be rejected as adding a last-minute feature.
>>
>>So do you have a patch, or are going to write one?
>
>Yes, it's checked in as r50916.
>
>It ultimately turned out to be simpler than I thought; only one new type
>(imp.NullImporter) was required.
>

Is this going to be the final state of PEP 302 support in Python 2.5?  I
don't particularly care how this ends up, but I'd like to know what has
been decided on (PEP 302 doesn't seem to have been updated yet) so I can
fix Twisted's test suite (which cannot even be run with Python 2.5b3
right now).

Jean-Paul
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Dicts are broken Was: unicode hell/mixing str and unicode asdictionarykeys

2006-08-08 Thread M.-A. Lemburg
Martin v. Löwis wrote:
> M.-A. Lemburg schrieb:
>> Failure to decode a string doesn't imply inequality.
> 
> If the failure is "these bytes don't have a meaningful character
> interpretation", then the bytes are *clearly* not equal to
> some character string.
>
>> It implies
>> that the programmer needs to step in and correct the problem by
>> making an explicit and conscious decision.
> 
> There is no problem to correct. The strings *are* inequal.

If the programmer writes:

x = 'äöü'
y = u'äöü'
...
if x == y:
do_something()

then he clearly has had the intention to compare two character
strings.

Now, if what you were saying were true, then the above would
simply continue to work without raising an exception, possibly
causing the application to return wrong results.

With the exception, the programmer will have a chance to correct
the problem (in this case, probably a forgotten u-prefix) and also
be safe in not having the application produce wrong data -
something that's usually hard to detect, debug and, more
importantly, can have effects which are a lot worse than
a failing application.

Note that we are not discussing changing the behavior of the
__eq__ comparison between strings and Unicode, since this has
always been to raise exceptions in case the automatic propagation
fails.

The discussion is about silencing exceptions in the dict lookup
mechanism - something which used to happen and now no longer
is done.

Since this behavior is an implementation detail of the
dictionary implementation, users perceive this change as random
exceptions occurring in their application.

While these exceptions do hint at programming errors (the main
reason for no longer silencing them), the particular case in
the dict implementation requires some extra thought.

I've suggested to go about this in a slightly more user-friendly
way, namely by giving a warning instead of raising an exception
in Python 2.5 and then going for the exception in Python 2.6.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Aug 08 2006)
>>> Python/Zope Consulting and Support ...http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! 
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] should i put this on the bug tracker ?

2006-08-08 Thread Bart Thate


On Tue, 8 Aug 2006, Hye-Shik Chang wrote:

sorry i should reply a little better ;]

> On 8/8/06, Bart Thate <[EMAIL PROTECTED]> wrote:
> > hello python-dev,
> >
> > the following code hangs on FreeBSD 6.1-STABLE,
> > Python 2.5b3 (r25b3:51041, Aug 5 2006, 20:46:57)
> >
>
> Python 2.5 now uses system scope threads in FreeBSD just like
> in other platforms.  So python may behave different for corner cases.
>

is there a way i can compile python with the old settings ?

> [snip]
> >
> > works fine on python2.4
> >
> > is this a bug or is it something i "should not do" ?
>
> In my machine (FreeBSD 6.1), 2.4 and 2.5 work same.
> What was the problem on your running?  Did you install
> it from the port?

in python2.5 i cannot start threads after the os.execl,
python2.4 lets me reboot my app and has no problem with
starting threads after that.


> BTW, os.exec() from a sub-thread looks not so safe.

i know it has problems but since it was working fine
i thought it was ok.

>
>
> Hye-Shik

Bart
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] should i put this on the bug tracker ?

2006-08-08 Thread Bart Thate


On Tue, 8 Aug 2006, Hye-Shik Chang wrote:

> In my machine (FreeBSD 6.1), 2.4 and 2.5 work same.
> What was the problem on your running?  Did you install
> it from the port?

i installed it from the python-devel port

Bart
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] should i put this on the bug tracker ?

2006-08-08 Thread Hye-Shik Chang
On 8/8/06, Bart Thate <[EMAIL PROTECTED]> wrote:
> hello python-dev,
>
> the following code hangs on FreeBSD 6.1-STABLE,
> Python 2.5b3 (r25b3:51041, Aug 5 2006, 20:46:57)
>

Python 2.5 now uses system scope threads in FreeBSD just like
in other platforms.  So python may behave different for corner cases.

[snip]
>
> works fine on python2.4
>
> is this a bug or is it something i "should not do" ?

In my machine (FreeBSD 6.1), 2.4 and 2.5 work same.
What was the problem on your running?  Did you install
it from the port?

BTW, os.exec() from a sub-thread looks not so safe.


Hye-Shik
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] should i put this on the bug tracker ?

2006-08-08 Thread Bart Thate

hello python-dev,

the following code hangs on FreeBSD 6.1-STABLE,
Python 2.5b3 (r25b3:51041, Aug 5 2006, 20:46:57)

$ cat pythontest
#!/usr/local/bin/python2.5

import os, thread, time, sys

def reboot():
print 'reboot'
os.execl(sys.argv[0], sys.argv[0])

thread.start_new_thread(reboot, ())

while 1:
time.sleep(1)


works fine on python2.4

is this a bug or is it something i "should not do" ?

Bart#!/usr/local/bin/python2.5

import os, thread, time, sys

def reboot():
print 'reboot'
os.execl(sys.argv[0], sys.argv[0])

thread.start_new_thread(reboot, ())

while 1:
time.sleep(1)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] free(): invalid pointer

2006-08-08 Thread Ralf Schmitt
Ralf Schmitt wrote:
> 
> 
> Sorry for not using the bugtracker (sf sucks). Did you guys already 
> settle on a new one?
> 

And sorry for bothering this list. It seems like this problem is related 
to the python cdb module.

- Ralf

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] returning longs from __hash__()

2006-08-08 Thread Armin Rigo
Hi all,

The 2.5 change of id() to return positive ints-or-longs is likely to
cause quite some breakage in user programs that erroneously implemented
custom __hash__() functions returning a value based on an id().  This
was discussed a few times already but it showed up again as a bug report
(#1536021).  Of course it has always been documented that id() is not
directly suitable as the return value of a custom __hash__(), but so far
it worked on 32-bit machines so people have been doing it all the time.

Maybe the user should just be able to return any integer value from a
custom __hash__() without having to worry about not exceeding
sys.maxint.

After all the returned value has no real meaning.  If a long is returned
we could take its hash again, and use that number internally.


A bientot,

Armin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] free(): invalid pointer

2006-08-08 Thread Ralf Schmitt
Hi all,

I've got another error porting our apps. It's a django app
and stops with (when pressing CTRL-C):

*** glibc detected *** free(): invalid pointer: 0xb684c650 ***



with MALLOC_CHECK=1 and gdb I get the following backtrace:

Program received signal SIGINT, Interrupt.
[Switching to Thread -1209690432 (LWP 10036)]
0xe410 in __kernel_vsyscall ()
(gdb) *** glibc detected *** free(): invalid pointer: 0xb66a22a8 ***

(gdb) bt
#0  0xe410 in __kernel_vsyscall ()
#1  0xb7fbd463 in __waitpid_nocancel () from 
/lib/tls/i686/cmov/libpthread.so.0
#2  0x080ed524 in posix_waitpid (self=0x0, args=0xfe00)
 at ./Modules/posixmodule.c:5502
#3  0x080bec34 in PyEval_EvalFrameEx (f=0x825a3b4, throwflag=0)
 at Python/ceval.c:3565
#4  0x080be459 in PyEval_EvalFrameEx (f=0x81bd60c, throwflag=0)
 at Python/ceval.c:3650
#5  0x080be459 in PyEval_EvalFrameEx (f=0x82354f4, throwflag=0)
 at Python/ceval.c:3650
#6  0x080be459 in PyEval_EvalFrameEx (f=0x81dbb94, throwflag=0)
 at Python/ceval.c:3650
#7  0x080bff75 in PyEval_EvalCodeEx (co=0xb7c91c38, globals=0xb7a3713c,
 locals=0x0, args=0x822fcd0, argcount=1, kws=0x822fcd4, kwcount=0,
 defs=0xb7a33058, defcount=2, closure=0x0) at Python/ceval.c:2832
#8  0x080bd771 in PyEval_EvalFrameEx (f=0x822fb6c, throwflag=0)
 at Python/ceval.c:3661
#9  0x080bff75 in PyEval_EvalCodeEx (co=0xb7dfeba8, globals=0xb7dfd714,
 locals=0x0, args=0x820165c, argcount=2, kws=0x8201664, kwcount=0,
 defs=0x0, defcount=0, closure=0x0) at Python/ceval.c:2832
#10 0x080bd771 in PyEval_EvalFrameEx (f=0x82014cc, throwflag=0)
 at Python/ceval.c:3661
#11 0x080bff75 in PyEval_EvalCodeEx (co=0xb7dfede8, globals=0xb7dfd714,
 locals=0x0, args=0x81ec784, argcount=1, kws=0x81ec788, kwcount=0,
 defs=0xb7cfcdb8, defcount=1, closure=0x0) at Python/ceval.c:2832
#12 0x080bd771 in PyEval_EvalFrameEx (f=0x81ec634, throwflag=0)
 at Python/ceval.c:3661
#13 0x080be459 in PyEval_EvalFrameEx (f=0x81ae054, throwflag=0)
 at Python/ceval.c:3650
#14 0x080bff75 in PyEval_EvalCodeEx (co=0xb7de2770, globals=0xb7e319bc,
 locals=0xb7e319bc, args=0x0, argcount=0, kws=0x0, kwcount=0, defs=0x0,
 defcount=0, closure=0x0) at Python/ceval.c:2832
---Type  to continue, or q  to quit---
#15 0x080c00f6 in PyEval_EvalCode (co=0xfe00, globals=0xfe00,
 locals=0xfe00) at Python/ceval.c:494
#16 0x080de682 in PyRun_FileExFlags (fp=0x815f008,
 filename=0xbfdf18b1 "manage.py", start=-512, globals=0xfe00,
 locals=0xfe00, closeit=1, flags=0xbfdefbb8) at 
Python/pythonrun.c:1255
#17 0x080de9f3 in PyRun_SimpleFileExFlags (fp=,
 filename=0xbfdf18b1 "manage.py", closeit=1, flags=0xbfdefbb8)
 at Python/pythonrun.c:861
#18 0x08056a69 in Py_Main (argc=2, argv=0xbfdefc54) at Modules/main.c:496
#19 0xb7e6fea2 in __libc_start_main () from /lib/tls/i686/cmov/libc.so.6
#20 0x08055fa1 in _start () at ../sysdeps/i386/elf/start.S:119


I'm using Python 2.5b3 (trunk:51066M, Aug  3 2006, 16:55:04).

Sorry for not using the bugtracker (sf sucks). Did you guys already 
settle on a new one?

- Ralf

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] windows 2.5 build: use OpenSSL for hashlib [bug 1535502]

2006-08-08 Thread Gregory P. Smith
On Tue, Aug 08, 2006 at 08:26:08AM +0200, "Martin v. L?wis" wrote:
> Gregory P. Smith schrieb:
> > Widely deployed popular applications use python for both large scale
> > hashing and ssl communications.
> 
> Yet, nobody has worried about performance in all these years to notice
> that the assembler code isn't being used. So it can't be that bad.
> For SSL specifically, the usage of hashing is minimal, as the actual
> communication uses symmetric encryption.

OpenSSL uses x86 asm to speed up other things besides hashes.  For SSL
sockets each new connection requires an RSA or DSA public key
operation for the symmetric key exchange.  Huge speedup there:

2ghz pentium-m with C ssl:
   signverifysign/s verify/s
 rsa 1024 bits 0.007576s 0.000368s132.0   2715.4
   signverifysign/s verify/s
 dsa 1024 bits 0.003655s 0.004409s273.6226.8

2ghz pentium-m with x86 asm ssl:
   signverifysign/s verify/s
 rsa 1024 bits 0.003410s 0.000171s293.3   5843.5
   signverifysign/s verify/s
 dsa 1024 bits 0.001632s 0.001987s612.9503.4

[data comes from running "out32\openssl speed" in openssl 0.9.8b]

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Dicts are broken Was: unicode hell/mixing str and unicode asdictionarykeys

2006-08-08 Thread Ralf Schmitt
Martin v. Löwis wrote:
> M.-A. Lemburg schrieb:
>> Hiding programmer errors is not making life easier in the
>> long run, so I'm -1 on having the equality comparison return
>> False.
> 
> There is no error to hide here. The objects are inequal, period.

And in the case of dicts it hides errors randomly...

> 
>> Instead we should generate a warning in Python 2.5 and introduce
>> the exception in Python 2.6.
> 
> A warning about what? That you can't put byte string and Unicode
> strings into the same dictionary (as keys)? Next we start not allowing
> to put numbers and strings into the same dictionary, because there
> is no conversion defined between them?

A warning that an exception has been ignored while adding a key to a 
dict, I guess. I'd see keep those dict changes, this is where real 
programmer errors are hidden.

>> I disagree with this part.
>>
>> Failure to decode a string doesn't imply inequality.
> 
> If the failure is "these bytes don't have a meaningful character
> interpretation", then the bytes are *clearly* not equal to
> some character string.

One could also think of a "magic encoding", which decodes non-ascii 
strings to None, making them compare unequal to any unicode string.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] windows 2.5 build: use OpenSSL for hashlib [bug 1535502]

2006-08-08 Thread Gregory P. Smith

I have supplied a patch that does everything needed to both make the
windows build process build OpenSSL with x86 assembly optimizations on
Win32 and to build the _hashlib.pyd module.  It works for me.

The only thing the patch doesn't do is add _hashlib.pyd to the .msi
windows installer because I want to go to bed and haven't looked into
how that works.

 
http://sourceforge.net/tracker/index.php?func=detail&aid=1535502&group_id=5470&atid=105470

I know Anthony suggested "no" earlier (which is why I haven't
committed it) but I really think that should be reconsidered.
This doesn't change python at all, only fixes a build process.

Ship with optimized code by default!  help delay the heat death of the
universe from cpu fans. :)

either way, enjoy.

> So is it worth my time doing this in a hurry for 2.5 or do other
> people really just not care if python for windows uses a slow OpenSSL?
> 
> Widely deployed popular applications use python for both large scale
> hashing and ssl communications.
> 
> If no, can this go in 2.5.1?  Its not an API change.
> 
> -g
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Dicts are broken Was: unicode hell/mixing str and unicode asdictionarykeys

2006-08-08 Thread Martin v. Löwis
M.-A. Lemburg schrieb:
> Hiding programmer errors is not making life easier in the
> long run, so I'm -1 on having the equality comparison return
> False.

There is no error to hide here. The objects are inequal, period.

> Instead we should generate a warning in Python 2.5 and introduce
> the exception in Python 2.6.

A warning about what? That you can't put byte string and Unicode
strings into the same dictionary (as keys)? Next we start not allowing
to put numbers and strings into the same dictionary, because there
is no conversion defined between them?

> In the above example, you clearly know that the two are
> unequal due to the relationship between complex numbers
> having an imaginary part and integers..

Right. And so I do when the byte string does not convert to
Unicode.

> However, this is not the case for 8-bit string vs. Unicode,
> since you cannot use such extra knowledge if you find that ASCII
> encoding assumption obviously doesn't match the string
> in question.

It's not the question "Could there be a conversion under which
they are equal?" If you ask that question, then

py> "3"==3
False

should raise an exception, because there exists a conversion under
which these objects are equal:

py> int("3")==3
True

It's just that, under the conversion Python applies, the byte
string and the Unicode string are not equal.

> Note that Python always coerces to the "bigger" type. As a result,
> the second option is what is actually implemented in Python.
[which is decode-to-unicode]

It might be debatable which of the types is the "bigger" type. It's
not that byte strings are a true subset of Unicode strings, under
some conversion, since there are byte strings which have no Unicode
equivalent (because they are not characters, and don't convert under
the encoding), and there are Unicode strings that have no byte string
equivalent.

For example, if the system encoding is UTF-8, then byte string is
the bigger type (all Unicode strings convert to byte strings, but
not all byte strings convert to Unicode strings).

However, this is a red herring: Python has, for whatever reason,
chosen to convert byte->unicode, and nobody is questioning that
choice.

> I disagree with this part.
> 
> Failure to decode a string doesn't imply inequality.

If the failure is "these bytes don't have a meaningful character
interpretation", then the bytes are *clearly* not equal to
some character string.

> It implies
> that the programmer needs to step in and correct the problem by
> making an explicit and conscious decision.

There is no problem to correct. The strings *are* inequal.

> The alternative would be to decide that equal comparisons should never
> be allowed to raise exceptions and instead have the equal comparison
> return False.

There are many reasons why comparison could raise an exception.
It could be out of memory, it could be that there is an
internal/programming error in the codec being used, it could be
that the codec is not found (likewise for other comparisons).

However, if the codec is working properly, and clearly determines
that the byte string has no character string equivalent, then
it can't be equal to some character (unicode) string.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Dicts are broken Was: unicode hell/mixing str and unicode asdictionarykeys

2006-08-08 Thread M.-A. Lemburg
Martin v. Löwis wrote:
> M.-A. Lemburg schrieb:
>> Python just doesn't know the encoding of the 8-bit string, so can't
>> make any assumptions on it. As result, it raises an exception to inform
>> the programmer.
> 
> Oh, Python does make an assumption what the encoding is: it assumes
> it is the system encoding (i.e. "ascii"). Then invoking the ascii
> codec raises an exception, because the string clearly isn't ascii.

Right, and as consequence, Python raises an exception to let the
programmer correct the problem.

The subsequent solution to the problem may result in the
string being decoded into Unicode and the two resulting Unicode
objects being unequal, or it may also result in them being equal.
Python doesn't have this knowledge, so always returning false
is clearly wrong.

Hiding programmer errors is not making life easier in the
long run, so I'm -1 on having the equality comparison return
False.

Instead we should generate a warning in Python 2.5 and introduce
the exception in Python 2.6.

>> Note that you do have to interpret the string as characters
>> > if you compare it to Unicode and there's nothing wrong with
>> > that.
> 
> Consider this:
> py> int(3+4j)
> Traceback (most recent call last):
>   File "", line 1, in ?
> TypeError: can't convert complex to int; use int(abs(z))
> py> 3 == 3+4j
> False
>
> So even though the conversion raises an exception, the
> values are determined to be not equal. Again, because int
> is a nearly true subset of complex, the conversion goes
> the other way, but *if* it would use the complex->int
> conversion, then the TypeError should be taken as
> a guarantee that the objects don't compare equal.

In the above example, you clearly know that the two are
unequal due to the relationship between complex numbers
having an imaginary part and integers..

The same is true for the overflow case:

>>> 2**1 == 1.23
False
>>> float(2**1)
Traceback (most recent call last):
  File "", line 1, in ?
OverflowError: long int too large to convert to float

(Note that in Python 2.3 this used to raise an exception as well.)

However, this is not the case for 8-bit string vs. Unicode,
since you cannot use such extra knowledge if you find that ASCII
encoding assumption obviously doesn't match the string
in question.

> Expanding this view to Unicode should mean that a unicode
> string U equals a byte string B if
> U.encode(system_encode) == B or B.decode(system_encoding) == U,
> and that they don't equal otherwise 

Agreed.

Note that Python always coerces to the "bigger" type. As a result,
the second option is what is actually implemented in Python.

> (e.g. if the conversion
> fails with a "not convertible" exception). 

I disagree with this part.

Failure to decode a string doesn't imply inequality. It implies
that the programmer needs to step in and correct the problem by
making an explicit and conscious decision.

The alternative would be to decide that equal comparisons should never
be allowed to raise exceptions and instead have the equal comparison
return False. In which case, we'd have the revert the dict patch
altogether and instead silence all exceptions that
are generated during the equal comparison (not only in the dict
implementation), replacing them with a False return value.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Aug 08 2006)
>>> Python/Zope Consulting and Support ...http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! 
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com