[issue1263] PEP 3137 patch - str8/str comparison should return false

2007-10-22 Thread Brett Cannon

Brett Cannon added the comment:

Everything committed in r58596.  Thanks for the help, Thomas!

--
resolution:  -> accepted
status: open -> closed

__
Tracker <[EMAIL PROTECTED]>

__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1263] PEP 3137 patch - str8/str comparison should return false

2007-10-22 Thread Brett Cannon

Changes by Brett Cannon:


--
keywords: +py3k
priority:  -> immediate
type: rfe -> behavior

__
Tracker <[EMAIL PROTECTED]>

__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1263] PEP 3137 patch - str8/str comparison should return false

2007-10-22 Thread Brett Cannon

Brett Cannon added the comment:

Attached is a fix for test_struct.

All of the string tests now assume str8 is returned when arguments of
bytes, str8 or str are given for the various string formats.

All tests now pass.  Re-assigning to myself to check everything in when
it isn't so late at night.  =)

--
assignee: gvanrossum -> brett.cannon
Added file: http://bugs.python.org/file8590/fix_test_struct.diff

__
Tracker <[EMAIL PROTECTED]>

__

fix_test_struct.diff
Description: Binary data
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1263] PEP 3137 patch - str8/str comparison should return false

2007-10-21 Thread Guido van Rossum

Guido van Rossum added the comment:

> Guido, what do you want to do about the struct module for the various
> string formats (i.e., c, s, p)?  Should they return str8 or Unicode?

Oh, tough call. I think they should return str8 (i.e. bytes after the
rename) because the encoding isn't known. Even though this will break
more code, since I'm sure there's lots of code out there that assumes
they return "text".

__
Tracker <[EMAIL PROTECTED]>

__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1263] PEP 3137 patch - str8/str comparison should return false

2007-10-20 Thread Brett Cannon

Brett Cannon added the comment:

Attached is a fix for test_subprocess.

Simply had to change a call to str8() to str().

I am going to run the test suite, but that should leave only test_struct
failing and that can be fixed as soon as Guido makes a call on whether
str8 or str should be used for the string formats.

Added file: http://bugs.python.org/file8582/fix_test_subprocess.diff

__
Tracker <[EMAIL PROTECTED]>

__

fix_test_subprocess.diff
Description: Binary data
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1263] PEP 3137 patch - str8/str comparison should return false

2007-10-20 Thread Brett Cannon

Brett Cannon added the comment:

Guido, what do you want to do about the struct module for the various
string formats (i.e., c, s, p)?  Should they return str8 or Unicode?

__
Tracker <[EMAIL PROTECTED]>

__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1263] PEP 3137 patch - str8/str comparison should return false

2007-10-20 Thread Brett Cannon

Brett Cannon added the comment:

Attached is a patch to fix test_str.

Basically there were a bunch of redundant tests for making sure that
calling str() on an object called it's __str__ method.  str8 no longer
is directly relevant since it is no longer an actual string type.

Added file: http://bugs.python.org/file8581/fix_test_str.diff

__
Tracker <[EMAIL PROTECTED]>

__

fix_test_str.diff
Description: Binary data
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1263] PEP 3137 patch - str8/str comparison should return false

2007-10-20 Thread Brett Cannon

Brett Cannon added the comment:

Attached is a fix for sqlite3.

First issue was that the dictionary that was being used to store
converters was having keys in Python code as Unicode but being compared
against str8 in C.

The second issue was that when an object was serialized using
__conform__ and a Unicode object was returned, it was being unserialized
as a str8 no matter what type of argument was returned.  That makes the
most sense if only a single type is going to be returned, so I left it
as such and fixed the test to decode str8 to UTF-8 if passed to __init__.

Added file: http://bugs.python.org/file8580/sqlite_fix.diff

__
Tracker <[EMAIL PROTECTED]>

__

sqlite_fix.diff
Description: Binary data
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1263] PEP 3137 patch - str8/str comparison should return false

2007-10-20 Thread Brett Cannon

Brett Cannon added the comment:

Attached is a fix for modulefinder.

It is an ugly hack as modulefinder took the numeric opcode numbers from
dis and passed them to chr().  But that doesn't work since that returns
Unicode.  So I took those single characters and passed them to str8().

Once str8() has its constructor match bytes() then the chr() call can be
ditched and the dis values can be tossed into a single-item list.

Added file: http://bugs.python.org/file8579/fix_modulefinder.diff

__
Tracker <[EMAIL PROTECTED]>

__

fix_modulefinder.diff
Description: Binary data
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1263] PEP 3137 patch - str8/str comparison should return false

2007-10-19 Thread Brett Cannon

Brett Cannon added the comment:

Attached a fix for test_format.

It was testing string interpolation on both str8 and str and using a str
for the comparison.  Since string interpolation is going away for str8
once it becomes bytes I just removed the testing of str8.

The failures I know of left are:
test_modulefinder
test_sqlite
test_str
test_struct
test_subprocess

Added file: http://bugs.python.org/file8574/fix_test_format.diff

__
Tracker <[EMAIL PROTECTED]>

__

fix_test_format.diff
Description: Binary data
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1263] PEP 3137 patch - str8/str comparison should return false

2007-10-19 Thread Brett Cannon

Brett Cannon added the comment:

Attached is a patch to fix test_compile.  Simple fix of turning an empty
string into ``str8('')``.

Added file: http://bugs.python.org/file8573/fix_test_compile.diff

__
Tracker <[EMAIL PROTECTED]>

__

fix_test_compile.diff
Description: Binary data
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1263] PEP 3137 patch - str8/str comparison should return false

2007-10-19 Thread Brett Cannon

Brett Cannon added the comment:

The file I just uploaded is unicode-string-eq-false-all-r4.patch with
the codecs.c and structmember.c parts of the patch removed.

--
nosy: +brett.cannon
Added file: http://bugs.python.org/file8572/r4-revised.patch

__
Tracker <[EMAIL PROTECTED]>

__

r4-revised.patch
Description: Binary data
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1263] PEP 3137 patch - str8/str comparison should return false

2007-10-19 Thread Brett Cannon

Changes by Brett Cannon:


Removed file: http://bugs.python.org/file8543/unnamed

__
Tracker <[EMAIL PROTECTED]>

__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1263] PEP 3137 patch - str8/str comparison should return false

2007-10-19 Thread Guido van Rossum

Guido van Rossum added the comment:

I've committed the half of this patch that doesn't break any tests: the
changes to codecs.c and structmember.c.

Committed revision 58551.

I'm seeking help getting the remaining unit tests to pass. (Thanks
Thomas for the enumeration of the test failures!)

__
Tracker <[EMAIL PROTECTED]>

__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1263] PEP 3137 patch - str8/str comparison should return false

2007-10-15 Thread Guido van Rossum

Guido van Rossum added the comment:

I'll look at this at some point. One quick comment: the lnotab and bytecode
should use PyString, which will be 'bytes' in 3.0a2. They must be immutable
because code objects must be immutable (it must not be possible to modify an
existing code object).

On 10/15/07, Thomas Lee <[EMAIL PROTECTED]> wrote:
>
>
> Thomas Lee added the comment:
>
> Hack to make Python/codecs.c use Unicode strings internally. I recognize
> the way I have fixed it here is probably not ideal (basically ripped out
> PyString_*, replaced with a PyMem_Malloc/PyMem_Free call) but it fixes
> 10-12 tests that were failing with my earlier changes. If anybody can
> recommend a "nice" way to fix this, I'm all ears.
>
> The following still fail for me with this patch applied:
>
> -- test_compile
>
> This is due to PyString_*/PyUnicode_*/PyBytes_* confusion in the
> assembler struct (specifically: a_lnotab and a_bytecode) in
> Python/compile.c - tried replacing PyString_* calls with PyBytes_*
> calls, but this raises a TypeError because PyBytes is not hashable ...
> not sure what exactly is causing this.
>
> -- test_ctypes
> Looks like a simple case of ctypes using str8 instead of str. Appears to
> be an easy fix.
>
> -- test_modulefinder
> Failing because str8 >= str is now an invalid operation
>
> -- test_set
> This test needs some love.
>
> -- test_sqlite
> Not sure what's going on here.
>
> -- test_str
>
> This one is a little tricky: str8/str with __str__/__unicode__ ... how
> is this test supposed to behave with the fix in this patch?
>
> -- test_struct
> "unpack/pack not transitive" - what does that mean?
>
> -- test_subprocess
> Like modulefinder, this is probably just due to the use of str8 over str
> internally in the subprocess module. Likely to be an easy fix.
>
> The following tests fail for me irrespective of whether or not I have r4
> of my patch applied:
>
> -- test_doctest
> -- test_email
> -- test_nis
> -- test_pty
>
> If anybody can give this new patch a try and let me know the result it
> would be much appreciated.
>
> __
> Tracker <[EMAIL PROTECTED]>
> 
> __
>

__
Tracker <[EMAIL PROTECTED]>

__I'll look at this at some point. One quick comment: the lnotab and bytecode 
should use PyString, which will be 'bytes' in 3.0a2. They must be 
immutable because code objects must be immutable (it must not be possible to 
modify an existing code object).
On 10/15/07, Thomas Lee [EMAIL PROTECTED]> wrote:
Thomas Lee added the comment:Hack to make Python/codecs.c use 
Unicode strings internally. I recognizethe way I have fixed it here is 
probably not ideal (basically ripped outPyString_*, replaced with a 
PyMem_Malloc/PyMem_Free call) but it fixes
10-12 tests that were failing with my earlier changes. If anybody 
canrecommend a "nice" way to fix this, I'm all 
ears.The following still fail for me with this patch applied:-- 
test_compile
This is due to PyString_*/PyUnicode_*/PyBytes_* confusion in 
theassembler struct (specifically: a_lnotab and a_bytecode) 
inPython/compile.c - tried replacing PyString_* calls with 
PyBytes_*calls, but this raises a TypeError because PyBytes is not hashable 
...
not sure what exactly is causing this.-- test_ctypesLooks like 
a simple case of ctypes using str8 instead of str. Appears tobe an easy 
fix.-- test_modulefinderFailing because str8 >= str is now an 
invalid operation
-- test_setThis test needs some love.-- test_sqliteNot 
sure what's going on here.-- test_strThis one is a little 
tricky: str8/str with __str__/__unicode__ ... howis this test supposed to 
behave with the fix in this patch?
-- test_struct"unpack/pack not transitive" - what does 
that mean?-- test_subprocessLike modulefinder, this is probably 
just due to the use of str8 over strinternally in the subprocess module. 
Likely to be an easy fix.
The following tests fail for me irrespective of whether or not I have 
r4of my patch applied:-- test_doctest-- test_email-- 
test_nis-- test_ptyIf anybody can give this new patch a try and let 
me know the result it
would be much 
appreciated.__Tracker [EMAIL PROTECTED]>http://bugs.python.org/issue1263
>__-- --Guido van Rossum (home page: http://www.python.org/~guido/";>http://www.python.org/~guido/)

___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1263] PEP 3137 patch - str8/str comparison should return false

2007-10-15 Thread Thomas Lee

Thomas Lee added the comment:

Hack to make Python/codecs.c use Unicode strings internally. I recognize
the way I have fixed it here is probably not ideal (basically ripped out
PyString_*, replaced with a PyMem_Malloc/PyMem_Free call) but it fixes
10-12 tests that were failing with my earlier changes. If anybody can
recommend a "nice" way to fix this, I'm all ears.

The following still fail for me with this patch applied:

-- test_compile

This is due to PyString_*/PyUnicode_*/PyBytes_* confusion in the
assembler struct (specifically: a_lnotab and a_bytecode) in
Python/compile.c - tried replacing PyString_* calls with PyBytes_*
calls, but this raises a TypeError because PyBytes is not hashable ...
not sure what exactly is causing this.

-- test_ctypes
Looks like a simple case of ctypes using str8 instead of str. Appears to
be an easy fix.

-- test_modulefinder
Failing because str8 >= str is now an invalid operation

-- test_set
This test needs some love.

-- test_sqlite
Not sure what's going on here.

-- test_str

This one is a little tricky: str8/str with __str__/__unicode__ ... how
is this test supposed to behave with the fix in this patch?

-- test_struct
"unpack/pack not transitive" - what does that mean?

-- test_subprocess
Like modulefinder, this is probably just due to the use of str8 over str
internally in the subprocess module. Likely to be an easy fix.
 
The following tests fail for me irrespective of whether or not I have r4
of my patch applied:

-- test_doctest
-- test_email
-- test_nis
-- test_pty

If anybody can give this new patch a try and let me know the result it
would be much appreciated.

__
Tracker <[EMAIL PROTECTED]>

__Index: Python/codecs.c
===
--- Python/codecs.c	(revision 58468)
+++ Python/codecs.c	(working copy)
@@ -55,16 +55,15 @@
 size_t len = strlen(string);
 char *p;
 PyObject *v;
-
+
 if (len > PY_SSIZE_T_MAX) {
 	PyErr_SetString(PyExc_OverflowError, "string is too large");
 	return NULL;
 }
 	
-v = PyString_FromStringAndSize(NULL, len);
-if (v == NULL)
-	return NULL;
-p = PyString_AS_STRING(v);
+p = PyMem_Malloc(len + 1);
+if (p == NULL)
+return NULL;
 for (i = 0; i < len; i++) {
 register char ch = string[i];
 if (ch == ' ')
@@ -73,6 +72,11 @@
 ch = tolower(Py_CHARMASK(ch));
 	p[i] = ch;
 }
+p[i] = '\0';
+v = PyUnicode_FromString(p);
+if (v == NULL)
+return NULL;
+PyMem_Free(p);
 return v;
 }
 
@@ -112,7 +116,7 @@
 v = normalizestring(encoding);
 if (v == NULL)
 	goto onError;
-PyString_InternInPlace(&v);
+PyUnicode_InternInPlace(&v);
 
 /* First, try to lookup the name in the registry dictionary */
 result = PyDict_GetItem(interp->codec_search_cache, v);
@@ -193,7 +197,7 @@
 if (errors) {
 	PyObject *v;
 	
-	v = PyString_FromString(errors);
+	v = PyUnicode_FromString(errors);
 	if (v == NULL) {
 	Py_DECREF(args);
 	return NULL;
Index: Python/structmember.c
===
--- Python/structmember.c	(revision 58468)
+++ Python/structmember.c	(working copy)
@@ -51,13 +51,13 @@
 			v = Py_None;
 		}
 		else
-			v = PyString_FromString(*(char**)addr);
+			v = PyUnicode_FromString(*(char**)addr);
 		break;
 	case T_STRING_INPLACE:
-		v = PyString_FromString((char*)addr);
+		v = PyUnicode_FromString((char*)addr);
 		break;
 	case T_CHAR:
-		v = PyString_FromStringAndSize((char*)addr, 1);
+		v = PyUnicode_FromStringAndSize((char*)addr, 1);
 		break;
 	case T_OBJECT:
 		v = *(PyObject **)addr;
@@ -225,8 +225,8 @@
 		Py_XDECREF(oldv);
 		break;
 	case T_CHAR:
-		if (PyString_Check(v) && PyString_Size(v) == 1) {
-			*(char*)addr = PyString_AsString(v)[0];
+		if (PyUnicode_Check(v) && PyUnicode_GetSize(v) == 1) {
+			*(char*)addr = PyUnicode_AsString(v)[0];
 		}
 		else {
 			PyErr_BadArgument();
Index: Objects/unicodeobject.c
===
--- Objects/unicodeobject.c	(revision 58468)
+++ Objects/unicodeobject.c	(working copy)
@@ -6224,16 +6224,6 @@
 if (PyUnicode_Check(left) && PyUnicode_Check(right))
 return unicode_compare((PyUnicodeObject *)left,
(PyUnicodeObject *)right);
-if ((PyString_Check(left) && PyUnicode_Check(right)) ||
-(PyUnicode_Check(left) && PyString_Check(right))) {
-if (PyUnicode_Check(left))
-left = _PyUnicode_AsDefaultEncodedString(left, NULL);
-if (PyUnicode_Check(right))
-right = _PyUnicode_AsDefaultEncodedString(right, NULL);
-assert(PyString_Check(left));
-assert(PyString_Check(right));
-return PyObject_Compare(left, right);
-}
 PyErr_Format(PyExc_TypeError,
  "Can't compare %.100s and %.100s",
 

[issue1263] PEP 3137 patch - str8/str comparison should return false

2007-10-11 Thread Martin v. Löwis

Changes by Martin v. Löwis:


--
keywords: +patch

__
Tracker <[EMAIL PROTECTED]>

__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1263] PEP 3137 patch - str8/str comparison should return false

2007-10-11 Thread Guido van Rossum

Changes by Guido van Rossum:


--
assignee:  -> gvanrossum
nosy: +gvanrossum

__
Tracker <[EMAIL PROTECTED]>

__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1263] PEP 3137 patch - str8/str comparison should return false

2007-10-11 Thread Thomas Lee

Thomas Lee added the comment:

Oops - use unicode-string-eq-false-r3.patch, not
unicode-string-eq-false-r2.patch.

__
Tracker <[EMAIL PROTECTED]>

__Index: Objects/unicodeobject.c
===
--- Objects/unicodeobject.c	(revision 58389)
+++ Objects/unicodeobject.c	(working copy)
@@ -6191,16 +6191,6 @@
 if (PyUnicode_Check(left) && PyUnicode_Check(right))
 return unicode_compare((PyUnicodeObject *)left,
(PyUnicodeObject *)right);
-if ((PyString_Check(left) && PyUnicode_Check(right)) ||
-(PyUnicode_Check(left) && PyString_Check(right))) {
-if (PyUnicode_Check(left))
-left = _PyUnicode_AsDefaultEncodedString(left, NULL);
-if (PyUnicode_Check(right))
-right = _PyUnicode_AsDefaultEncodedString(right, NULL);
-assert(PyString_Check(left));
-assert(PyString_Check(right));
-return PyObject_Compare(left, right);
-}
 PyErr_Format(PyExc_TypeError,
  "Can't compare %.100s and %.100s",
  left->ob_type->tp_name,
Index: Lib/test/test_unicode.py
===
--- Lib/test/test_unicode.py	(revision 58389)
+++ Lib/test/test_unicode.py	(working copy)
@@ -200,6 +200,10 @@
 self.checkequalnofix('[EMAIL PROTECTED]', 'one!two!three!', 'replace', '!', '@', 1)
 self.assertRaises(TypeError, 'replace'.replace, "r", 42)
 
+def test_str8_comparison(self):
+self.assertEqual('abc' == str8('abc'), False)
+self.assertEqual('abc' != str8('abc'), True)
+
 def test_comparison(self):
 # Comparisons:
 self.assertEqual('abc', 'abc')
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1263] PEP 3137 patch - str8/str comparison should return false

2007-10-11 Thread Thomas Lee

Changes by Thomas Lee:


__
Tracker <[EMAIL PROTECTED]>

__
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1263] PEP 3137 patch - str8/str comparison should return false

2007-10-11 Thread Thomas Lee

New submission from Thomas Lee:

The main patch - while exactly what is needed to make str8/str equality
checks return False - breaks a bunch of tests due to PyString_* still
being used elsewhere when it should be using PyUnicode.

The second patch modifies structmember.c to use PyUnicode_* where it was
previously using PyString_*, which fixes the first problem I stumbled
across in trying to get test_unicode to run.

Unfortunately, similar errors are present in Python/codecs.c and other
places (maybe Python/modsupport.c too? not 100% sure yet) - these still
need to be fixed!

--
components: Interpreter Core
files: unicode-string-eq-false-r2.patch
messages: 56343
nosy: thomas.lee
severity: normal
status: open
title: PEP 3137 patch - str8/str comparison should return false
type: rfe
versions: Python 3.0

__
Tracker <[EMAIL PROTECTED]>

__Index: Objects/unicodeobject.c
===
--- Objects/unicodeobject.c	(revision 58389)
+++ Objects/unicodeobject.c	(working copy)
@@ -6191,16 +6191,6 @@
 if (PyUnicode_Check(left) && PyUnicode_Check(right))
 return unicode_compare((PyUnicodeObject *)left,
(PyUnicodeObject *)right);
-if ((PyString_Check(left) && PyUnicode_Check(right)) ||
-(PyUnicode_Check(left) && PyString_Check(right))) {
-if (PyUnicode_Check(left))
-left = _PyUnicode_AsDefaultEncodedString(left, NULL);
-if (PyUnicode_Check(right))
-right = _PyUnicode_AsDefaultEncodedString(right, NULL);
-assert(PyString_Check(left));
-assert(PyString_Check(right));
-return PyObject_Compare(left, right);
-}
 PyErr_Format(PyExc_TypeError,
  "Can't compare %.100s and %.100s",
  left->ob_type->tp_name,
Index: Lib/stringprep.py
===
--- Lib/stringprep.py	(revision 58389)
+++ Lib/stringprep.py	(working copy)
@@ -5,6 +5,8 @@
 and mappings, for which a mapping function is provided.
 """
 
+import sys
+
 from unicodedata import ucd_3_2_0 as unicodedata
 
 assert unicodedata.unidata_version == '3.2.0'
Index: Lib/test/test_unicode.py
===
--- Lib/test/test_unicode.py	(revision 58389)
+++ Lib/test/test_unicode.py	(working copy)
@@ -200,6 +200,10 @@
 self.checkequalnofix('[EMAIL PROTECTED]', 'one!two!three!', 'replace', '!', '@', 1)
 self.assertRaises(TypeError, 'replace'.replace, "r", 42)
 
+def test_str8_comparison(self):
+self.assertEqual('abc' == str8('abc'), False)
+self.assertEqual('abc' != str8('abc'), True)
+
 def test_comparison(self):
 # Comparisons:
 self.assertEqual('abc', 'abc')
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com