Re: [Python-Dev] Strange error importing a Pickle from 2.7 to 3.2

2011-02-24 Thread M.-A. Lemburg
Alexander Belopolsky wrote:
 On Wed, Feb 23, 2011 at 6:32 PM, M.-A. Lemburg m...@egenix.com wrote:
 Alexander Belopolsky wrote:
 ..
 In what sense is Latin-1 the official name?  The IANA charset
 registry has the following listing


 Name: ISO_8859-1:1987[RFC1345,KXS2]
 MIBenum: 4
 Source: ECMA registry
 Alias: iso-ir-100
 Alias: ISO_8859-1
 Alias: ISO-8859-1 (preferred MIME name)
 Alias: latin1
 ..
 Latin-1 is short for Latin Alphabet No. 1 and
 started out as ECMA-94 in 1985 and 1986:
 
 This does not explain your preference of Latin-1 over Latin1.

This is not my preference. See e.g. Wikipedia
http://en.wikipedia.org/wiki/ISO/IEC_8859-1

It is common practice to replace spaces in descriptive names with
a hyphen to come up with an identifier string (even Google
does or undoes this when searching the net).

Replacing spaces with an empty string is also an option, but
doesn't read as well.

 Both are perfectly valid abbreviations for Latin Alphabet No. 1.
 The spelling without - has the advantage of being a valid Python
 identifier and a module name.

The hyphens are converted to underscores by the lookup function
in the encodings package. That turns the name into a valid
Python module name.

  The IANA registration for latin1 and
 lack of that for latin-1 most likely indicates that the former is
 more commonly found in machine readable metadata.

I don't know why you emphasize so much on machine readable metadata.
Python source code is machine readable, the Internet is machine
readable, all documents found there are machine readable.

As I said earlier on: the IANA registry is just that - a registry
of names with the purpose of avoiding name clashes in the resp.
name space. As such, it is not a standard, but merely a tool
to map various aliases to a canoncial name.

The fact that an alias is registered doesn't allow any
implication on whether it's in wide-spread use or not, e.g.
csISOLatin1 gives me 6810 hits on Google.

I get 788,000 hits for 'latin1 -latin-1' on Google,
'latin-1' gives 2,600,000 hits. Looks like it's still
the preferred way to write that encoding name.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Feb 24 2011)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


::: Try our new mxODBC.Connect Python Database Interface for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
   http://www.egenix.com/company/contact/
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Strange error importing a Pickle from 2.7 to 3.2

2011-02-24 Thread Scott Dial
On 2/24/2011 4:02 AM, M.-A. Lemburg wrote:
 I get 788,000 hits for 'latin1 -latin-1' on Google,
 'latin-1' gives 2,600,000 hits. Looks like it's still
 the preferred way to write that encoding name.

That's bogus. You can't search for latin-1 on Google, it isn't strict
enough. The third hit is a url containing latin1 and a title of Latin
1. And it picks up things like Latin 1: The Easy Way, which is a book
on Latin.

However, you *can* search much more strictly on Google Code Search,
which gives 4,014 (latin-1) to 13,597 (latin1).

http://www.google.com/codesearch?hl=enlr=q=%28\%22latin1\%22|\%27latin1\%27%29sbtn=Search
http://www.google.com/codesearch?hl=enlr=q=%28\%22latin-1\%22|\%27latin-1\%27%29sbtn=Search

So, no, I don't think the development world aligns with your pedantry.
That's not to say this is a popularity contest, but then let's not cite
google hit counts as proof.

-- 
Scott Dial
sc...@scottdial.com
scod...@cs.indiana.edu
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Strange error importing a Pickle from 2.7 to 3.2

2011-02-23 Thread Nick Coghlan
On Wed, Feb 23, 2011 at 3:51 PM, Stephen J. Turnbull step...@xemacs.org wrote:
 Jesus Cea writes:

   Every time I read a message from [long, incompletewink list] and
   so many others python-devs (not an exhaustive list, if you are not
   there, you probably should, sorry :), I feel I am faking my
   knowledge of Python :-). I am a pretender :).

 Sure.  I suspect even some of those *on* the list feel that way
 sometimes.  That's what's so great about the list, the people who
 contribute!

I personally feel that way every time I realise just how much of the
standard library I have never even seriously looked at (let alone
used) and how much more there is to the Python ecosystem than just
CPython (subscribing to Planet Python and the python tag on Stack
Overflow has been truly eye opening in that regard).

Heck, some day I may even get around to learning how to build a proper
Python package ;)

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Strange error importing a Pickle from 2.7 to 3.2

2011-02-23 Thread Alexander Belopolsky
On Tue, Feb 22, 2011 at 11:34 PM, Jesus Cea j...@jcea.es wrote:
..
 Issue filed. It already has a patch. That was fast!. Now I can sit back
 waiting for 3.2.1 before touching my project again :). Mixed feelings
 about the waiting. I hope it is short.

It looks like you don't need delay your project: if you spell encoding
as latin-1, your pickle loads just fine:


 with open(z.pickle, mode=rb) as f: pickle.load(f, encoding=latin-1)
...
{'ya_volcados': {'comment': ''}}


With encoding=latin1, it does fail:

 with open(z.pickle, mode=rb) as f: pickle.load(f, encoding=latin1)
...
Traceback (most recent call last):
  File stdin, line 1, in module
ValueError: operation forbidden on released memoryview object

(For those not following the tracker issue, the z.pickle file was
posted at http://bugs.python.org/file20839/z.pickle.)

There is still a bug, which is best demonstrated by

 pickle.loads(b'\x80\x02U\x00.', encoding='latin1')
Traceback (most recent call last):
  File stdin, line 1, in module
ValueError: operation forbidden on released memoryview object

The fact that the above works with encoding='latin-1',

 pickle.loads(b'\x80\x02U\x00.', encoding='latin-1')
''

shows that there is probably more than one bug.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Strange error importing a Pickle from 2.7 to 3.2

2011-02-23 Thread Guido van Rossum
I'm guessing that one of these encoding names is recognized by the C
code while the other one takes the slow path via the aliasing code.

On Wed, Feb 23, 2011 at 11:16 AM, Alexander Belopolsky
alexander.belopol...@gmail.com wrote:
 On Tue, Feb 22, 2011 at 11:34 PM, Jesus Cea j...@jcea.es wrote:
 ..
 Issue filed. It already has a patch. That was fast!. Now I can sit back
 waiting for 3.2.1 before touching my project again :). Mixed feelings
 about the waiting. I hope it is short.

 It looks like you don't need delay your project: if you spell encoding
 as latin-1, your pickle loads just fine:


 with open(z.pickle, mode=rb) as f: pickle.load(f, encoding=latin-1)
 ...
 {'ya_volcados': {'comment': ''}}


 With encoding=latin1, it does fail:

 with open(z.pickle, mode=rb) as f: pickle.load(f, encoding=latin1)
 ...
 Traceback (most recent call last):
  File stdin, line 1, in module
 ValueError: operation forbidden on released memoryview object

 (For those not following the tracker issue, the z.pickle file was
 posted at http://bugs.python.org/file20839/z.pickle.)

 There is still a bug, which is best demonstrated by

 pickle.loads(b'\x80\x02U\x00.', encoding='latin1')
 Traceback (most recent call last):
  File stdin, line 1, in module
 ValueError: operation forbidden on released memoryview object

 The fact that the above works with encoding='latin-1',

 pickle.loads(b'\x80\x02U\x00.', encoding='latin-1')
 ''

 shows that there is probably more than one bug.
 ___
 Python-Dev mailing list
 Python-Dev@python.org
 http://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe: 
 http://mail.python.org/mailman/options/python-dev/guido%40python.org




-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Strange error importing a Pickle from 2.7 to 3.2

2011-02-23 Thread Alexander Belopolsky
On Wed, Feb 23, 2011 at 4:07 PM, Guido van Rossum gu...@python.org wrote:
 I'm guessing that one of these encoding names is recognized by the C
 code while the other one takes the slow path via the aliasing code.

This is absolutely right.  In fact I am going to propose adding
strcmp(lower, latin1) to the following test in
PyUnicode_AsEncodedString():


else if ((strcmp(lower, latin-1) == 0) ||
 (strcmp(lower, iso-8859-1) == 0))
return PyUnicode_EncodeLatin1(...

I'll open a separate issue for that.  In Python's own stdlib and tests
latin1 is a more common spelling than latin-1, so it makes sense
to optimize it.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Strange error importing a Pickle from 2.7 to 3.2

2011-02-23 Thread M.-A. Lemburg
Alexander Belopolsky wrote:
 On Wed, Feb 23, 2011 at 4:07 PM, Guido van Rossum gu...@python.org wrote:
 I'm guessing that one of these encoding names is recognized by the C
 code while the other one takes the slow path via the aliasing code.
 
 This is absolutely right.  In fact I am going to propose adding
 strcmp(lower, latin1) to the following test in
 PyUnicode_AsEncodedString():
 
 
   else if ((strcmp(lower, latin-1) == 0) ||
  (strcmp(lower, iso-8859-1) == 0))
 return PyUnicode_EncodeLatin1(...
 
 I'll open a separate issue for that.  In Python's own stdlib and tests
 latin1 is a more common spelling than latin-1, so it makes sense
 to optimize it.

Latin-1 is the official name and the one used internally by Python,
so it would be good to have the test suite and Python code in general
to use that variant of the name (just as utf-8 is preferred over
utf8).

Instead of adding more aliases to the C code, please change the
encoding names in the stdlib and test suite.

Thanks,
-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Feb 23 2011)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


::: Try our new mxODBC.Connect Python Database Interface for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
   http://www.egenix.com/company/contact/
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Strange error importing a Pickle from 2.7 to 3.2

2011-02-23 Thread Alexander Belopolsky
On Wed, Feb 23, 2011 at 4:23 PM, M.-A. Lemburg m...@egenix.com wrote:
..
 Latin-1 is the official name and the one used internally by Python,
 so it would be good to have the test suite and Python code in general
 to use that variant of the name (just as utf-8 is preferred over
 utf8).

 Instead of adding more aliases to the C code, please change the
 encoding names in the stdlib and test suite.

I cannot agree with you on this one.  Official or not, latin-1 is
much less commonly used than latin1.   Currently decode(latin1) is
10x slower than  decode(latin-1) on short strings.  We already have
a check for iso-8859-1 alias in PyUnicode_AsEncodedString().  Adding
latin1 (and possibly utf8 as well) is likely to speed up many
applications at minimal cost.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Strange error importing a Pickle from 2.7 to 3.2

2011-02-23 Thread M.-A. Lemburg
Alexander Belopolsky wrote:
 On Wed, Feb 23, 2011 at 4:23 PM, M.-A. Lemburg m...@egenix.com wrote:
 ..
 Latin-1 is the official name and the one used internally by Python,
 so it would be good to have the test suite and Python code in general
 to use that variant of the name (just as utf-8 is preferred over
 utf8).

 Instead of adding more aliases to the C code, please change the
 encoding names in the stdlib and test suite.
 
 I cannot agree with you on this one.  Official or not, latin-1 is
 much less commonly used than latin1.   Currently decode(latin1) is
 10x slower than  decode(latin-1) on short strings.  We already have
 a check for iso-8859-1 alias in PyUnicode_AsEncodedString().  Adding
 latin1 (and possibly utf8 as well) is likely to speed up many
 applications at minimal cost.

Fair enough, then add latin1 and utf8 to both PyUnicode_Decode()
and PyUnicode_AsEncodedString().

Still, the stdlib and test suite should be examples of using the
correct names.

I only found these few cases where the wrong Latin-1 name is used
in the stdlib:

./distutils/command/bdist_wininst.py:
-- # convert back to bytes. latin1 simply avoids any possible
-- encoding=latin1) as script:
-- script_data = script.read().encode(latin1)
./urllib/request.py:
-- data = base64.decodebytes(data.encode('ascii')).decode('latin1')
./asynchat.py:
-- encoding= 'latin1'
./ftplib.py:
-- encoding = latin1
./sre_parse.py:
-- encode = lambda x: x.encode('latin1')

I get 12 hits for the test suite.

Yet 108 for the correct name, so I can't follow your statement
that the wrong variant is used more often.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Feb 23 2011)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


::: Try our new mxODBC.Connect Python Database Interface for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
   http://www.egenix.com/company/contact/
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Strange error importing a Pickle from 2.7 to 3.2

2011-02-23 Thread Alexander Belopolsky
On Wed, Feb 23, 2011 at 4:54 PM, M.-A. Lemburg m...@egenix.com wrote:
..
 Yet 108 for the correct name, so I can't follow your statement
 that the wrong variant is used more often.

Hmm, your grepping skills are probably better than mine. I get


$ grep -iw latin-1 Lib/*.py | wc -l
  24

and

$ grep -iw latin1 Lib/test/*.py | wc -l
  25

(I did get spurious hits with naive grep latin1, so I retract my
more often claim and just say that both spellings are equally
common.)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Strange error importing a Pickle from 2.7 to 3.2

2011-02-23 Thread Alexander Belopolsky
On Wed, Feb 23, 2011 at 4:23 PM, M.-A. Lemburg m...@egenix.com wrote:
..
 Latin-1 is the official name and the one used internally by Python,

In what sense is Latin-1 the official name?  The IANA charset
registry has the following listing


Name: ISO_8859-1:1987[RFC1345,KXS2]
MIBenum: 4
Source: ECMA registry
Alias: iso-ir-100
Alias: ISO_8859-1
Alias: ISO-8859-1 (preferred MIME name)
Alias: latin1
Alias: l1
Alias: IBM819
Alias: CP819
Alias: csISOLatin1

(See http://www.iana.org/assignments/character-sets)

Latin-1 spelling does appear in various unicode.org documents, but
not in machine readable files as far as I can tell.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Strange error importing a Pickle from 2.7 to 3.2

2011-02-23 Thread M.-A. Lemburg
Alexander Belopolsky wrote:
 On Wed, Feb 23, 2011 at 4:54 PM, M.-A. Lemburg m...@egenix.com wrote:
 ..
 Yet 108 for the correct name, so I can't follow your statement
 that the wrong variant is used more often.
 
 Hmm, your grepping skills are probably better than mine. I get
 
 
 $ grep -iw latin-1 Lib/*.py | wc -l
   24
 
 and
 
 $ grep -iw latin1 Lib/test/*.py | wc -l
   25
 
 (I did get spurious hits with naive grep latin1, so I retract my
 more often claim and just say that both spellings are equally
 common.)

I used a Python script based on re, perhaps that's why :-)

grep only counts lines, not multiple instances on a single line
and looking through the hits I found, there are a few false
positives such as 'latin-10' or 'iso-latin-1'. Without those,
I get 83 hits.

If you open a ticket for this, I'll add the list of hits to
that ticket.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Feb 23 2011)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


::: Try our new mxODBC.Connect Python Database Interface for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
   http://www.egenix.com/company/contact/
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Strange error importing a Pickle from 2.7 to 3.2

2011-02-23 Thread Alexander Belopolsky
On Wed, Feb 23, 2011 at 5:21 PM, M.-A. Lemburg m...@egenix.com wrote:
..
 If you open a ticket for this, I'll add the list of hits to
 that ticket.


http://bugs.python.org/issue11303
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Strange error importing a Pickle from 2.7 to 3.2

2011-02-23 Thread Ethan Furman

M.-A. Lemburg wrote:

Still, the stdlib and test suite should be examples of using the
correct names.


I won't argue with the stdlib portion of your argument, but I would 
think that the best example of test code would be a complete and 
thorough check of all options.


~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Strange error importing a Pickle from 2.7 to 3.2

2011-02-23 Thread M.-A. Lemburg
Alexander Belopolsky wrote:
 On Wed, Feb 23, 2011 at 4:23 PM, M.-A. Lemburg m...@egenix.com wrote:
 ..
 Latin-1 is the official name and the one used internally by Python,
 
 In what sense is Latin-1 the official name?  The IANA charset
 registry has the following listing
 
 
 Name: ISO_8859-1:1987[RFC1345,KXS2]
 MIBenum: 4
 Source: ECMA registry
 Alias: iso-ir-100
 Alias: ISO_8859-1
 Alias: ISO-8859-1 (preferred MIME name)
 Alias: latin1
 Alias: l1
 Alias: IBM819
 Alias: CP819
 Alias: csISOLatin1
 
 (See http://www.iana.org/assignments/character-sets)

Those are registered character set names, not necessarily
standard names. Anyone can apply for new aliases to get
added to that list.

 Latin-1 spelling does appear in various unicode.org documents, but
 not in machine readable files as far as I can tell.

Latin-1 is short for Latin Alphabet No. 1 and
started out as ECMA-94 in 1985 and 1986:

http://www.ecma-international.org/publications/standards/Ecma-094.htm

ISO then applied their numbering scheme for the character set
standard ISO-8859 in 1987 where Latin-1 became ISO-8859-1.
Note that this was before the Internet took off.

I assume that since the HTML standard used the more popular
name Latin-1 for its definition of the default character set
and also made use of the term throughout the spec, it
became the de-facto standard name for that character set
at the time. I only learned about the term ISO-8859-1
when starting to dive into the Unicode world late in the
1990s.

Latin-1 is also sometimes written as ISO Latin-1, e.g.
http://msdn.microsoft.com/en-us/library/ms537495(v=vs.85).aspx

For much the same reasons, ISO-10646 never really became
popular, but Unicode eventually did.

ECMA-262 or ISO/IEC 16262 just doesn't sound as good as
JavaScript either :-)

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Feb 23 2011)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


::: Try our new mxODBC.Connect Python Database Interface for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
   http://www.egenix.com/company/contact/
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Strange error importing a Pickle from 2.7 to 3.2

2011-02-23 Thread Alexander Belopolsky
On Wed, Feb 23, 2011 at 6:32 PM, M.-A. Lemburg m...@egenix.com wrote:
 Alexander Belopolsky wrote:
..
 In what sense is Latin-1 the official name?  The IANA charset
 registry has the following listing


 Name: ISO_8859-1:1987                                    [RFC1345,KXS2]
 MIBenum: 4
 Source: ECMA registry
 Alias: iso-ir-100
 Alias: ISO_8859-1
 Alias: ISO-8859-1 (preferred MIME name)
 Alias: latin1
..
 Latin-1 is short for Latin Alphabet No. 1 and
 started out as ECMA-94 in 1985 and 1986:

This does not explain your preference of Latin-1 over Latin1.
Both are perfectly valid abbreviations for Latin Alphabet No. 1.
The spelling without - has the advantage of being a valid Python
identifier and a module name.  The IANA registration for latin1 and
lack of that for latin-1 most likely indicates that the former is
more commonly found in machine readable metadata.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Strange error importing a Pickle from 2.7 to 3.2

2011-02-23 Thread Stephen J. Turnbull
M.-A. Lemburg writes:

  Latin-1 is short for Latin Alphabet No. 1 [...].

  I assume that since the HTML standard used the more popular
  name Latin-1 for its definition of the default character set
  and also made use of the term throughout the spec, it
  became the de-facto standard name for that character set
  at the time.

As usual with de facto standards, it got embraced and extended.
I've seen people seriously contend that Windows-1252 is an
implementation or (conformant) extension of Latin-1, and that the
EURO SIGN is now a member of Latin-1.  It's just too ambiguous for
my taste; I avoid it in discussions of character sets, preferring to
be thought idiosyncratic and pedantic.

As for the spelling, I think Latin-1 is slightly more readable than
Latin1, but the latter is in the same degree more typable.wink

  For much the same reasons, ISO-10646 never really became
  popular, but Unicode eventually did.

No, there are much more important reasons why Unicode became
popular.  IMHO, as an encoding standard ISO-10646 had a slight edge
over Unicode in the early 1990s, before the two were unified as coded
character sets.  However, as a text processing system there simply was
no comparison.  Unicode provided a large number of standard
facilities, and was clearly set to add to those, that were way outside
of the scope of ISO 10646.  Claiming Unicode conformance was a much
bigger deal than ISO 10646 (not to mention having the advantage that
you could *optionally* save Intel shorts to disk without swabbing them
first).

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Strange error importing a Pickle from 2.7 to 3.2

2011-02-23 Thread Dj Gilcrease
Google Code search limited to python

latin1: 3,489 
http://www.google.com/codesearch?hl=enlr=q=latin1+lang%3Apythonsbtn=Search
latin-1: 5,604 
http://www.google.com/codesearch?hl=enlr=q=latin-1+lang%3Apythonsbtn=Search

utf8: 25,341 
http://www.google.com/codesearch?hl=enlr=q=utf8+lang%3Apythonsbtn=Search
utf-8: 179,806 
http://www.google.com/codesearch?hl=enlr=q=utf-8+lang%3Apythonsbtn=Search


Interesting that for Latin-1 the split of wrong/right is 40/60 and
the split for utf8 is 15/85
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Strange error importing a Pickle from 2.7 to 3.2

2011-02-23 Thread Alexander Belopolsky
On Wed, Feb 23, 2011 at 8:48 PM, Dj Gilcrease digitalx...@gmail.com wrote:
 Google Code search limited to python

 latin1: 3,489 
 http://www.google.com/codesearch?hl=enlr=q=latin1+lang%3Apythonsbtn=Search
 latin-1: 5,604 
 http://www.google.com/codesearch?hl=enlr=q=latin-1+lang%3Apythonsbtn=Search

 utf8: 25,341 
 http://www.google.com/codesearch?hl=enlr=q=utf8+lang%3Apythonsbtn=Search
 utf-8: 179,806 
 http://www.google.com/codesearch?hl=enlr=q=utf-8+lang%3Apythonsbtn=Search


 Interesting that for Latin-1 the split of wrong/right is 40/60 and
 the split for utf8 is 15/85

Your search is invalid.  You hit things such as Latin1ClassModel which
have no relevance to the issue at hand.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Strange error importing a Pickle from 2.7 to 3.2

2011-02-23 Thread Scott Dial
On 2/23/2011 9:19 PM, Alexander Belopolsky wrote:
 On Wed, Feb 23, 2011 at 8:48 PM, Dj Gilcrease digitalx...@gmail.com wrote:
 Google Code search limited to python

 latin1: 3,489 
 http://www.google.com/codesearch?hl=enlr=q=latin1+lang%3Apythonsbtn=Search
 latin-1: 5,604 
 http://www.google.com/codesearch?hl=enlr=q=latin-1+lang%3Apythonsbtn=Search

latin1: 1,618
http://www.google.com/codesearch?hl=enlr=q=%28\%22latin1\%22|\%27latin1\%27%29+lang%3Apythonsbtn=Search
latin-1: 2,241
http://www.google.com/codesearch?hl=enlr=q=%28\%22latin-1\%22|\%27latin-1\%27%29+lang%3Apythonsbtn=Search

 utf8: 25,341 
 http://www.google.com/codesearch?hl=enlr=q=utf8+lang%3Apythonsbtn=Search
 utf-8: 179,806 
 http://www.google.com/codesearch?hl=enlr=q=utf-8+lang%3Apythonsbtn=Search

utf8 9,676
http://www.google.com/codesearch?hl=enlr=q=%28\%22utf8\%22|\%27utf8\%27%29+lang%3Apythonsbtn=Search
utf-8 44,795
http://www.google.com/codesearch?hl=enlr=q=%28\%22utf-8\%22|\%27utf-8\%27%29+lang%3Apythonsbtn=Search

 Your search is invalid.  You hit things such as Latin1ClassModel which
 have no relevance to the issue at hand.

You get about the same ratio if you filter out only the quoted strings.

-- 
Scott Dial
sc...@scottdial.com
scod...@cs.indiana.edu
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Strange error importing a Pickle from 2.7 to 3.2

2011-02-22 Thread Michael Foord

On 22/02/2011 12:14, Jesus Cea wrote:

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

I have 10MB pickled structure generated in Python 2.7. I only use basic
types (no clases) like sets, dictionaries, lists, strings, etc.

The pickle stores a lot of strings. Some of them should be bytes,
while other should be unicode. My idea is to import ALL the strings as
bytes in Python 3.2 and navigate the data structure to convert the
appropiate values to unicode, in a one-time operation (I version the
structure, so I can know if this conversion is already done, simply
storing a new version value).

But I get this error:


Python 3.2 (r32:88445, Feb 21 2011, 13:34:07)
[GCC 4.4.3] on linux2
Type help, copyright, credits or license for more information.

f=open(file.pickle, mode=rb).read()
len(f)

9847316

b=pickle.loads(f,encoding=latin1)

Traceback (most recent call last):
   File stdin, line 1, inmodule
ValueError: operation forbidden on released memoryview object




That seems like an odd error, but the decision was made that Python 2 
byte-strings would be unpickled on Python 3 as Unicode strings.


See the discussion here:

http://bugs.python.org/issue6784

This is basically because many people do the wrong thing and use 
Python 2 byte strings for restoring text. What it means though is that 
people who do the right thing and store binary data in Python 2 byte 
strings can't use Python 2 pickles from Python 3. It also means that 
only ascii data can be unpickled.


A custom pickler / unpickler is suggested as the solution in this issue.

All the best,

Michael Foord


I use the encoding latin1 for transparent byte/unicode conversion (do
not touch the values!).

This seems to be a bug in Python 3.2. Any suggestion?.

PS: The bytestream is protocol 2.

PPS: If there is consensus that this is a real bug, I would create an
issue in the tracker and try to get a minimal testcase.

- -- 
Jesus Cea Avion _/_/  _/_/_/_/_/_/

j...@jcea.es - http://www.jcea.es/ _/_/_/_/  _/_/_/_/  _/_/
jabber / xmpp:j...@jabber.org _/_/_/_/  _/_/_/_/_/
.  _/_/  _/_/_/_/  _/_/  _/_/
Things are not so easy  _/_/  _/_/_/_/  _/_/_/_/  _/_/
My name is Dump, Core Dump   _/_/_/_/_/_/  _/_/  _/_/
El amor es poner tu felicidad en la felicidad de otro - Leibniz
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQCVAwUBTWOomplgi5GaxT1NAQLTYAP/U+eqhQ5nIJyBAqSgYwPmkH4DOlMj4JnH
Jgt6okvOV0hRIXlZ7kbWI2l9OuQyUM4gAeTNDSjFaKs9Hswy26Ro6xhtjidivXDS
TKw6ocRx92/eHvgsOdEZjrE0D8l0dOqodZddbXELp2DjpYs9aozzAsjTHqNZDE1L
fujeTOhtUKw=
=/bcO
-END PGP SIGNATURE-
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk



--
http://www.voidspace.org.uk/

May you do good and not evil
May you find forgiveness for yourself and forgive others
May you share freely, never taking more than you give.
-- the sqlite blessing http://www.sqlite.org/different.html

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Strange error importing a Pickle from 2.7 to 3.2

2011-02-22 Thread Antoine Pitrou
On Tue, 22 Feb 2011 13:14:18 +0100
Jesus Cea j...@jcea.es wrote:
 
 This seems to be a bug in Python 3.2. Any suggestion?.

Report an issue and investigate :)

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Strange error importing a Pickle from 2.7 to 3.2

2011-02-22 Thread Jesus Cea
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 22/02/11 13:20, Michael Foord wrote:
 Traceback (most recent call last):
File stdin, line 1, inmodule
 ValueError: operation forbidden on released memoryview object
 
 That seems like an odd error, but the decision was made that Python 2
 byte-strings would be unpickled on Python 3 as Unicode strings.

This problem seems NOT related to unicode. In fact, when saying
'encoding=latin1', my binary strings should be converted to unicode
without any kind of issue (my plan was, then, to scan the datastructure
and convert them to native bytes).

The fact is that I get a strange error: ValueError: operation forbidden
on released memoryview object. Seems like a bug to me. Google shows no
hits. I want to discard any obvious overlooked point.

PS: Just checked... Python 3.1.3 imports the pickle just fine. So busy
migrating my projects to 3.2 (it was my compromise two years ago :), I
don't have time to debug this :).

- -- 
Jesus Cea Avion _/_/  _/_/_/_/_/_/
j...@jcea.es - http://www.jcea.es/ _/_/_/_/  _/_/_/_/  _/_/
jabber / xmpp:j...@jabber.org _/_/_/_/  _/_/_/_/_/
.  _/_/  _/_/_/_/  _/_/  _/_/
Things are not so easy  _/_/  _/_/_/_/  _/_/_/_/  _/_/
My name is Dump, Core Dump   _/_/_/_/_/_/  _/_/  _/_/
El amor es poner tu felicidad en la felicidad de otro - Leibniz
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQCVAwUBTWO7l5lgi5GaxT1NAQJVLQP/fZMxsybbHfkbwDEJ/DVaBSj8VZ2dkO38
oXsH9ojspbxRTv9BCNakKt8SyDMtzJIB6kaZ10qScxftDAGs22xlkpOJyGxBYgNZ
Ut5U425YuUTCyFQyYfREWNs2AqUQOWymnXgIlThDS93n1Y+W2S1ovcT9WJaHyebe
ZVDabLUZYlw=
=IBN8
-END PGP SIGNATURE-
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Strange error importing a Pickle from 2.7 to 3.2

2011-02-22 Thread Eli Bendersky
 PS: Just checked... Python 3.1.3 imports the pickle just fine. So busy
 migrating my projects to 3.2 (it was my compromise two years ago :), I
 don't have time to debug this :).


I hope you do have a time to open an issue, though :-)
Eli
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Strange error importing a Pickle from 2.7 to 3.2

2011-02-22 Thread Jesus Cea
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 22/02/11 15:32, Eli Bendersky wrote:
 PS: Just checked... Python 3.1.3 imports the pickle just fine. So busy
 migrating my projects to 3.2 (it was my compromise two years ago :), I
 don't have time to debug this :).

 
 I hope you do have a time to open an issue, though :-)
 Eli

Bugs bug me a lot :). I spend a couple of hours trying to reduce my
pickle to something I could post (the original pickle has tons of
propietary information):

http://bugs.python.org/issue11286

I got a reproductible pickle testcase in only 41 bytes.

Seems to be a SERIOUS regression in 3.2.

I can't progress further. Pickle internals are out of my expertise.

- -- 
Jesus Cea Avion _/_/  _/_/_/_/_/_/
j...@jcea.es - http://www.jcea.es/ _/_/_/_/  _/_/_/_/  _/_/
jabber / xmpp:j...@jabber.org _/_/_/_/  _/_/_/_/_/
.  _/_/  _/_/_/_/  _/_/  _/_/
Things are not so easy  _/_/  _/_/_/_/  _/_/_/_/  _/_/
My name is Dump, Core Dump   _/_/_/_/_/_/  _/_/  _/_/
El amor es poner tu felicidad en la felicidad de otro - Leibniz
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQCVAwUBTWPRz5lgi5GaxT1NAQKwdgQAjV5zKB3LhuVMU+JDbPeZjo/oFu1Yz++Z
1xFPuXTtaeMGMYuQH16j5rghqp90Q4u0M/VGaXI99uxcyTR9fpGGVEBE2L0qnVTg
1sbRyCaaVrPDVju3tTonw5QEe7eXnsec9INuK7KCIXUqEZK7klbqoWFFflU5g/Ui
hcxe8Zt6lQE=
=nWu0
-END PGP SIGNATURE-
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Strange error importing a Pickle from 2.7 to 3.2

2011-02-22 Thread Alexander Belopolsky
On Tue, Feb 22, 2011 at 9:31 PM, Stephen J. Turnbull step...@xemacs.org wrote:
 Jesus Cea writes:

   PPS: If there is consensus that this is a real bug, I would create an
   issue in the tracker and try to get a minimal testcase.

 All bugs are issues, but not all issues are bugs.

 Please don't wait for consensus or even a second opinion to file the
 issue.


AFAICT, it was not mentioned in this thread, but the issue has been
created on the tracker:

http://bugs.python.org/issue11286
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Strange error importing a Pickle from 2.7 to 3.2

2011-02-22 Thread Jesus Cea
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 23/02/11 03:31, Stephen J. Turnbull wrote:
 Please don't wait for consensus or even a second opinion to file the
 issue.
 
 It's reasonable for a new Python user to ask whether something is a
 bug or not, but if somebody with your experience and contribution
 level to Python doesn't understand something, at the very least we
 have to suspect a doc bug.

Every time I read a message from Antoine Pitrou, Brett Cannon, Nick
Coghlan, Martin Löwis, Eric Smith, Steve Holden, Benjamin Peterson,
Victor Stinner, Greog Brandl, Raymond Hettinger, Guido and so many
others python-devs (not an exhaustive list, if you are not there, you
probably should, sorry :), I feel I am faking my knowledge of Python
:-). I am a pretender :).

BTW, this project is my first real python 3 code (I promised to myself
to move after 3.2 release, a year ago), for a real/big project, and I
was thinking that maybe I was overlooking something obvious for any
seasoned real python programmer.

I overcame my fear of being seen as a fool last millenia, so I am not
afraid of asking. Sometimes I even ask too much.

 OTOH, the testcase might require a lot of effort on your part.  Of
 course it's reasonable for you to check whether it's a simple
 misunderstanding before exerting that effort.

In fact, I invested *hours* trying to reduce my multimegabyte
problematic pickle to 41 bytes, but at this time I was already convinced
that I had hit an ugly and serious bug.

Issue filed. It already has a patch. That was fast!. Now I can sit back
waiting for 3.2.1 before touching my project again :). Mixed feelings
about the waiting. I hope it is short.

Life sucks sometimes :). Thanks $DEITY there are quite a few better
python-devs than me :).

- -- 
Jesus Cea Avion _/_/  _/_/_/_/_/_/
j...@jcea.es - http://www.jcea.es/ _/_/_/_/  _/_/_/_/  _/_/
jabber / xmpp:j...@jabber.org _/_/_/_/  _/_/_/_/_/
.  _/_/  _/_/_/_/  _/_/  _/_/
Things are not so easy  _/_/  _/_/_/_/  _/_/_/_/  _/_/
My name is Dump, Core Dump   _/_/_/_/_/_/  _/_/  _/_/
El amor es poner tu felicidad en la felicidad de otro - Leibniz
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQCVAwUBTWSOY5lgi5GaxT1NAQIqQAP/Wf/e2iN3FRb7istoGCpqCgjDv7UyCOWF
RzOYMJWh0xhNL5ydZZ2YwtcQNEWrQS538zrr8piOqvV3ielOBCgSWArqChWaQTHU
ZC3gdaw8N5VMr0AXGBMXwcflkLaQ7BrBtiQBizFL9KLYGDI9JG8+O1YjpjamUeQv
iXEyGdqWUp4=
=XwYt
-END PGP SIGNATURE-
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Strange error importing a Pickle from 2.7 to 3.2

2011-02-22 Thread Stephen J. Turnbull
Jesus Cea writes:

  Every time I read a message from [long, incompletewink list] and
  so many others python-devs (not an exhaustive list, if you are not
  there, you probably should, sorry :), I feel I am faking my
  knowledge of Python :-). I am a pretender :).

Sure.  I suspect even some of those *on* the list feel that way
sometimes.  That's what's so great about the list, the people who
contribute!

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com