Re: [Python-Dev] _PyUnicode_CheckConsistency() too strict?

Phil Thompson Mon, 03 Feb 2014 07:46:43 -0800

On 03-02-2014 3:35 pm, Victor Stinner wrote:

2014-02-03 Phil Thompson <p...@riverbankcomputing.com>:
For example, a string created with a maxchar of 255 (ie. a Latin-1string)must contain at least one character in the range 128-255 otherwiseyou get
an assertion failure.
Yes, it's the specification of the PEP 393.
As it stands, when converting Latin-1 strings in my C extensionmodule Imust first check each character and specify a maxchar of 127 if thestrings
happens to only contain ASCII characters.
Use PyUnicode_FromKindAndData(PyUnicode_1BYTE_KIND, latin1_str,
length) which computes the kind for you.
What is the reasoning behind the checks being so strict?
Different Python functions rely on the exact kind to compare strings.
For example, if you search a latin1 substring in an ASCII string, the
search returns immediatly instead of searching in the string. Alatin1
string cannot be found in an ASCII string.
The main reason in the PEP 393 itself, a string must be compact tonot
waste memory.

Victor

Are you saying that code will fail if a particular Latin-1 string justhappens not to contains any character greater than 127?

I would be very surprised if that was the case. If it isn't the casethen I think that particular check shouldn't be made.


Phil
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] _PyUnicode_CheckConsistency() too strict?

Reply via email to