Re: Valid encodings for a Python source file

2018-06-08 Thread Thomas Jollans
On 2018-06-07 22:40, Daniel Glus wrote:
> I'm trying to figure out the entire list of possible encodings for a Python
> source file - that is, encodings that can go in a PEP 263
>  encoding specification, like #
> -*- encoding: foo -*-.
> 
> Is this list the same as the list given in the documentation for the codecs
> library, under "Standard Encodings"
> ? If
> not, where can I find the actual list?
> 
> (I know that list is the same as the set of unique values in CPython's
> /Lib/encodings/aliases.py
> ,
> or equivalently, the set of filenames in /Lib/encodings/
> , but again
> I'm not sure.)
> -Daniel


It's none of these.

To quote PEP 263:

> Any encoding which allows processing the first two lines in the way indicated 
> above is allowed as source code encoding, this includes ASCII compatible 
> encodings as well as certain multi-byte encodings such as Shift_JIS. It does 
> not include encodings which use two or more bytes for all characters like 
> e.g. UTF-16. The reason for this is to keep the encoding detection algorithm 
> in the tokenizer simple.

All of the lists above include encodings like UTF-16 that are not
sufficiently ASCII-compatible.

Of course, as Terry Reedy writes,
> For new code for python 3, don't use an encoding cookie.  Use an editor that 
> can save in utf-8 and tell it to do so if it does not do so by default. 


-- Thomas
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Valid encodings for a Python source file

2018-06-08 Thread Richard Damon
On 6/7/18 4:40 PM, Daniel Glus wrote:
> I'm trying to figure out the entire list of possible encodings for a Python
> source file - that is, encodings that can go in a PEP 263
>  encoding specification, like #
> -*- encoding: foo -*-.
>
> Is this list the same as the list given in the documentation for the codecs
> library, under "Standard Encodings"
> ? If
> not, where can I find the actual list?
>
> (I know that list is the same as the set of unique values in CPython's
> /Lib/encodings/aliases.py
> ,
> or equivalently, the set of filenames in /Lib/encodings/
> , but again
> I'm not sure.)
> -Daniel

Reading the proposal, I see one thing that seems worthy of a comment,
the proposal specifically calls out the UTF-8 'BOM" sequence, (which the
Unicode standard actually doesn't recommend using, as UTF-8 doesn't have
a 'Byte Order Problem', but doesn't allow the UTF-16 (0xFF, 0xFE or
0xFE, 0xFF) or UCS-4 BOM (0x00, 0x00, 0xFE, 0xFF or 0xFF, 0xFE, 0x00,
0x00)  marks which while the formats are unlikely are very likely to
have the marks, and detecting the marks are very important to detect
those encoding as they are NOT 'ACSII Compatible' formats, so the rest
of the header doesn't match what would be expected.

-- 
Richard Damon

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Valid encodings for a Python source file

2018-06-08 Thread Terry Reedy

On 6/7/2018 4:40 PM, Daniel Glus wrote:

I'm trying to figure out the entire list of possible encodings for a Python
source file - that is, encodings that can go in a PEP 263
 encoding specification, like #
-*- encoding: foo -*-.


For new code for python 3, don't use an encoding cookie.  Use an editor 
that can save in utf-8 and tell it to do so if it does not do so by default.


--
Terry Jan Reedy

--
https://mail.python.org/mailman/listinfo/python-list


Re: Valid encodings for a Python source file

2018-06-07 Thread Ben Finney via Python-list
Daniel Glus  writes:

> I'm trying to figure out the entire list of possible encodings for a Python
> source file - that is, encodings that can go in a PEP 263
>  encoding specification, like #
> -*- encoding: foo -*-.

What if the answer is not an emunerated set of encodings? That is, I am
pretty sure the set isn't specified, to allow the encoding to be
negotiated. Whatever the interpreter recognises as an encoding can be
the encoding of the source.

So, I guess that leads to the question: Why do you need it to be an
exhaustive set (rather than deliberately unspecified)? What are you
hoping to do with that information?

-- 
 \   “Good design adds value faster than it adds cost.” —Thomas C. |
  `\  Gale |
_o__)  |
Ben Finney

-- 
https://mail.python.org/mailman/listinfo/python-list