[issue12632] Python 3 doesn't support cp65001 as the OEM code page

2011-09-30 Thread Bruce Ferris

Bruce Ferris  added the comment:

The PYTHONIOENCODING=utf-8 setting works great if I have code page 65001 set.  
I haven't, however, done a complete console functionality check with that 
setting but, thanks for the input -- it solves the current problem I'm 
experiencing.

I do wonder, however, if switching to that setting should happen automatically 
if it's not specified and the Windows current code page is 65001.

That would solve the problem unless, of course, PYTHONIOENCODING has 
side-effects elsewhere that would cause other problems.

On the other hand, if it does have side-effects elsewhere than it's not the 
answer I'm looking for.

--

___
Python tracker 
<http://bugs.python.org/issue12632>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12632] Windows GPF with Code Page 65001

2011-07-26 Thread Bruce Ferris

Bruce Ferris  added the comment:

Victor, thanks for replying and I've had a quick read of everything that went 
on for issue #1602.  I think there's some misunderstanding in what I'm saying 
here.  Maybe this will help clear up what I'm saying...

  D:\>chcp
  Active code page: 850

  D:\>chcp 65001
  Active code page: 65001

  D:\>python27\python
  Python 2.7 (r27:82525, Jul  4 2010, 09:01:59) [MSC v.1500 32 bit 
  (Intel)] on win32
  Type "help", "copyright", "credits" or "license" for more information.
  >>> ^Z

  D:\>python31\python
  Fatal Python error: Py_Initialize: can't initialize sys standard
  streams
  LookupError: unknown encoding: cp65001

  This application has requested the Runtime to terminate it in an
  unusual way.
  Please contact the application's support team for more information.

  D:\>chcp 850
  Active code page: 850

  D:\>python31\python
  Python 3.1.2 (r312:79149, Mar 21 2010, 00:41:52) [MSC v.1500 32 bit
  (Intel)] on win32
  Type "help", "copyright", "credits" or "license" for more information.
  >>> ^Z

  D:\>

You see, I'm NOT trying to output any Unicode or UTF-8 characters.  All I'm 
trying to do is run different versions of Python on the same machine from the 
command line.

Some code inside Python now "break" if Python 3.1 is started with Code Page 
65001.

I fully understand the change between Python 2.7 and 3.1 were probably due to 
trying to fix issue #1602 (or some other related issue).

But, as a side-effect to that "fix", if you now start Python 3.1 (and maybe 
beyond) with code page set to 65001, it refuses to work but it didn't used to 
refuse to work.

Evidently, Python now tries using the Code Page as an encoding lookup.  But, it 
didn't used to in 2.7.  So, there's another compatability issue introduced.

Setting my cmd.exe code page to 65001 shouldn't mean a thing to Python if it 
can't associate it with an encoding.  It could, at least, just switch to 7-Bit 
ASCII and proceed on.  That would be better than failing!

That's my whole point.  If Python want to do some tweeking with code pages to 
get it's job done, that's fine by me, as long as it doesn't "break" and 
restores whatever code page I had set when I started it.

It's not down to a UTF-8 issue, it's about a compatability issue introduced 
sometime in the last year or so as a side-effect of trying to resolve a UTF-8 
issue, probably #1602.

That's all!

--

___
Python tracker 
<http://bugs.python.org/issue12632>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12632] Windows GPF with Code Page 65001

2011-07-25 Thread Bruce Ferris

Bruce Ferris  added the comment:

I disagree with the "it's not really a GPF since it calls Abort".

Consider the following cmd.exe session...

  Microsoft Windows [Version 6.0.6002]
  Copyright (c) 2006 Microsoft Corporation.  All rights reserved.

  D:\>chcp 65001
  Active code page: 65001

  D:\>python >t.txt
  Fatal Python error: Py_Initialize: can't initialize sys standard streams
  LookupError: unknown encoding: cp65001

  This application has requested the Runtime to terminate it in an unusual way.
  Please contact the application's support team for more information.

  D:\>type t.txt

  D:\>dir t.txt
  Volume in drive D is DATA
  Volume Serial Number is 2E61-626C

  Directory of D:\

  25/07/2011  06:10 PM 0 t.txt
 1 File(s)  0 bytes
 0 Dir(s)  16,768,655,360 bytes free

  D:\>

This means that, even if it was "intentional", from a programatic point of 
view. the Python process in this case leaves no other indication other than 
transient bytes in the transient cmd.exe console buffer.  No way of redirecting 
the output and examining it.

I strongly disagree with the statement "(If it were a true segfault-like error, 
there would be no message from Python itself.)"

The "no message from Python itself" case is shown above.

My application handles code page 65001 just fine, no problems.  If it attempts 
to use Windows WriteConsole function and it fails, it tries using WriteFile 
instead.  So, when my application fails and output is redirected, it produces 
output.

But, Python 3.1 doesn't.  See the following Microsoft MSDN link, it states the 
WriteConsole point explicitly...

  http://msdn.microsoft.com/en-us/library/ms687401%28v=VS.85%29.aspx

So, if Python doesn't like Code Page 65001, for whatever reason, it can simply 
save it on startup, and change it to whatever makes it happy.  Then, upon 
Python exit (including Abort), change it back to 65001 before calling Abort.

I'm sorry, but the following is "easy" in my book...

  1) At Startup...  Call GetConsoleOutputCP and save that somewhere.

If code page is 65001, change it to something that
doesn't cause problems by calling SetConsoleOutputCP

  2) On Write... If WriteConsole fails, try calling WriteFile instead.

  3) At Abort or Exit... Call SetConsoleOutputCP to set it back
 to whatever it was on Startup.

I don't care if your app (Python) can display UTF-8 on Microsoft's cmd.exe 
console or if it can't. 

All I'm trying to do is point out a bit of misbehaviour that CAN be easily 
changed and will make your product more resilient.

I don't know the details of how Python deals with character encoding and, quite 
honestly, I shouldn't need to since it's not my product.  however, I DO know 
how I handle a similiar scenario in my own app.

Microsoft made it complicated, not me.  But, I can "easily" get around the 
problem using the above scenario.  If Python can't do it just as "easily", then 
it tells me more about Python's implementation and the people behind Python 
then it tells me about Microsoft and the people behind Windows.

Don't get me wrong, I love Python as a tool for solving certain classes of 
problems and, please, keep up the good work.  It's appreciated.

--

___
Python tracker 
<http://bugs.python.org/issue12632>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12632] Windows GPF with Code Page 65001

2011-07-25 Thread Bruce Ferris

Bruce Ferris  added the comment:

I use code page 65001 because 1) it displays the UTF-8 characters in my text 
files with "echo " on the command line, and 2) that's Microsoft's 
"official" (whatever that means) code page for UTF-8, and 3) it works in 
cmd.exe.

Setting aside why I use it, it IS used by some, and Python shouldn't GPF for 
ANY reason if it can be easily fixed.  Right?

Essentially, 65001 makes Microsoft's console output behave properly (at least 
with the limited characters in Lucinda Console) so I would think Python should 
consider not blowing up when it's set.  

To be honest, I just happened to have it set to 65001 to get the output from 
another program to look right and just happened to run Python to do some quick 
unrelated calculations.

Imagine my surprise when Python blew, especially when all I did was to run it.  
It's not like I asked it to do any UTF-8 or anthing!

Anyway, as far as I understand...  Any GPF is a potential back door.  So, it 
needs closing.

--

___
Python tracker 
<http://bugs.python.org/issue12632>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12632] Windows GPF with Code Page 65001

2011-07-24 Thread Bruce Ferris

Changes by Bruce Ferris :


--
type:  -> crash

___
Python tracker 
<http://bugs.python.org/issue12632>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12632] Windows GPF with Code Page 65001

2011-07-24 Thread Bruce Ferris

New submission from Bruce Ferris :

The following scenario GPFs on Windows Vista using cmd.exe...

  D:\>python
  Python 3.1.2 (r312:79149, Mar 21 2010, 00:41:52) [MSC v.1500 32 bit
   (Intel)] on win32
  Type "help", "copyright", "credits" or "license" for more information.
  >>> ^Z

  D:\>chcp 65001
  Active code page: 65001

  D:\>python
  Fatal Python error: Py_Initialize: can't initialize sys standard
   streams
  LookupError: unknown encoding: cp65001

  This application has requested the Runtime to terminate it in an 
  unusual way.
  Please contact the application's support team for more information.

  D:\>

This is a bit surprising since Code Page 65001 IS the official Microsoft UTF-8 
Code Page.

Please see...

  http://msdn.microsoft.com/en-us/library/dd317756%28v=vs.85%29.aspx

--
components: Unicode
messages: 141067
nosy: bferris57
priority: normal
severity: normal
status: open
title: Windows GPF with Code Page 65001
versions: Python 3.1

___
Python tracker 
<http://bugs.python.org/issue12632>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com