Eryk Sun added the comment:

For posterity for anyone that finds this old issue, I investigated this problem 
in the debugger in Windows 7. It turns out that (the pager used by 
Python's help) calls MultiByteToWideChar [1] with dwFlags passed as 
MB_PRECOMPOSED (1), which is forbidden for UTF-8. The error message is just a 
generic error that incorrectly assumes decoding the byte string failed due to 
running out of memory.

You may be happy to learn that this problem is fixed in Windows 10.


Here are a few snapshots from the debugger. calls SetConsoleConversions from its init function, InitializeThings:

    Breakpoint 0 hit
    00000000`ff293058 48895c2408      mov     qword ptr [rsp+8],rbx
    0:000> g
    Breakpoint 2 hit
    000007fe`f6498934 8a05d6a80000    mov     al,byte ptr
                 (000007fe`f64a3210)] ds:000007f

This causes decoding byte strings to use the current console codepage instead 
of the system ANSI or OEM codepage. The intention here is to allow a user to 
correctly display a text file that's in a different encoding. The decoded text 
is written to the console as Unicode via WriteConsoleW.

Here is the bad call where dwFlags (register rdx) is passed as MB_PRECOMPOSED 
(1), which is invalid for codepage 65001 (register rcx).

    0:000> g
    Breakpoint 1 hit
    000007fe`fd191f00 fff3            push    rbx
    0:000> ? @rcx
    Evaluate expression: 65001 = 00000000`0000fde9
    0:000> r rdx

In Windows 10 this argument is passed as 0, the correct value.

This problem occurs indirectly via a utility library named ulib.dll, which is 
used by Windows command-line utilities. It should only occur when console 
conversions are enabled. Otherwise ulib converts using the system OEM and ANSI 
codepages.  I searched for other utilities that use 

    C:\>for %f in (C:\Windows\system32\*.exe) do @(^
    More? dumpbin /imports "%f" | ^
    More? findstr SetConsoleConversions && echo %f)
               7FF713B8934   167 ?SetConsoleConversions@WSTRING@@SAXXZ

I found that find.exe is also subject to this bug in Windows 7. It fails to 
print the result if the console is using codepage 65001:

    C:\Temp\test>type test

    C:\Temp\test>find /n "spam" *

    ---------- TEST

    C:\Temp\test>chcp 65001
    Active code page: 65001

    C:\Temp\test>find /n "spam" *

    ---------- TEST

This works correctly in Windows 10.

nosy: +eryksun

Python tracker <>
Python-bugs-list mailing list

Reply via email to