I have some suggestions. With ReadConsoleW, CPython can use the pInputControl parameter to set a CtrlWakeup mask. This enables a Unix-style Ctrl+D for ending a read without having to press enter. For example:
>>> CTRL_MASK = 1 << 4 >>> inctrl = (ctypes.c_ulong * 4)(16, 0, CTRL_MASK, 0) >>> _ = kernel32.ReadConsoleW(hStdIn, buf, 100, pn, inctrl); print() spam >>> buf.value 'spam\x04' >>> pn[0] 5 read() would have to manually replace '\x04' with NUL. Ctrl+Z can also be added to the mask: >>> CTRL_MASK = 1 << 4 | 1 << 26 >>> inctrl = (ctypes.c_ulong * 4)(16, 0, CTRL_MASK, 0) >>> _ = kernel32.ReadConsoleW(hStdIn, buf, 100, pn, inctrl); print() spam >>> buf.value 'spam\x1a' I'd like a method to query, set and unset ENABLE_VIRTUAL_TERMINAL_PROCESSING mode for the screen buffer (sys.stdout and sys.stderr) without having to use ctypes. The console in Windows 10 has built-in VT100 emulation, but it's initially disabled. The cmd shell enables it, but Python scripts aren't always run from cmd.exe. Sometimes they're run in a new console from Explorer or via "start", etc. For example, IPython could check for this to provide more bells and whistles when PyReadline isn't installed. Finally, functions such as WriteConsoleInputW and ReadConsoleOutputCharacter require opening CONIN$ or CONOUT$ with GENERIC_READ | GENERIC_WRITE access. The initial handles given to a console process have read-write access. For opening a new handle by device name, WindowsConsoleIO should first try GENERIC_READ | GENERIC_WRITE -- with a fallback to either GENERIC_READ or GENERIC_WRITE. The fallback is necessary for CON, which uses the desired access to determine whether to open the input buffer or screen buffer. --- Paul, do you have example code that uses the 'raw' stream? Using the buffer should behave as it always has -- at least in this regard. sys.stdin.buffer requests a large block, such as 8 KB. But since the console defaults to a cooked mode (i.e. processed input and line input -- control keys, command-line editing, input history, and aliases), ReadConsole returns when enter is pressed or when interrupted. It returns at least '\r\n', unless interrupted by Ctrl+C, Ctrl+Break or a custom CtrlWakeup key. However, if line-input mode is disabled, ReadConsole returns as soon as one or more characters is available in the input buffer. As to kbhit() returning true, this does not mean that read(1) from console input won't block (not unless line-input mode is disabled). It does mean that getwch() won't block (note the "w" in there; this one reads Unicode characters).The CRT's conio functions (e.g. kbhit, getwch) put the console input buffer in a raw mode (e.g. ^C is read as '\x03' instead of generating a CTRL_C_EVENT) and call the lower-level functions PeekConsoleInputW (kbhit) and ReadConsoleInputW (getwch), to peek at and read input event records. --- Splitting surrogate pairs across reads is a problem. Granted, this should rarely be an issue given the size of the reads that the buffer requests and the typical line length. In most cases the buffer completely consumes the entire line in one read. But in principle the raw stream shouldn't replace split surrogates with the U+FFFD replacement character. For example, with Steve's patch from issue 1602: >>> _ = write_console_input('\U00010000\r\n');\ ... b1 = raw_read(4); b2 = raw_read(4); b3 = raw_read(8) 𐀀 >>> b1, b2 (b'\xef\xbf\xbd', b'\xef\xbf\xbd') Splitting UTF-8 sequences across writes is more common. Currently a raw write doesn't handle this correctly: >>> b = 'eggs \U00010000 spam\n'.encode('utf-8') >>> _ = raw_write(b[:6]); _ = raw_write(b[6:]) eggs ���� spam Also, the console is UCS-2, which can't be transcoded between UTF-16 and UTF-8. Supporting UCS-2 in the console would integrate nicely with the filesystem PEP. It makes it always possible to print os.listdir('.'), copy and paste, and read it back without data loss. It would probably be simpler to use UTF-16 in the main pipeline and implement Martin's suggestion to mix in a UTF-8 buffer. The UTF-16 buffer could be renamed as "wbuffer", for expert use. However, if you're fully committed to transcoding in the raw layer, I'm certain that these problems can be addressed with small buffers and using Python's codec machinery for a flexible mix of "surrogatepass" and "replace" error handling. _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com