[issue1602] windows console doesn't print or input Unicode

2012-05-19 Thread David-Sarah Hopwood

David-Sarah Hopwood david-sa...@jacaranda.org added the comment:

Giampaolo: See #msg120700 for why that won't work, and the subsequent comments 
for what will work instead (basically, using WriteConsoleW and a workaround for 
a Windows API bug). Also see the prototype win_console.patch from Victor 
Stinner: #msg145963

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1602
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1602] windows console doesn't print or input Unicode

2011-03-26 Thread David-Sarah Hopwood

David-Sarah Hopwood david-sa...@jacaranda.org added the comment:

Glenn wrote:
 So if flush checks that bit, maybe TextIOWriter could just call buffer.flush, 
 and it would be fast if clean and slow if dirty?

Yes. I'll benchmark how much overhead is added by the calls to flush; there's 
no point in breaking the abstraction boundary of BufferedWriter if it doesn't 
give a significant performance benefit. (I suspect that it might not, because 
Windows is very slow at scrolling a console, which might make the cost of 
flushing insignificant in comparison.)

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1602
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11395] print(s) fails on Windows with long strings

2011-03-26 Thread David-Sarah Hopwood

David-Sarah Hopwood david-sa...@jacaranda.org added the comment:

If I understand the bug in the Windows console functions correctly, a limit of 
32767 bytes might not always be small enough. The problem is that if two or 
more threads are concurrently using any console functions (which all use the 
same 64 KiB heap), they could try to allocate up to 32767 bytes plus overhead 
at the same time, which will fail.

I wasn't able to provoke this by writing to sys.stdout.buffer (maybe there is 
locking that prevents concurrent writes), but the following code that calls 
WriteFile directly, does provoke it. GetLastError() returns 8 
(ERROR_NOT_ENOUGH_MEMORY; see 
http://msdn.microsoft.com/en-us/library/ms681382%28v=vs.85%29.aspx), indicating 
that it's the same bug.


# Warning: this test may DoS your system.

from threading import Thread
import sys
from ctypes import WINFUNCTYPE, windll, POINTER, byref, c_int
from ctypes.wintypes import BOOL, HANDLE, DWORD, LPVOID, LPCVOID

GetStdHandle = WINFUNCTYPE(HANDLE, DWORD)((GetStdHandle, windll.kernel32))
WriteFile = WINFUNCTYPE(BOOL, HANDLE, LPCVOID, DWORD, POINTER(DWORD), LPVOID) \
((WriteFile, windll.kernel32))
GetLastError = WINFUNCTYPE(DWORD)((GetLastError, windll.kernel32))
STD_OUTPUT_HANDLE = DWORD(-11)
INVALID_HANDLE_VALUE = DWORD(-1).value

hStdout = GetStdHandle(STD_OUTPUT_HANDLE)
assert hStdout is not None and hStdout != INVALID_HANDLE_VALUE

L = 32760
data = b'a'*L

def run():
n = DWORD(0)
while True:
ret = WriteFile(hStdout, data, L, byref(n), None)
if ret == 0 or n.value != L:
print(ret, n.value, GetLastError())
sys.exit(1)

[Thread(target=run).start() for i in range(10)]

--
nosy: +davidsarah

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11395
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1602] windows console doesn't print or input Unicode

2011-03-25 Thread David-Sarah Hopwood

David-Sarah Hopwood david-sa...@jacaranda.org added the comment:

First a minor correction:
 The new requirement would be that a correct app also needs to flush between a 
 sequence of buffer.writes (that end in an incomplete line, or always if 
 PYTHONUNBUFFERED or python -u is used), and a sequence of writes.

That should be and only if PYTHONUNBUFFERED or python -u is not used.

I also said:
 If an app sets the .buffer attribute of sys.std{out,err}, it would fall back 
 to using that buffer in the same way as when the fd is redirected.

but the .buffer attribute is readonly, so this case can't occur.

Glenn Linderman wrote:
 Would it suffice if the new scheme internally flushed after every 
 buffer.write?  It wouldn't be needed after write, because the correct 
 application would already do one there?

Yes, that would be sufficient.

 Am I off-base in supposing that the performance of buffer.write is expected 
 to include a flush (because it isn't expected to be buffered)?

It is expected to be line-buffered. So an app might expect that printing 
characters one-at-a-time will have reasonable performance.

In any case, given that the buffer of the initial std{out,err} will always be a 
BufferedWriter object (since .buffer is readonly), it would be possible for the 
TextIOWriter to test a dirty flag in the BufferedWriter, in order to check 
efficiently whether the buffer needs flushing on each write. I've looked at the 
implementation complexity cost of this, and it doesn't seem too bad.

A similar issue arises for stdin: to maintain strict compatibility, every read 
from a TextIOWrapper attached to an input console would have to drain the 
buffer of its buffer object, in case the app has read from it. This is a bit 
tricky because the bytes drained from the buffer have to be converted to 
Unicode, so what happens if they end part-way through a multibyte character? 
Ugh, I'll have to think about that one.

Victor STINNER wrote:
 Some developers already think that adding sys.stdout.flush() after
print(Processing.. , end='') is too hard (#11633).

IIUC, that bug is about the behaviour of 'print', and didn't suggest to change 
the fact that sys.stdout is line-buffered.


By the way, are these changes going to be in a major release? If I understand 
correctly, the layout of structs (for standard library types not prefixed with 
'_', such as 'buffered' in bufferedio.c or 'textio' in textio.c) can change 
with major releases but not with minor releases, correct?

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1602
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1602] windows console doesn't print or input Unicode

2011-03-25 Thread David-Sarah Hopwood

David-Sarah Hopwood david-sa...@jacaranda.org added the comment:

I wrote:
 A similar issue arises for stdin: to maintain strict compatibility, every 
 read from a TextIOWrapper attached to an input console would have to drain 
 the buffer of its buffer object, in case the app has read from it. This is a 
 bit tricky because the bytes drained from the buffer have to be converted to 
 Unicode, so what happens if they end part-way through a multibyte character? 
 Ugh, I'll have to think about that one.

It seems like there is no correct way for an app to read from both sys.stdin, 
and sys.stdin.buffer (even without these console changes). It must choose one 
or the other.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1602
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1602] windows console doesn't print or input Unicode

2011-03-24 Thread David-Sarah Hopwood

David-Sarah Hopwood david-sa...@jacaranda.org added the comment:

I wrote:
 The only caveat would be that if you write a partial line to the buffer 
 object (or if you set the buffer object to be fully buffered and write to 
 it), and then write to the text stream, the buffer wouldn't be flushed before 
 the text is written.

Actually it looks like that already happens (because the sys.std{out,err} 
TextIOWrappers are line-buffered separately to their underlying buffers), so it 
would not be an incompatibility:

$ python3 -c 'import sys; sys.stdout.write(foo); 
sys.stdout.buffer.write(bbar); sys.stdout.write(baz\n)'
barfoobaz

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1602
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1602] windows console doesn't print or input Unicode

2011-03-24 Thread David-Sarah Hopwood

David-Sarah Hopwood david-sa...@jacaranda.org added the comment:

I wrote:
$ python3 -c 'import sys; sys.stdout.write(foo); 
sys.stdout.buffer.write(bbar); sys.stdout.write(baz\n)'
barfoobaz

Hmm, the behaviour actually would differ here: the proposed implementation 
would print

foobaz
bar

(the foobaz\n is written by a call to WriteConsoleW and then the bar gets 
flushed to stdout when the process exits).

But since the naive expectation is foobarbaz\n and you already have to flush 
after each call in order to get that, I think this change in behaviour would be 
unlikely to affect correct applications.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1602
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1602] windows console doesn't print or input Unicode

2011-03-24 Thread David-Sarah Hopwood

David-Sarah Hopwood david-sa...@jacaranda.org added the comment:

Glenn Linderman wrote:
 Presently, a correct application only needs to flush between a sequence of 
 writes and a sequence of buffer.writes.

Right. The new requirement would be that a correct app also needs to flush 
between a sequence of buffer.writes (that end in an incomplete line, or always 
if PYTHONUNBUFFERED or python -u is used), and a sequence of writes.

 Don't assume the flush happens after every write, for a correct application.

It's rather hard to implement this without any change in behaviour. Or rather, 
it isn't hard if the TextIOWrapper were to flush its underlying buffer before 
each time it writes to the console, but I'd be concerned about the extra 
overhead of that call. I'd prefer not to do that unless the new requirement 
above leads to incompatibilities in practice.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1602
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1602] windows console doesn't print or input Unicode

2011-03-22 Thread David-Sarah Hopwood

David-Sarah Hopwood david-sa...@jacaranda.org added the comment:

(For anyone wondering about the hold-up on this bug, I ended up switching to 
Ubuntu. Not to worry, I now have Python 3 building in XP under VirtualBox -- 
which is further than I ever got with my broken Vista install :-/ It seems to 
behave identically to native XP as far as this bug is concerned.)

Victor STINNER wrote:
 The question is now how to integrate WriteConsoleW() into Python without 
 breaking the API, for example:
 - Should sys.stdout be a TextIOWrapper or not?

It pretty much has to be a TextIOWrapper for compatibility. Also it's easier to 
implement it that way, because the text stream object has to be able to fall 
back to using the buffer if the fd is redirected.

 - Should sys.stdout.fileno() returns 1 or raise an error?

Return sys.stdout.buffer.fileno(), which is 1 unless redirected.

This is the Right Thing because in Windows, fds are an abstraction of the C 
runtime library, and the C runtime allows an fd to be associated with a 
console. In that case, from the application's point of view it is still writing 
to the same fd. In fact, we'd be implementing this by calling the WriteConsoleW 
win32 API directly in order to avoid bugs in the CRT's Unicode support, but 
that's an implementation detail.

 - What about sys.stdout.buffer: should sys.stdout.buffer.write() calls 
 WriteConsoleA() or sys.stdout should not have a buffer attribute?

I was thinking that sys.std{out,err}.buffer would still be set up exactly as 
they are now. Then if an app writes to that buffer, it will get interleaved 
with any writes via the text stream. (The writes to the buffer go to the 
underlying fd, which probably ends up calling WriteFile at the win32 level.)

 I think that many modules and programs now rely on sys.stdout.buffer to write 
 directly bytes into stdout. There is at least python -m base64.

That would just work. The only caveat would be that if you write a partial line 
to the buffer object (or if you set the buffer object to be fully buffered and 
write to it), and then write to the text stream, the buffer wouldn't be flushed 
before the text is written. I think that is fine as long as it is documented.

If an app sets the .buffer attribute of sys.std{out,err}, it would fall back to 
using that buffer in the same way as when the fd is redirected.

 - Should we use ReadConsoleW() for stdin?

Yes. I'll probably start with a patch that just handles std{out,err}, though.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1602
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1602] windows console doesn't print or input Unicode

2011-02-02 Thread David-Sarah Hopwood

David-Sarah Hopwood david-sa...@jacaranda.org added the comment:

Feedback from Julie Solon of Microsoft:

 These console functions share a per-process heap that is 64K. There is some 
 overhead, the heap can get fragmented, and calls from multiple threads all 
 affect how much is available for this buffer. 

 I am working to update the documentation for this function [WriteConsoleW] 
 and other affected functions with information along these lines, and will 
 post it within the next week or two.

I replied thanking her and asking for clarification:

When you say that the heap can get fragmented, is this true only when
there are concurrent calls to the console functions, or can it occur
even with single-threaded use? I'm trying to determine whether acquiring
a process-global lock while calling these functions would be sufficient
to ensure that the available heap space will not be unexpectedly low.
(This assumes that the functions not used outside the lock by other
libraries in the same process.)

ReadConsoleW seems also to be affected, incidentally.

I've asked for clarification about whether acquiring a process-global lock when 
using these functions ...
Julie

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1602
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1602] windows console doesn't print utf8 (Py30a2)

2011-01-11 Thread David-Sarah Hopwood

Changes by David-Sarah Hopwood david-sa...@jacaranda.org:


--
nosy: +BreamoreBoy
versions: +Python 3.1, Python 3.2 -Python 3.3
Added file: http://bugs.python.org/file20360/doc-patch.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1602
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1602] windows console doesn't print utf8 (Py30a2)

2011-01-11 Thread David-Sarah Hopwood

Changes by David-Sarah Hopwood david-sa...@jacaranda.org:


--
nosy: +BreamoreBoy
versions: +Python 3.1, Python 3.2 -Python 3.3
Added file: http://bugs.python.org/file20361/doc-patch.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1602
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1602] windows console doesn't print utf8 (Py30a2)

2011-01-11 Thread David-Sarah Hopwood

Changes by David-Sarah Hopwood david-sa...@jacaranda.org:


Added file: http://bugs.python.org/file20362/doc-patch.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1602
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1602] windows console doesn't print or input Unicode

2011-01-11 Thread David-Sarah Hopwood

Changes by David-Sarah Hopwood david-sa...@jacaranda.org:


--
title: windows console doesn't print utf8 (Py30a2) - windows console doesn't 
print or input Unicode
Added file: http://bugs.python.org/file20363/doc-patch.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1602
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1602] windows console doesn't print utf8 (Py30a2)

2011-01-10 Thread David-Sarah Hopwood

David-Sarah Hopwood david-sa...@jacaranda.org added the comment:

I'll have a look at the Py3k I/O internals and see what I can do.
(Reopening a bug appears to need Coordinator permissions.)

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1602
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1602] windows console doesn't print utf8 (Py30a2)

2011-01-10 Thread David-Sarah Hopwood

David-Sarah Hopwood david-sa...@jacaranda.org added the comment:

 The script unicode2.py uses the console STD_OUTPUT_HANDLE iff 
 sys.stdout.fileno()==1.

You may have missed if not_a_console(hStdout): real_stdout = False.
not_a_console uses GetFileType and GetConsoleMode to check whether that handle 
is directed to something other than a console.

 But is it always the case?

The technique used here for detecting a console is almost the same as the code 
for IsConsoleRedirected at 
http://blogs.msdn.com/b/michkap/archive/2010/05/07/10008232.aspx , or in 
WriteLineRight at 
http://blogs.msdn.com/b/michkap/archive/2010/04/07/9989346.aspx (I got it from 
that blog, can't remember exactly which page).

[This code will give a false positive in the strange corner case that 
stdout/stderr is redirected to a console *input* handle. It might be better to 
use GetConsoleScreenBufferInfo instead of GetConsoleMode, as suggested by 
http://stackoverflow.com/questions/3648711/detect-nul-file-descriptor-isatty-is-bogus/3650507#3650507
 .]

 What about pythonw.exe?

I just tested that, using pythonw run from cmd.exe with stdout redirected to a 
file; it works as intended. It also works (for both console and non-console 
cases) when the handles are inherited from a parent process.

Incidentally, what's the earliest supported Windows version for Py3k? I see 
that http://www.python.org/download/windows/ mentions Windows ME. I can fairly 
easily make it fall back to never using WriteConsoleW on Windows ME, if that's 
necessary.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1602
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1602] windows console doesn't print utf8 (Py30a2)

2011-01-10 Thread David-Sarah Hopwood

David-Sarah Hopwood david-sa...@jacaranda.org added the comment:

Note: Michael Kaplan's code checks whether GetConsoleMode failed due to 
ERROR_INVALID_HANDLE. My code intentionally doesn't do that, because it is 
correct and conservative to fall back to the non-console behaviour when there 
is *any* error from GetConsoleMode. (It could also fail due to not having the 
GENERIC_READ right on the handle, for example.)

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1602
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue5924] When setting complete PYTHONPATH on Python 3.x, paths in the PYTHONPATH are ignored

2011-01-10 Thread David-Sarah Hopwood

David-Sarah Hopwood david-sa...@jacaranda.org added the comment:

Looking at 
http://svn.python.org/view/python/branches/py3k/PC/getpathp.c?r1=73322r2=73321pathrev=73322
 , wouldn't it be better to add a Py_WGETENV function? There are likely to be 
other cases where that would be the correct thing to use.

--
nosy: +davidsarah

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue5924
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1602] windows console doesn't print utf8 (Py30a2)

2011-01-10 Thread David-Sarah Hopwood

David-Sarah Hopwood david-sa...@jacaranda.org added the comment:

... os.dup2() ...

Good point, thanks.

It would work to change os.dup2 so that if its second argument is 0, 1, or 2, 
it calls _get_osfhandle to get the Windows handle for that fd, and then reruns 
the console-detection logic. That would even allow Unicode output to work after 
redirection to a different console.

Programs that directly called the CRT dup2 or SetStdHandle would bypass this. 
Can we consider such programs to be broken? Methinks a documentation patch for 
os.dup2 would be sufficient, something like:

When fd1 refers to the standard input, output, or error handles (0, 1 and 2 
respectively), this function also ensures that state associated with Python's 
initial sys.{stdin,stdout,stderr} streams is correctly updated if needed. It 
should therefore be used in preference to calling the C library's dup2, or 
similar APIs such as SetStdHandle on Windows.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1602
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1602] windows console doesn't print utf8 (Py30a2)

2011-01-09 Thread David-Sarah Hopwood

David-Sarah Hopwood david-sa...@jacaranda.org added the comment:

haypo wrote:
 davidsarah wrote:
 It is certainly possible to write Unicode to the console 
 successfully using WriteConsoleW

 Did you tried with characters not encodable to the code page and with 
 character that cannot be rendeded by the font?

Yes, characters not encodable to the code page do work (as confirmed by Glenn 
Linderman, since code page 437 does not include Cyrillic).

Characters that cannot be rendered by the font print as missing-glyph boxes, as 
expected. They don't cause any other problem, and they can be cut-and-pasted to 
other Unicode-aware applications, showing up as the original characters.

 See msg120414 for my tests with WriteConsoleOutputW

Even if it handled encoding correctly, WriteConsoleOutputW 
(http://msdn.microsoft.com/en-us/library/ms687404%28v=vs.85%29.aspx) would not 
be the right API to use in any case, because it prints to a rectangle of 
characters without scrolling. WriteConsoleW does scroll in the same way that 
printing to a console output stream normally would. (Redirection to a 
non-console stream can be detected and handled differently, as the code in 
unicode2.py does.)

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1602
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue410547] os.statvfs support for Windows

2011-01-09 Thread David-Sarah Hopwood

David-Sarah Hopwood david-sa...@jacaranda.org added the comment:

Don't use win32file.GetDiskFreeSpace; the underlying Windows API only supports 
drives up to 2 GB 
(http://blogs.msdn.com/b/oldnewthing/archive/2007/11/01/5807020.aspx). Use 
GetFreeDiskSpaceEx, as the code I linked to does.

I'm not sure it makes sense to provide an exact clone of os.statvfs, since some 
of the statvfs fields don't have equivalents that are obtainable by any Windows 
API as far as I know. What emwould/em make sense is a cross-platform way to 
get total disk space, and the space free for root/Administrator and for the 
current user. This would actually be somewhat easier to use on Unix as well.

Anyway, here's some code for Windows that only uses ctypes (whichdir should be 
Unicode):

from ctypes import WINFUNCTYPE, windll, POINTER, byref, c_ulonglong
from ctypes.wintypes import BOOL, DWORD, LPCWSTR

# http://msdn.microsoft.com/en-us/library/aa383742%28v=VS.85%29.aspx
PULARGE_INTEGER = POINTER(c_ulonglong)

# http://msdn.microsoft.com/en-us/library/aa364937%28VS.85%29.aspx
GetDiskFreeSpaceExW = WINFUNCTYPE(BOOL, LPCWSTR, PULARGE_INTEGER, 
PULARGE_INTEGER, PULARGE_INTEGER)(
(GetDiskFreeSpaceExW, windll.kernel32))

# http://msdn.microsoft.com/en-us/library/ms679360%28v=VS.85%29.aspx
GetLastError = WINFUNCTYPE(DWORD)((GetLastError, windll.kernel32))

# (This might put up an error dialog unless
# SetErrorMode(SEM_FAILCRITICALERRORS | SEM_NOOPENFILEERRORBOX)
# has been called.)

n_free_for_user = c_ulonglong(0)
n_total = c_ulonglong(0)
n_free  = c_ulonglong(0)
retval = GetDiskFreeSpaceExW(whichdir,
 byref(n_free_for_user),
 byref(n_total),
 byref(n_free))
if retval == 0:
raise OSError(Windows error %d attempting to get disk statistics for 
%r
  % (GetLastError(), whichdir))

free_for_user = n_free_for_user.value
total = n_total.value
free  = n_free.value

--
versions: +Python 2.7, Python 3.3 -Python 2.6

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue410547
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1602] windows console doesn't print utf8 (Py30a2)

2011-01-08 Thread David-Sarah Hopwood

David-Sarah Hopwood david-sa...@jacaranda.org added the comment:

It is certainly possible to write Unicode to the console successfully using 
WriteConsoleW. This works regardless of the console code page, including 65001. 
The code a 
href=http://tahoe-lafs.org/trac/tahoe-lafs/browser/src/allmydata/windows/fixups.py;here/a
 does so (it's for Python 2.x, but you'd be calling WriteConsoleW from C 
anyway).

WriteConsoleW has one bug that I know of, which is that it a 
href=http://tahoe-lafs.org/trac/tahoe-lafs/ticket/1232;fails when writing 
more than 26608 characters at once/a. That's easy to work around by limiting 
the amount of data passed in a single call.

Fonts are not Python's problem, but encoding is. It doesn't make sense to fail 
to output the right characters just because some users might not have selected 
fonts that can display those characters. This bug should be reopened.

(For completeness, it is possible to display Unicode on the console using fonts 
other than Lucida Console and Consolas, but it a 
href=http://stackoverflow.com/questions/878972/windows-cmd-encoding-change-causes-python-crash/3259271#3259271;requires
 a registry hack/a.)

--
nosy: +davidsarah

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1602
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1602] windows console doesn't print utf8 (Py30a2)

2011-01-08 Thread David-Sarah Hopwood

David-Sarah Hopwood david-sa...@jacaranda.org added the comment:

Glenn Linderman wrote:
 I skipped the unmangling of command-line arguments, because it produced an 
 error I didn't understand, about needing a buffer protocol.

If I understand correctly, that part isn't needed on Python 3 because issue2128 
is already fixed there.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1602
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2128] sys.argv is wrong for unicode strings

2011-01-08 Thread David-Sarah Hopwood

David-Sarah Hopwood david-sa...@jacaranda.org added the comment:

The following code is being used to work around this issue for Python 2.x in 
Tahoe-LAFS:

# This works around http://bugs.python.org/issue2128.
GetCommandLineW = WINFUNCTYPE(LPWSTR)((GetCommandLineW, windll.kernel32))
CommandLineToArgvW = WINFUNCTYPE(POINTER(LPWSTR), LPCWSTR, POINTER(c_int)) \
((CommandLineToArgvW, windll.shell32))

argc = c_int(0)
argv_unicode = CommandLineToArgvW(GetCommandLineW(), byref(argc))

argv = [argv_unicode[i].encode('utf-8') for i in range(0, argc.value)]

if not hasattr(sys, 'frozen'):
# If this is an executable produced by py2exe or bbfreeze, then it will
# have been invoked directly. Otherwise, unicode_argv[0] is the Python
# interpreter, so skip that.
argv = argv[1:]

# Also skip option arguments to the Python interpreter.
while len(argv)  0:
arg = argv[0]
if not arg.startswith(-) or arg == -:
break
argv = argv[1:]
if arg == '-m':
# sys.argv[0] should really be the absolute path of the module 
source,
# but never mind
break
if arg == '-c':
argv[0] = '-c'
break

--
nosy: +davidsarah
versions: +Python 2.5, Python 2.6, Python 2.7

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue2128
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2128] sys.argv is wrong for unicode strings

2011-01-08 Thread David-Sarah Hopwood

David-Sarah Hopwood david-sa...@jacaranda.org added the comment:

Sorry, missed out the imports:

from ctypes import WINFUNCTYPE, windll, POINTER, byref, c_int
from ctypes.wintypes import LPWSTR, LPCWSTR

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue2128
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue410547] os.statvfs support for Windows

2011-01-06 Thread David-Sarah Hopwood

David-Sarah Hopwood david-sa...@jacaranda.org added the comment:

Is there a portable way to get the available disk space by now?

No, but 
http://tahoe-lafs.org/trac/tahoe-lafs/browser/src/allmydata/util/fileutil.py?rev=4894#L308
 might be helpful (uses pywin32).

--
nosy: +davidsarah

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue410547
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6058] Add cp65001 to encodings/aliases.py

2010-10-23 Thread David-Sarah Hopwood

David-Sarah Hopwood david-sa...@jacaranda.org added the comment:

This problem causes {{{os.getcwdu()}}} to fail when the console code page is 
set to 65001 (always, I think):
{{{
t:\ver

Microsoft Windows [Version 6.0.6002]

t:\chcp
Active code page: 65001

t:\python -c import os; print os.getcwdu()
Traceback (most recent call last):
  File string, line 1, in module
LookupError: unknown encoding: cp65001

t:\chcp 1252
Active code page: 1252

t:\python -c import os; print os.getcwdu()
t:\
}}}

Incidentally, I don't agree that this codepage needs to be distinguished from 
UTF-8. The deviations in the Microsoft codec are just their bugs. There is only 
one correct way to encode/decode UTF-8, and cp65001 is supposed to be UTF-8 
according to Microsoft (e.g. 
http://msdn.microsoft.com/en-us/library/86hf4sb8%28en-US,VS.80%29.aspx ).

--
nosy: +davidsarah

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue6058
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6058] Add cp65001 to encodings/aliases.py

2010-10-23 Thread David-Sarah Hopwood

David-Sarah Hopwood david-sa...@jacaranda.org added the comment:

I said: There is only one correct way to encode/decode UTF-8. This is true 
modulo differences in the treatment of initial byte order marks.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue6058
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6058] Add cp65001 to encodings/aliases.py

2010-10-23 Thread David-Sarah Hopwood

David-Sarah Hopwood david-sa...@jacaranda.org added the comment:

I meant to say that the os.getcwdu() test in msg119440 was done with Windows 
native Python 2.6.2.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue6058
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6058] Add cp65001 to encodings/aliases.py

2010-10-23 Thread David-Sarah Hopwood

David-Sarah Hopwood david-sa...@jacaranda.org added the comment:

Oops, false alarm. python -c import os; print repr(os.getcwdu()) works as 
expected, so the exception is part of issue 1602.

(My command about there being no need to distinguish this codepage from UTF-8 
stands.)

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue6058
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7952] fileobject.c can switch between fread and fwrite without an intervening flush or seek, invoking undefined behaviour

2010-02-17 Thread David-Sarah Hopwood

New submission from David-Sarah Hopwood david-sa...@jacaranda.org:

The C standard (any version, or POSIX), says in the description of fopen that:
{{{
When a file is opened with update mode ( '+' as the second or third character 
in the mode argument), both input and output may be performed on the associated 
stream. However, the application shall ensure that output is not directly 
followed by input without an intervening call to fflush() or to a file 
positioning function ( fseek(), fsetpos(), or rewind()), and input is not 
directly followed by output without an intervening call to a file positioning 
function, unless the input operation encounters end-of-file.
}}}

Objects/fileobject.c makes calls to fread and fwrite without taking this into 
account. So calls from Python to read or write methods of a file object opened 
in any rw mode, may invoke undefined behaviour. It isn't reasonable to rely 
on Python code to avoid this situation, even if were considered acceptable in 
C. (Arguably this is a bug in the C standard, but it is unlikely to be fixed 
there or in POSIX, because of differences in philosophy about language safety.)

To fix this, fileobject.c should keep track of whether the last I/O operation 
was an input or output, and perform a call to fflush whenever an input follows 
an output or vice versa. This should not significantly affect performance in 
any case where the behaviour was previously defined (in cases where it wasn't, 
correctness trumps performance). fflush does not affect the file position and 
should have no other negative effect, because the stdio implementation is free 
to flush buffered data at any time (and certainly on I/O operations).

Despite the undefined behaviour, I don't currently know of a platform where 
this would lead to an exploitable security bug. I'm marking this issue as 
security-relevant anyway, because it may prevent analysing whether Python 
applications behave securely only on the basis of documented behaviour.

--
components: IO
messages: 99483
nosy: davidsarah
severity: normal
status: open
title: fileobject.c can switch between fread and fwrite without an intervening 
flush or seek, invoking undefined behaviour
type: security
versions: Python 2.7

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7952
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7952] fileobject.c can switch between fread and fwrite without an intervening flush or seek, invoking undefined behaviour

2010-02-17 Thread David-Sarah Hopwood

David-Sarah Hopwood david-sa...@jacaranda.org added the comment:

Correction: when input is followed by output, the call needed to avoid 
undefined behaviour has to be to a file positioning function (fseek, fsetpos, 
or rewind, but not fflush). Since fileobject.c does not use wide I/O 
operations, it should be sufficient to use _portable_fseek(fp, 0, SEEK_SET).

(_portable_fseek may call some function that is not strictly defined to be a 
file positioning function, e.g. fseeko() or fseek64(). However, it would be 
insane for a stdio implementation not to treat those as being file positioning 
functions as far as the intent of the C or POSIX standards is concerned.)

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7952
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com