> I see two main user-oriented use cases for the resulting Unicode > strings this PEP will produce on all systems: displaying a list of > filenames for the user to select from (an open file dialog), and > allowing a user to edit or supply a filename (a save dialog or a > rename control).
There are more, in particular the case "user passes a file name on the command line", and "web server passes URL in environment variable". > It's clear what this PEP provides for the former. On well-behaved > systems where a simpler filesystemencoding approach would work, the > results are identical; the user can select filenames that are what he > expects to see on both Unix and Windows. On less well-behaved systems, > some characters may appear as junk in the middle of the name (or would > they be invisible?) Depends on the rendering. Try "print u'\udc00'" in your terminal to see what happens; for me, it renders the glyph for "replacement character". In GUI applications, you often see white boxes (rectangles). > What I don't find clear is what the risks are for the latter. On the > less well behaved system, a user may well attempt to use this python > application to fix filenames. Can we estimate a likelihood that edits > to the names would result in a Unicode string that can no longer be > encoded with the python-escape? Will a new name fully provided by a > user on his keyboard (ignoring copy and paste) almost always safely > encode? That very much depends on the system setup, and your impression is right that the PEP doesn't address it - it only deals with cases where you get random unsupported bytes; getting random unsupported characters from the user is not considered. If the user has the locale setup in way that matches his keyboard, it should work all fine - and will already, even without the PEP. If the user enters a character that doesn't directly map to a good file name, you get an exception, and have to tell the user to pick a different filename. Notice that it may fail at several layers: - it may be that characters entered are not supported in what Python choses as the file system encoding. - it may be that the characters are not supported by the file system, e.g. leading spaces in Win32. - it may be that the file cannot be renamed because the target name already exists. In all these cases, the application has to ask the user to reconsider; for at least the last case, it should be prepared to do that, anyway (there is also the case where renaming fails because of lack of permissions; in that case, picking a different file name won't help). Regards, Martin _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com