Re: [Python-Dev] IDLE in the stdlib
Terry Reedy: > Broken (and quirky): it has an absurdly limited output buffer (under a > thousand lines) The limit is actually lines. > Quirky: Windows uses cntl-C to copy selected text to the clipboard and (where > appropriate) cntl-V to insert clipboard text at the cursor pretty much > everywhere. CP uses Ctrl+C to interrupt programs similar to Unix. Therefore it moves copy to a different key in a similar way to Unix consoles like GNOME Terminal and MATE Terminal which use Shift+Ctrl+C for copy despite Ctrl+C being the standard for other applications. Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] cffi in stdlib
Armin Rigo: > Maybe. Feel like adding an issue to > https://bitbucket.org/cffi/cffi/issues, with references? OK, issue #62 added. > This looks > like a Windows-specific extension, which means that I don't > automatically know about it. While SAL is Windows-specific, gcc supports some similar attributes including nonnull and sentinel. Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] cffi in stdlib
Armin Rigo: > So the general answer to your question is: we google MessageBox and > copy that line from the microsoft site, and manually remove the > unnecessary WINAPI and _In_opt declarations: Wouldn't it be better to understand the SAL annotations like _In_opt so that spurious NULLs (for example) produce a good exception from cffi instead of failing inside the system call? Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Cut/Copy/Paste items in IDLE right click context menu
Nick Coghlan: > - no need for extensive cross-OS testing prior to commit, that's a key > part of the role of the buildbots Are the buildbots able to test UI features like menu selections? Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] VS 11 Express is Metro only.
Curt: >> But will it be able to target Windows XP? It will likely be possible in a reasonable manner at some point. From http://blogs.msdn.com/b/visualstudio/archive/2012/05/18/a-look-ahead-at-the-visual-studio-11-product-lineup-and-platform-support.aspx : """C++ developers can also use the multi-targeting capability included in Visual Studio 11 to continue using the compilers and libraries included in Visual Studio 2010 to target Windows XP and Windows Server 2003. Multi-targeting for C++ applications currently requires a side-by-side installation of Visual Studio 2010. Separately, we are evaluating options for C++ that would enable developers to directly target XP without requiring a side-by-side installation of Visual Studio 2010 and intend to deliver this update post-RTM. """ Martin v. Löwis wrote: > The only place where platform support matters is the CRT, and this is > what I still want to test. E.g. it might be that the C RT works on XP, > and the C++ RT might use newer API. C++ runtime is more dependent on post-XP features than C runtime but even the C runtime currently needs some thunks: http://tedwvc.wordpress.com/ Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python as a Metro-style App
Antoine Pitrou: > How does it translate to C? The simplest technique would be to use C++ code to bridge from C to the API. If you really wanted to you could explicitly call the function pointer in the COM vtable but doing COM in C is more effort than calling through C++. > I'm not sure why "responsive user interfaces" would be more important > today than 10 years ago, but at least I hope Microsoft has found > something more usable than overlapped I/O. They are more important now due to the use of phones and tablets together with distant file systems. Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python as a Metro-style App
Antoine Pitrou: > When you say MoveFile is absent, is MoveFileEx supported instead? WinRT strongly prefers asynchronous methods for all lengthy operations. The most likely call to use for moving files is StorageFile.MoveAsync. http://msdn.microsoft.com/en-us/library/windows/apps/br227219.aspx > Depending on the extent of removed/disabled functionality, it might not > be very interesting to have a Metro port at all. Asynchronous APIs will become much more important on all platforms in the future to ensure responsive user interfaces. Python should not be left behind. Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Windows 8 support
Austin Fernandes: > Which versions of python will be compatible with windows8. I am using > currently 2.7.2 version. Current releases of both Python 2.7 and Python 3.2 appear to run fine on the Windows 8 Developer Preview. You should download and install the preview to ensure that your own code is compatible. Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 393 Summer of Code Project
Stephen J. Turnbull: > ... Eg, this is why the common GUIs for Unix (X.org, GTK+, and > Qt) either provide or require UTF-8 coding for their text. Qt uses UTF-16 for its basic QString type. While QString is mostly treated as a black box which you can create from input buffers in any encoding, the only encoding allowed for a contents-by-reference QString (QString::fromRawData) is UTF-16. http://doc.qt.nokia.com/latest/qstring.html#fromRawData Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 393 Summer of Code Project
Glenn Linderman: > How many different iterators into the same text would be concurrently needed > by an application? And why? Seems like if it is dealing with text at the > level of grapheme clusters, it needs that type of iterator. Of course, if > it does I/O it needs codec access, but that is by nature sequential from the > starting point to the end point. I would expect that there would mostly be a single iterator into a string but can imagine scenarios in which multiple iterators may be concurrently active and that these could be of different types. For example, say we wanted to search for each code point in a text that fails some test (such as being a member of a set of unwanted vowel diacritics) and then display that failure in context with its surrounding text of up to 30 graphemes either side. Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 393 Summer of Code Project
Guido van Rossum: > On Wed, Aug 31, 2011 at 5:58 PM, Neil Hodgson wrote: >> [...] some text drawing engines draw decomposed characters ("o" >> followed by " ̈" -> "ö") differently compared to their composite >> equivalents ("ö") and this may be perceived as better or worse. I'd >> like to offer an option to replace some decomposed characters with >> their composite equivalent before drawing but since other characters >> may look worse, I don't want to do a full normalization. > > Isn't this an issue properly solved by various normal forms? No, since normalization of all cases may actually lead to worse visuals in some situations. A potential reason for drawing decomposed characters differently is that more room may be allocated for the generic condition where a character may be combined with a wide variety of accents compared with combining it with a specific accent. Here is an example on Windows drawing composite and decomposed forms to show the types of difference often encountered. http://scintilla.org/Composite.png Now, this particular example displays both forms quite reasonably so would not justify special processing but I have seen on other platforms and earlier versions of Windows where the umlaut in the decomposed form is displaced to the right even to the extent of disappearing under the next character. In the example, the decomposed 'o' is shorter and lighter and the umlauts are round instead of square. Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 393 Summer of Code Project
Glenn Linderman: > That said, regexp, or some sort of cursor on a string, might be a workable > solution. Will it have adequate performance? Perhaps, at least for some > applications. Will it be as conceptually simple as indexing an array of > graphemes? No. Will it ever reach the efficiency of indexing an array of > graphemes? No. Does that matter? Depends on the application. Using an iterator for cluster access is a common technique currently. For example, with the Pango text layout and drawing library, you may create a PangoLayoutIter over a text layout object (which contains a UTF-8 string along with formatting information) and iterate by clusters by calling pango_layout_iter_next_cluster. Direct access to clusters by index is not as useful in this domain as access by pixel positions - for example to examine the portion of a layout visible in a window. http://developer.gnome.org/pango/stable/pango-Layout-Objects.html#pango-layout-get-iter In this API, 'index' is used to refer to a byte index into UTF-8, not a character or cluster index. Rather than discuss functionality in the abstract, we need some use cases involving different levels of character and cluster access to see whether providing indexed access is worthwhile. I'll start with an example: some text drawing engines draw decomposed characters ("o" followed by " ̈" -> "ö") differently compared to their composite equivalents ("ö") and this may be perceived as better or worse. I'd like to offer an option to replace some decomposed characters with their composite equivalent before drawing but since other characters may look worse, I don't want to do a full normalization. The API style that appears most useful for this example is an iterator over the input string that yields composed and decomposed character strings (that is, it will yield both "ö" and "ö"), each character string is then converted if in a substitution dictionary and written to an output string. This is similar to an iterator over grapheme clusters although, since it is only aimed at composing sequences, the iterator could be simpler than a full grapheme cluster iterator. One of the benefits of iterator access to text is that many different iterators can be built without burdening the implementation object with extra memory costs as would be likely with techniques that build indexes into the representation. Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] The socket HOWTO
Antoine Pitrou: > So what you're saying is that the text is mostly useless (or at least > quite dispensable), but you think it's fine that people waste their > time trying to read it? I found it useful when starting to write socket code. Later on I learnt more but, as an introduction, this document was great. It is written in an approachable manner and doesn't spend time on details unimportant to initial understanding. Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Python-checkins] cpython: _PyImport_LoadDynamicModule() encodes the module name explicitly to ASCII
Michael Urman: > That screenshot seems to show UTF-8 is being used. This may just be > the literal bytes in the .c file, but could it be something more > dependable? The file is in UTF-8 so the compiler may just be copying the bytes. There is a setlocale pragma but that seems to be just for string literals. Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Python-checkins] cpython: _PyImport_LoadDynamicModule() encodes the module name explicitly to ASCII
Michael Urman: > I'm not convinced this is correct for this case. GetProcAddress takes > an "ANSI" string, meaning while it could theoretically use UTF-8, in > practice I doubt it uses anything outside of ASCII safely. So while > the name of the library would be encoded in UTF-16, the name of the > function loaded from the library would not be. Yes you are right: http://scintilla.org/NarrowName.png Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Python-checkins] cpython: _PyImport_LoadDynamicModule() encodes the module name explicitly to ASCII
Victor Stinner: > I read these documents but they don't explain which encoding is used in > libraries and programs. Does it mean that Windows and Linux may use > different encodings? Yes, Windows will use UTF-16 as it does for almost everything. From a user's point of view, these should both just be seen as Unicode. Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Python-checkins] cpython: _PyImport_LoadDynamicModule() encodes the module name explicitly to ASCII
Victor Stinner: > C and C++ identifiers are restricted to ASCII. I don't know for Fortran > or Java. Some C and C++ implementations currently allow non-ASCII identifiers and the forthcoming C1X and C++0x language standards include non-ASCII identifiers. The allowed characters are specified in Annexes of the respective standards. http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf - Annex D http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2010/n3225.pdf - Annex E Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Bugs in thread_nt.h
Martin v. Löwis: > I guess all this advice doesn't really apply to this case, though. > The Microsoft API declares the parameter as a volatile*, indicating > that they consider it "proper" usage of the API to declare the storage > volatile. The 'volatile' here is a modifier on the parameter and does not require a corresponding agreement in the variable declaration. It indicates that all access through the pointer inside the function will be with volatile semantics. As long as all functions that operate on the variable do so treating access as volatile then everything is fine. You should only need to declare the variable as volatile if there is other code that accesses it directly. If agreement was required then the compiler would print a warning. It is similar to declaring a function to take a const parameter: there is no need for the variable to also be const. Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] hgeol
Martin v. Löwis: > So how can I fix this properly: so that all files have CRLF, but > are still attributed to whoever last modified them, rather than > having them attributed to me? I don't think this is possible from the current state. It may be possible to change the conversion process to 'rewrite history' to produce clean annotations. On other projects, I've just changed the files and accepted a degraded history. Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] CPython hg transition complete
To minimize differences from previous behaviour, it is probably best to mimic svn more closely by changing .hgeol to either have all the project files as native or allow fall through to the default ** = native. Another possibility is to set Visual Studio project files to CRLF but this is less compatible with how svn has been used. The advantage to explicit CRLF is that if you clone onto a Unix system and then share that disk with Windows or create an archive that is expanded on Windows (in binary mode) then you have the expected line ends. Similarly for sharing from Windows to Unix where the main problem is that bash can be upset by CRLF line ends since it assumes that the CR is part of the line and if the line ends with a file name (like "cat .profile\r") will treat the CR as part of the file name. Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] CPython hg transition complete
Antoine Pitrou: > It mimicks their settings in the SVN repository, so it should be ok. It doesn't match how they are checked out by svn since they have the property svn:eol-style set to 'native'. Therefore these files are checked out by svn with Windows \r\n line ends. Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] CPython hg transition complete
Georg Brandl: > I'm very happy to announce that the core Python repository switch > to Mercurial is complete and the new repository at > http://hg.python.org/cpython/ is now officially open for cloning, OK, I just performed a clone OK. It seems wrong to me that the *.vcproj and *.vsprops files in PCBuild use Unix line ends. These extensions are marked BIN in .hgeol. This machine does not have VS 2008 installed so I can't really check if that is OK. Just in case it is not all files, here are two with this issue cpython\PCbuild\kill_python.vcproj cpython\PCbuild\debug.vsprops Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] devguide (hg_transition): Advertise hg import over patch.
Adrian Buehlmann: > FWIW, we are very close to releasing TortoiseHg 2.0 (due March 1st), > which ported the current Gtk based TortoiseHg to Qt (although, it was > more like a rewrite :-). I hope this is going to be fast. One of the reasons I chose Hg over Bzr for another project was that the Bzr GUI tools which are written using Qt are much slower, particularly when starting. A cold start of Bazaar Explorer takes around 7 seconds on a new fast machine compared with under a second to launch Hg Repository Explorer. Warm starts and internal actions are better but the Hg GUI tools are still much smoother than Bzr's. This slowness is quite common for Qt applications and I think is because of the large set of DLLs that are loaded. Qt Creator is better at around 4 seconds for a cold launch but, naturally, it doesn't matter for an environment which you use for an extended period like Qt Creator. It does matter for a VCS tool that you may invoke hundreds of times in a day. Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] devguide (hg_transition): Advertise hg import over patch.
Scott Dial: > I don't believe TortoiseHG has such a feature (or I can't find it), > although if you have TortoiseSVN, you can still use that as a patch tool. The Import... command is in the Synchronize menu of Hg Repository Explorer. There is no GUI equivalent to --no-commit but you can exit the commit message editor without saving which causes the commit to be abandoned with the patch still having been applied. Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] pymigr: Ask for hgeol-checking hook.
Line end problems do occur in real projects. A scintilla-cocoa project was branched off Scintilla to support the Cocoa GUI framework on OS X. Here is one of the revisions in that project: http://bazaar.launchpad.net/~mike-lischke/scintilla-cocoa/trunk/revision/5#include/ScintillaWidget.h If the ScintillaWidget.h changes aren't visible (after a brief wait) then click on the arrow next to it. There are only 3 real changed lines in this file (which are changing comments from C++ to C) but the whole file appears to have been changed. This is far from the worst I have seen with some revisions showing almost every line in a project changed. There are several effects from this: 1) The blame command loses usefulness as all lines in the file appear to be from this revision. 2) Downloads become bigger, and take longer. 3) Fixing the issues takes time, effort and junks the history further. Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Mercurial conversion repositories
Antoine Pitrou: > It should now be fixed in current SVN, meaning the final conversion > should be perfectly usable with the eol extension enabled. Good. > Do you find other issues under Windows? Have you tried pushing changes? Since I'm not a member of core developers I used a http pull and can't push: C:\u\cpython>hg push pushing to http://hg.python.org/cpython searching for changes remote: ssl required Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Mercurial conversion repositories
With hg 1.7.5 on Windows 7 I performed a non-core checkout: hg clone http://hg.python.org/cpython The eol extension is enabled in global settings. I looked at things a bit, opening some files and using the Tortoise Hg Repository Explorer. But made no actual changes. Running hg diff produces a large amount of output with almost all the *.decTest and most of the Windows build files (*.mk, *.sln, *.vcproj, *.bat) showing as changed but with identical text. I've had problems like this with Hg before (http://mercurial.selenic.com/bts/issue2287). The situation can be fixed by hg update to another version and then back to default. Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Import and unicode: part two
Toshio Kuratomi: > When they update their OS to a version that has > utf-8 python module names, they will find that they have to make a choice. > They can either change their locale settings to a utf-8 encoding and have > the system installed modules work or they can leave their encoding on their > non-utf-8 encoding and have the modules that they've created on-site work. When switching to a UTF-8 locale, they can also change the file names of their modules to be encoded in UTF-8. It would be fairly easy to write a script that identifies non-ASCII file names in a directory and offers to transcode their names from their current encoding to UTF-8. Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Import and unicode: part two
Toshio Kuratomi: > My examples that you're replying to involve two "properly > configured" OS's. The Linux workstations are configured with a UTF-8 > locale. The Windows OS's use wide character unicode. The problem occurs in > that the code that one of the parties develops (either the students or the > professors) is developed on one of those OS's and then used on the other OS. This implies a symmetric issue,. but I can not see how there can be a problem with non-ASCII module names on Windows as the file system allows all Unicode characters so can represent any module name. OS X is also based on Unicode file names. While it is possible to mount file systems on Windows or OS X that do not support Unicode file names these are a very unusual situation that will cause problems in other ways. Common Linux distributions like Ubuntu and Fedora now default to UTF-8 locales. The situations in which users may encounter installations that do not support Unicode file names have reduced greatly. Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python and the Unicode Character Database
Stephen J. Turnbull: > Will it accept Arabic on input? (Han might be too much to ask for > since Unicode considers Han digits to be "impure".) I couldn't find a direct way to input Arabic digits into OO Calc, the normal use of Alt+number didn't work in Calc although it did in WordPad where Alt+1632 is ٠ and so on. OO Calc does have settings in the Complex Text Layout section for choosing different numerals but I don't understand the interaction of choices here. Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python and the Unicode Character Database
Stephen J. Turnbull: > Here's why: '''print "%d" % > some_integer''' doesn't now, and never will (unless Kristan gets his > Python 2.8), produce Arabic or Han numerals. Not in any > language I know of, not in Microsoft Excel, and definitely not in > Python 2. While I don't have Excel to test with, OpenOffice.org Calc will display in Arabic or Han numerals using the NatNum format codes. http://www.scintilla.org/ArabicNumbers.png > Ditto Arabic, I > would imagine; ISO 8859/6 (aka Latin/Arabic) does not contain the > Arabic digits that have been presented here earlier AFAICT. Note that > there's plenty of space for them in that code table (eg, 0xB0-0xB9 is > empty). Apparently nobody *ever* thought it was useful to have them! DOS code page 864 does use 0xB0-0xB9 for ٠ .. ٩. http://www.ascii.ca/cp864.htm Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Supporting raw bytes data in urllib.parse.* (was Re: Polymorphic best practices)
Ian Bicking: > I think the use case everyone has in mind here is where > you get a URL from one of these sources, and you want to handle it. I have > a hard time imagining the sequence of events that would lead to mojibake. > Naive parsing of a document in bytes couldn't do it, because if you have a > non-ASCII-compatible document your ASCII-based parsing will also fail (e.g., > looking for b'href="(.*?)"'). It depends on what the particular ASCII-based parsing is doing. For example, the set of trail bytes in Shift-JIS includes the same bytes as some of the punctuation characters in ASCII as well as all the letters. A search or split on '@' or '|' may find the trail byte in a two-byte character rather than a true occurrence of that character so the operation 'succeeds' but produces an incorrect result. Over time, the set of trail bytes used has expanded - in GB18030 digits are possible although many of the most important characters for parsing such as ''' "#%&.?/''' are still safe as they may not be trail bytes in the common double-byte character sets. Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 384 status
M.-A. Lemburg: > Is it possible to have multiple versions of the lib C loaded > on Windows ? Yes. It is possible not only to mix C runtimes from different vendors but different variants from a single vendor. Historically, each vendor has shipped their own C runtime libraries. This was also the case with CP/M and OS/2. Many applications can be extended with DLLs and if it were not possible to load DLLs which use different runtimes then that would limit which compilers can be used to extend particular applications. If Microsoft were to stop DLLs compiled with Borland or Intel from working inside Internet Explorer or Excel then there would be considerable controversy and likely anti-trust actions. Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Fwd: i18n
Terry Reedy: > File "C:\Python26\lib\socket.py", line 406, in readline > data = self._sock.recv(self._rbufsize) > socket.error: [Errno 10054] A lÚtez§ kapcsolatot a tßvoli ßllomßs > kÚnyszerÝtette n bezßrta That is pretty good mojibake. One of the problems of providing localized error messages is that the messages may be messed up at different stages. The original text was A létező kapcsolatot a távoli állomás kényszerítetten bezárta. It was printed in iso8859_2 (ISO standard for Eastern European) then those bytes were pasted in as if they were cp852 (MS-DOS Eastern European). text = "A létező kapcsolatot a távoli állomás kényszerítetten bezárta." print(str(text.encode('iso8859_2'), 'cp852')) Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] mingw support?
Terry Reedy: > I suspect that the persons who first ported Python to MSDOS simply used what > they were used to using, perhaps in their paid job. And I am sure that is > still true of at least some of the people doing Windows support today. Some Windows developers actually prefer Visual Studio, including me. MingW has become less attractive in recent years by the difficulty in downloading and installing a current version and finding out how to do so. Some projects have moved on to the TDM packaging of MingW. http://tdm-gcc.tdragon.net/ Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Removing IDLE from the standard library
Stephen J. Turnbull: > But it's very important to be able to *move* tabs across windows or > panes. ... > In many apps, however, you would have to select the foo.c tab, close > it, bring up a new window, and open foo.c using the long path > (presumably with a file browser interface, but often enough the > default directory is wherever you started the editor, not most > recently used file). The common GUI technique is to drag a tab from one window into another window. Drag onto the desktop for a new top level window. This is supported by, among others, Firefox; Chrome; gedit; and GNOME Terminal. Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Idle-dev] Removing IDLE from the standard library
Kurt B. Kaiser: >> The tear off menus are ugly as well as being non-standard on all three >> major platforms. > > Well, would you discard them? They can (occasionally) be useful. Yes, I would replace the menus with ones missing the tear line. Most of the GUI toolkits experimented with tear-offs (Mac in late 80s, GTK+ up until 2002) and dropped them or hid them in a rarely visited API. The idea initially appeared reasonable ("I can have the Run and Check commands available with a single click") but was found to be too confusing in use. IDLE, because it uses a separate top-level window for each file and shell suffers more than most applications. A menu is torn off from one window and always applies to that window but shows no visual affinity with that window: its window is not even activated when a menu command acts on it. Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Idle-dev] Removing IDLE from the standard library
Kurt B. Kaiser: > I'm mystified about the comments that the GUI is ugly. It is minimal. > On XP, it looks exactly like an XP window with a simple menubar. Those > who haven't looked at it for awhile may not be aware of the recent > advances made by Tk in native look and feel. What is ugly? While Tk has improved at emulating native appearance, there are still many differences. On the main editing screen of IDLE, the most noticeable issue is that there is no horizontal scroll bar even though the text will move left when you move the caret beyond the rightmost visible character. The scrollbar and status bar have an appearance that looks to be from Windows 2000, not Windows XP and there is no resizing gripper on the right side of the status bar. The tear off menus are ugly as well as being non-standard on all three major platforms. Use the "Configure IDLE..." and an "idle" dialog appears that also looks to be from Windows 2000. I know Tk can do better than this as Git Gui (the Tk (8.5.8) program I use most often) at least shows XP themed buttons, scrollbars and other controls. However, the "idle" dialog (as well as Git Gui) shows the largest remaining problem for Tk user interfaces: keyboard navigation. When the "idle" dialog opens, try doing anything with the keyboard. Chances are nothing will happen. If you press Tab 16 times (yes, 16!) a focus rectangle will finally show on the "Bold" check box. Another Tab takes you to the "Indentation Width" slider. After that you don't see the focus until it wraps around to "Bold" again. The Enter key doesn't trigger OK and the Escape key doesn't let you escape. The Find and Replace dialogs are better as focus works as do Enter and Escape but none of the buttons have mnemonics. This may all sound like picking nits but details and consistency are important in user interfaces and this is just looking at the most easily discovered problems. Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Licensing // PSF // Motion of non-confidence
anatoly techtonik: > The file consists of several licenses for multiple versions of Python. > It is an unusual mix that negatively affects understanding. A simpler license would be better. There have been moves in the past to simplify the license of Python but this would require agreement from the current rights owners including CWI and CNRI. IIRC not all of the rights owners are willing to agree to a change. Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] email package status in 3.X
Steven D'Aprano: > Do any other languages have any equivalent to this ebtyes type? The String type in Ruby 1.9 is a byte string with an encoding attribute. Most online Ruby documentation is for 1.8 but the API can be examined here: http://ruby-doc.org/ruby-1.9/index.html Here's something more explanatory: http://blog.grayproductions.net/articles/ruby_19s_string My view is that this actually makes things much more complex by making encoding combination an n*n problem (where n is the number of encodings) rather an n sized problem when you have a single core string type Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] email package status in 3.X
Michael Foord: > Python 3.0 was *declared* to be an experimental release, and by most > standards 3.1 (in terms of the core language and functionality) was a solid > release. That looks to me like an after-the-event rationalization. The release note for Python 3.0 (and the "What's new") gives no indication that it is experimental but does say """ We are confident that Python 3.0 is of the same high quality as our previous releases ... you can safely choose either version (or both) to use in your projects. """ http://mail.python.org/pipermail/python-dev/2008-December/083824.html Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Support byte string API of Windows in Python3?
Victor Stinner: > It's a choice, I didn't want to patch Windows because I know that Windows use > unicode internally. I consider that developers using Python3 should use > unicode on Windows, and byte or unicode+surrogates on other OS. The Win32 byte string APIs convert their inputs to Unicode and then run Unicode code. You don't get additional capabilities by calling the byte string APIs and should avoid them completely. Including an easy way to invoke them on Windows will just lead to failures. People may think that Unix code that uses the byte string APIs for better platform fidelity can just run this code on Windows and get equivalent benefits. They won't and instead will see an inverted form of the problems they are trying to avoid on Unix. If there is ever a reason to use a byte string API on Windows (and I can't think of any) then ctypes can be used to explicitly call the API desired. Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] C++
Antoine Pitrou: > Is this concern still valid? We are in the 2010s now. > I'm not saying I want us to put some C++ in the core interpreter, but > the portability argument sounds a little old... There are still viable platforms which only support subsets of C++. IIRC, Android does not support exceptions in C++. Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python and Windows 2000
Martin v. Löwis: > See http://bugs.python.org/issue6926 > > The SDK currently hides symbolic constants from us that people are > asking for. Setting the version to 0x501 (XP) doesn't actively try to stop running on version 0x500 (2K), it just reveals the symbols and APIs from 0x501. Including a call to an 0x501-specific API will then fail at load. IPPROTO_IPV6 (the cause of issue 6926) isn't a new symbol that started working in Windows XP - it was present in older SDKs without a version guard so was visible when compiling for any version of Windows. > In addition, we could simplify the code in dl_nt.c around > GetCurrentActCtx and friends, by linking to these functions directly. It would be simpler but its not as if this code needs any changes at this point. I don't really have a strong need for Windows 2000 although I keep an instance for checking compatibility of my code and I do still get queries from people using old versions of Windows, including 9x. There is the question of whether to force failure on Windows 2000 or just remove it from the list of known-working platforms while still allowing it to run. Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python and Windows 2000
Martin v. Löwis: > I don't recall whether we have already decided about continued support > for Windows 2000. > > If not, I'd like to propose that we phase out that support: the Windows > 2.7 installer should display a warning; 3.2 will stop supporting Windows > 2000. Is there any reason for this? I can understand dropping Windows 9x due to the lack of Unicode support but is there anything missing from Windows 2000 that makes supporting it difficult? Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Proposal for virtualenv functionality in Python
Larry Hastings: > But IIUC telling the compiler how to > do that is only vaguely standardized--Microsoft's CL.EXE doesn't seem to > support any environment variable containing an include /path/. The INCLUDE environment variable is a list of ';' separated paths http://msdn.microsoft.com/en-us/library/36k2cdd4%28VS.100%29.aspx Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Reworking the GIL
Eric Hopper: > I don't suppose it will ever be ported back to Python 2.x? It doesn't > look like the whole GIL concept has changed much between Python 2.x and > 3.x so I expect back-porting it would be pretty easy. There was a patch but it has been rejected. http://bugs.python.org/issue7753 Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3147: PYC Repository Directories
Tim Delaney: > I like this solution combined with having a single cache directory and a few > other things I've added below. > ... > 2. /tmp is often on non-volatile memory. If it is (e.g. my Windows system > temp dir is on a RAMdisk) then it seems wise to respect the obvious desire > to throw away temporary files on shutdown. This may create security vulnerabilities. I could, for example, insert a manipulated .pyc that logs passwords when other users run it. I can also see advantages to allowing out of tree compiled cache directories. For example, you could have a locked down .py tree with .pycs going into per-user trees. This prevents another user from spoofing a .pyc I use as well as allowing users to install arbitrary versions of Python without getting an admin to compile the .py tree with the new compiler. Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PyPI comments and ratings, *really*?
When SourceForge started having comments and ratings, I was a little upset at having poor negative comments there (like "not work!"). But after it has been running for a while it appears useful. I suppose it helps that Scintilla has 88% thumbs up from 134 respondents. Because there is voting on comments, the more useful comments have bubbled onto the front page. As the system is used more, you'll see a wider range of comments on projects and you'll be able to tell more from them. It should be seen as a completely separate thing to the existing fora and trackers that each project has. While you want people to become involved in your project, many are just having a quick look and don't want to sign up for mailing lists or to interact with project members. They may just want to quickly comment about whether it was suitable or not. Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] please consider changing --enable-unicode default to ucs4
Ronald Oussoren: > Both Carbon and the modern APIs use UTF-16. If Unicode size standardization is seen as sufficiently beneficial then UTF-16 would be more widely applicable than UTF-32. Unix mostly uses 8-bit APIs which are either explicitly UTF-8 (such as GTK+) or can accept UTF-8 when the locale is set to UTF-8. They don't accept UTF-32. It is possible that Unix could move towards UTF-32 but that hasn't been the case up to now and with both OS X and Windows being UTF-16, it is more likely that UTF-16 APIs will become more popular on Unix. Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Mercurial migration: help needed
Paul Moore: > 1. Given that the "problematic" tools (notepad and Visual Studio) are > Windows tools, we seem to be back to the idea that this extension is > only needed by Windows developers. As I understood the consensus to be > that the extension should be for all users, I suspect I've missed > something. Some of the problems come from users on Unix checking in files with CRLF line ends that they have received using some other mechanism such as sharing a disk between Windows and Linux. I was going to point to a bad revision in a bzr housed project I work on but launchpad isn't working currently. What happened was that an OS X user committed a set of changes but with all the files having a different line ending to the repository. The result is that it is no longer easy to track changes before that revision. It also makes a check out larger. It would help in such cases for the commit command on Unix to either automatically change any CRLF line ends to LF for text files (but not files with an explicitly specified line end) or to display a warning. Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Mercurial migration: help needed
Dirkjan Ochtman: > I know a lot of projects use Mercurial on Windows as well, I'm not > aware of any big problems with it. If you have a Windows-only project with CRLF files using Mercurial then there is no line end problem as Mercurial preserves the CRLFs for you. Line end problems occur on mixed projects where both Windows and Unix tools are used. Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 385: the eol-type issue
M.-A. Lemburg: > ... and because of this, the feature is already available if > you use codecs.open() instead of the built-in open(): So should I not add an issue for the basic open because codecs.open should be used for this case? Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 385: the eol-type issue
Glenn Linderman: > and perhaps other things (and > are there new Unicode control characters that could be used for line > endings?), Unicode includes Line Separator U+2028 and Paragraph Separator U+2029 but they are rarely supported and very rarely used. They are a pain to work with since they are 3 byte sequences in UTF-8. Visual Studio does support them. Python does not currently support these line separators such as in this example which only reads 2 lines rather than 3: with open("x.txt", "wb") as f: f.write("a\nb\u2029c\n".encode('utf-8')) with open("x.txt", "r") as f: n = 1 for l in f.readlines(): print(n, repr(l)) n += 1 Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 385: the eol-type issue
Martin v. Löwis: > Or don't you understand why that single unresolved item didn't manage > to revert the decision? Well, there are many unresolved items in > the Mercurial conversion, some much more stressful than the eol issue > (e.g. the branching discussion). Then these issues should have been included in the initial PEP for choosing a DVCS since the issues could have driven the choice. PEP 374 implies that win32text effectively solves the Windows eol issue which no longer appears to be correct. Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 385: the eol-type issue
Martin v. Löwis: > Is it really that you don't *understand*? It's fairly easy: there was > a PEP ... The PEP process is straightforward. However, a PEP may produce an outcome that proves after more experience to be wrong. ISTM a prerequisite to choosing a DVCS is that it should support the full range of development platforms and thus the PEP was accepted prematurely. At some point the PEP should be reexamined and, if necessary, rescinded. What I don't understand is why the plan is still to move to hg despite, after several months, there not being a known good way to include Windows eol support. Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 385: the eol-type issue
Mark Hammond: > Thanks Nick; I didn't want to be the only one saying that. There is a fine > line between asserting reasonable requirements for Windows users and being > obstructionist and unhelpful, and I'm trying to stay on the former side :) I haven't commented on this issue before because I can't really be helpful. I just don't understand why hg is being considered before it's Windows support is roughly equivalent to svn and cvs. There has been some similar experience with the main repository for the Cocoa port of Scintilla which is in bzr on launchpad. Several times in that repository, files were checked in with wrong line ends making every line appear changed when looking through history. There are several causes for this including user error but bzr (and hg) should default to more helpful behaviour on text files. Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] mingw32 and gc-header weirdness
Martin v. Löwis: > Yes: alignof(PyGC_HEAD) would be specified as being the maximum > alignment on a platform; sizeof(PyGC_HEAD) would be frozen. Maximum alignment currently on x86 is 16 bytes for SSE vector types. Next year AVX will add 32 byte types and while they are supposed to work OK with 16 byte alignment, performance will be better with 32 byte alignment. It is possible that some use could be found for vector instructions in core Python but it is more likely that they will only be used in specialized extensions that can take care of alignment issues for their own cases. http://en.wikipedia.org/wiki/Advanced_Vector_Extensions http://software.intel.com/en-us/forums/intel-avx-and-cpu-instructions/topic/61891/ Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] mingw32 and gc-header weirdness
Martin v. Löwis: > I propose to add another (regular) double into the union. Adding a regular double as a second dummy gives the same sizes and alignments with Mingw or MSVC as the original definition with MSVC: typedef union _gc_head { struct { union _gc_head *gc_next; union _gc_head *gc_prev; Py_ssize_t gc_refs; } gc; long double dummy; /* force worst-case alignment */ double dummy2; /* in case long double doesn't trigger worst-case */ } PyGC_Head; In regard to alignment penalties, a simple copy loop for doubles runs about 20% slower when misaligned on an my AMD processor. Other x86 processors can be much worse. As much as 2 to 3.25 times according to http://msdn.microsoft.com/en-us/library/aa290049%28VS.71%29.aspx Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] command line attachable debugger
Glyph Lefkowitz: > Sounds like this is moving into hypothetical territory better-suited to > python-ideas. (Although I'm sure that if you wanted to contribute polished, > tested code for a standard remote debugger interface, few people would > complain.) There is a remote debugger protocol called DBGP for different languages (including Python) and debuggers (such as Komodo) http://xdebug.org/docs-dbgp.php Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Support for Python/Windows
Curt Hagenlocher: > Ah, you're right -- the PGO bits probably need VS Pro. The 64-bit > compilers should be in the Windows SDK, but it wouldn't surprise me if > they were not included in Express. 64-bit isn't in Express and merging the 64 bit compiler from the SDK into Express may be possible but certainly isn't easy. I just use the command line compiler to check 64 bit issues. Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Evaluated cmake as an autoconf replacement
cmake does not produce relative paths in its generated make and project files. There is an option CMAKE_USE_RELATIVE_PATHS which appears to do this but the documentation says: """This option does not work for more complicated projects, and relative paths are used when possible. In general, it is not possible to move CMake generated makefiles to a different location regardless of the value of this variable.""" This means that generated Visual Studio project files will not work for other people unless a particular absolute build location is specified for everyone which will not suit most. Each person that wants to build Python will have to run cmake before starting Visual Studio thus increasing the prerequisites. Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Evaluated cmake as an autoconf replacement
Jeffrey Yasskin: > 1. It can autogenerate the Visual Studio project files instead of > needing them to be maintained separately I have looked at a couple of build tools (scons was probably one) that generate Visual Studio project files in the past and they produced fairly poor project files, which would compile the code but wouldn't be as capable as project files created by hand. Its been a while so I can't remember the details. The current Python project files are hierarchical, building several DLLs and an EXE and I think this was outside the scope of the tools I looked at. Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Ext4 data loss
Antoine Pitrou: > It depends on what you call "ACLs". It does copy the chmod permission bits. Access Control Lists are fine grained permissions. Perhaps you want to allow Sam to read a file and for Ted to both read and write it. These permissions should not need to be reset every time you modify the file. > As for owner and group, I think there is a very good reason that it doesn't > copy > them: under Linux, only root can change these properties. Since I am a member of both "staff" and "everyone", I can set group on one of my files from "staff" to "everyone" or back again: $ chown :everyone x.pl $ ls -la x.pl -rwxrwxrwx 1 nyamatongwe everyone 269 Mar 11 2008 x.pl $ chown :staff x.pl $ ls -la x.pl -rwxrwxrwx 1 nyamatongwe staff 269 Mar 11 2008 x.pl Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Ext4 data loss
Antoine Pitrou: > How about shutil.copystat()? shutil.copystat does not copy over the owner, group or ACLs. Modeling a copymetadata method on copystat would provide an easy to understand API and should be implementable on Windows and POSIX. Reading the OS X documentation shows a set of low-level POSIX functions for ACLs. Since there are multiple pieces of metadata and an application may not want to copy all pieces there could be multiple methods (copygroup ...) or one method with options shutil.copymetadata(src, dst, group=True, resource_fork=False) Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Ext4 data loss
The technique advocated by Theodore Ts'o (save to temporary then rename) discards metadata. What would be useful is a simple, generic way in Python to copy all the appropriate metadata (ownership, ACLs, ...) to another file so the temporary-and-rename technique could be used. On Windows, there is a hack in the file system that tries to track the use of temporary-and-rename and reapply ACLs and on OS X there is a function FSPathReplaceObject but I don't know how to do this correctly on Linux. Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 365 (Adding the pkg_resources module)
zooko: > Um, isn't this tool called "unzip"? I have done this -- accessed the > source code -- many times, and unzip suffices. The type of issue I ran into with eggs is when you get an exception with a trace that includes an egg, you can't use the normal means to look at the code. Instead you have to understand that its an egg, unzip the code, manually translate the path, open the file and go to the line number. Similarly, you can't easily grep the code in its egg state. If there was a global flag where I could say 'install eggs as directories of source' then I'd be much happier. Just reread the EasyInstall documentation and '--always-unzip' is portrayed as a 'don't do this' option. As it is I just avoid eggs. They may make sense for installing applications but for development they get in the way. Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3118: Extended buffer protocol (new version)
Travis Oliphant: > PEP: 3118 > ... I'd like to see the PEP include discussion of what to do when an incompatible request is received while locked. Should there be a standard "Can't do that: my buffer has been got" exception? Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Extended Buffer Interface/Protocol
I have developed a split vector type that implements the buffer protocol at http://scintilla.sourceforge.net/splitvector-1.0.zip It acts as a mutable string implementing most of the sequence protocol as well as the buffer protocol. splitvector.SplitVector('c') creates a vector containing 8 bit characters and splitvector.SplitVector('u') is for Unicode. A writable attribute bufferAppearence can be set to 0 (default) to respond to buffer protocol calls by moving the gap to the end and returning the address of all of the data. Setting bufferAppearence to 1 responds as a two segment buffer. I haven't found any code that understands responding with two segments. sre and file.write handle SplitVector fine when it responds as a single segment: import re, splitvector x = splitvector.SplitVector("c") x[:] = "The life of brian" r = re.compile("l[a-z]*", re.M) print x y = r.search(x) print y.group(0) x.bufferAppearence = 1 y = r.search(x) print y.group(0) produces The life of brian life Traceback (most recent call last): File "qt.py", line 9, in y = r.search(x) TypeError: expected string or buffer It is likely that adding multi-segment ability to sre would complexify and slow it down. OTOH multi-segment buffers may be well-suited to scatter/gather I/O calls like writev. Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Extended Buffer Interface/Protocol
Greg Ewing: > So an array-of-pointers interface wouldn't be a direct > substitute for the existing multi-segment buffer > interface. Providing an array of (pointer,length) wouldn't be too much extra work for a split vector implementation. Guido van Rossum: > But there's always a call to remove the gap (or move it to the end). Yes, although its something you try to avoid. I'm not saying that this is an important use-case since no one seems to have produced a split vector implementation that provides the buffer protocol. Numeric-style array handling is much more common so deserves priority. Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Extended Buffer Interface/Protocol
Travis Oliphant: > 3) information about discontiguous memory segments > > > Number 3 is where I could use feedback --- especially from PIL users and > developers. Strides are a common way to think about a possibly > discontiguous chunk of memory (which appear in NumPy when you select a > sub-region from a larger array). The strides vector tells you how many > bytes to skip in each dimension to get to the next memory location for > that dimension. I think one of the motivations for discontiguous segments was for split buffers which are commonly used in text editors. A split buffer has a gap in the middle where insertions and deletions can often occur without moving much memory. When an insertion or deletion is required elsewhere then the gap is first moved to that position. I have long intended to implement a good split buffer extension for Python but the best I have currently is an extension written using Boost.Python which doesn't implement the buffer interface. Here is a description of split buffers: http://www.cs.cmu.edu/~wjh/papers/byte.html Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] More tracker demos online
Martin v. Löwis: > Currently, we have two running tracker demos online: After playing with them for 30 minutes, Jira seems to have too busy an interface and finicky behaviour: not liking the back button sometimes (similar to SF) and clicking on diffs wants to download them rather than view them. Its disappointing that Jira and Launchpad use different bug IDs as continuity should be maintained with the SF bug IDs which will be referred to in other areas such as commit messages. They do include the SF bug ID (as a field in Jira and a nickname in Launchpad) but this makes it harder to navigate between related bugs. I mostly looked at "os.startfile() still doesn't work with Unicode filenames" and I would have tagged the patch on SF with a "looks OK to me" if SF was working. The text in Launchpad was a bit sparsely formatted for me so would like to see if indvidual users can choose a different style. The others are OK although Roundup is clearer. Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python 2.4, VS 2005 & Profile Guided Optmization
Trent Nelson: > I ended up playing around with Profile Guided Optimization, running > ``python.exe pystones.py'' to collect call-graph data after > python.exe/Python24.dll had been instrumented, then recompiling with the > optimizations fed back in. It'd be an idea to build a larger body of Python code to run the profiling pass on so it doesn't just optimize the sort of code in pystone which is not very representative. Could run the test suite as it would have good coverage but would hit exceptional cases too heavily. Other compilers (Intel?) support profile directed optimization so would also benefit from such a body of code. Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] first draft of bug guidelines for www.python.org/dev/
Brett Cannon: > But SourceForge does not support anonymous reporting. SourceForge does support anonymous reporting. A large proportion of the fault reports I receive for Scintilla are anonymous as indicated by "nobody" in the "Submitted By" column. https://sourceforge.net/tracker/?group_id=2439&atid=102439 Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] unicode imports
Kristján V. Jónsson: > Although python has had full unicode support for filenames for a long time > on selected platforms (e.g. Windows), there is one glaring deficiency: It > cannot import from paths containing unicode. I´ve tried creating folders > with chinese characters and adding them to path, to no avail. > The standard install path in chinese distributions can be with a non-ANSI > path, and installing an embedded python application there will break it. It should be unusual for a Chinese installation to use an install path that can not be represented in MBCS. Try encoding the install directory into MBCS before adding it to sys.path. Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Visual studio 2005 express now free
Martin v. Löwis: > Apparently, the status of this changed right now: it seems that > the 2003 compiler is not available anymore; the page now says > that it was replaced with the 2005 compiler. > > Should we reconsider? I expect Microsoft means that Visual Studio Express will be available free forever, not that you will always be able to download Visual Studio 2005 Express. They normally only provide a particular product version for a limited time after it has been superceded. Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Linking with mscvrt
Martin v. Löwis: > > Visual Basic never forced > > use of a particular compiler or runtime library for extensions so why > > should Python? > > Do you really not know? Because of API that happens to be defined > the way it is. It was rhetorical: Why should Python be inferior to VB? I suppose the answer (hmm, am I allowed to anser my own rhtorical questions?) is that it was originally developed on other operating systems and the Windows port has never been as much of a focus for most contributors. Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Linking with mscvrt
Paul Moore: > This has all been thrashed out before, but the issue is passing > CRT-allocated objects across DLL boundaries. Yes, that was the first point I addressed through wrapping CRT objects. > At first glance, this is a minor issue - passing FILE* pointers across > DLL boundaries isn't something I'd normally expect people to do - but > look further and you find you're opening a real can of worms. For > example, Python has public APIs which take FILE* parameters. So convert them to taking PyWrappedFile * parameters. > Further, > memory allocation is CRT-managed - allocate memory with one CRT's > malloc, and dealloacte it elsewhere, and you have issues. So *any* > pointer could be CRT-managed, to some extent. Etc, etc... I thought PyMem_Malloc was the correct call to use for memory allocation now and avoided direct links to the CRT for memory management. Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Linking with mscvrt
Martin v. Löwis: > Not sure whether this was a serious suggestion. Yes it is. > If pythonxy.dll > was statically linked, you would get all the CRT duplication > already in extension modules. Given that there are APIs in Python > where you have to do malloc/free across the python.dll > boundary, you get memory leaks. Memory allocations across DLL boundaries will have to use wrapper functions. Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Linking with mscvrt
Martin v. Löwis: > COM really solves all problems people might have on Windows. COM was partly just a continuation of the practices used for controls, VBXs and other forms of extension. Visual Basic never forced use of a particular compiler or runtime library for extensions so why should Python? It was also easy to debug an extension DLL inside release-mode VB (I can't recall if debug versions of VB were ever readily available) which is something that is more difficult than it should be for Python. > Alas, it is not a cross-platform API. Standard C is cross-platform, > so Python uses it in its own APIs. The old (pre-XPCOM) Netscape plugin interface was cross-platform and worked with any compiler on Windows. Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Linking with mscvrt
Greg Ewing: > But that won't help when you need to deal with third-party > code that knows nothing about Python or its wrapped file > objects, and calls the CRT (or one of the myriad extant > CRTs, chosen at random:-) directly. Can you explain exactly why there is a problem here? Its fairly normal under Windows to build applications that provide a generic plugin interface (think Netscape plugins or COM) that allow the plugins to be built with any compiler and runtime. Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Linking with mscvrt
Martin v. Löwis: > I don't think this would be good enough. I then also need a way to > provide extension authors with an API that looks like the CRT, but > isn't: they cannot realistically change all their code to use the > wrapped objects. In a recent case, somebody tried to passed a FILE* > to a postrgres DLL linked with a different CRT; he shouldn't need > to change the entire postgres code to use the modified API. The postgres example is strange to me as I'd never consider passing a FILE* over a DLL boundary. Maybe this is a Unix/Windows cultural thing due to such practices being more dangerous on Windows. > Also, there is still the redistribution issue: to redistribute > msvcr71.dll, you need to own a MSVC license. People that want to > use py2exe (or some such) are in trouble: they need to distribute > both python25.dll, and msvcr71.dll. They are allowed to distribute > the former, but (formally) not allowed to distribute the latter. Link statically. Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Linking with mscvrt
Martin v. Löwis: > So ideally, Python should drop usage of the CRT entirely (but getting > there will be a long process). Hopefully, P3k will drop usage of > stdio for file objects, which will be a big step forward. You don't need to drop the CRT, just encapsulate it so there is one copy controlled by Python that hands out wrapped objects (file handles, file pointers, memory blocks, others?). These wrappers can only be manipulated through calls back to that owning code that then calls the CRT. Unfortunately this change would itself be incompatible with current extensions. Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] / as path join operator
Stephen J. Turnbull: > Jason> Filesystem paths are in fact strings on all operating > Jason> systems I'm aware of. > > I have no idea what you could mean by that. The data structure used > to represent a filesystem on all OS filesystems I've used is a graph > of directories and files. A filesystem object is located by > traversing a path in that graph. > > Of course there's a string representation, especially for human use, Not always. IIRC very old MacOS used an integer directory ID and a string file name. The directory ID was a cookie that you received from the UI and passed through to the file system and there was little support for manipulating the directory ID. Textualized paths were never supposed to be shown to users. Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Include ctypes into core Python?
Won't ctypes completely replace dl? dl provides only a small subset of the functionality of ctypes and is very restricted in the set of argument types allowed. Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] ast status, memory leaks, etc
Neal Norwitz: > I think a bigger bang for the buck would be to buy a Windows box with > Purify. Rational was a real pain to deal with, maybe it's better now > that IBM bought them. Parasoft (Insure++) was even worse to deal > with. My experience with the other Windows option, BoundsChecker, is similarly negative and I haven't bothered upgrading for a couple of versions (so can only use it with VC++ 6). The original developer, NuMega, were great but they were absorbed into Compuware which seems to see it more as a source of consulting income than as a product. I'm fairly experienced with BoundsChecker and related programs (like their profiler) so could run it over a test suite if a license was provided. A demonstration license can probably not be installed on my machine due to earlier installs. Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Building Python with Visual C++ 2005 ExpressEdition
Martin v. Löwis: > The problem (for me, atleast) is that VC is so much more convenient to > work with. In my experience Visual C++ has always produced faster, more compact code than Mingw. While this may not be true with current releases, I'd want to ensure that the normal Python download for Windows didn't become slower. Visual C++ 2005 includes profile guided optimization (although this is not included in the Express Edition) and it would be interesting to see how much of a difference this makes. Microsoft was willing to give some copies of VS to Python developers before so I expect they'd be willing to give some copies of VS Professional or Team System. Tim Delaney: > There was a considerable amount of angst with the 2.4 release that can be > blamed solely on the CRT change (and hence different DLLs to link to). And > with them deprecating ISO standard functions ... One solution to CRT change is to drop direct linking of modules to the CRT and vector them through the core DLL. The core PythonXX.DLL would expose an array of functions (malloc, strdup, getcwd, ...) that would be called by all modules indirectly. Then, it no longer matters which compiler version or compiler you build extension modules with. Its quite a lot of work to do this as each CRT call site needs to change or a well thought through macro scheme be developed. Paul Moore: > The project file conversions seemed to go fine, and the debug builds > were OK, although the deprecation warnings for all the "insecure" CRT > functions was a pain. It might be worth adding > _CRT_SECURE_NO_DEPRECATE to the project defines somehow. I haven't tried to build Python with VC++ 2005 yet, but other code has also required _CRT_NONSTDC_NO_DEPRECATE for some of the file system calls. Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Divorcing str and unicode (no more implicitconversions).
Josiah Carlson: > According to wikipedia (http://en.wikipedia.org/wiki/Latin_alphabet), > various languages have adopted a transliteration of their language > and/or former alphabets into latin. They don't purport to know all of > the reasons why, and I'm not going to speculate. I used to work on software written by Japanese and English speakers at Fujitsu with most developers being Japanese. The rules were that comments could be in Japanese but identifiers were only allowed to contain ASCII characters. Most variable names were poorly chosen with s, p, q, fla (boolean=flag) and flafla being popular. When I asked some Japanese coders why they didn't use Japanese words expressed in ASCII (Romaji), their response was that it was a really weird idea. This is anecdotal but it appears to me that transliterations are not commonly used apart from learning languages and some minimal help for foreigners such as including transliterated names on railway station name boards. Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Divorcing str and unicode (no more implicit conversions).
M.-A. Lemburg: > You mean a slice that slices out the next ? Yes. > This sounds a lot like you'd want iterators for the various > index types. Should be possible to implement on top of the > proposed APIs, e.g. itergraphemes(u), itercodepoints(u), etc. Iterators may be helpful, but can also be too restrictive when the processing is not completely iterative, such as peeking ahead or looking behind to wrap at a word boundary in the display example. There should be It was more that there may leave less scope for error if there was a move away from indexes to slices. The PEP provides ways to specify what you want to examine or modify but it looks to me like returning indexes will see code repetition or additional variables with an increase in fragility. > Note that what most people refer to as "character" is a > grapheme in Unicode speak. A grapheme-oriented string type may be worthwhile although you'd probably have to choose a particular normalisation form to ease processing. > Given that interpretation, > "breaking" Unicode "characters" is something you won't > ever work around with by using larger code units such > as UCS4 compatible ones. I still think we can reduce the scope for errors. > Furthermore, you should also note that surrogates (two > code units encoding one code point) are part of Unicode > life. While you don't need them when storing Unicode > in UCS4 code units, they can still be part of the > Unicode data and the programmer has to be aware of > these. Many programmers can and will ignore surrogates. One day that may bite them but we can't close off text processing to those who have no idea of what surrogates are, or directional marks, or that sorting is locale dependent, or have no understanding of the difference between NFC and NFKD normalization forms. Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Divorcing str and unicode (no more implicitconversions).
Martin v. Löwis: > This aspect of rendering is often not implemented, though. Web browsers > do it correctly, see > ... > GUI frameworks sometimes do it correctly, sometimes don't; most > notably, Tk has no good support for RTL text. Scintilla does a rough job with this. RTL text is displayed correctly as the underlying platform libraries (Windows or GTK+/Pango) handle this aspect when called to draw text. However editing is not performed correctly with the caret not being placed correctly within RTL text and other visual glitches. There is interest in the area and even a funding proposal this week. Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Divorcing str and unicode (no more implicit conversions).
M.-A. Lemburg: > Unicode has the concept of combining code points, e.g. you can > store an "é" (e with a accent) as "e" + "'". Now if you slice > off the accent, you'll break the character that you encoded > using combining code points. > ... > next_(u, index) -> integer > > Returns the Unicode object index for the start of the next > found after u[index] or -1 in case no next element > of this type exists. Should entity breakage be further discouraged by returning a slice here rather than an object index? Something like: i = first_grapheme(u) x = 0 while x < width and u[i] != "\n": x, _ = draw(u[i], (x, y)) i = next_grapheme(u, i) Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Divorcing str and unicode (no more implicit conversions).
Martin v. Löwis: > That's very tricky. If you have multiple implementations, you make > usage at the C API difficult. If you make it either UTF-8 or UTF-32, > you make PythonWin difficult. If you make it UTF-16, you make indexing > difficult. For Windows, the code will get a little uglier, needing to perform an allocation/encoding and deallocation more often then at present but I don't think there will be a speed degradation as Windows is currently performing a conversion from 8 bit to UTF-16 inside many system calls. To minimize the cost of allocation, Python could copy Windows in keeping a small number of commonly sized preallocated buffers handy. For indexing UTF-16, a flag could be set to show if the string is all in the base plane and if not, an index could be constructed when and if needed. It'd be good to get some feel for what proportion of string operations performed require indexing. Many, such as startswith, split, and concatenation don't require indexing. The proportion of operations that use indexing to scan strings would also be interesting as adding a (currentIndex, currentOffset) cursor to string objects would be another approach. Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Divorcing str and unicode (no more implicit conversions).
Guido van Rossum: > Folks, please focus on what Python 3000 should do. > > I'm thinking about making all character strings Unicode (possibly with > different internal representations a la NSString in Apple's Objective > C) and introduce a separate mutable bytes array data type. But I could > use some validation or feedback on this idea from actual > practitioners. I'd like to more tightly define Unicode strings for Python 3000. Currently, Unicode strings may be implemented with either 2 byte (UCS-2) or 4 byte (UTF-32) elements. Python should allow strings to contain any Unicode character and should be indexable yielding characters rather than half characters. Therefore Python strings should appear to be UTF-32. There could still be multiple implementations (using UTF-16 or UTF-8) to preserve space but all implementations should appear to be the same apart from speed and memory use. Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Pythonic concurrency
Bruce Eckel: > I would say that the troublesome meme is that "threads are easy." I > posted an earlier, rather longish message about this. The gist of > which was: "when someone says that threads are easy, I have no idea > what they mean by it." I think you are overcomplicating the issue by looking at too many levels at once. The memory model is something that implementers of threading support need to understand. Users of that threading support just need to know that concurrent access to variables is dangerous and that they should use locks to access shared variables or use other forms of packaged inter-thread communication. Double Checked Locking is an optimization (removal of a lock) of an attempt to better modularize code (by automating the helper object creation). I'd either just leave the lock in or if benchmarking revealed an unacceptable performance problem, allocate the helper object before the resource is accessible to more than one thread. For statics, expose an Init method that gets called when the application is in the initial one user thread state. > But I just finished a 150-page chapter on Concurrency in Java which > took many months to write, based on a large chapter on Concurrency in > C++ which probably took longer to write. I keep in reasonably good > touch with some of the threading experts. I can't get any of them to > say that it's easy, even though they really do understand the issues > and think about it all the time. *Because* of that, they say that it's > hard. Implementing threading is hard. Using threading is not that hard. Its a source of complexity but so are many aspects of development. I get scared by reentrance in UI code. Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] international python
Antoine Pitrou: > I don't have a Windows machine at hand right now to test it, but, even > if this solution works, it breaks the principle of least astonishment: Astonishment is subjective and so a poor tool to measure by. At one stage Ruby tried to follow the more common formulation "principle of least surprise" (POLS) but this produced arguments of the following form: I am surprised by X. Therefore, X contradicts POLS. Therefore, X must be fixed. POLS was then abandoned. > os.path.abspath() should do the Right Thing regardless of what the > current locale is. This was discussed recently and the consensus position was for functions that can not return a value in the default encoding to instead return a unicode value. Correct implementation of this would require not only changing the behaviour of functions returning strings but also those receiving strings (which should treat byte strings as being in the default encoding). This would require a large amount of work, and is unlikely to be performed in the near future. Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] international python
Antoine Pitrou: > As for seamless unicode support, there are also problems sometimes with > filenames and filepaths: see e.g. > https://sourceforge.net/tracker/?func=detail&aid=1283895&group_id=5470&atid=105470 This bug report is using byte string arguments causing byte string processing rather than unicode calls with unicode processing. Windows code that may encounter file paths outside the default locale should stick to unicode for paths. Try converting os.curdir to unicode before calling other functions: os.path.abspath(unicode(os.curdir)) Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Replacement for print in Python 3.0
Gareth McCaughan: > >Interactive use is its own mode and works differently to the base > > language. To print the value of something, just type an expression. > > Doesn't do the same thing. In interactive mode, you are normally interested in the values of things, not their formatting so it does the right thing. If you need particular formatting or interpretation, you can always achieve this. > Do you have any suggestion that's as practically usable > as "print"? The print function proposal is already as usable as the print statement. When I write a print statement, I'd like to be able to redirect that to a log or GUI easily. If print is a function then its interface can be reimplemented but users can't add new statements to Python. Creation of strings containing values could be simplified as that would be applicable in many cases. I actually like being able to append to strings in Java with the second operand being stringified. Perhaps a stringify and catenate operator could be included in Python. Like this: MessageBox("a=" ° a ° "pos=" ° x°","°y) Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Replacement for print in Python 3.0
Gareth McCaughan: > 3. It's convenient for debugging, interactive use, simple scripts, >and various other things. Interactive use is its own mode and works differently to the base language. To print the value of something, just type an expression. Python will evaluate and print the value of the expression. Much easier than adding 'print '. Extended interactive modes like ipython include other conveniences that don't belong in the python language. The problem with print is it becomes a barrier to extending a script into something more ambitious. This then leads to ugly 'features' like '>>' and trailing commas. By all means provide a simple syntax for i/o with the standard streams but ensure it is something that is a firm basis for extension. Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Adding the 'path' module (was Re: Some RFE for review)
Martin v. Löwis: > This appears to be based on the usedDefault return value of > WideCharToMultiByte. I believe this is insufficient: > WideCharToMultiByte might convert Unicode characters to > codepage characters in a lossy way, without using the default > character. For example, it converts U+0308 (combining diaeresis) > to U+00A8 (diaeresis) (or something like that, I forgot the > exact details). So if you have, say, "p-umlaut" (i.e. U+0070 > U+0308), it converts it to U+0070 U+00A8 (in the local code page). > Trying to use this as a filename later fails. There is WC_NO_BEST_FIT_CHARS to defeat that. It says that it will use the default character if the translation can't be round-tripped. Available on WIndows 2000 and XP but not NT4. We could compare the original against the round-tripped as described at http://msdn.microsoft.com/library/default.asp?url=/library/en-us/intl/unicode_2bj9.asp Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Adding the 'path' module (was Re: Some RFE for review)
Martin v. Löwis: > - But then, the wide API gives all results as Unicode. If you want to > promote only those entries that need it, it really means that you > only want to "demote" those that don't need it. But how can you tell > whether an entry needs it? There is no API to find out. I wrote a patch for os.listdir at http://www.scintilla.org/difft.txt that uses WideCharToMultiByte to check if a wide name can be represented in a particular code page and only uses that representation if it fits. This is good for Windows code pages including ASCII and "mbcs" but since Python's sys.getdefaultencoding() can be something that has no code page equivalent, it would have to try converting using strict mode and interpret failure as leaving the name as unicode. > You could declare that anything with characters >128 needs it, > but that would be an incompatible change: If a character >128 in > the system code page is in a file name, listdir currently returns > it in the system code page. It then would return a Unicode string. I now quite like returning unicode for anything non-ASCII on Windows as there is no ambiguity in what the result means and there will be no need to change all the system calls to translate from the default encoding. It is a change to the API which can lead to code breaking but it should break with an exception. Assuming that byte string arguments are using Python's default encoding looks more dangerous with a behavioural change but no notification. Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com