Re: [Python-Dev] Multilingual programming article on the Red Hat Developer blog
On Fri, 12 Sep 2014 07:54:56 +0100 Jeff Allen wrote: > Simply having a block "for private use" seems to create an unmanaged > space for conflict, reminiscent of the "other 128 characters" in > bilingual programming. I wondered if the way to respect use by > applications might be to make it private to a particular sub-class of > str, idly however. It's not private from Python's point of view, it's actually specified in a PEP. So all Python 3 code has to follow the rule, and there's no conflict internally. The characters shouldn't leak out to other applications, unless the user's code does its I/O very badly :-) Regards Antoine. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Multilingual programming article on the Red Hat Developer blog
On September 11, 2014, Jeff Allen wrote: > ... the area of code point > space used for the smuggling of bytes under PEP-383 is not a > "Unicode Private Use Area", but a portion of the trailing surrogate > range. This is a code violation, which I imagine is why > "surrogateescape" is an error handler, not a codec. True, but I believe that is a CPython implementation detail. Other implementations (including jython) should implement the surrogatescape API, but I don't think it is important to use the same internal representation for the invalid bytes. (Well, unless you want to communicate with external tools (GUIs?) that are trying to directly use (effectively bytes rather than strings) in that particular internal encoding when communicating with python.) > lone surrogates preclude a naive use of the platform string library Invalid input often causes problems. Are you saying that there are situations where the platform string library could easily handle invalid characters in general, but has a problem with the specific case of lone surrogates? -jJ -- If there are still threading problems with my replies, please email me with details, so that I can try to resolve them. -jJ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Summary of Python tracker Issues
ACTIVITY SUMMARY (2014-09-05 - 2014-09-12) Python tracker at http://bugs.python.org/ To view or respond to any of the issues listed below, click on the issue. Do NOT respond to this message. Issues counts and deltas: open4652 (+12) closed 29509 (+38) total 34161 (+50) Open issues with patches: 2196 Issues opened (39) == #16662: load_tests not invoked in package/__init__.py http://bugs.python.org/issue16662 reopened by haypo #22343: Install bash activate script on Windows when using venv http://bugs.python.org/issue22343 opened by marfire #22344: Reorganize unittest.mock docs into linear manner http://bugs.python.org/issue22344 opened by py.user #22347: mimetypes.guess_type("//example.com") misinterprets host name http://bugs.python.org/issue22347 opened by vadmium #22348: Documentation of asyncio.StreamWriter.drain() http://bugs.python.org/issue22348 opened by martius #22350: nntplib file write failure causes exception from QUIT command http://bugs.python.org/issue22350 opened by vadmium #22351: NNTP constructor exception leaves socket for garbage collector http://bugs.python.org/issue22351 opened by vadmium #22352: Ensure opcode names and args fit in disassembly output http://bugs.python.org/issue22352 opened by ncoghlan #22354: Highlite tabs in the IDLE http://bugs.python.org/issue22354 opened by Christian.Kleineidam #22355: inconsistent results with inspect.getsource() / inspect.getsou http://bugs.python.org/issue22355 opened by isedev #22356: mention explicitly that stdlib assumes gmtime(0) epoch is 1970 http://bugs.python.org/issue22356 opened by akira #22357: inspect module documentation makes no reference to __qualname_ http://bugs.python.org/issue22357 opened by isedev #22359: Remove incorrect uses of recursive make http://bugs.python.org/issue22359 opened by Sjlver #22360: Adding manually offset parameter to str/bytes split function http://bugs.python.org/issue22360 opened by cwr #22361: Ability to join() threads in concurrent.futures.ThreadPoolExec http://bugs.python.org/issue22361 opened by dktrkranz #22362: Warn about octal escapes > 0o377 in re http://bugs.python.org/issue22362 opened by serhiy.storchaka #22363: argparse AssertionError with add_mutually_exclusive_group and http://bugs.python.org/issue22363 opened by Zacrath #22364: Unify error messages of re and regex http://bugs.python.org/issue22364 opened by serhiy.storchaka #22365: SSLContext.load_verify_locations(cadata) does not accept CRLs http://bugs.python.org/issue22365 opened by Ralph.Broenink #22366: urllib.request.urlopen shoudl take a "context" (SSLContext) ar http://bugs.python.org/issue22366 opened by alex #22367: Please add F_OFD_SETLK, etc support to fcntl.lockf http://bugs.python.org/issue22367 opened by Andrew.Lutomirski #22370: pathlib OS detection http://bugs.python.org/issue22370 opened by Antony.Lee #22371: tests failing with -uall and http_proxy and https_proxy set http://bugs.python.org/issue22371 opened by doko #22374: Replace contextmanager example and improve explanation http://bugs.python.org/issue22374 opened by terry.reedy #22376: urllib2.urlopen().read().splitlines() opening a directory in a http://bugs.python.org/issue22376 opened by alanoe #22377: %Z in strptime doesn't match EST and others http://bugs.python.org/issue22377 opened by cool-RR #22378: SO_MARK support for Linux http://bugs.python.org/issue22378 opened by jpv #22379: Empty exception message of str.join http://bugs.python.org/issue22379 opened by fossilet #22382: sqlite3 connection built from apsw connection should raise Int http://bugs.python.org/issue22382 opened by wtonkin #22384: Tk.report_callback_exception kills process when run with pytho http://bugs.python.org/issue22384 opened by Aivar.Annamaa #22385: Allow 'x' and 'X' to accept bytes-like objects in string forma http://bugs.python.org/issue22385 opened by ncoghlan #22387: Making tempfile.NamedTemporaryFile a class http://bugs.python.org/issue22387 opened by Antony.Lee #22388: Unify style of "Contributed by" notes http://bugs.python.org/issue22388 opened by serhiy.storchaka #22389: Generalize contextlib.redirect_stdout http://bugs.python.org/issue22389 opened by barry #22390: test.regrtest should complain if a test doesn't remove tempora http://bugs.python.org/issue22390 opened by haypo #22391: MSILIB truncates last character in summary information stream http://bugs.python.org/issue22391 opened by Kevin.Phillips #22392: Clarify documentation of __getinitargs__ http://bugs.python.org/issue22392 opened by David.Gilman #22393: multiprocessing.Pool shouldn't hang forever if a worker proces http://bugs.python.org/issue22393 opened by dan.oreilly #22394: Update documentation building to use venv and pip http://bugs.python.org/issue22394 opened by brett.cannon Most recent 15 issues with no replies (15) == #22394: Update documentation build
Re: [Python-Dev] Multilingual programming article on the Red Hat Developer blog
Jeff Allen writes: > Simply having a block "for private use" seems to create an unmanaged > space for conflict, No. The uncharted range of human language (including recently- invented nonsense like "emoticons" and the annual "design a character" contest run by a newpaper in Taipei, with the grand prize being your character gets added to the national standard IIRC, but maybe it's just that newspaper's collection of private space characters) already contains those conflicts. Believe me, "private use space, manage it yourself" was the best they could do. I've been working with the beureaucratic insanity of the Japanese national standard -- it took almost 3 decades before every Japanese citizen could store their names in a computer using government- approved codes -- and the chaos of the Taiwanese national standard -- which contains hordes of characters with one known use and no known meaning, many of them duplicates -- for twenty years now. Neither approach works as well as Unicode's, despite its design-by-committee flaws overlaid with national animosities that can flare into linguicidal vetoes and code-space-stuffing logrolling. > reminiscent of the "other 128 characters" in bilingual > programming. I wondered if the way to respect use by applications > might be to make it private to a particular sub-class of str, idly > however. If I understand your suggestion, that's precisely the intent of PEP 383, to make undecodable bytes in a coded character stream private. But they need to be in the stream one way or another. So PEP 383 chose to use a non-Unicode encoding (based on the "lone surrogate" device invented by Markus Kuhn for utf-8b) to deal with that, and that does effectively make those elements private to Python (but of course not in the Unicode sense, as they're not even characters in Unicode). But I gather the "native" Unicode type in Java doesn't allow you to use that dodge because it checks for malformed Unicode internally (ie, at a level not controllable by Jython). So you have to embed such stream elements in the space of Unicode characters. You have the option of the private space or unallocated (reserved) space. The latter seems like asking for trouble, and the only way to avoid it would be to be prepared to move that data around in case of collision. But that's precisely what I'm suggesting doing in private space. Same issue, either way. Private space with a local registry seems saner. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Multilingual programming article on the Red Hat Developer blog
Jim, Stephen: It seems like we're off topic here, but to answer all as briefly as possible: 1. Java does not really have a Unicode type, therefore not one that validates. It has a String type that is a sequence of UTF-16 code units. There are some String methods and Character methods that deal with code points represented as int. I can put any 16-bit values I like in a String. 2. With proper accounting for indices, and as long as surrogates appear in pairs, I believe operations like find or endswith give correct answers about the unicode, when applied to the UTF-16. This is an attractive implementation option, and mostly what we do. 3. I'm fixing some bugs where we get it wrong beyond the BMP, and the fix involves banning lone surrogates (completely). At present you can't type them in literals but you can sneak them in from Java. 4. I think (with Antoine) if Jython supported PEP-383 byte smuggling, it would have to do it the same way as CPython, as it is visible. It's not impossible (I think), but is messy. Some are strongly against. Jeff Allen On 12/09/2014 16:37, Jim J. Jewett wrote: On September 11, 2014, Jeff Allen wrote: ... "surrogateescape" is an error handler, not a codec. True, but I believe that is a CPython implementation detail. Other implementations (including jython) should implement the surrogatescape API, but I don't think it is important to use the same internal representation for the invalid bytes. lone surrogates preclude a naive use of the platform string library Invalid input often causes problems. Are you saying that there are situations where the platform string library could easily handle invalid characters in general, but has a problem with the specific case of lone surrogates? ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] new hg.python.org server
I just switched hg.python.org from a OSUOSL VM to a Rackspace VM. The new VM is a bit beefier and has what I think is better network connectivity, so hopefully that will improving the speed of repository operations. We also now support HTTPS for repository browsing and cloning, so update all your links to https://hg.python.org! IPv6 support has also returned for those who like that sort of thing. Note the host keys changed, so you'll probably have to futz with known_hosts to quiet ssh down. I apologize, but I noticed that that the current RSA host key is 1024 bits, so I decided to upgrade it to 2048 during the transition. Thanks to Donald Stufft for helping me set this up. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [python-committers] new hg.python.org server
On Sep 12, 2014, at 5:34 PM, Benjamin Peterson wrote: > The > new VM is a bit beefier and has what I think is better network > connectivity, so hopefully that will improving the speed of repository > operations. Thanks Benjamin, the repo is noticeably faster. Raymond ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [python-committers] new hg.python.org server
Just wondering - are there any sys-adminy sort of tasks that could be completed? I mean, I have some (note, some) experience doing this, and I wouldn't mind helping out (I inquired in the buildbot thread as well, but there wasn't much of a response). Thanks Shorya Raj On Sat, Sep 13, 2014 at 1:02 PM, Raymond Hettinger < raymond.hettin...@gmail.com> wrote: > On Sep 12, 2014, at 5:34 PM, Benjamin Peterson > wrote: > > The > new VM is a bit beefier and has what I think is better network > connectivity, so hopefully that will improving the speed of repository > operations. > > > Thanks Benjamin, the repo is noticeably faster. > > > Raymond > > > ___ > Python-Dev mailing list > Python-Dev@python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/rajshorya%40gmail.com > > ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [python-committers] new hg.python.org server
On Fri, Sep 12, 2014, at 21:52, Shorya Raj wrote: > Just wondering - are there any sys-adminy sort of tasks that could be > completed? I mean, I have some (note, some) experience doing this, and I > wouldn't mind helping out (I inquired in the buildbot thread as well, but > there wasn't much of a response). Well, hg.python.org is basically done now. The main thing now is understanding how other services (planet.python.org, bugs.python.org) are setup and moving them to config management. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com