Re: PEP 393 vs UTF-8 Everywhere

2017-01-21 Thread Steven D'Aprano
On Sunday 22 January 2017 06:58, Tim Chase wrote: > Right. It gets even weirder (edge-case'ier) when dealing with > combining characters: > > s = "man\N{COMBINING TILDE}ana" for i, c in enumerate(s): print("%i: %s" % (i, c)) > ... > 0: m > 1: a > 2: n > 3:˜ > 4: a > 5: n > 6: a '

Re: PEP 393 vs UTF-8 Everywhere

2017-01-21 Thread Steve D'Aprano
On Sun, 22 Jan 2017 07:21 am, Pete Forman wrote: > Marko Rauhamaa writes: > >>> py> low = '\uDC37' >> >> That should raise a SyntaxError exception. > > Quite. My point was that with older Python on a narrow build (Windows > and Mac) you need to understand that you are using UTF-16 rather than >

Re: PEP 393 vs UTF-8 Everywhere

2017-01-21 Thread Steve D'Aprano
On Sun, 22 Jan 2017 06:52 am, Marko Rauhamaa wrote: > Pete Forman : > >> Surrogates only exist in UTF-16. They are expressly forbidden in UTF-8 >> and UTF-32. > > Also, they don't exist as Unicode code points. Python shouldn't allow > surrogate characters in strings. Not quite. This is where it

Re: PEP 393 vs UTF-8 Everywhere

2017-01-21 Thread Tim Chase
On 2017-01-22 01:44, Steve D'Aprano wrote: > On Sat, 21 Jan 2017 11:45 pm, Tim Chase wrote: > > > but I'm hard-pressed to come up with any use case where direct > > indexing into a (non-byte)string makes sense unless you've already > > processed/searched up to that point and can use a recorded ind

Re: PEP 393 vs UTF-8 Everywhere

2017-01-21 Thread Matt Ruffalo
On 2017-01-21 10:50, Pete Forman wrote: > Thanks for a very thorough reply, most useful. I'm going to pick you up > on the above, though. > > Surrogates only exist in UTF-16. They are expressly forbidden in UTF-8 > and UTF-32. The rules for UTF-8 were tightened up in Unicode 4 and RFC > 3629 (2003)

Re: How to create a socket.socket() object from a socket fd?

2017-01-21 Thread Grant Edwards
Newsgroups: gmane.comp.python.general From: Grant Edwards Subject: Re: How to create a socket.socket() object from a socket fd? References: Followup-To: I'm still baffled why the standard library fromfd() code dup()s the descriptor. According to the comment in the CPython sources, the au

Re: How to create a socket.socket() object from a socket fd?

2017-01-21 Thread Grant Edwards
On 2017-01-21, Christian Heimes wrote: > You might be interested in my small module > https://pypi.python.org/pypi/socketfromfd/ . I just releases a new > version with a fix for Python 2. Thanks for the hint! :) > > The module correctly detects address family, socket type and proto from > a fd. I

Re: How to create a socket.socket() object from a socket fd?

2017-01-21 Thread Grant Edwards
On 2017-01-21, Chris Angelico wrote: > On Sun, Jan 22, 2017 at 9:41 AM, Grant Edwards > wrote: >> | __init__(self, family=2, type=1, proto=0, _sock=None) >> | >> >> Ah! There's a keyword argument that doesn't appear in the docs, so >> let's try that... > > That's marginally better than my monke

Re: How to create a socket.socket() object from a socket fd?

2017-01-21 Thread Peter Otten
Grant Edwards wrote: > Given a Unix file discriptor for an open TCP socket, I can't figure > out how to create a python 2.7 socket object like those returned by > socket.socket() > > Based on the docs, one might think that socket.fromfd() would do that > (since the docs say that's what it does):

Re: How to create a socket.socket() object from a socket fd?

2017-01-21 Thread Christian Heimes
On 2017-01-21 23:41, Grant Edwards wrote: > On 2017-01-21, Grant Edwards wrote: > >> Given a Unix file discriptor for an open TCP socket, I can't figure >> out how to create a python 2.7 socket object like those returned by >> socket.socket() >> >> Based on the docs, one might think that socket.f

Re: How to create a socket.socket() object from a socket fd?

2017-01-21 Thread Chris Angelico
On Sun, Jan 22, 2017 at 9:41 AM, Grant Edwards wrote: > | __init__(self, family=2, type=1, proto=0, _sock=None) > | > > Ah! There's a keyword argument that doesn't appear in the docs, so > let's try that... That's marginally better than my monkeypatch-after-creation suggestion, but still broad

Re: How to create a socket.socket() object from a socket fd?

2017-01-21 Thread Grant Edwards
On 2017-01-21, Grant Edwards wrote: > Given a Unix file discriptor for an open TCP socket, I can't figure > out how to create a python 2.7 socket object like those returned by > socket.socket() > > Based on the docs, one might think that socket.fromfd() would do that > (since the docs say that's

Re: How to create a socket.socket() object from a socket fd?

2017-01-21 Thread Chris Angelico
On Sun, Jan 22, 2017 at 9:28 AM, Grant Edwards wrote: > Given a Unix file discriptor for an open TCP socket, I can't figure > out how to create a python 2.7 socket object like those returned by > socket.socket() I suspect you can't easily do it. In more recent Pythons, you can socket.socket(filen

How to create a socket.socket() object from a socket fd?

2017-01-21 Thread Grant Edwards
Given a Unix file discriptor for an open TCP socket, I can't figure out how to create a python 2.7 socket object like those returned by socket.socket() Based on the docs, one might think that socket.fromfd() would do that (since the docs say that's what it does): Quoting https://docs.python.org/

Re: PEP 393 vs UTF-8 Everywhere

2017-01-21 Thread eryk sun
On Sat, Jan 21, 2017 at 8:21 PM, Pete Forman wrote: > Marko Rauhamaa writes: > >>> py> low = '\uDC37' >> >> That should raise a SyntaxError exception. > > Quite. My point was that with older Python on a narrow build (Windows > and Mac) you need to understand that you are using UTF-16 rather than

Re: Adding colormaps?

2017-01-21 Thread Martin Schöön
Den 2017-01-21 skrev Gilmeh Serda : > On Wed, 18 Jan 2017 21:41:34 +, Martin Schöön wrote: > >> What I would like to do is to add the perceptually uniform sequential >> colormaps introduced in version 1.5.something. I would like to do this >> without breaking my Debian system in which Matplotli

Re: PEP 393 vs UTF-8 Everywhere

2017-01-21 Thread Pete Forman
Marko Rauhamaa writes: >> py> low = '\uDC37' > > That should raise a SyntaxError exception. Quite. My point was that with older Python on a narrow build (Windows and Mac) you need to understand that you are using UTF-16 rather than Unicode. On a wide build or Python 3.3+ then all is rosy. (At th

Re: PEP 393 vs UTF-8 Everywhere

2017-01-21 Thread Marko Rauhamaa
Pete Forman : > Surrogates only exist in UTF-16. They are expressly forbidden in UTF-8 > and UTF-32. Also, they don't exist as Unicode code points. Python shouldn't allow surrogate characters in strings. Thus the range of code points that are available for use as characters is U+–U+D7F

Re: PEP 393 vs UTF-8 Everywhere

2017-01-21 Thread Jussi Piitulainen
Chris Angelico writes: > On Sun, Jan 22, 2017 at 2:56 AM, Jussi Piitulainen wrote: >> Steve D'Aprano writes: >> >> [snip] >> >>> You could avoid that error by increasing the offset by the right >>> amount: >>> >>> stuff = text[offset + len("ф".encode('utf-8'):] >>> >>> which is awful. I believe th

Re: PEP 393 vs UTF-8 Everywhere

2017-01-21 Thread Chris Angelico
On Sun, Jan 22, 2017 at 2:56 AM, Jussi Piitulainen wrote: > Steve D'Aprano writes: > > [snip] > >> You could avoid that error by increasing the offset by the right >> amount: >> >> stuff = text[offset + len("ф".encode('utf-8'):] >> >> which is awful. I believe that's what Go and Julia expect you t

Re: PEP 393 vs UTF-8 Everywhere

2017-01-21 Thread Jussi Piitulainen
Steve D'Aprano writes: [snip] > You could avoid that error by increasing the offset by the right > amount: > > stuff = text[offset + len("ф".encode('utf-8'):] > > which is awful. I believe that's what Go and Julia expect you to do. Julia provides a method to get the next index. let text = "ἐπὶ

Re: PEP 393 vs UTF-8 Everywhere

2017-01-21 Thread Pete Forman
Steve D'Aprano writes: > [...] > Another factor which I didn't see discussed anywhere is that Python > strings treat surrogates as normal code points. I believe that would > be troublesome for a UTF-8 implementation: > > py> '\uDC37'.encode('utf-8') > Traceback (most recent call last): > File "

Re: PEP 393 vs UTF-8 Everywhere

2017-01-21 Thread Steve D'Aprano
On Sat, 21 Jan 2017 11:45 pm, Tim Chase wrote: > but I'm hard-pressed to come up with any use case where direct > indexing into a (non-byte)string makes sense unless you've already > processed/searched up to that point and can use a recorded index > from that processing/search. Let's take a simp

Re: PEP 393 vs UTF-8 Everywhere

2017-01-21 Thread Steve D'Aprano
On Sat, 21 Jan 2017 09:35 am, Pete Forman wrote: > Can anyone point me at a rationale for PEP 393 being incorporated in > Python 3.3 over using UTF-8 as an internal string representation? I've read over the PEP, and the email discussion, and there is very little mention of UTF-8, and as far as I

Re: PEP 393 vs UTF-8 Everywhere

2017-01-21 Thread Tim Chase
On 2017-01-21 11:58, Chris Angelico wrote: > So, how could you implement this function? The current > implementation maintains an index - an integer position through the > string. It repeatedly requests the next character as string[idx], > and can also slice the string (to check for keywords like "

Problems with python3.6 on one system, but OK on another

2017-01-21 Thread Cecil Westerhof
I build python3.6 on two systems. On one system everything is OK: Python 3.6.0 (default, Jan 21 2017, 11:19:56) [GCC 4.9.2] on linux Type "help", "copyright", "credits" or "license" for more information. But on another I get: Could not find platform dependent libraries Consider setting $PYTHONH

Re: Let ipython3 use the latest python3

2017-01-21 Thread Chris Warrick
On 21 January 2017 at 12:30, Cecil Westerhof wrote: > I built python3.6, but ipython3 is still using the old one (3.4.5). > How can I make ipython3 use 3.6? All packages you have installed are tied to a specific Python version. If you want to use IPython with Python 3.6, you need to install it fo

Let ipython3 use the latest python3

2017-01-21 Thread Cecil Westerhof
I built python3.6, but ipython3 is still using the old one (3.4.5). How can I make ipython3 use 3.6? -- Cecil Westerhof Senior Software Engineer LinkedIn: http://www.linkedin.com/in/cecilwesterhof -- https://mail.python.org/mailman/listinfo/python-list

Re: PEP 393 vs UTF-8 Everywhere

2017-01-21 Thread Paul Rubin
Chris Angelico writes: > You can't do a look-ahead with a vanilla string iterator. That's > necessary for a lot of parsers. For JSON? For other parsers you usually have a tokenizer that reads characters with maybe 1 char of lookahead. > Yes, which gives a two-level indexing (first find the stra