Re: Python 3.7+ cannot print unicode characters when output is redirected to file - is this a bug?

2022-11-13 Thread Eryk Sun
On 11/13/22, Jessica Smith <12jessicasmit...@gmail.com> wrote: > Consider the following code ran in Powershell or cmd.exe: > > $ python -c "print('└')" > └ > > $ python -c "print('└')" > test_file.txt > Traceback (most recent call last): > File "", line 1, in > File "C:\Program Files\Python38\

Re: Python 3.7+ cannot print unicode characters when output is redirected to file - is this a bug?

2022-11-13 Thread Thomas Passin
t;C:\Program Files\Python38\lib\encodings\cp1252.py", line 19, in encode return codecs.charmap_encode(input,self.errors,encoding_table)[0] UnicodeEncodeError: 'charmap' codec can't encode character '\u2514' in position 0: character maps to Is this a known limit

Re: Python 3.7+ cannot print unicode characters when output is redirected to file - is this a bug?

2022-11-13 Thread Barry
t > Traceback (most recent call last): > File "", line 1, in > File "C:\Program Files\Python38\lib\encodings\cp1252.py", line 19, in encode >return codecs.charmap_encode(input,self.errors,encoding_table)[0] > UnicodeEncodeError: 'charmap' codec can&#

Python 3.7+ cannot print unicode characters when output is redirected to file - is this a bug?

2022-11-13 Thread Jessica Smith
2.py", line 19, in encode return codecs.charmap_encode(input,self.errors,encoding_table)[0] UnicodeEncodeError: 'charmap' codec can't encode character '\u2514' in position 0: character maps to Is this a known limitation of Windows + Unicode? I understand th

[Python-announce] ANN: unicode 2.9

2022-06-03 Thread garabik-news-2005-05
unicode is a simple python command line utility that displays properties for a given unicode character, or searches unicode database for a given name. It was written with Linux in mind, but should work almost everywhere (including MS Windows and MacOSX), UTF-8 console is recommended. ˙pɹɐpuɐʇs

Re: Printing Unicode strings in a list

2022-04-30 Thread Chris Angelico
On Sun, 1 May 2022 at 00:03, Vlastimil Brom wrote: > (Even the redundant u prefix from your python2 sample is apparently > accepted, maybe for compatibility reasons.) Yes, for compatibility reasons. It wasn't accepted in Python 3.0, but 3.3 re-added it to make porting easier. It doesn't do anythi

Re: Printing Unicode strings in a list

2022-04-30 Thread Vlastimil Brom
have good reasons for doing so and > will be moving to Python 3.x in due course. > > I have the following questions arising from the log: > > 1. Why does the second print statement not produce [ ║] or ["║"] ? > > 2. Should the second print statement produce [ ║] or [

Re: Printing Unicode strings in a list

2022-04-28 Thread Rob Cliffe via Python-list
On 28/04/2022 14:27, Stephen Tucker wrote: To Cameron Simpson, Thanks for your in-depth and helpful reply. I have noted it and will be giving it close attention when I can. The main reason why I am still using Python 2.x is that my colleagues are still using a GIS system that has a Python pro

Re: Printing Unicode strings in a list

2022-04-28 Thread Jon Ribbens via Python-list
don't have their own str converter, so fall back to repr instead, which outputs '[', followed by the repr of each list item separated by ', ', followed by ']'. > 2. Should the second print statement produce [ ║] or ["║"] ? There's certainly n

Re: Printing Unicode strings in a list

2022-04-28 Thread Stephen Tucker
tement not produce [ ║] or ["║"] ? > > Because print() prints the str() or each of its arguments, and str() of > a list if the same as its repr(), which is a list of the repr()s of > every item in the list. Repr of a Unicode string looks like what you > have in Python 2.

Re: Printing Unicode strings in a list

2022-04-28 Thread Cameron Simpson
course. Love to hear those reasons. Not suggesting that they are invalid. >I have the following questions arising from the log: >1. Why does the second print statement not produce [ ║] or ["║"] ? Because print() prints the str() or each of its arguments, and str() of a list

Printing Unicode strings in a list

2022-04-28 Thread Stephen Tucker
oes the second print statement not produce [ ║] or ["║"] ? 2. Should the second print statement produce [ ║] or ["║"] ? 3. Given that I want to print a list of Unicode strings so that their characters are displayed (instead of their Unicode codepoint definitions), is there a more P

Re: 'äÄöÖüÜ' in Unicode (utf-8)

2022-04-07 Thread Anssi Saari
Dennis Lee Bieber writes: > On Fri, 1 Apr 2022 03:59:32 +1100, Chris Angelico > declaimed the following: > > >>That's jmf. Ignore him. He knows nothing about Unicode and is >>determined to make everyone aware of that fact. >> >>He got blocked from the

Re: 'äÄöÖüÜ' in Unicode (utf-8)

2022-04-01 Thread Chris Angelico
On Fri, 1 Apr 2022 at 11:16, Dennis Lee Bieber wrote: > > On Fri, 1 Apr 2022 03:59:32 +1100, Chris Angelico > declaimed the following: > > > >That's jmf. Ignore him. He knows nothing about Unicode and is > >determined to make everyone aware of that fact. > &

Re: 'äÄöÖüÜ' in Unicode (utf-8)

2022-03-31 Thread Dennis Lee Bieber
On Fri, 1 Apr 2022 03:59:32 +1100, Chris Angelico declaimed the following: >That's jmf. Ignore him. He knows nothing about Unicode and is >determined to make everyone aware of that fact. > >He got blocked from the mailing list ages ago, and I don't think >anyone's

Re: 'äÄöÖüÜ' in Unicode (utf-8)

2022-03-31 Thread Chris Angelico
gt;>> len('äÄöÖüÜ'.encode('utf-8')) > >12 > >>>> > >>>> ? > > Is there a question in there somewhere? > > Crystal ball is hazy... > > However... Note that once you encode the Unicode literal, you h

Re: 'äÄöÖüÜ' in Unicode (utf-8)

2022-03-31 Thread Dennis Lee Bieber
;> >>>> ? Is there a question in there somewhere? Crystal ball is hazy... However... Note that once you encode the Unicode literal, you have a BYTE string. There are 12 bytes in that binary -- it is NOT considered Unicode at that point (only when you decode it with th

Re: ANN: unicode 2.8

2021-01-02 Thread Chris Angelico
On Sun, Jan 3, 2021 at 10:28 AM Terry Reedy wrote: > > And when implementing this, it was a no-brainer to include also the > > brexit varian (verbatim). > > I assume you meant 'variation' and not Varian, the maker of scientific > instruments. I assumed simple typo for "variant" ChrisA -- https:

Re: ANN: unicode 2.8

2021-01-02 Thread Terry Reedy
On 1/1/2021 3:48 PM, garabik-news-2005...@kassiopeia.juls.savba.sk wrote: Terry Reedy wrote: On 12/31/2020 9:36 AM, garabik-news-2005...@kassiopeia.juls.savba.sk wrote: unicode is a simple python command line utility that displays properties for a given unicode character, or searches unicode

Re: ANN: unicode 2.8

2021-01-01 Thread garabik-news-2005-05
Terry Reedy wrote: > On 12/31/2020 9:36 AM, garabik-news-2005...@kassiopeia.juls.savba.sk wrote: >> unicode is a simple python command line utility that displays >> properties for a given unicode character, or searches >> unicode database for a given name. > ... >> C

Re: ANN: unicode 2.8

2020-12-31 Thread Terry Reedy
On 12/31/2020 9:36 AM, garabik-news-2005...@kassiopeia.juls.savba.sk wrote: unicode is a simple python command line utility that displays properties for a given unicode character, or searches unicode database for a given name. ... Changes since previous versions: * display ASCII table

ANN: unicode 2.8

2020-12-31 Thread garabik-news-2005-05
unicode is a simple python command line utility that displays properties for a given unicode character, or searches unicode database for a given name. It was written with Linux in mind, but should work almost everywhere (including MS Windows and MacOSX), UTF-8 console is recommended. ˙pɹɐpuɐʇs

Re: Friday Finking: Beyond implementing Unicode

2020-06-17 Thread Terry Reedy
On 6/16/2020 7:45 PM, DL Neil via Python-list wrote: On 13/06/20 4:47 AM, Terry Reedy wrote: There was a recent thread on python-ideas discussing this.  It started with arrow characters.  There have been others. Am pleased to hear that it's neither 'new' nor 'way out there'... The idea has b

Re: Friday Finking: Beyond implementing Unicode

2020-06-16 Thread DL Neil via Python-list
There was a recent thread on python-ideas discussing this.  It started with arrow characters.  There have been others. Am pleased to hear that it's neither 'new' nor 'way out there'... Am not subscribed to that list. Went looking for its archives, but failed - there's no "ideas" on (https://

Re: Friday Finking: Beyond implementing Unicode

2020-06-16 Thread DL Neil via Python-list
On 13/06/20 5:11 AM, Dennis Lee Bieber wrote: On Fri, 12 Jun 2020 18:03:55 +1200, DL Neil via Python-list declaimed the following: There is/was a language called "APL" (and yes the acronym means "A Programming Language", and yes it started the craze, through "B" (and BCPL), and yes, that brou

Re: Friday Finking: Beyond implementing Unicode

2020-06-16 Thread DL Neil via Python-list
On 13/06/20 4:47 AM, Terry Reedy wrote: On 6/12/2020 2:03 AM, DL Neil via Python-list wrote: Unicode has given us access to a wealth of mathematical and other symbols. Hardware and soft-/firm-ware flexibility enable us to move beyond and develop new 'standards'. Do we have opport

Re: Friday Finking: Beyond implementing Unicode

2020-06-12 Thread Terry Reedy
On 6/12/2020 2:03 AM, DL Neil via Python-list wrote: Unicode has given us access to a wealth of mathematical and other symbols. Hardware and soft-/firm-ware flexibility enable us to move beyond and develop new 'standards'. Do we have opportunities to make computer programming

Re: Friday Finking: Beyond implementing Unicode

2020-06-12 Thread Chris Angelico
On Fri, Jun 12, 2020 at 9:11 PM Elliott Roper wrote: > > On 12 Jun 2020 at 09:47:04 BST, "moi" wrote: > i) Who cares? Don't bother responding to him. He's somehow gotten the idea that Python's Unicode support is broken, and he spews his vomit out onto the ne

Re: Friday Finking: Beyond implementing Unicode

2020-06-12 Thread Elliott Roper
On 12 Jun 2020 at 09:47:04 BST, "moi" wrote: > i) Today there people, who are still not understanding this: > 'Å'.encode('utf-8') > b'\xc3\x85' 'Å'.encode('utf-16-le') > b'\xc5\x00' 'Å'.encode('utf-32-le') > b'\xc5\x00\x00\x00' > > ii) On a Western Europen Windows, Py 3 is not ev

Friday Finking: Beyond implementing Unicode

2020-06-11 Thread DL Neil via Python-list
Unicode has given us access to a wealth of mathematical and other symbols. Hardware and soft-/firm-ware flexibility enable us to move beyond and develop new 'standards'. Do we have opportunities to make computer programming more math-familiar and/or more logically-expressive, and t

Re: ÿ in Unicode

2020-03-07 Thread Grant Edwards
On 2020-03-07, Jon Ribbens via Python-list wrote: > On 2020-03-06, Jon Ribbens wrote: >> What's the bug, or source of amusement? > > Oh, that's fun. There's a Russian Fidonet gateway, that somehow > still exists, that's re-injecting usenet posts back into the group. Last time I think it was one

Re: ÿ in Unicode

2020-03-07 Thread Richard Damon
On 3/7/20 12:52 PM, Ben Bacarisse wrote: > moi writes: > >> Le samedi 7 mars 2020 16:41:10 UTC+1, R.Wieser a écrit : >>> Moi, >>> Fortunately, UTF-8 has not been created the Python devs. >>> >>> And there we go again, making vague statements/accusations - without >>> /anything/ to back it u

Re: ÿ in Unicode

2020-03-07 Thread Ben Bacarisse
moi writes: > Le samedi 7 mars 2020 16:41:10 UTC+1, R.Wieser a écrit : >> Moi, >> >> > Fortunately, UTF-8 has not been created the Python devs. >> >> And there we go again, making vague statements/accusations - without >> /anything/ to back it up ofcourse >> >> Kiddo, you have posted a couple

Re: ÿ in Unicode

2020-03-07 Thread R.Wieser
Moi, > Fortunately, UTF-8 has not been created the Python devs. And there we go again, making vague statements/accusations - without /anything/ to back it up ofcourse Kiddo, you have posted a couple of messages now, but have said exactly nothing. Are you sure you do not want to go into polit

Re: ÿ in Unicode

2020-03-07 Thread R.Wieser
Moi, > - Today, there are still people who do not understand a > "ÿ' can not be *safely* encoded with a single byte. It can (and has been done for ages), just not in the character encoding method you've choosen to use. > - Python == Latin-1 mess (as somebody wrote on a mailing list). Putting b

Re: ÿ in Unicode

2020-03-06 Thread Jon Ribbens via Python-list
On 2020-03-06, Jon Ribbens wrote: > What's the bug, or source of amusement? Oh, that's fun. There's a Russian Fidonet gateway, that somehow still exists, that's re-injecting usenet posts back into the group. -- https://mail.python.org/mailman/listinfo/python-list

Re: ÿ in Unicode

2020-03-06 Thread Chris Angelico
On Fri, Mar 6, 2020 at 9:31 PM Ben Bacarisse wrote: > > moi writes: > > > Le jeudi 5 mars 2020 13:20:38 UTC+1, Ben Bacarisse a ÄäCcrit : > >> moi writes: > >> > >> 'Ääâ¿'.encode('utf-8') > >> > b'\xc3\xbf' > >> 'Ääâ¿'.encode('utf-16-le') > >> > b'\xff\x00' > >> 'Ääâ¿'.encode('utf-

Re: ÿ in Unicode

2020-03-06 Thread Ben Bacarisse
moi writes: > Le jeudi 5 mars 2020 13:20:38 UTC+1, Ben Bacarisse a ÄCcrit : >> moi writes: >> >> 'Ä¿'.encode('utf-8') >> > b'\xc3\xbf' >> 'Ä¿'.encode('utf-16-le') >> > b'\xff\x00' >> 'Ä¿'.encode('utf-32-le') >> > b'\xff\x00\x00\x00' >> > >> That all looks as expected. > Yes > >>Is

Re: ÿ in Unicode

2020-03-06 Thread Jon Ribbens via Python-list
t;>> >>>> That all looks as expected. >>> Yes >>> >>>>Is there something about the output that puzzles you? >>> No >>> >>>>Did you have a question? >>> No, only a comment >>> >>> This buggy language

Re: ÿ in Unicode

2020-03-06 Thread Pieter van Oostrum
Jon Ribbens writes: > On 2020-03-06, moi wrote: >> Le jeudi 5 mars 2020 13:20:38 UTC+1, Ben Bacarisse a ÄäCcritÄø : >>> moi writes: >>> 'Ääâ¿'.encode('utf-8') >>> > b'\xc3\xbf' >>> 'Ääâ¿'.encode('utf-16-le') >>> > b'\xff\x00' >>> 'Ääâ¿'.encode('utf-32-le') >>> > b'\xff\x00\x00\x0

Re: ÿ in Unicode

2020-03-06 Thread Jon Ribbens via Python-list
On 2020-03-06, moi wrote: > Le jeudi 5 mars 2020 13:20:38 UTC+1, Ben Bacarisse a ÄäCcritÄø : >> moi writes: >> 'Ääâ¿'.encode('utf-8') >> > b'\xc3\xbf' >> 'Ääâ¿'.encode('utf-16-le') >> > b'\xff\x00' >> 'Ääâ¿'.encode('utf-32-le') >> > b'\xff\x00\x00\x00' > >> That all looks as expect

Re: ÿ in Unicode

2020-03-06 Thread Pieter van Oostrum
Jon Ribbens writes: > On 2020-03-06, moi wrote: >> Le jeudi 5 mars 2020 13:20:38 UTC+1, Ben Bacarisse a ÄCcritâ : >>> moi writes: >>> 'Ä¿'.encode('utf-8') >>> > b'\xc3\xbf' >>> 'Ä¿'.encode('utf-16-le') >>> > b'\xff\x00' >>> 'Ä¿'.encode('utf-32-le') >>> > b'\xff\x00\x00\x00' >> >>

Re: ÿ in Unicode

2020-03-06 Thread Ben Bacarisse
moi writes: > Le jeudi 5 mars 2020 13:20:38 UTC+1, Ben Bacarisse a écrit : >> moi writes: >> >> 'ÿ'.encode('utf-8') >> > b'\xc3\xbf' >> 'ÿ'.encode('utf-16-le') >> > b'\xff\x00' >> 'ÿ'.encode('utf-32-le') >> > b'\xff\x00\x00\x00' >> > >> That all looks as expected. > Yes > >>Is the

Re: ÿ in Unicode

2020-03-06 Thread Chris Angelico
On Fri, Mar 6, 2020 at 9:31 PM Ben Bacarisse wrote: > > moi writes: > > > Le jeudi 5 mars 2020 13:20:38 UTC+1, Ben Bacarisse a ÄCcrit : > >> moi writes: > >> > >> 'Ä¿'.encode('utf-8') > >> > b'\xc3\xbf' > >> 'Ä¿'.encode('utf-16-le') > >> > b'\xff\x00' > >> 'Ä¿'.encode('utf-32-le')

Re: ÿ in Unicode

2020-03-06 Thread Jon Ribbens via Python-list
On 2020-03-06, moi wrote: > Le jeudi 5 mars 2020 13:20:38 UTC+1, Ben Bacarisse a ÄCcritâ : >> moi writes: >> 'Ä¿'.encode('utf-8') >> > b'\xc3\xbf' >> 'Ä¿'.encode('utf-16-le') >> > b'\xff\x00' >> 'Ä¿'.encode('utf-32-le') >> > b'\xff\x00\x00\x00' > >> That all looks as expected. > Ye

Re: ÿ in Unicode

2020-03-06 Thread Jon Ribbens via Python-list
>>>> That all looks as expected. >>> Yes >>> >>>>Is there something about the output that puzzles you? >>> No >>> >>>>Did you have a question? >>> No, only a comment >>> >>> This buggy language is ver

Re: ÿ in Unicode

2020-03-06 Thread Jon Ribbens via Python-list
>>>> That all looks as expected. >>> Yes >>> >>>>Is there something about the output that puzzles you? >>> No >>> >>>>Did you have a question? >>> No, only a comment >>> >>> This buggy language is very amus

Re: ÿ in Unicode

2020-03-06 Thread Pieter van Oostrum
Jon Ribbens writes: > On 2020-03-06, moi wrote: >> Le jeudi 5 mars 2020 13:20:38 UTC+1, Ben Bacarisse a écrit : >>> moi writes: >>> 'ÿ'.encode('utf-8') >>> > b'\xc3\xbf' >>> 'ÿ'.encode('utf-16-le') >>> > b'\xff\x00' >>> 'ÿ'.encode('utf-32-le') >>> > b'\xff\x00\x00\x00' >> >>> Tha

Re: ÿ in Unicode

2020-03-06 Thread Jon Ribbens via Python-list
On 2020-03-06, moi wrote: > Le jeudi 5 mars 2020 13:20:38 UTC+1, Ben Bacarisse a écrit : >> moi writes: >> 'ÿ'.encode('utf-8') >> > b'\xc3\xbf' >> 'ÿ'.encode('utf-16-le') >> > b'\xff\x00' >> 'ÿ'.encode('utf-32-le') >> > b'\xff\x00\x00\x00' > >> That all looks as expected. > Yes > >

Re: ÿ in Unicode

2020-03-06 Thread Chris Angelico
On Fri, Mar 6, 2020 at 9:31 PM Ben Bacarisse wrote: > > moi writes: > > > Le jeudi 5 mars 2020 13:20:38 UTC+1, Ben Bacarisse a écrit : > >> moi writes: > >> > >> 'ÿ'.encode('utf-8') > >> > b'\xc3\xbf' > >> 'ÿ'.encode('utf-16-le') > >> > b'\xff\x00' > >> 'ÿ'.encode('utf-32-le') > >

Re: ÿ in Unicode

2020-03-06 Thread Ben Bacarisse
moi writes: > Le jeudi 5 mars 2020 13:20:38 UTC+1, Ben Bacarisse a écrit : >> moi writes: >> >> 'ÿ'.encode('utf-8') >> > b'\xc3\xbf' >> 'ÿ'.encode('utf-16-le') >> > b'\xff\x00' >> 'ÿ'.encode('utf-32-le') >> > b'\xff\x00\x00\x00' >> > >> That all looks as expected. > Yes > >>Is t

Re: ÿ in Unicode

2020-03-05 Thread Ben Bacarisse
moi writes: 'ÿ'.encode('utf-8') > b'\xc3\xbf' 'ÿ'.encode('utf-16-le') > b'\xff\x00' 'ÿ'.encode('utf-32-le') > b'\xff\x00\x00\x00' That all looks as expected. Is there something about the output that puzzles you? Did you have a question? -- Ben. -- https://mail.python.org/mail

Re: Unicode filenames

2019-12-07 Thread Chris Angelico
; many, many years! If they're that short and people are depending on them, it won't be too much work to port them. And you gain a huge measure of reliability: you no longer have to worry about "Unicode filenames" - or, to be more precise, "non-ASCII filenames" -

Re: Unicode filenames

2019-12-07 Thread Bob van der Poel
> >>> I have some files which came off the net with, I'm assuming, unicode > >>> characters in the names. I have a very short program which takes the > >>> filename and puts into an emacs buffer, and then lets me add > information > >> to > &

Re: Unicode filenames

2019-12-07 Thread DL Neil via Python-list
On 8/12/19 5:50 AM, Bob van der Poel wrote: On Sat, Dec 7, 2019 at 4:00 AM Barry Scott wrote: On 6 Dec 2019, at 18:17, Bob van der Poel wrote: I have some files which came off the net with, I'm assuming, unicode characters in the names. I have a very short program which takes the fil

Re: Unicode filenames

2019-12-07 Thread Bob van der Poel
On Sat, Dec 7, 2019 at 4:00 AM Barry Scott wrote: > > > > On 6 Dec 2019, at 18:17, Bob van der Poel wrote: > > > > I have some files which came off the net with, I'm assuming, unicode > > characters in the names. I have a very short program which takes the

Re: Unicode filenames

2019-12-07 Thread Barry Scott
> On 6 Dec 2019, at 18:17, Bob van der Poel wrote: > > I have some files which came off the net with, I'm assuming, unicode > characters in the names. I have a very short program which takes the > filename and puts into an emacs buffer, and then lets me add informatio

Re: Unicode filenames

2019-12-07 Thread Peter Otten
Bob van der Poel wrote: > I have some files which came off the net with, I'm assuming, unicode > characters in the names. I have a very short program which takes the > filename and puts into an emacs buffer, and then lets me add information > to that new file (it's a poor m

Re: Unicode filenames

2019-12-06 Thread Terry Reedy
On 12/6/2019 1:17 PM, Bob van der Poel wrote: I have some files which came off the net with, I'm assuming, unicode characters in the names. I have a very short program which takes the filename and puts into an emacs buffer, and then lets me add information to that new file (it's a poo

Re: Unicode filenames

2019-12-06 Thread DL Neil via Python-list
On 7/12/19 7:17 AM, Bob van der Poel wrote: I have some files which came off the net with, I'm assuming, unicode characters in the names. I have a very short program which takes the filename and puts into an emacs buffer, and then lets me add information to that new file (it's a poo

Unicode filenames

2019-12-06 Thread Bob van der Poel
I have some files which came off the net with, I'm assuming, unicode characters in the names. I have a very short program which takes the filename and puts into an emacs buffer, and then lets me add information to that new file (it's a poor man's DB). Next, I can look up text in th

Re: Unicode UCS2, UCS4 and ... UCS1

2019-09-19 Thread MRAB
On 2019-09-19 09:55, Gregory Ewing wrote: Eli the Bearded wrote: There isn't anything called UCS1. Apparently there is, but it's not a character set, it's a loudspeaker. https://www.bhphotovideo.com/c/product/1205978-REG/yorkville_sound_ucs1_1200w_15_horn_loaded.html The OP might mean Py_UCS

Re: Unicode UCS2, UCS4 and ... UCS1

2019-09-19 Thread Gregory Ewing
Eli the Bearded wrote: There isn't anything called UCS1. Apparently there is, but it's not a character set, it's a loudspeaker. https://www.bhphotovideo.com/c/product/1205978-REG/yorkville_sound_ucs1_1200w_15_horn_loaded.html -- Greg -- https://mail.python.org/mailman/listinfo/python-list

Re: Unicode UCS2, UCS4 and ... UCS1

2019-09-17 Thread Chris Angelico
On Wed, Sep 18, 2019 at 6:51 AM Eli the Bearded <*@eli.users.panix.com> wrote: > > In comp.lang.python, moi wrote: > > I hope, one day, for those who are interested in Unicode, > > they find a book, publication, ... which will explain > > what is UCS1. > > The

Re: Unicode UCS2, UCS4 and ... UCS1

2019-09-17 Thread Eli the Bearded
In comp.lang.python, moi wrote: > I hope, one day, for those who are interested in Unicode, > they find a book, publication, ... which will explain > what is UCS1. There isn't anything called UCS1. There is a UTF-1, but don't use it. UTF-8 is better in every way. https://en

Re: unicode mail list archeology

2019-04-20 Thread Luuk
On 20-4-2019 12:47, Luuk wrote: On 20-4-2019 11:26, wxjmfa...@gmail.com wrote: http://unicode.org/mail-arch/unicode-ml/Archives-Old/UML018/0594.html [quoot] > It is simple to make a compacter version of UTF-8 using the base > 256 character codes were possible (comacter for many lan

Re: unicode mail list archeology

2019-04-20 Thread Luuk
On 20-4-2019 11:26, wxjmfa...@gmail.com wrote: http://unicode.org/mail-arch/unicode-ml/Archives-Old/UML018/0594.html [quoot] > It is simple to make a compacter version of UTF-8 using the base > 256 character codes were possible (comacter for many languages). No. If you think otherwis

Re: Python2.7 unicode conundrum

2018-11-26 Thread Robert Latest via Python-list
Richard Damon wrote: > Why do you say it has been convert to 'Latin'. The string prints as > being Unicode. Internally Python doesn't store strings as UTF-8, but as > plain Unicode (UCS-2 or UCS-4 as needed), and code-point E4 is the > character you want. You'r

Re: Python2.7 unicode conundrum

2018-11-25 Thread Richard Damon
8 20 2d 2a 2d 0a 0a 73 20 3d 20 75 27 |utf8 -*-..s = u'| > 0020 c3 a4 27 0a 0a 70 72 69 6e 74 28 73 29 0a 70 72 |..'..print(s).pr| > 0030 69 6e 74 28 28 73 2c 20 29 29 0a 0a |int((s,))..| > 003c > dh@jenna:~/python$ python unicode.py > ä &g

Re: Python2.7 unicode conundrum

2018-11-25 Thread Thomas Jollans
4' in the > third line of the hexdump). When just printed, the string "s" is > displayed correctly as 'ä' (a umlaut), but the string representation > shows that it seems to have been converted to latin-1 'e4' somewhere on > the way. It's not being con

Python2.7 unicode conundrum

2018-11-25 Thread Robert Latest via Python-list
Hi folks, what semmingly started out as a weird database character encoding mix-up could be boiled down to a few lines of pure Python. The source-code below is real utf8 (as evidenced by the UTF code point 'c3 a4' in the third line of the hexdump). When just printed, the string "s" is displayed cor

Re: Email parsing and unicode/utf8

2018-10-15 Thread dieter
Thomas Jollans writes: > I just stumbled over some curious behaviour of the stdlib email parsing > APIs which accept strings rather than bytes. It appears that you can't > parse an 8-bit UTF-8 message you have as a str without first encoding it. The primary purpose of an email parser is likely th

Email parsing and unicode/utf8

2018-10-15 Thread Thomas Jollans
Hi, I just stumbled over some curious behaviour of the stdlib email parsing APIs which accept strings rather than bytes. It appears that you can't parse an 8-bit UTF-8 message you have as a str without first encoding it. The docs

Re: Non-unicode file names

2018-08-09 Thread Thomas Jollans
On 09/08/18 05:13, INADA Naoki wrote: > Please use Python 3.7. > > Python 3.7 has several improvements on this area. Thanks! Darkly remembering something about UTF-8 mode, I suspected it might... > > * When PEP 538 or 540 is used, default error handler for stdio is > surrogateescape > * You can

Re: Non-unicode file names

2018-08-08 Thread Marko Rauhamaa
INADA Naoki : > For Python 3.6, I think best way to allow arbitrary bytes on stdout is > using `PYTHONIOENCODING=utf-8:surrogateescape` environment variable. Good info! Marko -- https://mail.python.org/mailman/listinfo/python-list

Re: Non-unicode file names

2018-08-08 Thread INADA Naoki
Please use Python 3.7. Python 3.7 has several improvements on this area. * When PEP 538 or 540 is used, default error handler for stdio is surrogateescape * You can sys.stdout.reconfigure(errors='surrogateescape') For Python 3.6, I think best way to allow arbitrary bytes on stdout is using `PYTH

Re: Non-unicode file names

2018-08-08 Thread Cameron Simpson
On 09Aug2018 03:14, MRAB wrote: [...] Is it true that Unix filenames can contain control characters, e.g. \x07? Yep. They're just byte strings. You can't have \0 (NUL) because the API uses NUL terminated strings, and you can't use slash '/' in the filename components because that is the comp

Re: Non-unicode file names

2018-08-08 Thread MRAB
On 2018-08-09 01:14, Thomas Jollans wrote: On 09/08/18 01:48, MRAB wrote: On 2018-08-08 23:16, Thomas Jollans wrote: On *nix, file names are bytes. In real life, we prefer to think of file names as strings. How non-ASCII file names are created is determined by the locale, and on most systems th

Re: Non-unicode file names

2018-08-08 Thread Thomas Jollans
On 09/08/18 01:48, MRAB wrote: > On 2018-08-08 23:16, Thomas Jollans wrote: >> On *nix, file names are bytes. In real life, we prefer to think of file >> names as strings. How non-ASCII file names are created is determined by >> the locale, and on most systems these days, every locale uses UTF-8 an

Re: Non-unicode file names

2018-08-08 Thread MRAB
On 2018-08-08 23:16, Thomas Jollans wrote: On *nix, file names are bytes. In real life, we prefer to think of file names as strings. How non-ASCII file names are created is determined by the locale, and on most systems these days, every locale uses UTF-8 and everybody's happy. Of course this does

Non-unicode file names

2018-08-08 Thread Thomas Jollans
On *nix, file names are bytes. In real life, we prefer to think of file names as strings. How non-ASCII file names are created is determined by the locale, and on most systems these days, every locale uses UTF-8 and everybody's happy. Of course this doesn't mean you'll never run into and old direct

Re: Unicode [was Re: Cult-like behaviour]

2018-07-17 Thread Tim Chase
On 2018-07-17 08:37, Marko Rauhamaa wrote: > Tim Chase : > > Wait, but now you're talking about vendors. Much of the crux of > > this discussion has been about personal scripts that don't need to > > marshal Unicode strings in and out of various functions/objec

Re: Unicode [was Re: Cult-like behaviour]

2018-07-16 Thread Marko Rauhamaa
Unless a consortium is >> erected to support Python2, no vendor will be able to use it in the >> medium term. > > Wait, but now you're talking about vendors. Much of the crux of this > discussion has been about personal scripts that don't need to > marshal Unicode strin

Re: Unicode [was Re: Cult-like behaviour]

2018-07-16 Thread Tim Chase
rt Python2, no vendor will be able to use it in the > medium term. Wait, but now you're talking about vendors. Much of the crux of this discussion has been about personal scripts that don't need to marshal Unicode strings in and out of various functions/objects. If you have a py2 scri

Unicode is not UTF-32 [was Re: Cult-like behaviour]

2018-07-16 Thread Steven D'Aprano
UTF-32 is implementation, not semantics: it specifies how to represent Unicode code points as bytes in memory, not what Unicode code points are. Python 3 strings are sequences of abstract characters ("code points") with no mandatory implementation. In CPython, some string objects are

Re: Unicode [was Re: Cult-like behaviour]

2018-07-16 Thread Mark Lawrence
y read my words with *intent* rather than *reaction*, you would notice that I suggested the *option* of turning off Unicode.  I didn't say get *rid* of Unicode.  I didn't say make it *harder* to use Unicode.  Once again - reaction rather than reading. Obviously, the most vocal represent

Re: Unicode [was Re: Cult-like behaviour]

2018-07-16 Thread MRAB
On 2018-07-16 21:59, Marko Rauhamaa wrote: Tim Chase : While the python world has moved its efforts into improving Python3, Python2 hasn't suddenly stopped working. The sword of Damocles is hanging on its head. Unless a consortium is erected to support Python2, no vendor will be able to use it

Re: Unicode [was Re: Cult-like behaviour]

2018-07-16 Thread Chris Angelico
On Tue, Jul 17, 2018 at 6:32 AM, Tim Chase wrote: > On 2018-07-16 18:31, Steven D'Aprano wrote: >> You say that all you want is a switch to turn off Unicode (and >> replace it with what? Kanji strings? Cyrillic? Shift_JS? no of >> course not, I'm being absurd -- r

Re: Unicode [was Re: Cult-like behaviour]

2018-07-16 Thread Marko Rauhamaa
Tim Chase : > While the python world has moved its efforts into improving Python3, > Python2 hasn't suddenly stopped working. The sword of Damocles is hanging on its head. Unless a consortium is erected to support Python2, no vendor will be able to use it in the medium term. Given the recent even

Re: Unicode [was Re: Cult-like behaviour]

2018-07-16 Thread Tim Chase
On 2018-07-16 18:31, Steven D'Aprano wrote: > You say that all you want is a switch to turn off Unicode (and > replace it with what? Kanji strings? Cyrillic? Shift_JS? no of > course not, I'm being absurd -- replace it with ASCII, what else > could any right-thinking person

Re: Unicode [was Re: Cult-like behaviour]

2018-07-16 Thread Chris Angelico
27;t care if that's not PC enough for you. >>> >>> Had you actually read my words with *intent* rather than *reaction*, you >>> would notice that I suggested the *option* of turning off Unicode. I didn't >>> say get *rid* of Unicode. I didn't s

Re: Unicode [was Re: Cult-like behaviour]

2018-07-16 Thread Rhodri James
eaction*, you would notice that I suggested the *option* of turning off Unicode.  I didn't say get *rid* of Unicode.  I didn't say make it *harder* to use Unicode.  Once again - reaction rather than reading. Obviously, the most vocal representatives of the Python community are too

Re: Unicode [was Re: Cult-like behaviour]

2018-07-16 Thread Anders Wegge Keller
På Mon, 16 Jul 2018 11:33:46 -0700 Jim Lee skrev: > Go right ahead.  I find it surprising that Stephen isn't banned, > considering the fact that he ridicules anyone he doesn't agree with.  > But I guess he's one of the 'good 'ol boys', and so exempt from the code > of conduct. Well said! --

Re: Unicode [was Re: Cult-like behaviour]

2018-07-16 Thread Terry Reedy
uggested the *option* of turning off Unicode.  I didn't say get *rid* of Unicode.  I didn't say make it *harder* to use Unicode.  Once again - reaction rather than reading. Obviously, the most vocal representatives of the Python community are too sensitive about their language to en

Re: Unicode [was Re: Cult-like behaviour]

2018-07-16 Thread Terry Reedy
On 7/16/2018 1:13 PM, Jim Lee wrote: I just think that a language should allow one to bypass Unicode handling easily *when it's not needed*. Both for patching IDLE and for my currently private work, I usually only use Ascii, and no unicode escapes. When I do, it does not matter wh

Re: Unicode [was Re: Cult-like behaviour]

2018-07-16 Thread Rhodri James
On 16/07/18 18:38, Rhodri James wrote: Actually having an option of turning off Unicode *does* make it harder to use, because you end up coming across programs that have Unicode and surprise you when they misbehave.  And yes I saw that 90% of your programs aren't intended to get out int

Re: Unicode [was Re: Cult-like behaviour]

2018-07-16 Thread Jim Lee
On 07/16/18 11:31, Steven D'Aprano wrote: On Mon, 16 Jul 2018 10:27:18 -0700, Jim Lee wrote: Had you actually read my words with *intent* rather than *reaction*, you would notice that I suggested the *option* of turning off Unicode. Yes, I know what you wrote, and I read it with i

Re: Unicode [was Re: Cult-like behaviour]

2018-07-16 Thread Jim Lee
On 07/16/18 10:40, Mark Lawrence wrote: On 16/07/18 18:27, Jim Lee wrote: Obviously, the most vocal representatives of the Python community are too sensitive about their language to enable rational discussion. Please moderators ban this person as he's going down the same line as bartc and s

Re: Unicode [was Re: Cult-like behaviour]

2018-07-16 Thread Rhodri James
On 16/07/18 19:31, Steven D'Aprano wrote: I'm simply not seeing the advantage of: from __future__ import no_unicode print("Hello World!") # stand in for any string handling on ASCII Sure this should be "from __past__ import no_unicode"? gd&r -- Rhodri James *-* Kynesim Ltd -- http

Re: Unicode [was Re: Cult-like behaviour]

2018-07-16 Thread Steven D'Aprano
On Mon, 16 Jul 2018 10:27:18 -0700, Jim Lee wrote: > Had you actually read my words with *intent* rather than *reaction*, you > would notice that I suggested the *option* of turning off Unicode. Yes, I know what you wrote, and I read it with intent. Jim, you seem to be labouring und

  1   2   3   4   5   6   7   8   9   10   >