Re: mailbox misbehavior with non-ASCII

2022-07-30 Thread Peter J. Holzer
On 2022-07-29 23:24:57 +, Peter Pearson wrote: > The following code produces a nonsense result with the input > described below: > > import mailbox > box = mailbox.Maildir("/home/peter/Temp/temp",create=False) > x = box.values()[0] > h = x.get("X-DSPAM-Factors") > print(type(h)) > # > > The

Re: mailbox misbehavior with non-ASCII

2022-07-30 Thread Barry
he exact bytes that are in the header. In may not be utf-8 encoded it maybe windows cp1252, etc. Repr of the bytes header will show this. Barry > > I realize that one should not put non-ASCII characters in > message headers, but of course I didn't put it there, it > just showed u

Re: mailbox misbehavior with non-ASCII

2022-07-29 Thread 2QdxY4RzWzUUiLuE
; X-DSPAM-Factors: a'b > > xxx > > ... but if the apostrophe in "a'b" is replaced with a > RIGHT SINGLE QUOTATION MARK, the returned h is of type > "email.header.Header", and seems to contain inscrutable garbage. > > I realize that one should no

Re: mailbox misbehavior with non-ASCII

2022-07-29 Thread Ethan Furman
On 7/29/22 16:24, Peter Pearson wrote: > ... but if the apostrophe in "a'b" is replaced with a > RIGHT SINGLE QUOTATION MARK, the returned h is of type > "email.header.Header", and seems to contain inscrutable garbage. > > I'd think an exception would be the right answer. > > Is this worth a bug

mailbox misbehavior with non-ASCII

2022-07-29 Thread Peter Pearson
is of type "email.header.Header", and seems to contain inscrutable garbage. I realize that one should not put non-ASCII characters in message headers, but of course I didn't put it there, it just showed up, pretty much beyond my control. And I realize that when software is given

Re: Jinja and non-ASCII characters (was Re: Prepare accented characters for HTML)

2019-03-29 Thread Peter Otten
.access.log: >> >> So it's running in apache! >> >> Now the question is what apache is doing. Is it running it as a CGI >> script? Is it doing something clever for Python files (maybe involving >> Python 2?) >> >> ... wild guess: if the script is r

Re: Jinja and non-ASCII characters (was Re: Prepare accented characters for HTML)

2019-03-29 Thread Peter J. Holzer
! > > Now the question is what apache is doing. Is it running it as a CGI > script? Is it doing something clever for Python files (maybe involving > Python 2?) > > ... wild guess: if the script is running as CGI in an enviroment with an > ASCII-using "C" locale, wit

Re: Jinja and non-ASCII characters (was Re: Prepare accented characters for HTML)

2019-03-29 Thread Thomas Jollans
xt/html\n\n") >> >> Try: text/html; charset=utf-8 >> > No difference > >> That might be all you need to make the browser understand it >> correctly. Otherwise, as Thomas says, you will need to figure out >> where the traceback is, which can proba

Re: Jinja and non-ASCII characters (was Re: Prepare accented characters for HTML)

2019-03-29 Thread Tony van der Hoff
On 29/03/2019 11:08, Chris Angelico wrote: > On Fri, Mar 29, 2019 at 9:12 PM Tony van der Hoff > wrote: >> >> Hello Chris. >> Thanks for your interest. >> >> On 28/03/2019 18:04, Chris Angelico wrote: >>> On Fri, Mar 29, 2019 at 4:10 AM Tony van der Hoff >>> wrote: This'll probably wo

Re: Jinja and non-ASCII characters (was Re: Prepare accented characters for HTML)

2019-03-29 Thread Chris Angelico
On Fri, Mar 29, 2019 at 9:12 PM Tony van der Hoff wrote: > > Hello Chris. > Thanks for your interest. > > On 28/03/2019 18:04, Chris Angelico wrote: > > On Fri, Mar 29, 2019 at 4:10 AM Tony van der Hoff > > wrote: > >> > >> This'll probably work: > > > > You have a python3 shebang, but are you d

Re: Jinja and non-ASCII characters (was Re: Prepare accented characters for HTML)

2019-03-29 Thread Thomas Jollans
On 29/03/2019 11.10, Tony van der Hoff wrote: > and running it in a browser (tried both chrome and Firefox), How? > it fails as before: blank web page. No traceback? There must be a traceback somewhere. In a log file perhaps. -- https://mail.python.org/mailman/listinfo/python-list

Re: Jinja and non-ASCII characters (was Re: Prepare accented characters for HTML)

2019-03-29 Thread Tony van der Hoff
Hello Chris. Thanks for your interest. On 28/03/2019 18:04, Chris Angelico wrote: > On Fri, Mar 29, 2019 at 4:10 AM Tony van der Hoff > wrote: >> >> This'll probably work: > > You have a python3 shebang, but are you definitely running this under Python > 3? > Absolutely. > Here's a much more

Jinja and non-ASCII characters (was Re: Prepare accented characters for HTML)

2019-03-28 Thread Chris Angelico
On Fri, Mar 29, 2019 at 4:10 AM Tony van der Hoff wrote: > > This'll probably work: > > accent-test/accent-test.py: > # > #!/usr/bin/env python3 > > import os > from jinja2 import Environment, FileSystemLoader > > PATH = os.path.d

Re: To ASCII Or Not To ASCII? (Posting On Python-List Prohibited)

2017-11-16 Thread Christian Gollwitzer
Am 16.11.17 um 02:45 schrieb Lawrence D’Oliveiro: From : def raıse(self) : "raises this exception." libm.feraiseexcept(self.mask) #end raıse raiise = raıse # if you prefer you do this to annoy people? Christian -- https

To ASCII Or Not To ASCII? (Posting On Python-List Prohibited)

2017-11-15 Thread Lawrence D’Oliveiro
From : def raıse(self) : "raises this exception." libm.feraiseexcept(self.mask) #end raıse raiise = raıse # if you prefer -- https://mail.python.org/mailman/listinfo/python-list

Re: What extended ASCII character set uses 0x9D?

2017-08-22 Thread Chris Angelico
On Tue, Aug 22, 2017 at 5:15 PM, Gregory Ewing wrote: > Chris Angelico wrote: >> >> a naive ASCII upper-casing wouldn't produce 0x81 either - if it did, it >> would also convert 0x21 ("!") into 0x01 (SOH, a control character). So >> this one's s

Re: What extended ASCII character set uses 0x9D?

2017-08-22 Thread Gregory Ewing
Chris Angelico wrote: a naive ASCII upper-casing wouldn't produce 0x81 either - if it did, it would also convert 0x21 ("!") into 0x01 (SOH, a control character). So this one's still a mystery. It's unlikely that even a naive ascii upper/lower casing algorithm would

Re: What extended ASCII character set uses 0x9D?

2017-08-19 Thread Gregory Ewing
Ian Kelly wrote: One possibility is that it's the same two bytes. That would make it 0xE2 0x80 0x9D which is "right double quotation mark". Since it keeps appearing after ending double quotes that seems plausible, although one has to wonder why it appears *in addition to* the ASCI

Re: What extended ASCII character set uses 0x9D?

2017-08-18 Thread John Nagle
k') This one has to be Polish, and the first character should be the letter Ł U+0141 or ł U+0142. In UTF-8, U+0141 becomes C5 81, which is very similar to the E5 81 that you have. So here's an insane theory: something attempted to lower-case the byte stream as if it were ASCII. If you

Re: What extended ASCII character set uses 0x9D?

2017-08-18 Thread Piet van Oostrum
Marko Rauhamaa writes: > Chris Angelico : > >> Ohh. We have no evidence that uppercasing is going on here, and a >> naive ASCII upper-casing wouldn't produce 0x81 either - if it did, it >> would also convert 0x21 ("!") into 0x01 (SOH, a control chara

Re: What extended ASCII character set uses 0x9D?

2017-08-18 Thread Random832
On Fri, Aug 18, 2017, at 03:39, Marko Rauhamaa wrote: > BTW, I was reading up on the history of ASCII control characters. Quite > fascinating. > > For example, have you ever wondered why DEL is the odd control character > out at the code point 127? The reason turns out to be p

Re: What extended ASCII character set uses 0x9D?

2017-08-18 Thread MRAB
o the E5 81 that you have. > > So here's an insane theory: something attempted to lower-case the byte > stream as if it were ASCII. If you ignore the high bit, 0xC5 looks > like 0x45 or "E", which lower-cases by having 32 added to it, yielding > 0xE5. Reve

Re: What extended ASCII character set uses 0x9D?

2017-08-18 Thread Chris Angelico
On Fri, Aug 18, 2017 at 5:39 PM, Marko Rauhamaa wrote: > Chris Angelico : > >> Ohh. We have no evidence that uppercasing is going on here, and a >> naive ASCII upper-casing wouldn't produce 0x81 either - if it did, it >> would also convert 0x21 ("!") i

Re: What extended ASCII character set uses 0x9D?

2017-08-18 Thread Marko Rauhamaa
Chris Angelico : > Ohh. We have no evidence that uppercasing is going on here, and a > naive ASCII upper-casing wouldn't produce 0x81 either - if it did, it > would also convert 0x21 ("!") into 0x01 (SOH, a control character). So > this one's still a mystery. BTW,

Re: What extended ASCII character set uses 0x9D?

2017-08-18 Thread Chris Angelico
; John Nagle writes: >>>>>> Since, as someone pointed out, there was UTF-8 which had been >>>>>> run through an ASCII-type lower casing algorithm >>>>> >>>>> I spent a few minutes figuring out if some of the mysterious 0x81's &

Re: What extended ASCII character set uses 0x9D?

2017-08-18 Thread Marko Rauhamaa
Chris Angelico : > On Fri, Aug 18, 2017 at 4:57 PM, Marko Rauhamaa wrote: >> Chris Angelico : >> >>> On Fri, Aug 18, 2017 at 4:38 PM, Paul Rubin wrote: >>>> John Nagle writes: >>>>> Since, as someone pointed out, there was UTF-8 which ha

Re: What extended ASCII character set uses 0x9D?

2017-08-18 Thread Chris Angelico
On Fri, Aug 18, 2017 at 4:57 PM, Marko Rauhamaa wrote: > Chris Angelico : > >> On Fri, Aug 18, 2017 at 4:38 PM, Paul Rubin wrote: >>> John Nagle writes: >>>> Since, as someone pointed out, there was UTF-8 which had been >>>> run through an ASCII-type

Re: What extended ASCII character set uses 0x9D?

2017-08-18 Thread Marko Rauhamaa
Chris Angelico : > On Fri, Aug 18, 2017 at 4:38 PM, Paul Rubin wrote: >> John Nagle writes: >>> Since, as someone pointed out, there was UTF-8 which had been >>> run through an ASCII-type lower casing algorithm >> >> I spent a few minutes figuring out if

Re: What extended ASCII character set uses 0x9D?

2017-08-17 Thread Chris Angelico
On Fri, Aug 18, 2017 at 4:38 PM, Paul Rubin wrote: > John Nagle writes: >> Since, as someone pointed out, there was UTF-8 which had been >> run through an ASCII-type lower casing algorithm > > I spent a few minutes figuring out if some of the mysterious 0x81's > cou

Re: What extended ASCII character set uses 0x9D?

2017-08-17 Thread Paul Rubin
John Nagle writes: > Since, as someone pointed out, there was UTF-8 which had been > run through an ASCII-type lower casing algorithm I spent a few minutes figuring out if some of the mysterious 0x81's could be from ASCII-lower-casing some Unicode combining characters, but the num

Re: What extended ASCII character set uses 0x9D?

2017-08-17 Thread Chris Angelico
On Fri, Aug 18, 2017 at 4:24 PM, John Nagle wrote: >I'm coming around to the idea that some of these snippets > have been previously mis-converted, which is why they make no sense. > Since, as someone pointed out, there was UTF-8 which had been > run through an ASCII-

Re: What extended ASCII character set uses 0x9D?

2017-08-17 Thread John Nagle
quotation mark". Since it keeps appearing after ending double quotes that seems plausible, although one has to wonder why it appears *in addition to* the ASCII double quotes. I was wondering if it was a signal to some word processor to apply smart quote handling. This has me puzzled.

Re: What extended ASCII character set uses 0x9D?

2017-08-17 Thread Steve D'Aprano
;)) 'LATIN SMALL LETTER U WITH GRAVE' Doesn't seem too likely. This may help: http://i18nqa.com/debug/bug-double-conversion.html There's always the possibility that it's just junk, or moji-bake from some other source, so it might not be anything sensible in any extended ASCII

Re: What extended ASCII character set uses 0x9D?

2017-08-17 Thread Ian Kelly
On Thu, Aug 17, 2017 at 9:46 PM, John Nagle wrote: >The 0x9d thing seems unrelated to the Polish names thing. 0x9d > shows up in the middle of English text that's otherwise ASCII. > Is this something that can appear as a result of cutting and > pasting from Microsoft Word? &

Re: What extended ASCII character set uses 0x9D?

2017-08-17 Thread John Nagle
cases: >> >> bytearray(b'\xe5\x81ukasz zmywaczyk') > > This one has to be Polish, and the first character should be the > letter Ł U+0141 or ł U+0142. In UTF-8, U+0141 becomes C5 81, which is > very similar to the E5 81 that you have. > > So here's an ins

Re: What extended ASCII character set uses 0x9D?

2017-08-17 Thread Ian Kelly
On Thu, Aug 17, 2017 at 8:15 PM, MRAB wrote: > On 2017-08-18 01:53, Chris Angelico wrote: >> So here's an insane theory: something attempted to lower-case the byte >> stream as if it were ASCII. If you ignore the high bit, 0xC5 looks >> like 0x45 or "E", which

Re: What extended ASCII character set uses 0x9D?

2017-08-17 Thread MRAB
On 2017-08-18 01:30, John Nagle wrote: On 08/17/2017 05:14 PM, John Nagle wrote: > I'm cleaning up some data which has text description fields from > multiple sources. A few more cases: bytearray(b'miguel \xe3\x81ngel santos') bytearray(b'lidija kmeti\xe4\x8d') bytearray(b'\xe5\x81ukasz

Re: What extended ASCII character set uses 0x9D?

2017-08-17 Thread MRAB
27;) This one has to be Polish, and the first character should be the letter Ł U+0141 or ł U+0142. In UTF-8, U+0141 becomes C5 81, which is very similar to the E5 81 that you have. So here's an insane theory: something attempted to lower-case the byte stream as if it were ASCII. If you ignor

Re: What extended ASCII character set uses 0x9D?

2017-08-17 Thread MRAB
On 2017-08-18 01:14, John Nagle wrote: I'm cleaning up some data which has text description fields from multiple sources. Some are are in UTF-8. Some are in WINDOWS-1252. And some are in some other character set. So I have to examine and sanity check each field in a database dump, deciding

Re: What extended ASCII character set uses 0x9D?

2017-08-17 Thread Ben Bacarisse
John Nagle writes: > I'm cleaning up some data which has text description fields from > multiple sources. Some are are in UTF-8. Some are in WINDOWS-1252. > And some are in some other character set. So I have to examine and > sanity check each field in a database dump, deciding which characte

Re: What extended ASCII character set uses 0x9D?

2017-08-17 Thread Ian Kelly
On Thu, Aug 17, 2017 at 6:53 PM, Chris Angelico wrote: > That doesn't work for everything, though. The 0x81 0x81 and 0x9d ones > are still a puzzle. I'm fairly sure that b'M\x81\x81\xfcnster' is 'Münster'. It decodes to that in Latin-1 if you remove the \x81 bytes. The question then is what those

Re: What extended ASCII character set uses 0x9D?

2017-08-17 Thread Chris Angelico
e. >> >> I suspect the others contain similar errors. I don't know if it's the >> result of some form of Mojibake or maybe just transcription errors. > > Oh shit, I think know what happened. In ASCII you can lower-case > letters by just adding 32 (0x20) to them. Somebody

Re: What extended ASCII character set uses 0x9D?

2017-08-17 Thread Ian Kelly
; rest of the name. > >> bytearray(b'\xe5\x81ukasz zmywaczyk') > > If that were b'\xc5\x81' it would be Ł in UTF-8 which would fit the > rest of the name. > > I suspect the others contain similar errors. I don't know if it's the > result of some

Re: What extended ASCII character set uses 0x9D?

2017-08-17 Thread Ian Kelly
On Thu, Aug 17, 2017 at 6:30 PM, John Nagle wrote: > A few more cases: > > bytearray(b'miguel \xe3\x81ngel santos') If that were b'\xc3\x81' it would be Á in UTF-8 which would fit the rest of the name. > bytearray(b'\xe5\x81ukasz zmywaczyk') If that were b'\xc5\x81' it would be Ł in UTF-8 which

Re: What extended ASCII character set uses 0x9D?

2017-08-17 Thread Chris Angelico
his one has to be Polish, and the first character should be the letter Ł U+0141 or ł U+0142. In UTF-8, U+0141 becomes C5 81, which is very similar to the E5 81 that you have. So here's an insane theory: something attempted to lower-case the byte stream as if it were ASCII. If you ignore the high b

Re: What extended ASCII character set uses 0x9D?

2017-08-17 Thread Ian Kelly
On Thu, Aug 17, 2017 at 6:27 PM, Chris Angelico wrote: > On Fri, Aug 18, 2017 at 10:14 AM, John Nagle wrote: >> I'm cleaning up some data which has text description fields from >> multiple sources. Some are are in UTF-8. Some are in WINDOWS-1252. >> And some are in some other character set. S

Re: What extended ASCII character set uses 0x9D?

2017-08-17 Thread John Nagle
On 08/17/2017 05:14 PM, John Nagle wrote: > I'm cleaning up some data which has text description fields from > multiple sources. A few more cases: bytearray(b'miguel \xe3\x81ngel santos') bytearray(b'lidija kmeti\xe4\x8d') bytearray(b'\xe5\x81ukasz zmywaczyk') bytearray(b'M\x81\x81\xfcnster'

Re: What extended ASCII character set uses 0x9D?

2017-08-17 Thread Chris Angelico
On Fri, Aug 18, 2017 at 10:14 AM, John Nagle wrote: > I'm cleaning up some data which has text description fields from > multiple sources. Some are are in UTF-8. Some are in WINDOWS-1252. > And some are in some other character set. So I have to examine and > sanity check each field in a databa

What extended ASCII character set uses 0x9D?

2017-08-17 Thread John Nagle
I'm cleaning up some data which has text description fields from multiple sources. Some are are in UTF-8. Some are in WINDOWS-1252. And some are in some other character set. So I have to examine and sanity check each field in a database dump, deciding which character set best represents what's

Re: How to make sure the result of Pandas.to_csv does not have non-ASCII code?

2017-05-31 Thread MRAB
On 2017-05-31 17:52, David Shi via Python-list wrote: How to make sure the result of Pandas.to_csv does not have non-ASCII code? Specify the encoding as 'ascii': df.to_csv(path, encoding='ascii') If there's a non-ASCII character that it can't write, it

How to make sure the result of Pandas.to_csv does not have non-ASCII code?

2017-05-31 Thread David Shi via Python-list
How to make sure the result of Pandas.to_csv does not have non-ASCII code? Regards, David -- https://mail.python.org/mailman/listinfo/python-list

Re: python script Non-ASCII character

2017-03-19 Thread MRAB
On 2017-03-20 02:50, eryk sun wrote: On Sun, Mar 19, 2017 at 11:06 PM, MRAB wrote: If you're using Unicode string literals, your choices are: 1. Raw string literals: var1 = ur"C:\Users\username\Desktop\η γλωσσα μου\mylanguage\myfile" Raw unicode literals are practically useless in Pyth

Re: python script Non-ASCII character

2017-03-19 Thread eryk sun
On Sun, Mar 19, 2017 at 11:06 PM, MRAB wrote: > > If you're using Unicode string literals, your choices are: > > 1. Raw string literals: > > var1 = ur"C:\Users\username\Desktop\η γλωσσα μου\mylanguage\myfile" Raw unicode literals are practically useless in Python 2. They're not actually raw b

Re: python script Non-ASCII character

2017-03-19 Thread Steve D'Aprano
On Mon, 20 Mar 2017 06:48 am, Xristos Xristoou wrote: > Τη Κυριακή, 19 Μαρτίου 2017 - 7:38:19 μ.μ. UTC+2, ο χρήστης Xristos > Xristoou έγραψε: > > how to define my script with encoding of ISO-8859-7 or UTF-8?and for the > blanks ? First you need to know whether your editor is saving the file

Re: python script Non-ASCII character

2017-03-19 Thread MRAB
On 2017-03-19 20:10, Xristos Xristoou wrote: Τη Κυριακή, 19 Μαρτίου 2017 - 7:38:19 μ.μ. UTC+2, ο χρήστης Xristos Xristoou έγραψε: @Terry non-ascii in pathnames i need for ex :var1="C:\Users\username\Desktop\my language\mylanguage\myfile" and for the blank ? Your choices are: 1.

Re: python script Non-ASCII character

2017-03-19 Thread Xristos Xristoou
Τη Κυριακή, 19 Μαρτίου 2017 - 7:38:19 μ.μ. UTC+2, ο χρήστης Xristos Xristoou έγραψε: @Terry non-ascii in pathnames i need for ex :var1="C:\Users\username\Desktop\my language\mylanguage\myfile" and for the blank ? -- https://mail.python.org/mailman/listinfo/python-list

Re: python script Non-ASCII character

2017-03-19 Thread Xristos Xristoou
Τη Κυριακή, 19 Μαρτίου 2017 - 7:38:19 μ.μ. UTC+2, ο χρήστης Xristos Xristoou έγραψε: yes that i know but i need python 2.7 for my task -- https://mail.python.org/mailman/listinfo/python-list

Re: python script Non-ASCII character

2017-03-19 Thread Terry Reedy
uage Non-ascii in a pathname and non-ascii within a file are different issues. On Windows, non-ascii in pathnames did not work consistently until 3.2 or maybe 3.3. > or the path have some blank character then not working This should not be a problem. With 2.7.13: >>> f = open(&

Re: python script Non-ASCII character

2017-03-19 Thread Chris Angelico
On Mon, Mar 20, 2017 at 6:48 AM, Xristos Xristoou wrote: > Τη Κυριακή, 19 Μαρτίου 2017 - 7:38:19 μ.μ. UTC+2, ο χρήστης Xristos Xristoou > έγραψε: > > how to define my script with encoding of ISO-8859-7 or UTF-8?and for the > blanks ? First, try using Python 3. Most of the time, that will be t

Re: python script Non-ASCII character

2017-03-19 Thread Xristos Xristoou
Τη Κυριακή, 19 Μαρτίου 2017 - 7:38:19 μ.μ. UTC+2, ο χρήστης Xristos Xristoou έγραψε: how to define my script with encoding of ISO-8859-7 or UTF-8?and for the blanks ? -- https://mail.python.org/mailman/listinfo/python-list

Re: python script Non-ASCII character

2017-03-19 Thread Chris Angelico
if that paths is in my main language or the path > have some blank character then not working and i take that error : > > SyntaxError: Non-ASCII character '\xce' in file Untitled_.py on line 15, but > no encoding declared; > > can i fix that in python 2.7.13 ? can i fin

python script Non-ASCII character

2017-03-19 Thread Xristos Xristoou
r then not working and i take that error : SyntaxError: Non-ASCII character '\xce' in file Untitled_.py on line 15, but no encoding declared; can i fix that in python 2.7.13 ? can i find some solution to python read paths in my main language or paths with blanks? -- https://mail.python.

Re: Extended ASCII [solved]

2017-01-13 Thread D'Arcy Cain
nt(ln) UnicodeEncodeError: 'ascii' codec can't encode character '\xe9' in position 132: ordinal not in range(128) I don't understand why the error says "ascii" when I told it to use "latin-1". That can't be the failing code, since it&#

Re: Extended ASCII

2017-01-13 Thread Jon Ribbens
On 2017-01-13, D'Arcy Cain wrote: > I thought I was done with this crap once I moved to 3.x but some > Winblows machines are still sending what some circles call "Extended > ASCII". I have a file that I am trying to read and it is barfing on > some characters. For

Re: Extended ASCII

2017-01-13 Thread Grant Edwards
On 2017-01-13, D'Arcy Cain wrote: > Here is the failing code: > > with open(sys.argv[1], encoding="latin-1") as fp: >for ln in fp: > print(ln) > > Traceback (most recent call last): >File "./load_iff", line 11, in > print(

Re: Extended ASCII

2017-01-13 Thread Random832
On Fri, Jan 13, 2017, at 17:24, D'Arcy Cain wrote: > I thought I was done with this crap once I moved to 3.x but some > Winblows machines are still sending what some circles call "Extended > ASCII". I have a file that I am trying to read and it is barfing on > so

Extended ASCII

2017-01-13 Thread D'Arcy Cain
I thought I was done with this crap once I moved to 3.x but some Winblows machines are still sending what some circles call "Extended ASCII". I have a file that I am trying to read and it is barfing on some characters. For example: due to the Qu\xe9bec government Obviously shou

Re: best way to read a huge ascii file.

2016-11-30 Thread Rolando Espinoza
do On Wed, Nov 30, 2016 at 1:16 PM, Heli wrote: > Hi all, > > Writing my ASCII file once to either of pickle or npy or hdf data types > and then working afterwards on the result binary file reduced the read time > from 80(min) to 2 seconds. > > Thanks everyone for your help

Re: best way to read a huge ascii file.

2016-11-30 Thread Chris Angelico
On Thu, Dec 1, 2016 at 3:26 AM, BartC wrote: > On 30/11/2016 16:16, Heli wrote: >> >> Hi all, >> >> Writing my ASCII file once to either of pickle or npy or hdf data types >> and then working afterwards on the result binary file reduced the read time >> f

Re: best way to read a huge ascii file.

2016-11-30 Thread BartC
On 30/11/2016 16:16, Heli wrote: Hi all, Writing my ASCII file once to either of pickle or npy or hdf data types and then working afterwards on the result binary file reduced the read time from 80(min) to 2 seconds. 240,000% faster? Something doesn't sound quite right! How big is the

Re: best way to read a huge ascii file.

2016-11-30 Thread Heli
Hi all, Writing my ASCII file once to either of pickle or npy or hdf data types and then working afterwards on the result binary file reduced the read time from 80(min) to 2 seconds. Thanks everyone for your help. -- https://mail.python.org/mailman/listinfo/python-list

Re: best way to read a huge ascii file.

2016-11-29 Thread Steve D'Aprano
On Wed, 30 Nov 2016 01:17 am, Heli wrote: > The following line which reads the entire 7.4 GB file increments the > memory usage by 3206.898 MiB (3.36 GB). First question is Why it does not > increment the memory usage by 7.4 GB? > > f=np.loadtxt(os.path.join(dir,myfile),delimiter=None,skiprows=0)

Re: best way to read a huge ascii file.

2016-11-29 Thread BartC
On 29/11/2016 14:17, Heli wrote: Hi all, Let me update my question, I have an ascii file(7G) which has around 100M lines. I read this file using : f=np.loadtxt(os.path.join(dir,myfile),delimiter=None,skiprows=0) x=f[:,1] y=f[:,2] z=f[:,3] id=f[:,0] I will need the x,y,z and id arrays later

Re: best way to read a huge ascii file.

2016-11-29 Thread marco . nawijn
On Tuesday, November 29, 2016 at 3:18:29 PM UTC+1, Heli wrote: > Hi all, > > Let me update my question, I have an ascii file(7G) which has around 100M > lines. I read this file using : > > f=np.loadtxt(os.path.join(dir,myfile),delimiter=None,skiprows=0) > > x=f[:,1

Re: best way to read a huge ascii file.

2016-11-29 Thread Jussi Piitulainen
Heli writes: > Hi all, > > Let me update my question, I have an ascii file(7G) which has around > 100M lines. I read this file using : > > f=np.loadtxt(os.path.join(dir,myfile),delimiter=None,skiprows=0) > > x=f[:,1] > y=f[:,2] > z=f[:,3] > id=f[:,0] > &

Re: best way to read a huge ascii file.

2016-11-29 Thread Heli
Hi all, Let me update my question, I have an ascii file(7G) which has around 100M lines. I read this file using : f=np.loadtxt(os.path.join(dir,myfile),delimiter=None,skiprows=0) x=f[:,1] y=f[:,2] z=f[:,3] id=f[:,0] I will need the x,y,z and id arrays later for interpolations. The

Re: best way to read a huge ascii file.

2016-11-25 Thread Steve D'Aprano
On Sat, 26 Nov 2016 02:17 am, Heli wrote: > Hi, > > I have a huge ascii file(40G) and I have around 100M lines. I read this > file using : > > f=np.loadtxt(os.path.join(dir,myfile),delimiter=None,skiprows=0) [...] > I will need the x,y,z and id arrays later for interpolati

Re: best way to read a huge ascii file.

2016-11-25 Thread BartC
On 25/11/2016 15:17, Heli wrote: I have a huge ascii file(40G) and I have around 100M lines. I read this file using : f=np.loadtxt(os.path.join(dir,myfile),delimiter=None,skiprows=0) x=f1[:,1] y=f1[:,2] z=f1[:,3] id=f1[:,0] I will need the x,y,z and id arrays later for interpolations. The

Re: best way to read a huge ascii file.

2016-11-25 Thread Marko Rauhamaa
Heli : > I have a huge ascii file(40G) and I have around 100M lines. I read this > file using : > > [...] > > The problem is reading the file takes around 80 min while the > interpolation only takes 15 mins. > > I was wondering if there is a more optimized way to read t

best way to read a huge ascii file.

2016-11-25 Thread Heli
Hi, I have a huge ascii file(40G) and I have around 100M lines. I read this file using : f=np.loadtxt(os.path.join(dir,myfile),delimiter=None,skiprows=0) x=f1[:,1] y=f1[:,2] z=f1[:,3] id=f1[:,0] I will need the x,y,z and id arrays later for interpolations. The problem is reading the file

Re: Reading Fortran Ascii output using python

2016-11-03 Thread Heli
On Monday, October 31, 2016 at 8:03:53 PM UTC+1, MRAB wrote: > On 2016-10-31 17:46, Heli wrote: > > On Monday, October 31, 2016 at 6:30:12 PM UTC+1, Irmen de Jong wrote: > >> On 31-10-2016 18:20, Heli wrote: > >> > Hi all, > >> > > >> > I am t

Re: Reading Fortran Ascii output using python

2016-10-31 Thread MRAB
On 2016-10-31 17:46, Heli wrote: On Monday, October 31, 2016 at 6:30:12 PM UTC+1, Irmen de Jong wrote: On 31-10-2016 18:20, Heli wrote: > Hi all, > > I am trying to read an ascii file written in Fortran90 using python. I am reading this file by opening the input file and then read

Re: Reading Fortran Ascii output using python

2016-10-31 Thread Irmen de Jong
On 31-10-2016 19:24, Irmen de Jong wrote: > So there must be something in that line in your file that it considers an EOF. I meant to type EOL there. (end-of-line/newline). Irmen -- https://mail.python.org/mailman/listinfo/python-list

Re: Reading Fortran Ascii output using python

2016-10-31 Thread Irmen de Jong
On 31-10-2016 18:46, Heli wrote: > Thanks Irmen, > > I tried with "rU" but that did not make a difference. The problem is a line > that with one single write statement in my fortran code : > > write(UNIT=9,FMT="(99g20.8)") value > > seems to be read in two python inputfile.readline(). > >

Re: Reading Fortran Ascii output using python

2016-10-31 Thread Heli
On Monday, October 31, 2016 at 6:30:12 PM UTC+1, Irmen de Jong wrote: > On 31-10-2016 18:20, Heli wrote: > > Hi all, > > > > I am trying to read an ascii file written in Fortran90 using python. I am > > reading this file by opening the input

Re: Reading Fortran Ascii output using python

2016-10-31 Thread Irmen de Jong
On 31-10-2016 18:20, Heli wrote: > Hi all, > > I am trying to read an ascii file written in Fortran90 using python. I am > reading this file by opening the input file and then reading using: > > inputfile.readline() > > On each line of the ascii file I have a few numb

Reading Fortran Ascii output using python

2016-10-31 Thread Heli
Hi all, I am trying to read an ascii file written in Fortran90 using python. I am reading this file by opening the input file and then reading using: inputfile.readline() On each line of the ascii file I have a few numbers like this: line 1: 1 line 2: 1000.834739 2000.38473 3000.349798 line

Re: SyntaxError: Non-ASCII character

2016-07-17 Thread ldompeling
Op zondag 17 juli 2016 11:19:32 UTC+2 schreef ldomp...@casema.nl: > I copy this script from the magpi but when I run this script I get this > error: > SyntaxError: Non-ASCII character '\xe2' in file sound.py on line 32, but no > encoding declared; see http://python.org/

Re: SyntaxError: Non-ASCII character

2016-07-17 Thread Wildman via Python-list
On Sun, 17 Jul 2016 05:01:21 -0700, ldompeling wrote: > I installed python 3.4 and set my python path to PYTONPATH:/usr/bin/python3.4 > > When I try to import pyaudio then I get this error: > Python 3.4.2 (default, Oct 19 2014, 13:31:11) > [GCC 4.9.1] on linux > Type "help", "copyright", "credits

Re: SyntaxError: Non-ASCII character

2016-07-17 Thread Steven D'Aprano
On Sun, 17 Jul 2016 11:35 pm, ldompel...@casema.nl wrote: > I also get a lot off alsa errors on my screen so I don't no if that result > in this error: > > Traceback (most recent call last): > File "sound.py", line 19, in > rate=RATE, input=TRUE, frames_per_buffer=CHUNK) > NameError: name 'TRUE'

Re: SyntaxError: Non-ASCII character

2016-07-17 Thread Steven D'Aprano
On Sun, 17 Jul 2016 10:01 pm, ldompel...@casema.nl wrote: > I installed python 3.4 and set my python path to > PYTONPATH:/usr/bin/python3.4 > > When I try to import pyaudio then I get this error: > Python 3.4.2 (default, Oct 19 2014, 13:31:11) > [GCC 4.9.1] on linux > Type "help", "copyright", "c

Re: SyntaxError: Non-ASCII character

2016-07-17 Thread ldompeling
Op zondag 17 juli 2016 11:19:32 UTC+2 schreef ldomp...@casema.nl: > I copy this script from the magpi but when I run this script I get this > error: > SyntaxError: Non-ASCII character '\xe2' in file sound.py on line 32, but no > encoding declared; see http://python.org/

Re: SyntaxError: Non-ASCII character

2016-07-17 Thread Chris Angelico
On Sun, Jul 17, 2016 at 10:01 PM, wrote: > I installed python 3.4 and set my python path to PYTONPATH:/usr/bin/python3.4 > > When I try to import pyaudio then I get this error: > Python 3.4.2 (default, Oct 19 2014, 13:31:11) > [GCC 4.9.1] on linux > Type "help", "copyright", "credits" or "license

Re: SyntaxError: Non-ASCII character

2016-07-17 Thread ldompeling
Op zondag 17 juli 2016 11:19:32 UTC+2 schreef ldomp...@casema.nl: > I copy this script from the magpi but when I run this script I get this > error: > SyntaxError: Non-ASCII character '\xe2' in file sound.py on line 32, but no > encoding declared; see http://python.org/

Re: SyntaxError: Non-ASCII character

2016-07-17 Thread Steven D'Aprano
> is probably worse than Python2’s > > Traceback (most recent call last): > File "", line 1, in > File "foo.py", line 31 > SyntaxError: Non-ASCII character '\xe2' in file foo.py on line 31, but no > encoding declared; see http://python.org

Re: SyntaxError: Non-ASCII character

2016-07-17 Thread Rustom Mody
ll last): File "", line 1, in File "/home/ariston/foo.py", line 31 wf = wave.open(“test.wav”, “rb”) ^ SyntaxError: invalid character in identifier is probably worse than Python2’s Traceback (most recent call last): File "", line 1, in

Re: SyntaxError: Non-ASCII character

2016-07-17 Thread Chris Angelico
On Sun, Jul 17, 2016 at 7:19 PM, wrote: > wf = wave.open(“test.wav”, “rb”) Watch your quotes. They want to be flat quotes, U+0022 "this sort", not any sort of typographical quote. Recommendation: Use a programmer's editor, not a word processor, for working with code. As well as not mangling it,

SyntaxError: Non-ASCII character

2016-07-17 Thread ldompeling
I copy this script from the magpi but when I run this script I get this error: SyntaxError: Non-ASCII character '\xe2' in file sound.py on line 32, but no encoding declared; see http://python.org/dev/peps/pep-0263/ for details Below is the eamplescript. What is wrong with this scrip

Re: ASCII or Unicode? (was best text editor for programming Python on a Mac)

2016-06-22 Thread Lawrence D’Oliveiro
On Thursday, June 23, 2016 at 2:02:18 PM UTC+12, Rustom Mody wrote: > So remembered that there is one method -- yes clunky -- that I use most -- > forgot to mention -- C-x 8 RET > ie insert-char¹ > > Which takes the name (or hex) of the unicode char. A handy tool for looking up names and codes

Re: ASCII or Unicode? (was best text editor for programming Python on a Mac)

2016-06-22 Thread Rustom Mody
On Tuesday, June 21, 2016 at 7:27:00 PM UTC+5:30, Rustom Mody wrote: > >https://wiki.archlinux.org/index.php/Keyboard_configuration_i > >n_Xorg> -- no good You probably want this: https://wiki.archlinux.org/index.php/X_KeyBoard_extension#Editing_the_layout > > So Rustom, how do *you* prod

  1   2   3   4   5   6   7   8   9   10   >