Re: How do I automate the removal of all non-ascii characters from my code?

2011-09-13 Thread Vlastimil Brom
2011/9/13 Alec Taylor : > Hmm, nothing mentioned so far works for me... > > Here's a very small test case: > python -u "Convert to Creole.py" >  File "Convert to Creole.py", line 1 > SyntaxError: Non-ASCII character '\xe2' in file Convert to Creole.py > on line 1, but no encoding declared; see

Re: How do I automate the removal of all non-ascii characters from my code?

2011-09-13 Thread Jussi Piitulainen
Alec Taylor writes: > Hmm, nothing mentioned so far works for me... > > Here's a very small test case: > > >>> python -u "Convert to Creole.py" > File "Convert to Creole.py", line 1 > SyntaxError: Non-ASCII character '\xe2' in file Convert to Creole.py > on line 1, but no encoding declared; se

Re: How do I automate the removal of all non-ascii characters from my code?

2011-09-13 Thread Alec Taylor
Hmm, nothing mentioned so far works for me... Here's a very small test case: >>> python -u "Convert to Creole.py" File "Convert to Creole.py", line 1 SyntaxError: Non-ASCII character '\xe2' in file Convert to Creole.py on line 1, but no encoding declared; see http://www.python.org/peps/pep-0263

Re: How do I automate the removal of all non-ascii characters from my code?

2011-09-13 Thread Vlastimil Brom
2011/9/13 ron : > > Depending on the load, you can do something like: > > "".join([x for x in string if ord(x) < 128]) > > It's worked great for me in cleaning input on webapps where there's a > lot of copy/paste from varied sources. > -- > http://mail.python.org/mailman/listinfo/python-list > Well

Re: How do I automate the removal of all non-ascii characters from my code?

2011-09-13 Thread ron
On Sep 12, 4:49 am, Steven D'Aprano wrote: > On Mon, 12 Sep 2011 06:43 pm Stefan Behnel wrote: > > > I'm not sure what you are trying to say with the above code, but if it's > > the code that fails for you with the exception you posted, I would guess > > that the problem is in the "[more stuff her

Re: How do I automate the removal of all non-ascii characters from my code?

2011-09-13 Thread jmfauth
On 13 sep, 10:15, Steven D'Aprano wrote: The intrinsic coding of the characters is one thing, The usage of bytes stream supposed to represent a text is one another thing, jmf -- http://mail.python.org/mailman/listinfo/python-list

Re: How do I automate the removal of all non-ascii characters from my code?

2011-09-13 Thread Steven D'Aprano
On Tue, 13 Sep 2011 05:49 pm jmfauth wrote: > On 12 sep, 23:39, "Rhodri James" wrote: > > >> Now read what Steven wrote again.  The issue is that the program contains >> characters that are syntactically illegal.  The "engine" can be perfectly >> correctly translating a character as a smart quo

Re: How do I automate the removal of all non-ascii characters from my code?

2011-09-13 Thread jmfauth
On 12 sep, 23:39, "Rhodri James" wrote: > Now read what Steven wrote again.  The issue is that the program contains   > characters that are syntactically illegal.  The "engine" can be perfectly   > correctly translating a character as a smart quote or a non breaking space   > or an e-umlaut or w

Re: How do I automate the removal of all non-ascii characters from my code?

2011-09-12 Thread Vlastimil Brom
e you'll get the same result if you write up a > document in LibreOffice Writer and add some End Notes. > > How do I automate the removal of all non-ascii characters from my code? > > Thanks for all suggestions, > > Alec Taylor > -- > http://mail.python.org/mailman/list

Re: How do I automate the removal of all non-ascii characters from my code?

2011-09-12 Thread Rhodri James
On Mon, 12 Sep 2011 15:47:00 +0100, jmfauth wrote: On 12 sep, 10:49, Steven D'Aprano wrote: Even with a source code encoding, you will probably have problems with source files including \xe2 and other "bad" chars. Unless they happen to fall inside a quoted string literal, I would expect to g

Re: How do I automate the removal of all non-ascii characters from my code?

2011-09-12 Thread jmfauth
On 12 sep, 10:49, Steven D'Aprano wrote: > > Even with a source code encoding, you will probably have problems with > source files including \xe2 and other "bad" chars. Unless they happen to > fall inside a quoted string literal, I would expect to get a SyntaxError. > This is absurd and a complet

Re: How do I automate the removal of all non-ascii characters from my code?

2011-09-12 Thread Eric Snow
king on), but I'm sure you'll get the same result if you write up a > document in LibreOffice Writer and add some End Notes. > > How do I automate the removal of all non-ascii characters from my code? Perhaps try "The Unicode Hammer". http://code.activestate.com/re

Re: How do I automate the removal of all non-ascii characters from my code?

2011-09-12 Thread jmfauth
ortunately I can't post my document yet (it's a research paper I'm > > working on), but I'm sure you'll get the same result if you write up a > > document in LibreOffice Writer and add some End Notes. > > > How do I automate the removal of all non-ascii char

Re: How do I automate the removal of all non-ascii characters from my code?

2011-09-12 Thread Miki Tebeka
You can add "# coding=UTF8" to the top of your file (see http://www.python.org/dev/peps/pep-0263/). Of you want to remove unicode, there are several options, one of them is passing the file through "iconv --to ascii". -- http://mail.python.org/mailman/listinfo/python-list

Re: How do I automate the removal of all non-ascii characters from my code?

2011-09-12 Thread Dave Angel
On 01/-10/-28163 02:59 PM, Steven D'Aprano wrote: On Mon, 12 Sep 2011 06:43 pm Stefan Behnel wrote: I'm not sure what you are trying to say with the above code, but if it's the code that fails for you with the exception you posted, I would guess that the problem is in the "[more stuff here]" pa

Re: How do I automate the removal of all non-ascii characters from my code?

2011-09-12 Thread Steven D'Aprano
On Mon, 12 Sep 2011 06:43 pm Stefan Behnel wrote: > I'm not sure what you are trying to say with the above code, but if it's > the code that fails for you with the exception you posted, I would guess > that the problem is in the "[more stuff here]" part, which likely contains > a non-ASCII charact

Re: How do I automate the removal of all non-ascii characters from my code?

2011-09-12 Thread Stefan Behnel
Alec Taylor, 12.09.2011 10:33: from creole import html2creole from BeautifulSoup import BeautifulSoup VALID_TAGS = ['strong', 'em', 'p', 'ul', 'li', 'br', 'b', 'i', 'a', 'h1', 'h2'] def sanitize_html(value): soup = BeautifulSoup(value) for tag in soup.findAll(True): if tag.na

Re: How do I automate the removal of all non-ascii characters from my code?

2011-09-12 Thread John-John Tedro
0263.html>for >> details". >> >> Unfortunately I can't post my document yet (it's a research paper I'm >> working on), but I'm sure you'll get the same result if you write up a >> document in LibreOffice Writer and add some End Notes.

Re: How do I automate the removal of all non-ascii characters from my code?

2011-09-12 Thread Alec Taylor
nvert to Creole.py", line 17 >> SyntaxError: Non-ASCII character '\xe2' in file Convert to Creole.py >> on line 18, but no encoding declared; see >> http://www.python.org/peps/pep-0263.html for details". >> >> Unfortunately I can't post my docu

Re: How do I automate the removal of all non-ascii characters from my code?

2011-09-12 Thread Gary Herron
f you write up a document in LibreOffice Writer and add some End Notes. How do I automate the removal of all non-ascii characters from my code? Thanks for all suggestions, Alec Taylor This question does not quite make sense. The error message is complaining about a python file. What doe

How do I automate the removal of all non-ascii characters from my code?

2011-09-12 Thread Alec Taylor
e Writer and add some End Notes. How do I automate the removal of all non-ascii characters from my code? Thanks for all suggestions, Alec Taylor -- http://mail.python.org/mailman/listinfo/python-list