Re: raw_input() and utf-8 formatted chars

2007-11-02 Thread Marc 'BlackJack' Rintsch
On Thu, 01 Nov 2007 19:21:03 -0700, 7stud wrote: BeautifulSoup can convert an html entity representing an 'A' with umlaut, e.g.: Auml; into an without every touching my keyboard. How does BeautifulSoup do it? It maps the HTML entity names to unicode characters. Take a look at the

Re: raw_input() and utf-8 formatted chars

2007-11-01 Thread 7stud
On Oct 13, 12:42 pm, MRAB [EMAIL PROTECTED] wrote: You can decode that into the actual UTF-8 string with decode(string_escape): s = raw_input('Enter: ') #A\xcc\x88 s = s.decode(string_escape) Ahh. Thanks for that. On Oct 12, 2:43 pm, Marc 'BlackJack' Rintsch [EMAIL PROTECTED] wrote:

Re: raw_input() and utf-8 formatted chars

2007-10-13 Thread Marc 'BlackJack' Rintsch
On Fri, 12 Oct 2007 19:09:46 -0700, 7stud wrote: On Oct 12, 2:43 pm, Marc 'BlackJack' Rintsch [EMAIL PROTECTED] wrote: You mean literally!? Then of course I get A\xcc\x88 because that's what I entered. In string literals in source code the backslash has a special meaning but `raw_input()`

Re: raw_input() and utf-8 formatted chars

2007-10-13 Thread MRAB
On Oct 13, 3:09 am, 7stud [EMAIL PROTECTED] wrote: On Oct 12, 2:43 pm, Marc 'BlackJack' Rintsch [EMAIL PROTECTED] wrote: You mean literally!? Then of course I get A\xcc\x88 because that's what I entered. In string literals in source code the backslash has a special meaning but

Re: raw_input() and utf-8 formatted chars

2007-10-12 Thread kyosohma
On Oct 12, 1:53 pm, 7stud [EMAIL PROTECTED] wrote: s = 'A\xcc\x88' #capital A with umlaut print s #displays capital A with umlaut s = raw_input('Enter: ') #A\xcc\x88 print s#displays A\xcc\x88 print len(input) #9 It looks like every character of

raw_input() and utf-8 formatted chars

2007-10-12 Thread 7stud
s = 'A\xcc\x88' #capital A with umlaut print s #displays capital A with umlaut s = raw_input('Enter: ') #A\xcc\x88 print s#displays A\xcc\x88 print len(input) #9 It looks like every character of the string I enter in utf-8 is being interpreted

Re: raw_input() and utf-8 formatted chars

2007-10-12 Thread 7stud
On Oct 12, 1:18 pm, [EMAIL PROTECTED] wrote: On Oct 12, 1:53 pm, 7stud [EMAIL PROTECTED] wrote: s = 'A\xcc\x88' #capital A with umlaut print s #displays capital A with umlaut s = raw_input('Enter: ') #A\xcc\x88 print s#displays A\xcc\x88 print

Re: raw_input() and utf-8 formatted chars

2007-10-12 Thread Marc 'BlackJack' Rintsch
On Fri, 12 Oct 2007 13:18:35 -0700, 7stud wrote: On Oct 12, 1:18 pm, [EMAIL PROTECTED] wrote: On Oct 12, 1:53 pm, 7stud [EMAIL PROTECTED] wrote: s = 'A\xcc\x88' #capital A with umlaut print s #displays capital A with umlaut s = raw_input('Enter: ') #A\xcc\x88 print s

Re: raw_input() and utf-8 formatted chars

2007-10-12 Thread 7stud
On Oct 12, 2:43 pm, Marc 'BlackJack' Rintsch [EMAIL PROTECTED] wrote: You mean literally!? Then of course I get A\xcc\x88 because that's what I entered. In string literals in source code the backslash has a special meaning but `raw_input()` does not interpret the input in any way. Then why