On Thu, 01 Nov 2007 19:21:03 -0700, 7stud wrote:
BeautifulSoup can convert an html entity representing an 'A' with
umlaut, e.g.:
Auml;
into an without every touching my keyboard. How does BeautifulSoup
do it?
It maps the HTML entity names to unicode characters. Take a look at the
On Oct 13, 12:42 pm, MRAB [EMAIL PROTECTED] wrote:
You can
decode that into the actual UTF-8 string with decode(string_escape):
s = raw_input('Enter: ') #A\xcc\x88
s = s.decode(string_escape)
Ahh. Thanks for that.
On Oct 12, 2:43 pm, Marc 'BlackJack' Rintsch [EMAIL PROTECTED] wrote:
On Fri, 12 Oct 2007 19:09:46 -0700, 7stud wrote:
On Oct 12, 2:43 pm, Marc 'BlackJack' Rintsch [EMAIL PROTECTED] wrote:
You mean literally!? Then of course I get A\xcc\x88 because that's what I
entered. In string literals in source code the backslash has a special
meaning but `raw_input()`
On Oct 13, 3:09 am, 7stud [EMAIL PROTECTED] wrote:
On Oct 12, 2:43 pm, Marc 'BlackJack' Rintsch [EMAIL PROTECTED] wrote:
You mean literally!? Then of course I get A\xcc\x88 because that's what I
entered. In string literals in source code the backslash has a special
meaning but
On Oct 12, 1:53 pm, 7stud [EMAIL PROTECTED] wrote:
s = 'A\xcc\x88' #capital A with umlaut
print s #displays capital A with umlaut
s = raw_input('Enter: ') #A\xcc\x88
print s#displays A\xcc\x88
print len(input) #9
It looks like every character of
s = 'A\xcc\x88' #capital A with umlaut
print s #displays capital A with umlaut
s = raw_input('Enter: ') #A\xcc\x88
print s#displays A\xcc\x88
print len(input) #9
It looks like every character of the string I enter in utf-8 is being
interpreted
On Oct 12, 1:18 pm, [EMAIL PROTECTED] wrote:
On Oct 12, 1:53 pm, 7stud [EMAIL PROTECTED] wrote:
s = 'A\xcc\x88' #capital A with umlaut
print s #displays capital A with umlaut
s = raw_input('Enter: ') #A\xcc\x88
print s#displays A\xcc\x88
print
On Fri, 12 Oct 2007 13:18:35 -0700, 7stud wrote:
On Oct 12, 1:18 pm, [EMAIL PROTECTED] wrote:
On Oct 12, 1:53 pm, 7stud [EMAIL PROTECTED] wrote:
s = 'A\xcc\x88' #capital A with umlaut
print s #displays capital A with umlaut
s = raw_input('Enter: ') #A\xcc\x88
print s
On Oct 12, 2:43 pm, Marc 'BlackJack' Rintsch [EMAIL PROTECTED] wrote:
You mean literally!? Then of course I get A\xcc\x88 because that's what I
entered. In string literals in source code the backslash has a special
meaning but `raw_input()` does not interpret the input in any way.
Then why