Re: a simple unicode question

Mark Tolonen Tue, 20 Oct 2009 23:18:47 -0700

"George Trojan" <[email protected]> wrote in messagenews:[email protected]...

Thanks for all suggestions. It took me a while to find out how to
configure my keyboard to be able to type the degree sign. I prefer to
stick with pure ASCII if possible.
Where are the literals (i.e. u'\N{DEGREE SIGN}') defined? I found
http://www.unicode.org/Public/5.1.0/ucd/UnicodeData.txt
Is that the place to look?


George

Scott David Daniels wrote:

Mark Tolonen wrote:

Is there a better way of getting the degrees?
It seems your string is UTF-8. \xc2\xb0 is UTF-8 for DEGREE SIGN. Ifyou type non-ASCII characters in source code, make sure to declare theencoding the file is *actually* saved in:
# coding: utf-8

s = '''48° 13' 16.80" N'''
q = s.decode('utf-8')

# next line equivalent to previous two
q = u'''48° 13' 16.80" N'''

# couple ways to find the degrees
print int(q[:q.find(u'°')])
import re
print re.search(ur'(\d+)°',q).group(1)


Mark is right about the source, but you needn't write unicode source
to process unicode data.  Since nobody else mentioned my favorite way
of writing unicode in ASCII, try:

IDLE 2.6.3
 >>> s = '''48\xc2\xb0 13' 16.80" N'''
 >>> q = s.decode('utf-8')
 >>> degrees, rest = q.split(u'\N{DEGREE SIGN}')
 >>> print degrees
48
 >>> print rest
 13' 16.80" N

And if you are unsure of the name to use:
 >>> import unicodedata
 >>> unicodedata.name(u'\xb0')
'DEGREE SIGN'


It wouldn't be your favorite way if you were typing Chinese:

x = u'我是美国人。'

vs.

x = u'\N{CJK UNIFIED IDEOGRAPH-6211}\N{CJK UNIFIED IDEOGRAPH-662F}\N{CJKUNIFIED IDEOGRAPH-7F8E}\N{CJK UNIFIED IDEOGRAPH-56FD}\N{CJK UNIFIEDIDEOGRAPH-4EBA}\N{IDEOGRAPHIC FULL STOP}'


;^) Mark





--
http://mail.python.org/mailman/listinfo/python-list

Re: a simple unicode question

Reply via email to