"Wijaya Edward" <[EMAIL PROTECTED]> wrote in message 
news:[EMAIL PROTECTED]
>
> Hi all,
>
> I was trying to split a string that
> represent chinese characters below:
>
>
>>>> str = '\xc5\xeb\xc7\xd5\xbc'
>>>> print str2,
> ???
>>>> fields2 = split(r'\\',str)
>>>> print fields2,
> ['\xc5\xeb\xc7\xd5\xbc']
>
> But why the split function here doesn't seem
> to do the job for obtaining the desired result:
>
> ['\xc5','\xeb','\xc7','\xd5','\xbc']
>

There are no backslash characters in the string str, so split finds nothing 
to split on.  I know it looks like there are, but the backslashes shown are 
part of the \x escape sequence for defining characters when you can't or 
don't want to use plain ASCII characters (such as in your example in which 
the characters are all in the range 0x80 to 0xff).  Look at this example:

>>> s = "\x40"
>>> print s
@

I defined s using the escaped \x notation, but s does not contain any 
backslashes, it contains the '@' character, whose ordinal character value is 
64, or 40hex.

Also, str is not the best name for a string variable, since this masks the 
built-in str type.

-- Paul 


-- 
http://mail.python.org/mailman/listinfo/python-list

Reply via email to