>>>>> akhil1988 <akhilan...@gmail.com> (a) wrote:

>a> Chris,

>a> Using 

>a> print (u'line: %s' % line).encode('utf-8')

>a> the 'line' gets printed, but actually this print statement I was using just
>a> for testing, actually my code operates on 'line', on which I use line =
>a> line.decode('utf-8') as 'line' is read as bytes from a stream.

>a> And if I use line = line.encode('utf-8'), 

>a> I start getting other error like
>a> UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 4561:
>a> ordinal not in range(128)
>a> at line = line.replace('<<', u'«').replace('>>', u'»')

You do a Unicode replace here, so line should be a unicode string.
Therefore you have to do this before the line.encode('utf-8'), but after
the decode('utf-8'). 

It might be better to use different variables for Unicode strings and
byte code strings to prevent confusion, like:

'line' is read as bytes from a stream
uline = line.decode('utf-8')
uline = uline.replace('<<', u'«').replace('>>', u'»')
line = uline.encode('utf-8')
-- 
Piet van Oostrum <p...@cs.uu.nl>
URL: http://pietvanoostrum.com [PGP 8DAE142BE17999C4]
Private email: p...@vanoostrum.org
-- 
http://mail.python.org/mailman/listinfo/python-list

Reply via email to