Where did the string come from? It looks at first glance like you have two bytes for each character instead of the one you expect. Is this perhaps a Unicode string instead of ASCII?
Sent from my iPad On 2011/11/20, at 10:28, dave selby <dave6...@gmail.com> wrote: > Hi All, > > I have a long string which is an HTML file, I strip the HTML tags away > and make a list with > > text = re.split('<.*?>', HTML) > > I then tried to search for a string with text.index(...) but it was > not found, printing HTML to a terminal I get what I expect, a block of > tags and text, I split the HTML and print text and I get loads of > > \x00T\x00r\x00i\x00a\x00 ie I get \x00 breaking up every character. > > Any idea what is happening and how to get back to a list of ascii strings ? > > Cheers > > Dave > > -- > > Please avoid sending me Word or PowerPoint attachments. > See http://www.gnu.org/philosophy/no-word-attachments.html > _______________________________________________ > Tutor maillist - Tutor@python.org > To unsubscribe or change subscription options: > http://mail.python.org/mailman/listinfo/tutor _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor