Fabian López wrote:
Thanks Mark, the code is like this. The attrib name is the problem:
from lxml import etree
context = etree.iterparse(file.xml)
for action, elem in context:
if elem.tag == weblog:
print action, elem.tag , elem.attrib[name],elem.attrib[url],
The problem is
On 10/23/07, Fabian López [EMAIL PROTECTED] wrote:
Hi,
I am parsing an XML file that includes chineses characters, like
^�u�u啖啖才是�w.���扉L锍才是�� or ヘアアイロン... The problem is that I get an error like:
UnicodeEncodeerror:'charmap' codec can't encode characters in position
The thing is that I
On 10/23/07, Stefan Behnel [EMAIL PROTECTED] wrote:
Fabian López wrote:
Thanks Mark, the code is like this. The attrib name is the problem:
from lxml import etree
context = etree.iterparse(file.xml)
for action, elem in context:
if elem.tag == weblog:
print action,
Thanks, I have tried all you told me. It was an error on print statement. So
I decided to catch the exception if I had an UnicodeEncodeError, that is, if
I had chinese/japanese characters because they don't interest to me and it
worked.
The strip_asian function of Ryan didn't work well here, but
Hi,
I am parsing an XML file that includes chineses characters, like ^
�u�u啖啖才是�w.���扉L锍才是�� or ヘアアイロン... The problem is that I get an error like:
UnicodeEncodeerror:'charmap' codec can't encode characters in position
The thing is that I would like to ignore it and parse all the characters
On Mon, 22 Oct 2007 21:24:40 +0200, Fabian López wrote:
I am parsing an XML file that includes chineses characters, like ^
uu啖啖才是w.扉L锍才是 or ヘアアイロン... The problem is that I get an error like:
UnicodeEncodeerror:'charmap' codec can't encode characters in
position..
You say you are *parsing*
Thanks Mark, the code is like this. The attrib name is the problem:
from lxml import etree
context = etree.iterparse(file.xml)
for action, elem in context:
if elem.tag == weblog:
print action, elem.tag , elem.attrib[name],elem.attrib[url],
elem.attrib[rssUrl]
And the xml file like:
On Behalf Of Fabian Lopez
like ^�u�u啖啖才是�w.���扉L锍才是�� or ヘアアイロン... The problem is that
I get
Just thought I'd point out here that the second string is Japanese, not
Chinese.
From your second post, it appears that you've parsed the text without
problems -- it's when you go to print them out