This problem can be solved by using the errors='ignore'parameter of
codecs.open. It may be that some other codec would decode the
characters coming from the Excel Spreadsheet correctly, but I could not
find the correct one.
import codecs
store.load(codecs.open("Test.rdf",'r','utf8',errors='ignore'))
Dave J
-----Original Message-----
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
Sent: Tuesday, October 16, 2007 9:00 AM
To: [email protected]
Subject: Dev Digest, Vol 19, Issue 4
Send Dev mailing list submissions to
[email protected]
To subscribe or unsubscribe via the World Wide Web, visit
http://rdflib.net/mailman/listinfo/dev
or, via email, send a message with subject or body 'help' to
[EMAIL PROTECTED]
You can reach the person managing the list at
[EMAIL PROTECTED]
When replying, please edit your Subject line so it is more specific than
"Re: Contents of Dev digest..."
Today's Topics:
1. SAX invalid token error when parsing property values outside
ASCII range. (Jones, David H)
----------------------------------------------------------------------
Message: 1
Date: Mon, 15 Oct 2007 13:10:09 -0700
From: "Jones, David H" <[EMAIL PROTECTED]>
Subject: [rdflib-dev] SAX invalid token error when parsing property
values outside ASCII range.
To: <[email protected]>
Message-ID:
<[EMAIL PROTECTED]>
Content-Type: text/plain; charset="us-ascii"
I am attempting to process rdf that has characters outside the ASCII
range, and am getting a SAXParseException: not well-formed (invalid
token)
Call:
store = ConjunctiveGraph()
store.load("ToolsTestA0Removed.rdf")
I thought this might be corrected by adding the encoding tot the top of
the file:
<?xml version='1.0' encoding='UTF-8'?>
But this did not correct the problem.
Is there a parsing option that I've missed, or some other error I'm
making? Will utf-8 encoding work for characters like hex A0 or hex 92?
Thanks in advance for help
Dave J
Trace:
Traceback (most recent call last):
File "C:\nbo\rdf2Forms.py", line 18, in <module>
store.load("endpoint/ToolsTestA0Removed.rdf") # Saved by
makeTriples.py.
File "build\bdist.win32\egg\rdflib\Graph.py", line 665, in load
self.parse(source, publicID, format)
File "build\bdist.win32\egg\rdflib\Graph.py", line 828, in parse
context.parse(source, publicID=publicID, format=format, **args)
File "build\bdist.win32\egg\rdflib\Graph.py", line 661, in parse
parser.parse(source, self, **args)
File "build\bdist.win32\egg\rdflib\syntax\parsers\RDFXMLParser.py",
line 37, in parse
self._parser.parse(source)
File "c:\python25\lib\xml\sax\expatreader.py", line 107, in parse
xmlreader.IncrementalParser.parse(self, source)
File "c:\python25\lib\xml\sax\xmlreader.py", line 123, in parse
self.feed(buffer)
File "c:\python25\lib\xml\sax\expatreader.py", line 211, in feed
self._err_handler.fatalError(exc)
File "c:\python25\lib\xml\sax\handler.py", line 38, in fatalError
raise exception
SAXParseException: file:///C|/ToolsTestA0Removed.rdf:373:684: not
well-formed (invalid token)
------------------------------
_______________________________________________
Dev mailing list
[email protected]
http://rdflib.net/mailman/listinfo/dev
End of Dev Digest, Vol 19, Issue 4
**********************************
_______________________________________________
Dev mailing list
[email protected]
http://rdflib.net/mailman/listinfo/dev