Nick Lothian created JENA-806:
---------------------------------
Summary: illegal escape sequence value exception on legal
characters
Key: JENA-806
URL: https://issues.apache.org/jira/browse/JENA-806
Project: Apache Jena
Issue Type: Bug
Components: Cmd line tools
Affects Versions: Jena 2.12.1
Environment: Ubuntu 14.04, Java 8
Reporter: Nick Lothian
When loading the Wikidata data dump using tdbloader2, I received the following
error:
{{ERROR [line: 142128, col: 121] illegal escape sequence value: " (0x22)
org.apache.jena.riot.RiotException: [line: 142128, col: 121] illegal escape
sequence value: " (0x22)
at
org.apache.jena.riot.system.ErrorHandlerFactory$ErrorHandlerStd.fatal(ErrorHandlerFactory.java:136)
at
org.apache.jena.riot.lang.LangEngine.raiseException(LangEngine.java:163)
at org.apache.jena.riot.lang.LangEngine.nextToken(LangEngine.java:106)
at org.apache.jena.riot.lang.LangNTriples.parseOne(LangNTriples.java:67)
at
org.apache.jena.riot.lang.LangNTriples.runParser(LangNTriples.java:54)
at org.apache.jena.riot.lang.LangBase.parse(LangBase.java:42)
at org.apache.jena.riot.RiotReader.parse(RiotReader.java:119)
at org.apache.jena.riot.RiotReader.parse(RiotReader.java:96)
at org.apache.jena.riot.RiotReader.parse(RiotReader.java:69)
at
com.hp.hpl.jena.tdb.store.bulkloader2.CmdNodeTableBuilder.exec(CmdNodeTableBuilder.java:162)
at arq.cmdline.CmdMain.mainMethod(CmdMain.java:102)
at arq.cmdline.CmdMain.mainRun(CmdMain.java:63)
at arq.cmdline.CmdMain.mainRun(CmdMain.java:50)
at
com.hp.hpl.jena.tdb.store.bulkloader2.CmdNodeTableBuilder.main(CmdNodeTableBuilder.java:80)
}}
Looking that that line
{{sed '142128!d' uncompressed/wikidata-simple-statements.nt}}
{{<http://www.wikidata.org/entity/Q16873> <http://www.wikidata.org/entity/P18c>
<http://commons.wikimedia.org/wiki/File:\"Retrat_de_l'escriptor_Juan_Carlos_Onetti_(1909-1994)\".png>
.}}
Column 121 is the "R" after the ".
Looking at http://www.w3.org/TR/n-triples/#n-triples-grammar, it appears that
the " character is allowed.
Should tdbloader2 load this or am I missing something?
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)