On 22/10/13 11:50, Philippe Genoud wrote:
I've got some trouble when attempting to read rdf data from freebase.

The following porgram

public class ExampleFreeBase {

     static public void main(String... argv) {
         try {
             Model fbModel = ModelFactory.createDefaultModel();
fbModel.read("http://rdf.freebase.com/rdf/en/en/phillip_glasser","TURTLE";);
             fbModel.write(System.out, "TURTLE");
         } catch (Exception e) {
             e.printStackTrace();
         }
     }
}

fails to execute. The exception is

org.apache.jena.riot.RiotException: [line: 12, col: 179] illegal escape
sequence value: x (0x78)
     at
org.apache.jena.riot.system.ErrorHandlerFactory$ErrorHandlerStd.fatal(ErrorHandlerFactory.java:136)

     at
org.apache.jena.riot.lang.LangEngine.raiseException(LangEngine.java:163)
     at org.apache.jena.riot.lang.LangEngine.nextToken(LangEngine.java:106)
....
     at org.apache.jena.riot.RDFDataMgr.read(RDFDataMgr.java:113)
     at
org.apache.jena.riot.adapters.RDFReaderRIOT.read(RDFReaderRIOT.java:77)
     at com.hp.hpl.jena.rdf.model.impl.ModelCom.read(ModelCom.java:247)
     at test.ExampleFreeBase.main(ExampleFreeBase.java:11)

line 12 of the rdf data is
     ns:common.topic.description    "Phillip Glasser \u2013
ameryka\u0144ski .... film\xf3w animowany  ....""@pl;

so if there is no problems with the unicode escaped characters (for
example here \u2013) , riot seems to not support characters escaped in
hexadecimal (here \xfr3w)

\xf3w is not a legal Turtle escape sequence. Unicode escape are \u and \U; other characters are a few like \" and \n.

RIOT does not accept it.

Not sure what \xf3w is trying to be -- 3F is '?' so does not need escaping. Maybe it is a degraded Unicode replacement character.


Any suggestion to fix that problem ?

You need to fix the data - I have found it is usually necessary to fixup Freebase to make it legal Turtle. I used perl to fix the text.

The Freebase releases don't seem to check the Turtle output for strict legality, maybe because they reply on the conversion process. They appreciate feedback and will make corrections for their next revision.

        Andy


Thanks

Philippe


Reply via email to