Hello,
I face the problem that i have large ntriple-files which are containing corrupt triples. They should be imported into a tdb database but the importer allways aborts because of invalid iri. I suspect the best way to handle this would be a "pre-validation" and excluding the invalid triples. Is there a script which can do this or maybe a simple mechanism in the jena-api?

The invalid triples look like and shoul be excluded:
<http://res_id.de> <http://prop.de> <http://a.de <c,d>>.

The exception which aborts the import algorithm is:
Exception in thread "main" org.openjena.riot.RiotException: [line: 8766228, col: 89] Broken IRI (bad character: '<'): http://www.kirchen.net/portal/pfarre.asp?Iid=%7BADBE0FCB-F59B-4388-BAA3-8E22D450AFB6%7DPfarre at org.openjena.riot.ErrorHandlerFactory$ErrorHandlerStd.fatal(ErrorHandlerFactory.java:130) at org.openjena.riot.lang.LangEngine.raiseException(LangEngine.java:169)
    at org.openjena.riot.lang.LangEngine.nextToken(LangEngine.java:116)
    at org.openjena.riot.lang.LangNTriples.parseOne(LangNTriples.java:57)
    at org.openjena.riot.lang.LangNTriples.parseOne(LangNTriples.java:33)
    at org.openjena.riot.lang.LangNTuple.runParser(LangNTuple.java:69)
    at org.openjena.riot.lang.LangBase.parse(LangBase.java:43)
    at org.openjena.riot.RiotReader.parseTriples(RiotReader.java:97)
    at org.openjena.riot.RiotReader.parseTriples(RiotReader.java:83)
    at org.openjena.riot.RiotReader.parseTriples(RiotReader.java:56)
at com.hp.hpl.jena.tdb.store.bulkloader.BulkLoader.loadTriples$(BulkLoader.java:139) at com.hp.hpl.jena.tdb.store.bulkloader.BulkLoader.loadNamedGraph(BulkLoader.java:107)
    at com.hp.hpl.jena.tdb.TDBLoader.loadNamedGraph$(TDBLoader.java:271)
    at com.hp.hpl.jena.tdb.TDBLoader.loadGraph$(TDBLoader.java:246)
    at com.hp.hpl.jena.tdb.TDBLoader.loadGraph(TDBLoader.java:177)
    at com.hp.hpl.jena.tdb.TDBLoader.load(TDBLoader.java:112)
    at tdb.tdbloader.loadNamedGraph(tdbloader.java:157)
    at tdb.tdbloader.exec(tdbloader.java:142)
    at arq.cmdline.CmdMain.mainMethod(CmdMain.java:101)
    at arq.cmdline.CmdMain.mainRun(CmdMain.java:63)
    at arq.cmdline.CmdMain.mainRun(CmdMain.java:50)
    at tdb.tdbloader.main(tdbloader.java:53)


Thanks in advance
Stefan Scheffler

Reply via email to