Please don't reply messages about other issues - it makes them hard to find.
On 20/02/13 07:16, Stefan Scheffler wrote:
Hello,
> I face the problem that i have large ntriple-files which are
containing corrupt triples. They should be imported into a tdb database but the importer allways aborts because of invalid iri. I suspect the best way to handle this would be a "pre-validation" and excluding the invalid triples. Is there a script which can do this or maybe a simple mechanism in the jena-api?
Yes prevalidation is the way to go. > The invalid triples look like and shoul be excluded: > <http://res_id.de> <http://prop.de> <http://a.de <c,d>>. I use perl to fix up files. You need to decide what to do - %encode, reject, or whatever. > The exception which aborts the import algorithm is: > Exception in thread "main" org.openjena.riot.RiotException: [line: > 8766228, col: 89] Broken IRI (bad character: '<'): You can run the parser separately (in checking mode) with riot --validate NTFILE.nt Andy
