Hi Steve,This code was add to the RDF/XML parser to make it similar in behavior to Turtle etc parsing.
In Turtle, this happens in the tokenizer, that is right next to the input stream and the errors does have the line/column number.
In RDF/XML, in Jena up to 4.7.0, where the checking is layered on top of the output of the RDF/XML parser, the check is happening later and the code has not passed through the line and column information.
As of Jena 4.8.0 this has been addressed. The RDF/XML parser checks earlier, and the error message includes the line and column numbers.
Example: 4.8.0:08:07:45 ERROR riot :: [line: 5, col: 55] {W002} <http://example/abc def> Code: 17/WHITESPACE in PATH: A single whitespace character. These match no grammar rules of URIs/IRIs.
Andy
On 27/06/2023 18:46, Steve Vestal wrote:
RDF/XML. The call to ErrorHandlingFactory.error has -1 for both line and col, which explains why fmtMessage doesn't report them. These come from ReaderRIOTRDFXML.convert, which does this when uriStr.contains(" "). Philippe was right about a space in an IRI being the problem.private Node convert(AResource r) { if (!r.isAnonymous()) { // URI. String uriStr = r.getURI() ; if ( errorForSpaceInURI ) { // Special check for spaces in a URI. // Convert to an error like TokernizerText. if ( uriStr.contains(" ") ) { int i = uriStr.indexOf(' '); String s = uriStr.substring(0,i);String msg = String.format("Bad character in IRI (space): <%s[space]...>", s);riotErrorHandler.error(msg, -1, -1); throw new RiotParseException(msg, -1, -1); } } return NodeFactory.createURI(uriStr); } // String id = r.getAnonymousID(); Node rr = (Node) r.getUserData(); if (rr == null) { rr = NodeFactory.createBlankNode(); r.setUserData(rr); } return rr; } On 6/27/2023 11:53 AM, Andy Seaborne wrote:Steve, Is this RDF/XML, N-triples or Turtle data?It is easiest to reconstruct if you have a small example program to run on various versions or small example data to feed into "riot". This area may have changed a bit and its hard to remember all the details.Andy On 27/06/2023 15:19, Steve Vestal wrote:I am getting an org.apache.jena.riot.RiotException: Bad character in IRI (space): <http://galois.com/cameo/CubeSat[space]...>, but I can't figure out how to get the line and column.The above is the complete getMessage() string. Looking at a stack dump, the ErrorHandling Factory code/** report an error */ @Override public void error(String message, long line, long col) { logError(message, line, col) ; throw new RiotException(fmtMessage(message, line, col)) ; }makes it look like the line and column number should be displayed. IIRC that used to be the case, but it is not here.How do I get those two piece of information to report to users? I can't find a way to configure the org.apache.jena.riot.SysRIOT.fmtMessage method.I am using Jena 4.5.0.
