[ https://issues.apache.org/jira/browse/JENA-2179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17432177#comment-17432177 ]
Holger Knublauch commented on JENA-2179: ---------------------------------------- BTW the same seems to happen using RDF Delta: {code:java} [line: 1276, col: 437] Unicode replacement character U+FFFD. org.apache.jena.riot.RiotParseException: [line: 1276, col: 428] Unicode replacement character U+FFFD in string at org.apache.jena.riot.system.ErrorHandlerFactory$ErrorHandlerRiotParseException.warning(ErrorHandlerFactory.java:367) at org.apache.jena.riot.tokens.TokenizerText.warning(TokenizerText.java:1332) at org.apache.jena.riot.tokens.TokenizerText.readString(TokenizerText.java:768) at org.apache.jena.riot.tokens.TokenizerText.parseToken(TokenizerText.java:238) at org.apache.jena.riot.tokens.TokenizerText.hasNext(TokenizerText.java:89) at org.seaborne.patch.text.RDFPatchReaderText.nextToken(RDFPatchReaderText.java:243) at org.seaborne.patch.text.RDFPatchReaderText.nextNode(RDFPatchReaderText.java:254) at org.seaborne.patch.text.RDFPatchReaderText.doOneLine(RDFPatchReaderText.java:104) at org.seaborne.patch.text.RDFPatchReaderText.apply1(RDFPatchReaderText.java:72) at org.seaborne.patch.text.RDFPatchReaderText.read(RDFPatchReaderText.java:49) at org.seaborne.patch.text.RDFPatchReaderText.apply(RDFPatchReaderText.java:59) at org.seaborne.delta.client.DeltaLinkHTTP.lambda$fetchCommon$8(DeltaLinkHTTP.java:211) at org.seaborne.delta.client.DeltaLinkHTTP.retry(DeltaLinkHTTP.java:125) at org.seaborne.delta.client.DeltaLinkHTTP.fetchCommon(DeltaLinkHTTP.java:204) at org.seaborne.delta.client.DeltaLinkHTTP.fetch(DeltaLinkHTTP.java:184) at org.topbraidlive.edg.backup.BackupUtils.getPatch(BackupUtils.java:368) {code} > TDB throws Unicode Replacement Character exception while fetching data > ---------------------------------------------------------------------- > > Key: JENA-2179 > URL: https://issues.apache.org/jira/browse/JENA-2179 > Project: Apache Jena > Issue Type: Bug > Components: TDB > Affects Versions: Jena 4.2.0 > Reporter: Holger Knublauch > Assignee: Andy Seaborne > Priority: Major > Fix For: Jena 4.3.0 > > Attachments: TBS4190_Test.java > > > This seems to have been introduced with > https://issues.apache.org/jira/browse/JENA-2120 > With TDB databases that contain the replacement character in a literal, the > warnings are reported as Exceptions. We have seen this: > {code:java} > WARN [http-nio-8083-exec-10] g.e.SimpleDataFetcherExceptionHandler - > Exception while fetching data (/resources[0]/turtleSourceCode) : [line: 1, > col: 318] Unicode replacement character U+FFFD in string > org.apache.jena.riot.RiotParseException: [line: 1, col: 318] Unicode > replacement character U+FFFD in string > at > org.apache.jena.riot.system.ErrorHandlerFactory$ErrorHandlerRiotParseException.warning(ErrorHandlerFactory.java:367) > ~[jena-arq-4.2.0.jar:4.2.0] > at > org.apache.jena.riot.tokens.TokenizerText.warning(TokenizerText.java:1332) > ~[jena-arq-4.2.0.jar:4.2.0] > at > org.apache.jena.riot.tokens.TokenizerText.readString(TokenizerText.java:768) > ~[jena-arq-4.2.0.jar:4.2.0] > at > org.apache.jena.riot.tokens.TokenizerText.parseToken(TokenizerText.java:238) > ~[jena-arq-4.2.0.jar:4.2.0] > at > org.apache.jena.riot.tokens.TokenizerText.hasNext(TokenizerText.java:89) > ~[jena-arq-4.2.0.jar:4.2.0] > at > org.apache.jena.tdb.store.nodetable.NodecSSE.decode(NodecSSE.java:119) > ~[jena-tdb-4.2.0.jar:4.2.0] > at org.apache.jena.tdb.lib.NodeLib.decode(NodeLib.java:118) > ~[jena-tdb-4.2.0.jar:4.2.0] > {code} > TDB seems to use the fallback error handler causing an exception to be thrown > instead of just printing the warning (to the log). > Richard says he believes a fix would be to change NodecSEE.createTokenizer(): > {code:java} > return TokenizerText.create() > .fromString(string) > .errorHandler(ErrorHandlerFactory.errorHandlerDetailed()) > .build(); > {code} > Is there any known work-around in 4.2.0? We cannot even query those triples > from the offending TDBs at the moment. -- This message was sent by Atlassian Jira (v8.3.4#803005)