[
https://issues.apache.org/jira/browse/JENA-1553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16683496#comment-16683496
]
Rob Vesse commented on JENA-1553:
---------------------------------
Any of the above, at the NodeTable level it doesn't really care very much since
it is simply a mapping of internal 64 bit identifiers to a byte sequence
containing the full representation of the node. The node table is offset based
so there is a lookup file of IDs to offset and then an actual table file with
nodes encoded at the relevant offsets.
The corruption is at the file level i.e. there has been an
incomplete/overwritten write leaving an unreadable node entry.
You aren't solving the corruption occurrences with your approach you are just
avoiding it which is only going to cause you more problems in the long run. As
has been repeated multiple times the only real solution is to reload the data
into a fresh database.
> Can't Backup data - java.io.IOException: Illegal UTF-8: 0xFFFFFFB1
> ------------------------------------------------------------------
>
> Key: JENA-1553
> URL: https://issues.apache.org/jira/browse/JENA-1553
> Project: Apache Jena
> Issue Type: Bug
> Components: Jena
> Environment: Ubuntu 16.04 running Docker. Running stain/jena-fuseki
> from the official Docker Hub.
> Reporter: Brian Mullen
> Assignee: Andy Seaborne
> Priority: Major
> Fix For: Jena 3.9.0
>
>
> Attempting to backup through Fuseki, TDB 500M+ triples, breaking with error:
>
> {code:java}
> [2018-06-01 13:25:46] Log4jLoggerAdapter WARN Exception in backup
> org.apache.jena.atlas.RuntimeIOException: java.io.IOException: Illegal UTF-8:
> 0xFFFFFFB1
> at org.apache.jena.atlas.io.IO.exception(IO.java:233)
> at org.apache.jena.atlas.io.BlockUTF8.exception(BlockUTF8.java:275)
> at
> org.apache.jena.atlas.io.BlockUTF8.toCharsBuffer(BlockUTF8.java:150)
> at org.apache.jena.atlas.io.BlockUTF8.toChars(BlockUTF8.java:73)
> at org.apache.jena.atlas.io.BlockUTF8.toString(BlockUTF8.java:95)
> at
> org.apache.jena.tdb.store.nodetable.NodecSSE.decode(NodecSSE.java:101)
> at org.apache.jena.tdb.lib.NodeLib.decode(NodeLib.java:105)
> at org.apache.jena.tdb.lib.NodeLib.fetchDecode(NodeLib.java:81)
> at
> org.apache.jena.tdb.store.nodetable.NodeTableNative.readNodeFromTable(NodeTableNative.java:186)
> at
> org.apache.jena.tdb.store.nodetable.NodeTableNative._retrieveNodeByNodeId(NodeTableNative.java:111)
> at
> org.apache.jena.tdb.store.nodetable.NodeTableNative.getNodeForNodeId(NodeTableNative.java:70)
> at
> org.apache.jena.tdb.store.nodetable.NodeTableCache._retrieveNodeByNodeId(NodeTableCache.java:128)
> at
> org.apache.jena.tdb.store.nodetable.NodeTableCache.getNodeForNodeId(NodeTableCache.java:82)
> at
> org.apache.jena.tdb.store.nodetable.NodeTableWrapper.getNodeForNodeId(NodeTableWrapper.java:50)
> at
> org.apache.jena.tdb.store.nodetable.NodeTableInline.getNodeForNodeId(NodeTableInline.java:67)
> at org.apache.jena.tdb.lib.TupleLib.triple(TupleLib.java:107)
> at org.apache.jena.tdb.lib.TupleLib.triple(TupleLib.java:84)
> at
> org.apache.jena.tdb.lib.TupleLib.lambda$convertToTriples$2(TupleLib.java:54)
> at org.apache.jena.atlas.iterator.Iter$2.next(Iter.java:270)
> at org.apache.jena.atlas.iterator.Iter$2.next(Iter.java:270)
> at org.apache.jena.atlas.iterator.Iter.next(Iter.java:891)
> at
> org.apache.jena.riot.system.StreamOps.sendQuadsToStream(StreamOps.java:140)
> at
> org.apache.jena.riot.writer.NQuadsWriter.write$(NQuadsWriter.java:62)
> at
> org.apache.jena.riot.writer.NQuadsWriter.write(NQuadsWriter.java:45)
> at
> org.apache.jena.riot.writer.NQuadsWriter.write(NQuadsWriter.java:91)
> at org.apache.jena.riot.RDFWriter.write$(RDFWriter.java:208)
> at org.apache.jena.riot.RDFWriter.output(RDFWriter.java:165)
> at org.apache.jena.riot.RDFWriter.output(RDFWriter.java:112)
> at
> org.apache.jena.riot.RDFWriterBuilder.output(RDFWriterBuilder.java:149)
> at org.apache.jena.riot.RDFDataMgr.write$(RDFDataMgr.java:1269)
> at org.apache.jena.riot.RDFDataMgr.write(RDFDataMgr.java:1162)
> at org.apache.jena.riot.RDFDataMgr.write(RDFDataMgr.java:1153)
> at org.apache.jena.fuseki.mgt.Backup.backup(Backup.java:115)
> at org.apache.jena.fuseki.mgt.Backup.backup(Backup.java:75)
> at
> org.apache.jena.fuseki.mgt.ActionBackup$BackupTask.run(ActionBackup.java:58)
> at
> org.apache.jena.fuseki.async.AsyncPool.lambda$submit$0(AsyncPool.java:55)
> at org.apache.jena.fuseki.async.AsyncTask.call(AsyncTask.java:100)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.io.IOException: Illegal UTF-8: 0xFFFFFFB1
> ... 40 more
> [2018-06-01 13:25:46] Log4jLoggerAdapter INFO
> Backup(/fuseki/backups/PDE_PROD_2018-06-01_13-24-00):2{code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)