[ 
https://issues.apache.org/jira/browse/JENA-1553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16683496#comment-16683496
 ] 

Rob Vesse commented on JENA-1553:
---------------------------------

Any of the above, at the NodeTable level it doesn't really care very much since 
it is simply a mapping of internal 64 bit identifiers to a byte sequence 
containing the full representation of the node.  The node table is offset based 
so there is a lookup file of IDs to offset and then an actual table file with 
nodes encoded at the relevant offsets.
The corruption is at the file level i.e. there has been an 
incomplete/overwritten write leaving an unreadable node entry.

You aren't solving the corruption occurrences with your approach you are just 
avoiding it which is only going to cause you more problems in the long run.  As 
has been repeated multiple times the only real solution is to reload the data 
into a fresh database.

> Can't Backup data - java.io.IOException: Illegal UTF-8: 0xFFFFFFB1
> ------------------------------------------------------------------
>
>                 Key: JENA-1553
>                 URL: https://issues.apache.org/jira/browse/JENA-1553
>             Project: Apache Jena
>          Issue Type: Bug
>          Components: Jena
>         Environment: Ubuntu 16.04 running Docker.  Running stain/jena-fuseki 
> from the official Docker Hub.
>            Reporter: Brian Mullen
>            Assignee: Andy Seaborne
>            Priority: Major
>             Fix For: Jena 3.9.0
>
>
> Attempting to backup through Fuseki, TDB 500M+ triples, breaking with error:  
>  
> {code:java}
> [2018-06-01 13:25:46] Log4jLoggerAdapter WARN  Exception in backup
> org.apache.jena.atlas.RuntimeIOException: java.io.IOException: Illegal UTF-8: 
> 0xFFFFFFB1
>         at org.apache.jena.atlas.io.IO.exception(IO.java:233)
>         at org.apache.jena.atlas.io.BlockUTF8.exception(BlockUTF8.java:275)
>         at 
> org.apache.jena.atlas.io.BlockUTF8.toCharsBuffer(BlockUTF8.java:150)
>         at org.apache.jena.atlas.io.BlockUTF8.toChars(BlockUTF8.java:73)
>         at org.apache.jena.atlas.io.BlockUTF8.toString(BlockUTF8.java:95)
>         at 
> org.apache.jena.tdb.store.nodetable.NodecSSE.decode(NodecSSE.java:101)
>         at org.apache.jena.tdb.lib.NodeLib.decode(NodeLib.java:105)
>         at org.apache.jena.tdb.lib.NodeLib.fetchDecode(NodeLib.java:81)
>         at 
> org.apache.jena.tdb.store.nodetable.NodeTableNative.readNodeFromTable(NodeTableNative.java:186)
>         at 
> org.apache.jena.tdb.store.nodetable.NodeTableNative._retrieveNodeByNodeId(NodeTableNative.java:111)
>         at 
> org.apache.jena.tdb.store.nodetable.NodeTableNative.getNodeForNodeId(NodeTableNative.java:70)
>         at 
> org.apache.jena.tdb.store.nodetable.NodeTableCache._retrieveNodeByNodeId(NodeTableCache.java:128)
>         at 
> org.apache.jena.tdb.store.nodetable.NodeTableCache.getNodeForNodeId(NodeTableCache.java:82)
>         at 
> org.apache.jena.tdb.store.nodetable.NodeTableWrapper.getNodeForNodeId(NodeTableWrapper.java:50)
>         at 
> org.apache.jena.tdb.store.nodetable.NodeTableInline.getNodeForNodeId(NodeTableInline.java:67)
>         at org.apache.jena.tdb.lib.TupleLib.triple(TupleLib.java:107)
>         at org.apache.jena.tdb.lib.TupleLib.triple(TupleLib.java:84)
>         at 
> org.apache.jena.tdb.lib.TupleLib.lambda$convertToTriples$2(TupleLib.java:54)
>         at org.apache.jena.atlas.iterator.Iter$2.next(Iter.java:270)
>         at org.apache.jena.atlas.iterator.Iter$2.next(Iter.java:270)
>         at org.apache.jena.atlas.iterator.Iter.next(Iter.java:891)
>         at 
> org.apache.jena.riot.system.StreamOps.sendQuadsToStream(StreamOps.java:140)
>         at 
> org.apache.jena.riot.writer.NQuadsWriter.write$(NQuadsWriter.java:62)
>         at 
> org.apache.jena.riot.writer.NQuadsWriter.write(NQuadsWriter.java:45)
>         at 
> org.apache.jena.riot.writer.NQuadsWriter.write(NQuadsWriter.java:91)
>         at org.apache.jena.riot.RDFWriter.write$(RDFWriter.java:208)
>         at org.apache.jena.riot.RDFWriter.output(RDFWriter.java:165)
>         at org.apache.jena.riot.RDFWriter.output(RDFWriter.java:112)
>         at 
> org.apache.jena.riot.RDFWriterBuilder.output(RDFWriterBuilder.java:149)
>         at org.apache.jena.riot.RDFDataMgr.write$(RDFDataMgr.java:1269)
>         at org.apache.jena.riot.RDFDataMgr.write(RDFDataMgr.java:1162)
>         at org.apache.jena.riot.RDFDataMgr.write(RDFDataMgr.java:1153)
>         at org.apache.jena.fuseki.mgt.Backup.backup(Backup.java:115)
>         at org.apache.jena.fuseki.mgt.Backup.backup(Backup.java:75)
>         at 
> org.apache.jena.fuseki.mgt.ActionBackup$BackupTask.run(ActionBackup.java:58)
>         at 
> org.apache.jena.fuseki.async.AsyncPool.lambda$submit$0(AsyncPool.java:55)
>         at org.apache.jena.fuseki.async.AsyncTask.call(AsyncTask.java:100)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>         at java.lang.Thread.run(Thread.java:745)
> Caused by: java.io.IOException: Illegal UTF-8: 0xFFFFFFB1
>         ... 40 more
> [2018-06-01 13:25:46] Log4jLoggerAdapter INFO  
> Backup(/fuseki/backups/PDE_PROD_2018-06-01_13-24-00):2{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to