[ 
https://issues.apache.org/jira/browse/JENA-1553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16682451#comment-16682451
 ] 

Andy Seaborne commented on JENA-1553:
-------------------------------------

The node recovery code does not recover the database and that is what people 
will expect. It also has to be hand-tuned the the nature of the errors. I 
specially wrote it for recovering this data. It is based on a TDB2 program.

I haven't tested it, nor done any work on matching the nodes to the triples.  
I'm not even sure it does not permanently modify the node table it is 
recovering. I have had to run on a fresh copy each time.

You'll notice, as before, the last node is broken.  This is because the node 
tabel can get ahead of what is in the database. Those nodes can (correctly this 
time) get overwritten.

I have learnt through bitter experience that providing code as a starting point 
for people to adapt to their needs does not work. They expect perfection and 
will skip over any description that says "you need to adapt ths code". 

The code does not attempt to fix the database.



> Can't Backup data - java.io.IOException: Illegal UTF-8: 0xFFFFFFB1
> ------------------------------------------------------------------
>
>                 Key: JENA-1553
>                 URL: https://issues.apache.org/jira/browse/JENA-1553
>             Project: Apache Jena
>          Issue Type: Bug
>          Components: Jena
>         Environment: Ubuntu 16.04 running Docker.  Running stain/jena-fuseki 
> from the official Docker Hub.
>            Reporter: Brian Mullen
>            Assignee: Andy Seaborne
>            Priority: Major
>             Fix For: Jena 3.9.0
>
>
> Attempting to backup through Fuseki, TDB 500M+ triples, breaking with error:  
>  
> {code:java}
> [2018-06-01 13:25:46] Log4jLoggerAdapter WARN  Exception in backup
> org.apache.jena.atlas.RuntimeIOException: java.io.IOException: Illegal UTF-8: 
> 0xFFFFFFB1
>         at org.apache.jena.atlas.io.IO.exception(IO.java:233)
>         at org.apache.jena.atlas.io.BlockUTF8.exception(BlockUTF8.java:275)
>         at 
> org.apache.jena.atlas.io.BlockUTF8.toCharsBuffer(BlockUTF8.java:150)
>         at org.apache.jena.atlas.io.BlockUTF8.toChars(BlockUTF8.java:73)
>         at org.apache.jena.atlas.io.BlockUTF8.toString(BlockUTF8.java:95)
>         at 
> org.apache.jena.tdb.store.nodetable.NodecSSE.decode(NodecSSE.java:101)
>         at org.apache.jena.tdb.lib.NodeLib.decode(NodeLib.java:105)
>         at org.apache.jena.tdb.lib.NodeLib.fetchDecode(NodeLib.java:81)
>         at 
> org.apache.jena.tdb.store.nodetable.NodeTableNative.readNodeFromTable(NodeTableNative.java:186)
>         at 
> org.apache.jena.tdb.store.nodetable.NodeTableNative._retrieveNodeByNodeId(NodeTableNative.java:111)
>         at 
> org.apache.jena.tdb.store.nodetable.NodeTableNative.getNodeForNodeId(NodeTableNative.java:70)
>         at 
> org.apache.jena.tdb.store.nodetable.NodeTableCache._retrieveNodeByNodeId(NodeTableCache.java:128)
>         at 
> org.apache.jena.tdb.store.nodetable.NodeTableCache.getNodeForNodeId(NodeTableCache.java:82)
>         at 
> org.apache.jena.tdb.store.nodetable.NodeTableWrapper.getNodeForNodeId(NodeTableWrapper.java:50)
>         at 
> org.apache.jena.tdb.store.nodetable.NodeTableInline.getNodeForNodeId(NodeTableInline.java:67)
>         at org.apache.jena.tdb.lib.TupleLib.triple(TupleLib.java:107)
>         at org.apache.jena.tdb.lib.TupleLib.triple(TupleLib.java:84)
>         at 
> org.apache.jena.tdb.lib.TupleLib.lambda$convertToTriples$2(TupleLib.java:54)
>         at org.apache.jena.atlas.iterator.Iter$2.next(Iter.java:270)
>         at org.apache.jena.atlas.iterator.Iter$2.next(Iter.java:270)
>         at org.apache.jena.atlas.iterator.Iter.next(Iter.java:891)
>         at 
> org.apache.jena.riot.system.StreamOps.sendQuadsToStream(StreamOps.java:140)
>         at 
> org.apache.jena.riot.writer.NQuadsWriter.write$(NQuadsWriter.java:62)
>         at 
> org.apache.jena.riot.writer.NQuadsWriter.write(NQuadsWriter.java:45)
>         at 
> org.apache.jena.riot.writer.NQuadsWriter.write(NQuadsWriter.java:91)
>         at org.apache.jena.riot.RDFWriter.write$(RDFWriter.java:208)
>         at org.apache.jena.riot.RDFWriter.output(RDFWriter.java:165)
>         at org.apache.jena.riot.RDFWriter.output(RDFWriter.java:112)
>         at 
> org.apache.jena.riot.RDFWriterBuilder.output(RDFWriterBuilder.java:149)
>         at org.apache.jena.riot.RDFDataMgr.write$(RDFDataMgr.java:1269)
>         at org.apache.jena.riot.RDFDataMgr.write(RDFDataMgr.java:1162)
>         at org.apache.jena.riot.RDFDataMgr.write(RDFDataMgr.java:1153)
>         at org.apache.jena.fuseki.mgt.Backup.backup(Backup.java:115)
>         at org.apache.jena.fuseki.mgt.Backup.backup(Backup.java:75)
>         at 
> org.apache.jena.fuseki.mgt.ActionBackup$BackupTask.run(ActionBackup.java:58)
>         at 
> org.apache.jena.fuseki.async.AsyncPool.lambda$submit$0(AsyncPool.java:55)
>         at org.apache.jena.fuseki.async.AsyncTask.call(AsyncTask.java:100)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>         at java.lang.Thread.run(Thread.java:745)
> Caused by: java.io.IOException: Illegal UTF-8: 0xFFFFFFB1
>         ... 40 more
> [2018-06-01 13:25:46] Log4jLoggerAdapter INFO  
> Backup(/fuseki/backups/PDE_PROD_2018-06-01_13-24-00):2{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to