[ https://issues.apache.org/jira/browse/JENA-1553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16682451#comment-16682451 ]
Andy Seaborne commented on JENA-1553: ------------------------------------- The node recovery code does not recover the database and that is what people will expect. It also has to be hand-tuned the the nature of the errors. I specially wrote it for recovering this data. It is based on a TDB2 program. I haven't tested it, nor done any work on matching the nodes to the triples. I'm not even sure it does not permanently modify the node table it is recovering. I have had to run on a fresh copy each time. You'll notice, as before, the last node is broken. This is because the node tabel can get ahead of what is in the database. Those nodes can (correctly this time) get overwritten. I have learnt through bitter experience that providing code as a starting point for people to adapt to their needs does not work. They expect perfection and will skip over any description that says "you need to adapt ths code". The code does not attempt to fix the database. > Can't Backup data - java.io.IOException: Illegal UTF-8: 0xFFFFFFB1 > ------------------------------------------------------------------ > > Key: JENA-1553 > URL: https://issues.apache.org/jira/browse/JENA-1553 > Project: Apache Jena > Issue Type: Bug > Components: Jena > Environment: Ubuntu 16.04 running Docker. Running stain/jena-fuseki > from the official Docker Hub. > Reporter: Brian Mullen > Assignee: Andy Seaborne > Priority: Major > Fix For: Jena 3.9.0 > > > Attempting to backup through Fuseki, TDB 500M+ triples, breaking with error: > > {code:java} > [2018-06-01 13:25:46] Log4jLoggerAdapter WARN Exception in backup > org.apache.jena.atlas.RuntimeIOException: java.io.IOException: Illegal UTF-8: > 0xFFFFFFB1 > at org.apache.jena.atlas.io.IO.exception(IO.java:233) > at org.apache.jena.atlas.io.BlockUTF8.exception(BlockUTF8.java:275) > at > org.apache.jena.atlas.io.BlockUTF8.toCharsBuffer(BlockUTF8.java:150) > at org.apache.jena.atlas.io.BlockUTF8.toChars(BlockUTF8.java:73) > at org.apache.jena.atlas.io.BlockUTF8.toString(BlockUTF8.java:95) > at > org.apache.jena.tdb.store.nodetable.NodecSSE.decode(NodecSSE.java:101) > at org.apache.jena.tdb.lib.NodeLib.decode(NodeLib.java:105) > at org.apache.jena.tdb.lib.NodeLib.fetchDecode(NodeLib.java:81) > at > org.apache.jena.tdb.store.nodetable.NodeTableNative.readNodeFromTable(NodeTableNative.java:186) > at > org.apache.jena.tdb.store.nodetable.NodeTableNative._retrieveNodeByNodeId(NodeTableNative.java:111) > at > org.apache.jena.tdb.store.nodetable.NodeTableNative.getNodeForNodeId(NodeTableNative.java:70) > at > org.apache.jena.tdb.store.nodetable.NodeTableCache._retrieveNodeByNodeId(NodeTableCache.java:128) > at > org.apache.jena.tdb.store.nodetable.NodeTableCache.getNodeForNodeId(NodeTableCache.java:82) > at > org.apache.jena.tdb.store.nodetable.NodeTableWrapper.getNodeForNodeId(NodeTableWrapper.java:50) > at > org.apache.jena.tdb.store.nodetable.NodeTableInline.getNodeForNodeId(NodeTableInline.java:67) > at org.apache.jena.tdb.lib.TupleLib.triple(TupleLib.java:107) > at org.apache.jena.tdb.lib.TupleLib.triple(TupleLib.java:84) > at > org.apache.jena.tdb.lib.TupleLib.lambda$convertToTriples$2(TupleLib.java:54) > at org.apache.jena.atlas.iterator.Iter$2.next(Iter.java:270) > at org.apache.jena.atlas.iterator.Iter$2.next(Iter.java:270) > at org.apache.jena.atlas.iterator.Iter.next(Iter.java:891) > at > org.apache.jena.riot.system.StreamOps.sendQuadsToStream(StreamOps.java:140) > at > org.apache.jena.riot.writer.NQuadsWriter.write$(NQuadsWriter.java:62) > at > org.apache.jena.riot.writer.NQuadsWriter.write(NQuadsWriter.java:45) > at > org.apache.jena.riot.writer.NQuadsWriter.write(NQuadsWriter.java:91) > at org.apache.jena.riot.RDFWriter.write$(RDFWriter.java:208) > at org.apache.jena.riot.RDFWriter.output(RDFWriter.java:165) > at org.apache.jena.riot.RDFWriter.output(RDFWriter.java:112) > at > org.apache.jena.riot.RDFWriterBuilder.output(RDFWriterBuilder.java:149) > at org.apache.jena.riot.RDFDataMgr.write$(RDFDataMgr.java:1269) > at org.apache.jena.riot.RDFDataMgr.write(RDFDataMgr.java:1162) > at org.apache.jena.riot.RDFDataMgr.write(RDFDataMgr.java:1153) > at org.apache.jena.fuseki.mgt.Backup.backup(Backup.java:115) > at org.apache.jena.fuseki.mgt.Backup.backup(Backup.java:75) > at > org.apache.jena.fuseki.mgt.ActionBackup$BackupTask.run(ActionBackup.java:58) > at > org.apache.jena.fuseki.async.AsyncPool.lambda$submit$0(AsyncPool.java:55) > at org.apache.jena.fuseki.async.AsyncTask.call(AsyncTask.java:100) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.io.IOException: Illegal UTF-8: 0xFFFFFFB1 > ... 40 more > [2018-06-01 13:25:46] Log4jLoggerAdapter INFO > Backup(/fuseki/backups/PDE_PROD_2018-06-01_13-24-00):2{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)