[
https://issues.apache.org/jira/browse/JENA-1553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16677797#comment-16677797
]
Jean-Marc Vanel commented on JENA-1553:
---------------------------------------
I got the same kind of problem on my other site (dedicated to nature, botany,
etc).
{{I made a dump with tdb.tdbdump , and got that same stack already reported in
this issue:}}
on the main TDB:
{noformat}
org.apache.jena.tdb.TDBException: Failed to tokenise:
at
org.apache.jena.tdb.store.nodetable.NodecSSE.decode(NodecSSE.java:127)
at org.apache.jena.tdb.lib.NodeLib.decode(NodeLib.java:120)
at org.apache.jena.tdb.lib.NodeLib.fetchDecode(NodeLib.java:97)
at
org.apache.jena.tdb.store.nodetable.NodeTableNative.readNodeFromTable(NodeTableNative.java:182)
at
org.apache.jena.tdb.store.nodetable.NodeTableNative._retrieveNodeByNodeId(NodeTableNative.java:108)
at
org.apache.jena.tdb.store.nodetable.NodeTableNative.getNodeForNodeId(NodeTableNative.java:67)
at
org.apache.jena.tdb.store.nodetable.NodeTableCache._retrieveNodeByNodeId(NodeTableCache.java:128)
at
org.apache.jena.tdb.store.nodetable.NodeTableCache.getNodeForNodeId(NodeTableCache.java:82)
at
org.apache.jena.tdb.store.nodetable.NodeTableWrapper.getNodeForNodeId(NodeTableWrapper.java:50)
at
org.apache.jena.tdb.store.nodetable.NodeTableInline.getNodeForNodeId(NodeTableInline.java:67)
at org.apache.jena.tdb.lib.TupleLib.quad(TupleLib.java:128)
at org.apache.jena.tdb.lib.TupleLib.quad(TupleLib.java:120)
at
org.apache.jena.tdb.lib.TupleLib.lambda$convertToQuads$3(TupleLib.java:59)
at org.apache.jena.atlas.iterator.Iter$2.next(Iter.java:270)
at
org.apache.jena.riot.system.StreamOps.sendQuadsToStream(StreamOps.java:140)
at org.apache.jena.riot.writer.NQuadsWriter.write$(NQuadsWriter.java:62)
{noformat}
on the history TDB:
{noformat}
org.apache.jena.tdb.TDBException: Not a node: urce/Sillans-la-Cascade>
at
org.apache.jena.tdb.store.nodetable.NodecSSE.decode(NodecSSE.java:132)
at org.apache.jena.tdb.lib.NodeLib.decode(NodeLib.java:120)
at org.apache.jena.tdb.lib.NodeLib.fetchDecode(NodeLib.java:97)
at
org.apache.jena.tdb.store.nodetable.NodeTableNative.readNodeFromTable(NodeTableNative.java:182)
at
org.apache.jena.tdb.store.nodetable.NodeTableNative._retrieveNodeByNodeId(NodeTableNative.java:108)
at
org.apache.jena.tdb.store.nodetable.NodeTableNative.getNodeForNodeId(NodeTableNative.java:67)
at
org.apache.jena.tdb.store.nodetable.NodeTableCache._retrieveNodeByNodeId(NodeTableCache.java:128)
at
org.apache.jena.tdb.store.nodetable.NodeTableCache.getNodeForNodeId(NodeTableCache.java:82)
at
org.apache.jena.tdb.store.nodetable.NodeTableWrapper.getNodeForNodeId(NodeTableWrapper.java:50)
at
org.apache.jena.tdb.store.nodetable.NodeTableInline.getNodeForNodeId(NodeTableInline.java:67)
at org.apache.jena.tdb.lib.TupleLib.quad(TupleLib.java:128)
at org.apache.jena.tdb.lib.TupleLib.quad(TupleLib.java:120)
at
org.apache.jena.tdb.lib.TupleLib.lambda$convertToQuads$3(TupleLib.java:59)
at org.apache.jena.atlas.iterator.Iter$2.next(Iter.java:270)
at
org.apache.jena.riot.system.StreamOps.sendQuadsToStream(StreamOps.java:140)
at org.apache.jena.riot.writer.NQuadsWriter.write$(NQuadsWriter.java:62)
{noformat}
Nothing bad was reported on the web application yet .
QUESTIONS
* What should I do ?
* It is feasible to have a (possibly partial) recovery program ?
It would catch TDBException somewhere, remove bad binary data, and continue .
That could even be the default mode of tdbdump.
> Can't Backup data - java.io.IOException: Illegal UTF-8: 0xFFFFFFB1
> ------------------------------------------------------------------
>
> Key: JENA-1553
> URL: https://issues.apache.org/jira/browse/JENA-1553
> Project: Apache Jena
> Issue Type: Bug
> Components: Jena
> Environment: Ubuntu 16.04 running Docker. Running stain/jena-fuseki
> from the official Docker Hub.
> Reporter: Brian Mullen
> Assignee: Andy Seaborne
> Priority: Major
> Fix For: Jena 3.9.0
>
>
> Attempting to backup through Fuseki, TDB 500M+ triples, breaking with error:
>
> {code:java}
> [2018-06-01 13:25:46] Log4jLoggerAdapter WARN Exception in backup
> org.apache.jena.atlas.RuntimeIOException: java.io.IOException: Illegal UTF-8:
> 0xFFFFFFB1
> at org.apache.jena.atlas.io.IO.exception(IO.java:233)
> at org.apache.jena.atlas.io.BlockUTF8.exception(BlockUTF8.java:275)
> at
> org.apache.jena.atlas.io.BlockUTF8.toCharsBuffer(BlockUTF8.java:150)
> at org.apache.jena.atlas.io.BlockUTF8.toChars(BlockUTF8.java:73)
> at org.apache.jena.atlas.io.BlockUTF8.toString(BlockUTF8.java:95)
> at
> org.apache.jena.tdb.store.nodetable.NodecSSE.decode(NodecSSE.java:101)
> at org.apache.jena.tdb.lib.NodeLib.decode(NodeLib.java:105)
> at org.apache.jena.tdb.lib.NodeLib.fetchDecode(NodeLib.java:81)
> at
> org.apache.jena.tdb.store.nodetable.NodeTableNative.readNodeFromTable(NodeTableNative.java:186)
> at
> org.apache.jena.tdb.store.nodetable.NodeTableNative._retrieveNodeByNodeId(NodeTableNative.java:111)
> at
> org.apache.jena.tdb.store.nodetable.NodeTableNative.getNodeForNodeId(NodeTableNative.java:70)
> at
> org.apache.jena.tdb.store.nodetable.NodeTableCache._retrieveNodeByNodeId(NodeTableCache.java:128)
> at
> org.apache.jena.tdb.store.nodetable.NodeTableCache.getNodeForNodeId(NodeTableCache.java:82)
> at
> org.apache.jena.tdb.store.nodetable.NodeTableWrapper.getNodeForNodeId(NodeTableWrapper.java:50)
> at
> org.apache.jena.tdb.store.nodetable.NodeTableInline.getNodeForNodeId(NodeTableInline.java:67)
> at org.apache.jena.tdb.lib.TupleLib.triple(TupleLib.java:107)
> at org.apache.jena.tdb.lib.TupleLib.triple(TupleLib.java:84)
> at
> org.apache.jena.tdb.lib.TupleLib.lambda$convertToTriples$2(TupleLib.java:54)
> at org.apache.jena.atlas.iterator.Iter$2.next(Iter.java:270)
> at org.apache.jena.atlas.iterator.Iter$2.next(Iter.java:270)
> at org.apache.jena.atlas.iterator.Iter.next(Iter.java:891)
> at
> org.apache.jena.riot.system.StreamOps.sendQuadsToStream(StreamOps.java:140)
> at
> org.apache.jena.riot.writer.NQuadsWriter.write$(NQuadsWriter.java:62)
> at
> org.apache.jena.riot.writer.NQuadsWriter.write(NQuadsWriter.java:45)
> at
> org.apache.jena.riot.writer.NQuadsWriter.write(NQuadsWriter.java:91)
> at org.apache.jena.riot.RDFWriter.write$(RDFWriter.java:208)
> at org.apache.jena.riot.RDFWriter.output(RDFWriter.java:165)
> at org.apache.jena.riot.RDFWriter.output(RDFWriter.java:112)
> at
> org.apache.jena.riot.RDFWriterBuilder.output(RDFWriterBuilder.java:149)
> at org.apache.jena.riot.RDFDataMgr.write$(RDFDataMgr.java:1269)
> at org.apache.jena.riot.RDFDataMgr.write(RDFDataMgr.java:1162)
> at org.apache.jena.riot.RDFDataMgr.write(RDFDataMgr.java:1153)
> at org.apache.jena.fuseki.mgt.Backup.backup(Backup.java:115)
> at org.apache.jena.fuseki.mgt.Backup.backup(Backup.java:75)
> at
> org.apache.jena.fuseki.mgt.ActionBackup$BackupTask.run(ActionBackup.java:58)
> at
> org.apache.jena.fuseki.async.AsyncPool.lambda$submit$0(AsyncPool.java:55)
> at org.apache.jena.fuseki.async.AsyncTask.call(AsyncTask.java:100)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.io.IOException: Illegal UTF-8: 0xFFFFFFB1
> ... 40 more
> [2018-06-01 13:25:46] Log4jLoggerAdapter INFO
> Backup(/fuseki/backups/PDE_PROD_2018-06-01_13-24-00):2{code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)