[ https://issues.apache.org/jira/browse/JENA-1909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17125780#comment-17125780 ]
Andy Seaborne commented on JENA-1909: ------------------------------------- It seems that there is a letter {{p}} (which is 112) in the working space files being read (the file is text lines of hex binary). That suggests the file has been corrupted in someway. Could you check the files please? They are {{data-triples.tmp}} and {{data-quads.tmp}}. > TDB1: tdbloader2 crashes > ------------------------ > > Key: JENA-1909 > URL: https://issues.apache.org/jira/browse/JENA-1909 > Project: Apache Jena > Issue Type: Bug > Components: TDB > Affects Versions: Jena 3.15.0 > Reporter: Jonas Sourlier > Priority: Major > Attachments: tdb2.log > > > This might be related to JENA-1908, but since the stack trace is different, I > opened a second ticket. > Tried to import the latest Wikidata dump into Apache Jena, using the > following setup: > * Ubuntu 20.04 on Windows 10 Subsystem for Linux > * Apache Jena 3.15.0 > * Intel i7 4770K, 32GB RAM > * > {code:java} > openjdk 11.0.7 2020-04-14 > OpenJDK Runtime Environment (build 11.0.7+10-post-Ubuntu-3ubuntu1) > OpenJDK 64-Bit Server VM (build 11.0.7+10-post-Ubuntu-3ubuntu1, mixed mode, > sharing){code} > These are the commands I have run: > {code:java} > wget -c > http://mirror.easyname.ch/apache/jena/binaries/apache-jena-3.15.0.tar.gz > tar -xvzf apache-jena-3.15.0.tar.gz > mkdir data > apache-jena-3.15.0/bin/tdbloader2 --phase data --loc data/ ../latest-all.ttl > > tdb1.log 2> tdb2.log & > apache-jena-3.15.0/bin/tdbloader2 --phase index --loc data/ > tdb1.log 2> > tdb2.log & > {code} > The data phase ran fine, but the index phase crashed after about 10 hours. > The stack trace is attached to this ticket (tdb2.log). > Here's the standard output: > {code:java} > 08:47:57 INFO -- TDB Bulk Loader Start > 08:47:57 INFO Index Building Phase > 08:47:57 INFO Creating Index SPO > 08:47:58 INFO Sort SPO > 18:26:19 INFO Sort SPO Completed > 18:26:19 INFO Build SPO > {code} > -- This message was sent by Atlassian Jira (v8.3.4#803005)