Thanks, Andy, the TDB2 assembler fixed it, and all worked well. I've tried to load wikidata-truthy then, but apparently the bzip file was damaged at line 4052914959 - have to try again
Cheers, Joachim > -----Ursprüngliche Nachricht----- > Von: Andy Seaborne <a...@apache.org> > Gesendet: Samstag, 12. Februar 2022 11:15 > An: users@jena.apache.org > Betreff: Re: AW: AW: AW: xloader "Can't find gzip program" > > Hi Joachim, > > Aside: I've realised why the timestampes are fixed at "2022-01-30 15:03". > > The build setup is for repeatable builds of releases. Any build from the X.Y.Z > release source, with the same JDK, will generate the byte-wise same jar files. > > Each release build fixes the timestamp and uses that, and it gets in the POM > as property <project.build.outputTimestamp>. It only get updated when a > release happens otherwise the POM file is going to get modified several > times a week. > > Thankfully, we have --version on most commands as well. > > That's timestamps explained. > > ---- > > You seem to have run the TDB2 xloader, then given the text index builder a > assembler description for TDB1. > > Fuseki with --loc determines the database type by looking at the file layout, > but assemblers don't. > > The version output can be changed to say "TDB1" without too much > disruption. Small tweak that might have helped shown this up earlier. > > Andy > > On 11/02/2022 23:06, Neubert, Joachim wrote: > > Sorry, my fault: I've actually had jena-4.4.0 active, not 4.5.0-SNAPSHOT. > > > > Now the loading works smoothly: > > > > 22:50:10 INFO Load node table = 62 seconds > > 22:50:10 INFO Load ingest data = 37 seconds > > 22:50:10 INFO Build index SPO = 7 seconds > > 22:50:10 INFO Build index POS = 12 seconds > > 22:50:10 INFO Build index OSP = 9 seconds > > 22:50:10 INFO Overall 127 seconds > > 22:50:10 INFO Overall 00h 02m 07s > > 22:50:10 INFO Triples loaded = 10000000 > > 22:50:10 INFO Quads loaded = 0 > > 22:50:10 INFO Overall Rate 78740 tuples per second > > That's output from tdb2.xloader. > > At 10m up to 500m (laptop) or maybe 1B (server), triples, also try > "tdb2.tdbloader --loader=parallel" > > > However, the text indexing crashes, when called like that: > > > > java -cp $FUSEKI_HOME/fuseki-server.jar jena.textindexer --debug > > --desc=/tmp/temp.ttl > > > > org.apache.jena.assembler.exceptions.AssemblerException: caught: > Unable to check TDB lock owner, the lock file contents appear to be for a > TDB2 database. Please try loading this location as a TDB2 database. See > https://jena.apache.org/documentation/tdb/faqs.html for more > information. > > doing: > > root: file:///tmp/temp.ttl#dataset with type: > > http://jena.hpl.hp.com/2008/tdb#DatasetTDB assembler class: class > > org.apache.jena.tdb.assembler.DatasetAssemblerTDB1 > > But that is TDB1 > > > root: http://localhost/jena_example/#text_dataset with type: > > http://jena.apache.org/text#TextDataset assembler class: class > > org.apache.jena.query.text.assembler.TextDatasetAssembler > > > ... > > Caused by: org.apache.jena.tdb.base.file.FileException: Unable to check > TDB lock owner, the lock file contents appear to be for a TDB2 database. > Please try loading this location as a TDB2 database. See > https://jena.apache.org/documentation/tdb/faqs.html for more > information. > > at > > org.apache.jena.tdb.base.file.LocationLock.getOwner(LocationLock.java: > > 110) > > org.apache.jena.tdb == TDB1 > > > at > org.apache.jena.tdb.base.file.LocationLock.canObtain(LocationLock.java:139) > > at > org.apache.jena.tdb.StoreConnection._makeAndCache(StoreConnection.java > :262) > > at > org.apache.jena.tdb.StoreConnection.make(StoreConnection.java:226) > > at > org.apache.jena.tdb.StoreConnection.make(StoreConnection.java:240) > > at > org.apache.jena.tdb.transaction.DatasetGraphTransaction.<init>(DatasetGra > phTransaction.java:72) > > at > > org.apache.jena.tdb.sys.TDBMaker.createDirect(TDBMaker.java:114) > ... > > > ... 23 more > > 2022-02-11 22:50:12 ABORTED > > > > cat /var/lib/fuseki/databases/temp/tdb.lock > > 32907 > > > > Cheers, Joachim