Sorry, my fault: I've actually had jena-4.4.0 active, not 4.5.0-SNAPSHOT.

Now the loading works smoothly:

22:50:10 INFO  Load node table  = 62 seconds
22:50:10 INFO  Load ingest data = 37 seconds
22:50:10 INFO  Build index SPO  = 7 seconds
22:50:10 INFO  Build index POS  = 12 seconds
22:50:10 INFO  Build index OSP  = 9 seconds
22:50:10 INFO  Overall          127 seconds
22:50:10 INFO  Overall          00h 02m 07s
22:50:10 INFO  Triples loaded   = 10000000
22:50:10 INFO  Quads loaded     = 0
22:50:10 INFO  Overall Rate     78740 tuples per second

However, the text indexing crashes, when called like that:

java -cp $FUSEKI_HOME/fuseki-server.jar jena.textindexer --debug 
--desc=/tmp/temp.ttl

org.apache.jena.assembler.exceptions.AssemblerException: caught: Unable to 
check TDB lock owner, the lock file contents appear to be for a TDB2 database.  
Please try loading this location as a TDB2 database. See 
https://jena.apache.org/documentation/tdb/faqs.html for more information.
  doing:
    root: file:///tmp/temp.ttl#dataset with type: 
http://jena.hpl.hp.com/2008/tdb#DatasetTDB assembler class: class 
org.apache.jena.tdb.assembler.DatasetAssemblerTDB1
    root: http://localhost/jena_example/#text_dataset with type: 
http://jena.apache.org/text#TextDataset assembler class: class 
org.apache.jena.query.text.assembler.TextDatasetAssembler

        at 
org.apache.jena.assembler.assemblers.AssemblerGroup$PlainAssemblerGroup.openBySpecificType(AssemblerGroup.java:165)
        at 
org.apache.jena.assembler.assemblers.AssemblerGroup$PlainAssemblerGroup.open(AssemblerGroup.java:144)
        at 
org.apache.jena.assembler.assemblers.AssemblerGroup$ExpandingAssemblerGroup.open(AssemblerGroup.java:93)
        at 
org.apache.jena.assembler.assemblers.AssemblerBase.open(AssemblerBase.java:39)
        at 
org.apache.jena.assembler.assemblers.AssemblerBase.open(AssemblerBase.java:35)
        at 
org.apache.jena.query.text.assembler.TextDatasetAssembler.open(TextDatasetAssembler.java:67)
        at 
org.apache.jena.query.text.assembler.TextDatasetAssembler.open(TextDatasetAssembler.java:42)
        at 
org.apache.jena.assembler.assemblers.AssemblerGroup$PlainAssemblerGroup.openBySpecificType(AssemblerGroup.java:157)
        at 
org.apache.jena.assembler.assemblers.AssemblerGroup$PlainAssemblerGroup.open(AssemblerGroup.java:144)
        at 
org.apache.jena.assembler.assemblers.AssemblerGroup$ExpandingAssemblerGroup.open(AssemblerGroup.java:93)
        at 
org.apache.jena.assembler.assemblers.AssemblerBase.open(AssemblerBase.java:39)
        at 
org.apache.jena.assembler.assemblers.AssemblerBase.open(AssemblerBase.java:35)
        at 
org.apache.jena.sparql.core.assembler.AssemblerUtils.build(AssemblerUtils.java:144)
        at 
org.apache.jena.sparql.core.assembler.AssemblerUtils.build(AssemblerUtils.java:132)
        at 
org.apache.jena.query.text.TextDatasetFactory.create(TextDatasetFactory.java:38)
        at 
org.apache.jena.query.text.cmd.textindexer.processModulesAndArgs(textindexer.java:90)
        at org.apache.jena.cmd.CmdArgModule.process(CmdArgModule.java:39)
        at org.apache.jena.cmd.CmdMain.mainMethod(CmdMain.java:86)
        at org.apache.jena.cmd.CmdMain.mainRun(CmdMain.java:56)
        at org.apache.jena.cmd.CmdMain.mainRun(CmdMain.java:43)
        at org.apache.jena.query.text.cmd.textindexer.main(textindexer.java:52)
        at 
org.apache.jena.query.text.cmd.InitTextCmds.lambda$cmds$1(InitTextCmds.java:26)
        at org.apache.jena.cmd.Cmds.exec(Cmds.java:65)
        at jena.textindexer.main(textindexer.java:25)
Caused by: org.apache.jena.tdb.base.file.FileException: Unable to check TDB 
lock owner, the lock file contents appear to be for a TDB2 database.  Please 
try loading this location as a TDB2 database. See 
https://jena.apache.org/documentation/tdb/faqs.html for more information.
        at 
org.apache.jena.tdb.base.file.LocationLock.getOwner(LocationLock.java:110)
        at 
org.apache.jena.tdb.base.file.LocationLock.canObtain(LocationLock.java:139)
        at 
org.apache.jena.tdb.StoreConnection._makeAndCache(StoreConnection.java:262)
        at org.apache.jena.tdb.StoreConnection.make(StoreConnection.java:226)
        at org.apache.jena.tdb.StoreConnection.make(StoreConnection.java:240)
        at 
org.apache.jena.tdb.transaction.DatasetGraphTransaction.<init>(DatasetGraphTransaction.java:72)
        at org.apache.jena.tdb.sys.TDBMaker.createDirect(TDBMaker.java:114)
        at 
java.base/java.util.concurrent.ConcurrentHashMap.computeIfAbsent(ConcurrentHashMap.java:1705)
        at org.apache.jena.tdb.sys.TDBMaker._create(TDBMaker.java:100)
        at 
org.apache.jena.tdb.sys.TDBMaker.createDatasetGraphTransaction(TDBMaker.java:43)
        at 
org.apache.jena.tdb.TDBFactory._createDatasetGraph(TDBFactory.java:93)
        at org.apache.jena.tdb.TDBFactory.createDatasetGraph(TDBFactory.java:71)
        at 
org.apache.jena.tdb.assembler.DatasetAssemblerTDB1.make(DatasetAssemblerTDB1.java:55)
        at 
org.apache.jena.tdb.assembler.DatasetAssemblerTDB1.createDataset(DatasetAssemblerTDB1.java:46)
        at 
org.apache.jena.sparql.core.assembler.DatasetAssembler.open(DatasetAssembler.java:40)
        at 
org.apache.jena.sparql.core.assembler.DatasetAssembler.open(DatasetAssembler.java:33)
        at 
org.apache.jena.assembler.assemblers.AssemblerGroup$PlainAssemblerGroup.openBySpecificType(AssemblerGroup.java:157)
        ... 23 more
2022-02-11 22:50:12 ABORTED

cat /var/lib/fuseki/databases/temp/tdb.lock
32907

Cheers, Joachim

> -----Ursprüngliche Nachricht-----
> Von: Andy Seaborne <a...@apache.org>
> Gesendet: Freitag, 11. Februar 2022 23:06
> An: users@jena.apache.org
> Betreff: Re: AW: AW: xloader "Can't find gzip program"
> 
> 
> 
> On 11/02/2022 21:38, Neubert, Joachim wrote:
> > Strange - I should have the same version:
> >
> > sudo tar xzvf
> > /usr/local/src/apache-jena-fuseki-4.5.0-20220209.180144-12.tar.gz
> 
> Different jar file : apache-jena-4.5.0-20220209.180144-12 (no Fuseki) but
> weird anyway.
> 
> wget
> https://repository.apache.org/content/groups/snapshots/org/apache/jena/
> apache-jena/4.5.0-SNAPSHOT/apache-jena-4.5.0-20220209.180144-12.zip
> 
> then the zip file is:
> 
> 27372309 Feb  9 18:26 apache-jena-4.5.0-20220209.180144-12.zip
> 
> 
> apache-jena-4.5.0-SNAPSHOT/bin/tdb2.tdbloader --version
> 
> Jena:       VERSION: 4.5.0-SNAPSHOT
> Jena:       BUILD_DATE: 2022-02-09T18:01:44Z
> ARQ:        VERSION: 4.5.0-SNAPSHOT
> ARQ:        BUILD_DATE: 2022-02-09T18:01:44Z
> TDB2:       VERSION: 4.5.0-SNAPSHOT
> TDB2:       BUILD_DATE: 2022-02-09T18:01:44Z
> 
> yet the TDB2 jar is dated 30th Jan, as are the files inside it -- can't 
> explain that.
> 
> 294846 Jan 30 15:03
> apache-jena-4.5.0-SNAPSHOT/lib/jena-tdb2-4.5.0-SNAPSHOT.jar
> 
> The tdb2.xloader script is 10485 bytes and has
> 
> SORT_THREADS="2"
> 
> in it.  Is that what your copy of the script have in it?
> 
> I'll clear the Jenkins workspace and schedule a new build.
> 
>       Andy
> 
> >
> > but the jarfile date is of Jan 30:
> >
> > ll apache-jena-fuseki-4.5.0-SNAPSHOT/
> > total 35868
> > -rw-r--r-- 1 root root    36975 Jan 30 15:02 LICENSE
> > -rw-r--r-- 1 root root     8914 Jan 30 15:02 NOTICE
> > -rw-r--r-- 1 root root     1151 Jan 30 15:02 README
> > drwxr-xr-x 2 root root      179 Feb 11 20:47 bin
> > -rwxr-xr-x 1 root root    12339 Jan 30 15:02 fuseki
> > -rwxr-xr-x 1 root root     1241 Jan 30 15:02 fuseki-backup
> > -rwxr-xr-x 1 root root     3370 Jan 30 15:02 fuseki-server
> > -rw-r--r-- 1 root root     1264 Jan 30 15:02 fuseki-server.bat
> > -rw-r--r-- 1 root root 36631864 Jan 30 15:02 fuseki-server.jar
> > -rw-r--r-- 1 root root     2217 Jan 30 15:02 fuseki.service
> > -rw-r--r-- 1 root root     2124 Jan 30 15:02 log4j2.properties
> > drwxr-xr-x 4 root root      121 Jan 30 15:02 webapp
> >
> > Cheers, Joachim
> >
> >> -----Ursprüngliche Nachricht-----
> >> Von: Andy Seaborne <a...@apache.org>
> >> Gesendet: Freitag, 11. Februar 2022 22:30
> >> An: users@jena.apache.org
> >> Betreff: Re: AW: xloader "Can't find gzip program"
> >>
> >> Works for me - make sure it is the latest dev build (the one down the
> >> bottom)
> >>
> >> I just grabbed apache-jena-4.5.0-20220209.180144-12.zip (2022-02-09)
> >>
> >> and loaded a few millions triples with no problems.
> >>
> >> rm -rf DB2
> >> apache-jena-4.5.0-SNAPSHOT/bin/tdb2.xloader --loc DB2
> >> ~/Datasets/BSBM/bsbm-5m.nt.gz
> >>
> >>       Andy
> >>
> >> On 11/02/2022 21:20, Neubert, Joachim wrote:
> >>> Hi Andy,
> >>>
> >>> Thanks! The code of 4.5.0-SNAPSHOT seems to run significantly faster
> >>> -
> >> however, the same error at SPO start.
> >>>
> >>> Please let me know if I can help with tracing/reproducing the error.
> >>>
> >>> Cheers, Joachim
> >>>
> >>>> -----Ursprüngliche Nachricht-----
> >>>> Von: Andy Seaborne <a...@apache.org>
> >>>> Gesendet: Freitag, 11. Februar 2022 21:07
> >>>> An: users@jena.apache.org
> >>>> Betreff: Re: xloader "Can't find gzip program"
> >>>>
> >>>> Hi Joachim,
> >>>>
> >>>> https://issues.apache.org/jira/browse/JENA-2277
> >>>> https://issues.apache.org/jira/browse/JENA-2279
> >>>>
> >>>> There are two fixes for tdb2.xloader which are now in the
> >>>> development
> >>>> builds:
> >>>>
> >>>> https://repository.apache.org/content/groups/snapshots/org/apache/j
> >>>> en
> >>>> a/
> >>>>
> >>>> (these are not official releases and have not been voted on by the
> >>>> PMC)
> >>>>
> >>>> If you coudl test them and let us know if they work or whether
> >>>> theer are further problems, that would be great.
> >>>>
> >>>>        Andy
> >>>>
> >>>>
> >>>> On 11/02/2022 17:53, Neubert, Joachim wrote:
> >>>>> I've just started tests with xloader. It aborts with
> >>>>>
> >>>>> 17:21:56 INFO  Data            :: Triples = 10,000,000 ; Quads = 0
> >>>>> 17:21:57 INFO  =-=-=-=-=-=-=-=
> >>>>> 17:21:57 INFO
> >>>>> 17:21:57 INFO  Build SPO
> >>>>> 17:21:57 INFO  (Very long pause likely at this point)
> >>>>> 17:21:58 INFO  Index           :: Build index SPO
> >>>>> java.lang.RuntimeException: org.apache.jena.tdb2.TDBException:
> >>>>> Can't find
> >>>> gzip program
> >>>>>      at
> >>>> org.apache.jena.tdb2.xloader.ProcBuildIndexX.sort_build_index(ProcB
> >>>> ui
> >>>> ldIn
> >>>> dexX.java:207)
> >>>>>      at
> >>>> org.apache.jena.tdb2.xloader.ProcBuildIndexX.buildIndex(ProcBuildIn
> >>>> de
> >>>> xX.ja
> >>>> va:121)
> >>>>>      at
> >>>> org.apache.jena.tdb2.xloader.ProcBuildIndexX.exec2(ProcBuildIndexX.
> >>>> ja
> >>>> va:1
> >>>> 06)
> >>>>>      at
> >>>> org.apache.jena.tdb2.xloader.ProcBuildIndexX.exec(ProcBuildIndexX.j
> >>>> av
> >>>> a:94
> >>>> )
> >>>>>      at tdb2.xloader.CmdxBuildIndex.exec(CmdxBuildIndex.java:80)
> >>>>>      at org.apache.jena.cmd.CmdMain.mainMethod(CmdMain.java:92)
> >>>>>      at org.apache.jena.cmd.CmdMain.mainRun(CmdMain.java:58)
> >>>>>      at org.apache.jena.cmd.CmdMain.mainRun(CmdMain.java:45)
> >>>>>      at tdb2.xloader.CmdxBuildIndex.main(CmdxBuildIndex.java:28)
> >>>>> Caused by: org.apache.jena.tdb2.TDBException: Can't find gzip
> program
> >>>>>      at
> >>>> org.apache.jena.tdb2.xloader.BulkLoaderX.gzipProgram(BulkLoaderX.ja
> >>>> va
> >>>> :67
> >>>> )
> >>>>>      at
> >>>> org.apache.jena.tdb2.xloader.ProcBuildIndexX.sort_build_index(ProcB
> >>>> ui
> >>>> ldIn
> >>>> dexX.java:183)
> >>>>>      ... 8 more
> >>>>>
> >>>>> Of course, /usr/bin/gzip is in the path. My configuration is
> >>>>> below,
> >>>> tdb2.xloader was called with --threads=12.
> >>>>>
> >>>>> Any idea what could be wrong?
> >>>>>
> >>>>> Cheers, Joachim
> >>>>>
> >>>>>
> >>>>> Configuration:
> >>>>> openjdk version "11.0.13" 2021-10-19 LTS OpenJDK Runtime
> >> Environment
> >>>>> 18.9 (build 11.0.13+8-LTS) OpenJDK 64-Bit Server VM 18.9 (build
> >>>>> 11.0.13+8-LTS, mixed mode, sharing)
> >>>>> JAVA_OPTS: -d64 -Xmx12G
> >>>>> Loader: tdb2.xloader
> >>>>> Jena:       VERSION: 4.4.0
> >>>>> Jena:       BUILD_DATE: 2022-01-30T15:09:41Z
> >>>>> ARQ:        VERSION: 4.4.0
> >>>>> ARQ:        BUILD_DATE: 2022-01-30T15:09:41Z
> >>>>> TDB:        VERSION: 4.4.0
> >>>>> TDB:        BUILD_DATE: 2022-01-30T15:09:41Z
> >>>>>
> >>>>> Use fuseki tdb2.xloader on file
> >>>>> /zbw/var/wikidata/2022-02-03/rdf/test.nt.gz
> >>>>> 17:20:13 INFO  Setup:
> >>>>> 17:20:13 INFO    Database: /zbw/var/lib/fuseki/databases/temp
> >>>>> 17:20:13 INFO    Data:     /zbw/var/wikidata/2022-02-03/rdf/test.nt.gz
> >>>>> 17:20:13 INFO    TMPDIR:   /zbw/var/lib/fuseki/databases/temp
> >>>>> 17:20:13 INFO
> >>>>> 17:20:13 INFO  Load node table
> >>>>>
> >>>>>
> >>>>> --
> >>>>> Joachim Neubert
> >>>>>
> >>>>> ZBW - Leibniz Information Centre for Economics Neuer Jungfernstieg
> >>>>> 21
> >>>>> 20354 Hamburg
> >>>>> Phone +49-40-42834-462
> >>>>>
> >>>>>

Reply via email to