Works for me - make sure it is the latest dev build (the one down the bottom)

I just grabbed apache-jena-4.5.0-20220209.180144-12.zip (2022-02-09)

and loaded a few millions triples with no problems.

rm -rf DB2
apache-jena-4.5.0-SNAPSHOT/bin/tdb2.xloader --loc DB2 ~/Datasets/BSBM/bsbm-5m.nt.gz

    Andy

On 11/02/2022 21:20, Neubert, Joachim wrote:
Hi Andy,

Thanks! The code of 4.5.0-SNAPSHOT seems to run significantly faster - however, 
the same error at SPO start.

Please let me know if I can help with tracing/reproducing the error.

Cheers, Joachim

-----Ursprüngliche Nachricht-----
Von: Andy Seaborne <a...@apache.org>
Gesendet: Freitag, 11. Februar 2022 21:07
An: users@jena.apache.org
Betreff: Re: xloader "Can't find gzip program"

Hi Joachim,

https://issues.apache.org/jira/browse/JENA-2277
https://issues.apache.org/jira/browse/JENA-2279

There are two fixes for tdb2.xloader which are now in the development
builds:

https://repository.apache.org/content/groups/snapshots/org/apache/jena/

(these are not official releases and have not been voted on by the PMC)

If you coudl test them and let us know if they work or whether theer are
further problems, that would be great.

      Andy


On 11/02/2022 17:53, Neubert, Joachim wrote:
I've just started tests with xloader. It aborts with

17:21:56 INFO  Data            :: Triples = 10,000,000 ; Quads = 0
17:21:57 INFO  =-=-=-=-=-=-=-=
17:21:57 INFO
17:21:57 INFO  Build SPO
17:21:57 INFO  (Very long pause likely at this point)
17:21:58 INFO  Index           :: Build index SPO
java.lang.RuntimeException: org.apache.jena.tdb2.TDBException: Can't find
gzip program
    at
org.apache.jena.tdb2.xloader.ProcBuildIndexX.sort_build_index(ProcBuildIn
dexX.java:207)
    at
org.apache.jena.tdb2.xloader.ProcBuildIndexX.buildIndex(ProcBuildIndexX.ja
va:121)
    at
org.apache.jena.tdb2.xloader.ProcBuildIndexX.exec2(ProcBuildIndexX.java:1
06)
    at
org.apache.jena.tdb2.xloader.ProcBuildIndexX.exec(ProcBuildIndexX.java:94
)
    at tdb2.xloader.CmdxBuildIndex.exec(CmdxBuildIndex.java:80)
    at org.apache.jena.cmd.CmdMain.mainMethod(CmdMain.java:92)
    at org.apache.jena.cmd.CmdMain.mainRun(CmdMain.java:58)
    at org.apache.jena.cmd.CmdMain.mainRun(CmdMain.java:45)
    at tdb2.xloader.CmdxBuildIndex.main(CmdxBuildIndex.java:28)
Caused by: org.apache.jena.tdb2.TDBException: Can't find gzip program
    at
org.apache.jena.tdb2.xloader.BulkLoaderX.gzipProgram(BulkLoaderX.java:67
)
    at
org.apache.jena.tdb2.xloader.ProcBuildIndexX.sort_build_index(ProcBuildIn
dexX.java:183)
    ... 8 more

Of course, /usr/bin/gzip is in the path. My configuration is below,
tdb2.xloader was called with --threads=12.

Any idea what could be wrong?

Cheers, Joachim


Configuration:
openjdk version "11.0.13" 2021-10-19 LTS OpenJDK Runtime Environment
18.9 (build 11.0.13+8-LTS) OpenJDK 64-Bit Server VM 18.9 (build
11.0.13+8-LTS, mixed mode, sharing)
JAVA_OPTS: -d64 -Xmx12G
Loader: tdb2.xloader
Jena:       VERSION: 4.4.0
Jena:       BUILD_DATE: 2022-01-30T15:09:41Z
ARQ:        VERSION: 4.4.0
ARQ:        BUILD_DATE: 2022-01-30T15:09:41Z
TDB:        VERSION: 4.4.0
TDB:        BUILD_DATE: 2022-01-30T15:09:41Z

Use fuseki tdb2.xloader on file
/zbw/var/wikidata/2022-02-03/rdf/test.nt.gz
17:20:13 INFO  Setup:
17:20:13 INFO    Database: /zbw/var/lib/fuseki/databases/temp
17:20:13 INFO    Data:     /zbw/var/wikidata/2022-02-03/rdf/test.nt.gz
17:20:13 INFO    TMPDIR:   /zbw/var/lib/fuseki/databases/temp
17:20:13 INFO
17:20:13 INFO  Load node table


--
Joachim Neubert

ZBW - Leibniz Information Centre for Economics Neuer Jungfernstieg 21
20354 Hamburg
Phone +49-40-42834-462


Reply via email to