This is an automated email from the ASF dual-hosted git repository. andy pushed a commit to branch xloader-threads in repository https://gitbox.apache.org/repos/asf/jena-site.git
commit 25206246fd27f2a0b443e43c528113a82f641a42 Author: Andy Seaborne <a...@apache.org> AuthorDate: Fri Jan 14 16:17:46 2022 +0000 xloader --thread argument --- source/documentation/tdb/tdb-xloader.md | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/source/documentation/tdb/tdb-xloader.md b/source/documentation/tdb/tdb-xloader.md index 82c8878..f23056d 100644 --- a/source/documentation/tdb/tdb-xloader.md +++ b/source/documentation/tdb/tdb-xloader.md @@ -6,11 +6,12 @@ TDB xloader ("x" for external) is a bulkloader for very large datasets. The goal is stability and reliability for long running loading, running on modest hardware and can be use to load a database on rotating disk or SSD. -xloader is not a replacement for regular TDB1 and TDB2 loaders. +`xloader` is not a replacement for regular TDB1 and TDB2 loaders. It is for very +large datasets. There are two scripts to load data using the xloader subsystem. -"tdb1.xloader", which was called "tdbloader2" and has some improvements. +"tdb1.xloader", which was called "tdbloader2", has some improvements. It is not as fast as other TDB loaders on dataset where the general loaders work without encountering progressive slowdown. @@ -40,6 +41,11 @@ temporary files. `FILE` is any RDF syntax supported by Jena. Syntax is determined by the file extension and can include an addtional ".gz" or ".bz2" for compressed files. +`tdb2.xloader also supports `--threads` to set the number of threads to use with +`sort(1)`. The default is 2. The recommendation for an inital setting is to set +it to the number of cores (not hardware threads) minus 1. This is sensitive to +the hardware environment. Experimentation may show a different best setting. + ### Advice To avoid a load failing due to a syntax or other data error, it is advisable to