did you try to point the wdqs copy to your tdb/fuseki endpoint? On Thu, 7 Dec 2017 at 18:58, Andy Seaborne <a...@apache.org> wrote:
> Dell XPS 13 (model 9350) - the 2015 model. > Ubuntu 17.10, not a VM. > 1T SSD. > 16G RAM. > Two volumes = root and user. > Swappiness = 10 > > java version "1.8.0_151" (OpenJDK) > > Data: latest-truthy.nt.gz (version of 2017-11-24) > > == TDB1, tdbloader2 > 8 hours // 76,164 TPS > > Using SORT_ARGS: --temporary-directory=/home/afs/Datasets/tmp > to make sure the temporary files are on the large volume. > > The run took 28877 seconds and resulted in a 173G database. > > All the index files are the same size. > > node2id : 12G > OSP : 53G > SPO : 53G > POS : 53G > > Algorithm: > > Data phase: > > parse file, create node table and a temporary file of all triples (3x 64 > bit numbers, written in text. > > Index phase: > > for each index, sort the temp file (using sort(1), an external sort > utility), and make the index file by writing the sorted results, filling > the data blocks and creating any tree blocks needed. This is a > stream-write process - calculate the data block, write it out when full > and never touch it again. > > This results in data blocks being completely full, unlike the standard > B+Tree insertion algorithm. It is why indexes are exactly the same size. > > Building SPO is faster because the data is nearly sorted to start with,. > Data often tends to arrive grouped by subject. > > tdbloader2 is doing stream (append) I/O on index files, not a random > access pattern. > > == TDB1 tdbloader1 > 29 hours 43 minutes // 20,560 TPS > > 106,975 seconds > 297G DB-truthy > > node2id: 12G > OSP: 97G > SPO: 96G > POS: 98G > > Same size node2id table, larger indexes. > > Algorithm: > > Data phase: > > parse the file and create the node table and the SPO index. > The creation of SPO is by b+tree insert so blocks are partially full > (average is empirically about 2/3 full). When a block fills up, it is > split into 2. The node table is exactly the same as tdbloader2 because > nodes are stored in the same order. > > Index phase: > > for each index, copy SPO to the index. This is a tree sort and the > access pattern on blocks is fairly random which is a bad thing. Doing > one at a time is faster than two together because more RAM in the > OS-managed file system cache, is devoted to caching one index. A cache > miss is a possible write to disk, and always a read from disk, which is > a lot of work even with an SSD. > > Stream reading SPO is efficient - it is not random I/O, it is stream I/O. > > Once the cache-efficiency of the OS disk cache drops, tdbloader slows > down markedly. > > == Comparison of TDB1 loaders. > > Building an index is a sort because the B+Trees hold data sorted. > > The approach of tdbloader2 is to use an external sort algorithm (i.e. > sort larger than RAM using temporary files) done by a highly tuned > utility, unix sort(1). > > The approach of tdbloader1 is to copy into a sorted datastructure. For > example, copying index SPO to POS, it is creating a file with keys > sorted by P then O then S, which is not the arrival order which is > S-sorted. tdbloader1 maximises OS caching of memory mapped files by > doing indexes one at a time. Experimentation shows that doing two at > once is slower, and doing two in parallel is no better and sometimes > worse, than doing sequentially. > > == TDB2 > > TDB2 is experimental. The current TDB2 loader is a functional placeholder. > > It is writing all three indexes at the same time. While for SPO this is > not a bad access pattern (subjects are naturally grouped), for POS and > OSP, the I/O is a random pattern, not a stream pattern. There is more > than double contention for OS disk cache, hence it is slow and gets > slower faster. > > == More details. > > For more information, consult the Jena dev@ and user@ archives and the > code. > -- --- Marco Neumann KONA
_______________________________________________ Wikidata-tech mailing list Wikidata-tech@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-tech