Re: Testing tdb2.xloader

2021-12-16 Thread Andy Seaborne
On 16/12/2021 10:52, Andy Seaborne wrote: ... I am getting a slow down during data ingestion. However, your summary figures don't show that in the ingest phase. The whole logs may have the signal in it but less pronounced. My working assumption is now that it is random access to the node t

Re: Testing tdb2.xloader

2021-12-16 Thread Andy Seaborne
On 16/12/2021 12:32, LB wrote: I couldn't get access to the full log as the output was too verbose for the screen and I forgot to pipe into a file ... Yes - familiar ... Maybe xloader should capture it's logging. I can confirm the triples.tmp.gz size was something around 35-40G if I rememb

Re: Testing tdb2.xloader

2021-12-16 Thread LB
I couldn't get access to the full log as the output was too verbose for the screen and I forgot to pipe into a file ... I can confirm the triples.tmp.gz size was something around 35-40G if I remember correctly. I rerun the load now to a) keep logs and b) see if increasing the number of threa

Re: Testing tdb2.xloader

2021-12-16 Thread Andy Seaborne
Awesome! I'm really pleased to hear the news. That's better than I feared at this scale! How big is triples.tmp.gz? 2* that size, and the database size is the peak storage space used. My estimate is about 40G making 604G overall. I'd appreciate having the whole log file. Could you email it to

Re: Testing tdb2.xloader

2021-12-16 Thread Marco Neumann
thank you Lorenz, I am running this test myself now again with a larger disk. You may want to consider running a full load of wikidata as well. The timing info and disk space you have should be sufficient. Did we figure out a place to post the parser messages? Marco On Thu, Dec 16, 2021 at 10:0

Re: Testing tdb2.xloader

2021-12-16 Thread LB
Sure wikidata-tdb/Data-0001: total 524G -rw-r--r-- 1   24 Dez 15 05:41 GOSP.bpt -rw-r--r-- 1 8,0M Dez 14 12:21 GOSP.dat -rw-r--r-- 1 8,0M Dez 14 12:21 GOSP.idn -rw-r--r-- 1   24 Dez 15 05:41 GPOS.bpt -rw-r--r-- 1 8,0M Dez 14 12:21 GPOS.dat -rw-r--r-- 1 8,0M Dez 14 12:21 GPOS.idn -rw-r--r-- 1   2

Re: Testing tdb2.xloader

2021-12-16 Thread Marco Neumann
Thank you Lorenz, can you please post a directory list for Data-0001 with file sizes. On Thu, Dec 16, 2021 at 8:49 AM LB wrote: > Loading of latest WD truthy dump (6.6 billion triples) Bzip2 compressed: > > Server: > > AMD Ryzen 9 5950X (16C/32T) > 128 GB DDR4 ECC RAM > 2 x 3.84 TB NVMe SSD >

[jira] [Created] (JENA-2220) Move Apache HttpClient-related code into jena-jdbc

2021-12-16 Thread Andy Seaborne (Jira)
Andy Seaborne created JENA-2220: --- Summary: Move Apache HttpClient-related code into jena-jdbc Key: JENA-2220 URL: https://issues.apache.org/jira/browse/JENA-2220 Project: Apache Jena Issue Type

[jira] [Created] (JENA-2221) Remove usage of Apache HttpClient in transition code.

2021-12-16 Thread Andy Seaborne (Jira)
Andy Seaborne created JENA-2221: --- Summary: Remove usage of Apache HttpClient in transition code. Key: JENA-2221 URL: https://issues.apache.org/jira/browse/JENA-2221 Project: Apache Jena Issue T

Re: Testing tdb2.xloader

2021-12-16 Thread LB
Loading of latest WD truthy dump (6.6 billion triples) Bzip2 compressed: Server: AMD Ryzen 9 5950X  (16C/32T) 128 GB DDR4 ECC RAM 2 x 3.84 TB NVMe SSD Environment: - Ubuntu 20.04.3 LTS - OpenJDK Runtime Environment (build 11.0.11+9-Ubuntu-0ubuntu2.20.04) - Jena 4.3.1 Command: tools/apache-