For your information a) It took 10.2 days to load the Wikidata RDF dump (wikidata-20190513-all-BETA.ttl, 379G) in Blazegraph 2.1.5. The bigdata.jnl file turned to be 1.3T
Server technical features Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 16 On-line CPU(s) list: 0-15 Thread(s) per core: 2 Core(s) per socket: 8 Socket(s): 1 NUMA node(s): 1 Vendor ID: GenuineIntel CPU family: 6 Model: 79 Model name: Intel(R) Xeon(R) CPU E5-2620 v4 @ 2.10GHz Stepping: 1 CPU MHz: 1200.476 CPU max MHz: 3000.0000 CPU min MHz: 1200.0000 BogoMIPS: 4197.65 Virtualization: VT-x L1d cache: 32K L1i cache: 32K L2 cache: 256K L3 cache: 20480K RAM: 128G b) It took 43 hours to load the Wikidata RDF dump (wikidata-20190610-all-BETA.ttl, 383G) in the dev version of Virtuoso 07.20.3230. I had to patch Virtuoso because it was given the following error each time I load the RDF data 09:58:06 PL LOG: File /backup/wikidata-20190610-all-BETA.ttl error 42000 TURTLE RDF loader, line 2984680: RDFGE: RDF box with a geometry RDF type and a non-geometry content The virtuoso.db file turned to be 340G. Server technical features Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 12 On-line CPU(s) list: 0-11 Thread(s) per core: 2 Core(s) per socket: 6 Socket(s): 1 NUMA node(s): 1 Vendor ID: GenuineIntel CPU family: 6 Model: 63 Model name: Intel(R) Xeon(R) CPU E5-1650 v3 @ 3.50GHz Stepping: 2 CPU MHz: 1199.920 CPU max MHz: 3800.0000 CPU min MHz: 1200.0000 BogoMIPS: 6984.39 Virtualization: VT-x L1d cache: 32K L1i cache: 32K L2 cache: 256K L3 cache: 15360K NUMA node0 CPU(s): 0-11 RAM: 128G Best, Le mar. 4 juin 2019 à 16:37, Vi to <vituzzu.w...@gmail.com> a écrit : > > V4 has 8 cores instead of 6. > > But well, it's a server grade config on purpose! > > Vito > > Il giorno mar 4 giu 2019 alle ore 16:32 Guillaume Lederrey > <gleder...@wikimedia.org> ha scritto: >> >> On Tue, Jun 4, 2019 at 3:14 PM Vi to <vituzzu.w...@gmail.com> wrote: >> > >> > AFAIR it's a double Xeon E5-2620 v3. >> > With modern CPUs frequency is not so significant. >> >> Our latest batch of servers are: Intel(R) Xeon(R) CPU E5-2620 v4 @ >> 2.10GHz (so v4 instead of v3, but the difference is probably minimal). >> >> > Vito >> > >> > Il giorno mar 4 giu 2019 alle ore 13:00 Adam Sanchez >> > <a.sanche...@gmail.com> ha scritto: >> >> >> >> Thanks Guillaume! >> >> One question more, what is the CPU frequency (GHz)? >> >> >> >> Le mar. 4 juin 2019 à 12:25, Guillaume Lederrey >> >> <gleder...@wikimedia.org> a écrit : >> >> > >> >> > On Tue, Jun 4, 2019 at 12:18 PM Adam Sanchez <a.sanche...@gmail.com> >> >> > wrote: >> >> > > >> >> > > Hello, >> >> > > >> >> > > Does somebody know the minimal hardware requirements (disk size and >> >> > > RAM) for loading wikidata dump in Blazegraph? >> >> > >> >> > The actual hardware requirements will depend on your use case. But for >> >> > comparison, our production servers are: >> >> > >> >> > * 16 cores (hyper threaded, 32 threads) >> >> > * 128G RAM >> >> > * 1.5T of SSD storage >> >> > >> >> > > The downloaded dump file wikidata-20190513-all-BETA.ttl is 379G. >> >> > > The bigdata.jnl file which stores all the triples data in Blazegraph >> >> > > is 478G but still growing. >> >> > > I had 1T disk but is almost full now. >> >> > >> >> > The current size of our jnl file in production is ~670G. >> >> > >> >> > Hope that helps! >> >> > >> >> > Guillaume >> >> > >> >> > > Thanks, >> >> > > >> >> > > Adam >> >> > > >> >> > > _______________________________________________ >> >> > > Wikidata mailing list >> >> > > Wikidata@lists.wikimedia.org >> >> > > https://lists.wikimedia.org/mailman/listinfo/wikidata >> >> > >> >> > >> >> > >> >> > -- >> >> > Guillaume Lederrey >> >> > Engineering Manager, Search Platform >> >> > Wikimedia Foundation >> >> > UTC+2 / CEST >> >> > >> >> > _______________________________________________ >> >> > Wikidata mailing list >> >> > Wikidata@lists.wikimedia.org >> >> > https://lists.wikimedia.org/mailman/listinfo/wikidata >> >> >> >> _______________________________________________ >> >> Wikidata mailing list >> >> Wikidata@lists.wikimedia.org >> >> https://lists.wikimedia.org/mailman/listinfo/wikidata >> > >> > _______________________________________________ >> > Wikidata mailing list >> > Wikidata@lists.wikimedia.org >> > https://lists.wikimedia.org/mailman/listinfo/wikidata >> >> >> >> -- >> Guillaume Lederrey >> Engineering Manager, Search Platform >> Wikimedia Foundation >> UTC+2 / CEST >> >> _______________________________________________ >> Wikidata mailing list >> Wikidata@lists.wikimedia.org >> https://lists.wikimedia.org/mailman/listinfo/wikidata > > _______________________________________________ > Wikidata mailing list > Wikidata@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/wikidata _______________________________________________ Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata