Hi I was able to reduce the load time to 9.1 hours aprox. (32890338 msec) in Virtuoso 7. I used 6 SSD disks of 1T each with RAID 0 (mdadm software RAID, I have not tried with hardware RAID). The virtuoso.ini for 256G RAM is https://gist.github.com/asanchez75/58d5aed504051c7fbf9af0921c3c9130 I downloaded the dump from https://dumps.wikimedia.org/wikidatawiki/entities/latest-all.ttl.gz on August 30th, The size is 387G uncompressed and finally the file virtuoso.db is 362G. The total number of triples is 9 470 700 617. Have a look to the simple patch here (is just a workaround) https://github.com/asanchez75/virtuoso-opensource/commit/5d7b1b9b29e53cb8a25bed69f512a150f9f05d50 You can create your own docker image with that patch using https://github.com/asanchez75/docker-virtuoso/tree/brendan Check the Dockerfile which retrieves the patch from my forked Virtuoso git repository https://github.com/asanchez75/docker-virtuoso/blob/brendan/Dockerfile
Best, Le dim. 1 sept. 2019 à 13:38, Edgar Meij <edgar.m...@gmail.com> a écrit : > Thanks for this, Kingsley. > > Based on > https://docs.google.com/spreadsheets/d/1-stlTC_WJmMU3xA_NxA1tSLHw6_sbpjff-5OITtrbFw/edit#gid=1799898600 > (copy-pasted below), it seems that it takes 43 hours to load, is that > correct? > > Also, what is the "patch for geometry" mentioned there? I'm assuming that > is the patch meant to address > https://github.com/openlink/virtuoso-opensource/issues/295 and > https://community.openlinksw.com/t/non-terrestrial-geo-literals/359, > correct? Is it simply disabling the data validation code? Can you share the > patch? > > Thanks, > Edgar > > > Other Information > Architecture x86_64 > CPU op-mode(s) 32-bit, 64-bit > Byte Order Little Endian > CPU(s) 12.00 > On-line CPU(s) list 0-11 > Thread(s) per core 2.00 > Core(s) per socket 6.00 > Socket(s) 1.00 > NUMA node(s) 1.00 > Vendor ID GenuineIntel > CPU family 6.00 > Model 63.00 > Model name > Intel(R) Xeon(R) CPU E5-1650 v3 @ 3.50GHz > Stepping 2.00 > CPU MHz 1,199.92 > CPU max MHz 3,800.00 > CPU min MHz 1,200.00 > BogoMIPS 6,984.39 > Virtualization VT-x > L1d cache 32K > L1i cache 32K > L2 cache 256K > L3 cache 15360K > NUMA node0 CPU(s) 0-11 > RAM 128G > wikidata-20190610-all-BETA.ttl 383G > Virtuoso version > 07.20.3230 (with patch for geometry) > Time to load 43 hours > virtuoso.db 340G > > On Wed, Aug 14, 2019 at 12:10 AM Kingsley Idehen <kide...@openlinksw.com> > wrote: > >> Hi Everyone, >> >> A little FYI. >> >> We have loaded Wikidata into a Virtuoso instance accessible via SPARQL >> [1]. One benefit is helping to understand Wikidata using our Faceted >> Browsing Interface for Entity Relationship Types [2][3]. >> >> Links: >> >> [1] http://wikidata.demo.openlinksw.com/sparql -- SPARQL endpoint >> >> [2] http://wikidata.demo.openlinksw.com/fct -- Faceted Browsing Interface >> >> [3] About New York >> <https://wikidata.demo.openlinksw.com/describe/?url=http%3A%2F%2Fwww.wikidata.org%2Fentity%2FQ60&gp=16&go=&lp=940&invfp=IFP_OFF&sas=SAME_AS_OFF&distinct=1> >> >> Enjoy! >> >> Feedback always welcome too :) >> >> -- >> Regards, >> >> Kingsley Idehen >> Founder & CEO >> OpenLink Software >> Home Page: http://www.openlinksw.com >> Community Support: https://community.openlinksw.com >> Weblogs (Blogs): >> Company Blog: https://medium.com/openlink-software-blog >> Virtuoso Blog: https://medium.com/virtuoso-blog >> Data Access Drivers Blog: >> https://medium.com/openlink-odbc-jdbc-ado-net-data-access-drivers >> >> Personal Weblogs (Blogs): >> Medium Blog: https://medium.com/@kidehen >> Legacy Blogs: http://www.openlinksw.com/blog/~kidehen/ >> http://kidehen.blogspot.com >> >> Profile Pages: >> Pinterest: https://www.pinterest.com/kidehen/ >> Quora: https://www.quora.com/profile/Kingsley-Uyi-Idehen >> Twitter: https://twitter.com/kidehen >> Google+: https://plus.google.com/+KingsleyIdehen/about >> LinkedIn: http://www.linkedin.com/in/kidehen >> >> Web Identities (WebID): >> Personal: http://kingsley.idehen.net/public_home/kidehen/profile.ttl#i >> : >> http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this >> >> _______________________________________________ >> Wikidata mailing list >> Wikidata@lists.wikimedia.org >> https://lists.wikimedia.org/mailman/listinfo/wikidata >> > _______________________________________________ > Wikidata mailing list > Wikidata@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/wikidata >
_______________________________________________ Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata