Hi

I was able to reduce the load time to 9.1 hours aprox. (32890338 msec) in
Virtuoso 7.
I used 6 SSD disks of 1T each with RAID 0 (mdadm software RAID, I have not
tried with hardware RAID).
The virtuoso.ini for 256G RAM is
https://gist.github.com/asanchez75/58d5aed504051c7fbf9af0921c3c9130
I downloaded the dump from
https://dumps.wikimedia.org/wikidatawiki/entities/latest-all.ttl.gz
on August 30th,
The size is 387G uncompressed and finally the file virtuoso.db is 362G. The
total number of triples is 9 470 700 617.
Have a look to the simple patch here (is just a workaround)
https://github.com/asanchez75/virtuoso-opensource/commit/5d7b1b9b29e53cb8a25bed69f512a150f9f05d50
You can create your own docker image with that patch using
https://github.com/asanchez75/docker-virtuoso/tree/brendan
Check the Dockerfile which retrieves the patch from my forked Virtuoso git
repository
https://github.com/asanchez75/docker-virtuoso/blob/brendan/Dockerfile


Best,




Le dim. 1 sept. 2019 à 13:38, Edgar Meij <edgar.m...@gmail.com> a écrit :

> Thanks for this, Kingsley.
>
> Based on
> https://docs.google.com/spreadsheets/d/1-stlTC_WJmMU3xA_NxA1tSLHw6_sbpjff-5OITtrbFw/edit#gid=1799898600
> (copy-pasted below), it seems that it takes 43 hours to load, is that
> correct?
>
> Also, what is the "patch for geometry" mentioned there? I'm assuming that
> is the patch meant to address
> https://github.com/openlink/virtuoso-opensource/issues/295 and
> https://community.openlinksw.com/t/non-terrestrial-geo-literals/359,
> correct? Is it simply disabling the data validation code? Can you share the
> patch?
>
> Thanks,
> Edgar
>
>
> Other Information
> Architecture x86_64
> CPU op-mode(s) 32-bit, 64-bit
> Byte Order Little Endian
> CPU(s) 12.00
> On-line CPU(s) list 0-11
> Thread(s) per core 2.00
> Core(s) per socket 6.00
> Socket(s) 1.00
> NUMA node(s) 1.00
> Vendor ID GenuineIntel
> CPU family 6.00
> Model 63.00
> Model name
> Intel(R) Xeon(R) CPU E5-1650 v3 @ 3.50GHz
> Stepping 2.00
> CPU MHz 1,199.92
> CPU max MHz 3,800.00
> CPU min MHz 1,200.00
> BogoMIPS 6,984.39
> Virtualization VT-x
> L1d cache 32K
> L1i cache 32K
> L2 cache 256K
> L3 cache 15360K
> NUMA node0 CPU(s) 0-11
> RAM 128G
> wikidata-20190610-all-BETA.ttl 383G
> Virtuoso version
> 07.20.3230 (with patch for geometry)
> Time to load 43 hours
> virtuoso.db 340G
>
> On Wed, Aug 14, 2019 at 12:10 AM Kingsley Idehen <kide...@openlinksw.com>
> wrote:
>
>> Hi Everyone,
>>
>> A little FYI.
>>
>> We have loaded Wikidata into a Virtuoso instance accessible via SPARQL
>> [1]. One benefit is helping to understand Wikidata using our Faceted
>> Browsing Interface for Entity Relationship Types [2][3].
>>
>> Links:
>>
>> [1] http://wikidata.demo.openlinksw.com/sparql -- SPARQL endpoint
>>
>> [2] http://wikidata.demo.openlinksw.com/fct -- Faceted Browsing Interface
>>
>> [3] About New York
>> <https://wikidata.demo.openlinksw.com/describe/?url=http%3A%2F%2Fwww.wikidata.org%2Fentity%2FQ60&gp=16&go=&lp=940&invfp=IFP_OFF&sas=SAME_AS_OFF&distinct=1>
>>
>> Enjoy!
>>
>> Feedback always welcome too :)
>>
>> --
>> Regards,
>>
>> Kingsley Idehen      
>> Founder & CEO
>> OpenLink Software
>> Home Page: http://www.openlinksw.com
>> Community Support: https://community.openlinksw.com
>> Weblogs (Blogs):
>> Company Blog: https://medium.com/openlink-software-blog
>> Virtuoso Blog: https://medium.com/virtuoso-blog
>> Data Access Drivers Blog: 
>> https://medium.com/openlink-odbc-jdbc-ado-net-data-access-drivers
>>
>> Personal Weblogs (Blogs):
>> Medium Blog: https://medium.com/@kidehen
>> Legacy Blogs: http://www.openlinksw.com/blog/~kidehen/
>>               http://kidehen.blogspot.com
>>
>> Profile Pages:
>> Pinterest: https://www.pinterest.com/kidehen/
>> Quora: https://www.quora.com/profile/Kingsley-Uyi-Idehen
>> Twitter: https://twitter.com/kidehen
>> Google+: https://plus.google.com/+KingsleyIdehen/about
>> LinkedIn: http://www.linkedin.com/in/kidehen
>>
>> Web Identities (WebID):
>> Personal: http://kingsley.idehen.net/public_home/kidehen/profile.ttl#i
>>         : 
>> http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this
>>
>> _______________________________________________
>> Wikidata mailing list
>> Wikidata@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikidata
>>
> _______________________________________________
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
_______________________________________________
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Reply via email to