On 4/13/11 4:15 AM, Bernard Vatant wrote:
Hello all

Just trying to figure what is the size of personal information available as LOD vs billions of person profiles stored by Google, Amazon, Facebook, LinkedIn, unameit ... in proprietary formats.

Any hint of the proportion of "living" people vs historical characters is also welcome.

Any idea?


Bernard Vatant
Senior Consultant
Vocabulary & Data Integration
Tel:       +33 (0) 971 488 459
Mail: bernard.vat...@mondeca.com <mailto:bernard.vat...@mondeca.com>
3, cité Nollez 75018 Paris France
Web: http://www.mondeca.com
Blog: http://mondeca.wordpress.com

LOD Cloud cache has 3,321,094 foaf:Person entities [1]. Distinct count 3,319,862 count [2]. URIBurner has 4,564,981 foaf:Person entities [3]. Distinct count is 4,555,697 [4] .

Both cases via SPARQL aggregate queries against their respective endpoints. Note, no inference context applied there are a variety of rules across OpenCyc, UMBEL, Yago, and DBpedia that would alter these counts.

Tip re. URLs below, simply change the "authority" part of the URL when seeking similar counts from other Virtuoso instances, with some luck it could apply to other SPARQL endpoints in general, subject to what the endpoints support and permit etc..

SPARQL queries used across each endpoint:

select count(?s) where  {?s a foaf:Person}

select count(distinct ?s) where  {?s a foaf:Person}


1. http://lod.openlinksw.com/c/CYIZZL4 -- LOD Cloud Cache
2. http://lod.openlinksw.com/c/COXER7C -- LOD Cloud Cache Distinct Count
3. http://uriburner.com/c/DYVU7N -- URIBurner
4. http://uriburner.com/c/DV6VPQ -- URIBurner Distinct Count .



Kingsley Idehen 
President&  CEO
OpenLink Software
Web: http://www.openlinksw.com
Weblog: http://www.openlinksw.com/blog/~kidehen
Twitter/Identi.ca: kidehen

Reply via email to