On 4/13/11 9:59 AM, Michael Brunnbauer wrote:

Here is the current top 25 for foaf-search.net (Number of RDF documents per
second level domain). dbpedia is not included because we used the dumps and
livejournal.com was not crawled completely. Not all RDF documents are about
persons. We index every document containing a foaf:name or foaf:nick

opera.com       247281
ecademy.com     224875
livejournal.com 221321
identi.ca       192732
insanejournal.com       183046
ac.uk   176452 (mostly eprints.soton.ac.uk and eprints.ecs.soton.ac.uk)
deadjournal.com 161659
spin.de 131959
rambler.ru      111119
mybloglog.com   53633
i.ua    43727 (narod.i.ua)
dreamwidth.org  39471
smart.fm        38945
dbtune.org      36498
bibliographica.org      34795
rdfabout.com    30237
rpi.edu 29913
co.uk   25073 (mostly ordnancesurvey.co.uk)
qdos.com        18816
wasab.dk        18763
sapo.pt 17029
photozou.jp     14560
phitter.com     11823
openei.org      10794
gov.uk  10589 (mostly data.gov.uk)

How a publicly shared Google spreadsheet doc? From there to Linked Data is a short journey :-)


Michael Brunnbauer

On Wed, Apr 13, 2011 at 01:37:48PM +0200, Mischa Tuffield wrote:
Hi All,

I was looking at the number of foaf files on the web over a year ago now, 
output looks like so :


On 13 Apr 2011, at 12:11, Melvin Carvalho wrote:

On 13 April 2011 10:54, Michael Brunnbauer<bru...@netestate.de>  wrote:

On Wed, Apr 13, 2011 at 10:15:46AM +0200, Bernard Vatant wrote:
Just trying to figure what is the size of personal information available as
LOD vs billions of person profiles stored by Google, Amazon, Facebook,
LinkedIn, unameit ... in proprietary formats.
At www.foaf-search.net, we have ca. 3.5 mio instances of foaf:Person.

The biggest chunk out there is probably livejournal.com with more than 25mio
users which we cannot index all right now (we have 221090 of them).

Another big one is hi5.com but the FOAF is quite broken so we don't crawl it.
gmail at one point were publishing foaf profiles ... so that's quite a few more

facebook graph is not quite foaf but certainly machine readable JSON,
and could easily be transformed to FOAF, so that's another chunk

there's a few bridges too such as ones last.fm, flikr and semantic tweet

So including bridge I'd guess 250 million, 99% should be alive today,
but that number will fall over time (obviously)

See also:



Michael Brunnbauer

++  Michael Brunnbauer
++  netEstate GmbH
++  Geisenhausener Straße 11a
++  81379 München
++  Tel +49 89 32 19 77 80
++  Fax +49 89 32 19 77 89
++  E-Mail bru...@netestate.de
++  http://www.netestate.de/
++  Sitz: München, HRB Nr.142452 (Handelsregister B München)
++  USt-IdNr. DE221033342
++  Geschäftsführer: Michael Brunnbauer, Franz Brunnbauer
++  Prokurist: Dipl. Kfm. (Univ.) Markus Hendel



Kingsley Idehen 
President&  CEO
OpenLink Software
Web: http://www.openlinksw.com
Weblog: http://www.openlinksw.com/blog/~kidehen
Twitter/Identi.ca: kidehen

Reply via email to