Hi,

As it's me who have done the statistics you mention, let me try to clarify. [1] and [2] are based on DBpedia dumps for 3.9 and 3.8, respectively. The last DBpedia paper has the numbers for 3.8 - and in the statistics page for 3.8 [2] you indeed find 3.7 mln entities. Why is "400 mln triples" not there? Because [2] counts *just* raw property statements extracted from infoboxes (65 mln), type statements (13.7 mln) and mapped (to DBpedia ontology) property statements (33.7 mln). It does not count however many other triples: those coming from inter-language links, abstracts, categories, links to other resources and so on, check the download pages for the whole list [3,4]. If you count all these, perhaps, you'll arrive at 400 mln triples. In fact,
SELECT COUNT(*) WHERE {?x ?y ?z}
executed against DBpedia SPARQL endpoint returns 825,761,509 at the moment. And actually I am not sure that all datasets available at [5] are loaded into the endpoint, so the total number for English can be even bigger.

Summarizing, [1,2] are good sources for getting numbers of things/instances. For the number of triples - depends on what you want to count. For types and properties refer to [1,2], for total number of triples - refer to SPARQL endpoints for English and some other languages for which the endpoints exist. Or go through the dumps and count :)

Cheers,
Volha


[1] http://wiki.dbpedia.org/Datasets39/DatasetStatistics
[2] http://wiki.dbpedia.org/Datasets38/DatasetStatistics
[3] http://wiki.dbpedia.org/Downloads39
[4] http://wiki.dbpedia.org/Downloads38
[5] http://downloads.dbpedia.org/3.9/en/




On 4/19/2014 11:59 PM, Gunaratna, Dalkandura Arachchige Kalpa Shashika Silva wrote:

Hi,

I want to know the correct number of instances and total triples for theEnglish version of DBpedia 3.9. I have come across the DBpedia statistics page (http://wiki.dbpedia.org/Datasets39/DatasetStatistics) but it is confusing for me to get the numbers correct.


The reason being, I read in a paper that they mentioned in DBpedia version 3.4, it had 3.5 entities (instances) with 672 million triples.


Having that in mind, DBpedia statistics page says that version 3.9 has 4 million (4,004,478) instances and 70 million (70,147,399) raw statements.


Recent DBpedia paper (http://svn.aksw.org/papers/2013/SWJ_DBpedia/public.pdf) says the English version has 3.7 million things described in 400 million triples. I believe they are talking about version 3.8. But this number also does not match with the version 3.8 table in the statistics page.


Couls somebody clarify the correct numbers for me for both English version and whole DBpedia. Total number of things (instances) and total number of triples.


Thank you very much.



------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/NeoTech


_______________________________________________
Dbpedia-discussion mailing list
Dbpedia-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion


--
---------------------
Dr. Volha Bryl
Postdoctoral Researcher
Chair of Information Systems V
Web-based Systems Group
Universität Mannheim
B6, 26, Room C1.03
D-68131 Mannheim

Tel.: +49 621 181 2657
Mail: vo...@informatik.uni-mannheim.de

------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/NeoTech
_______________________________________________
Dbpedia-discussion mailing list
Dbpedia-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion

Reply via email to