Hi,
As it's me who have done the statistics you mention, let me try to
clarify. [1] and [2] are based on DBpedia dumps for 3.9 and 3.8,
respectively. The last DBpedia paper has the numbers for 3.8 - and in
the statistics page for 3.8 [2] you indeed find 3.7 mln entities. Why is
"400 mln triples" not there? Because [2] counts *just* raw property
statements extracted from infoboxes (65 mln), type statements (13.7 mln)
and mapped (to DBpedia ontology) property statements (33.7 mln). It does
not count however many other triples: those coming from inter-language
links, abstracts, categories, links to other resources and so on, check
the download pages for the whole list [3,4]. If you count all these,
perhaps, you'll arrive at 400 mln triples. In fact,
SELECT COUNT(*) WHERE {?x ?y ?z}
executed against DBpedia SPARQL endpoint returns 825,761,509 at the
moment. And actually I am not sure that all datasets available at [5]
are loaded into the endpoint, so the total number for English can be
even bigger.
Summarizing, [1,2] are good sources for getting numbers of
things/instances. For the number of triples - depends on what you want
to count. For types and properties refer to [1,2], for total number of
triples - refer to SPARQL endpoints for English and some other languages
for which the endpoints exist. Or go through the dumps and count :)
Cheers,
Volha
[1] http://wiki.dbpedia.org/Datasets39/DatasetStatistics
[2] http://wiki.dbpedia.org/Datasets38/DatasetStatistics
[3] http://wiki.dbpedia.org/Downloads39
[4] http://wiki.dbpedia.org/Downloads38
[5] http://downloads.dbpedia.org/3.9/en/
On 4/19/2014 11:59 PM, Gunaratna, Dalkandura Arachchige Kalpa Shashika
Silva wrote:
Hi,
I want to know the correct number of instances and total triples for
theEnglish version of DBpedia 3.9. I have come across the DBpedia
statistics page (http://wiki.dbpedia.org/Datasets39/DatasetStatistics)
but it is confusing for me to get the numbers correct.
The reason being, I read in a paper that they mentioned in DBpedia
version 3.4, it had 3.5 entities (instances) with 672 million triples.
Having that in mind, DBpedia statistics page says that version 3.9 has
4 million (4,004,478) instances and 70 million (70,147,399) raw
statements.
Recent DBpedia paper
(http://svn.aksw.org/papers/2013/SWJ_DBpedia/public.pdf) says the
English version has 3.7 million things described in 400 million
triples. I believe they are talking about version 3.8. But this number
also does not match with the version 3.8 table in the statistics page.
Couls somebody clarify the correct numbers for me for both English
version and whole DBpedia. Total number of things (instances) and
total number of triples.
Thank you very much.
------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/NeoTech
_______________________________________________
Dbpedia-discussion mailing list
Dbpedia-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
--
---------------------
Dr. Volha Bryl
Postdoctoral Researcher
Chair of Information Systems V
Web-based Systems Group
Universität Mannheim
B6, 26, Room C1.03
D-68131 Mannheim
Tel.: +49 621 181 2657
Mail: vo...@informatik.uni-mannheim.de
------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/NeoTech
_______________________________________________
Dbpedia-discussion mailing list
Dbpedia-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion