Hi Denny,
you didn't find them really, because they are not yet publicly released.
Please see them as a beta.
The main reason is, that there are a handful of missing features and a
handful of stupid bugs.
One example:
- we discovered a unicode issue in URIs which still allows valid
analysis, but would not allow to load it into dbpedia.org/sparql
- we built the Databus to have a group changelog and a dataset/artifact
changelog, however, these can only be changed at release time, so we can
not update reported errors after it was published, like the one above.
It is not hard and marvin did new extractions already:
https://databus.dbpedia.org/marvin , there is just a bit missing.
i.e. files such as
http://downloads.dbpedia.org/2016-10/core-i18n/de/mappingbased_objects_wkd_uris_de.ttl.bz2
- can you point me where I can find the canonicalized versions in the
new files?
These are discontinued. Instead there is:
https://databus.dbpedia.org/dbpedia/id-management/global-ids loaded into
this webservice:
https://global.dbpedia.org/same-thing/lookup/?uri=http://www.wikidata.org/entity/Q8087
where you can resolve many URIs against clusters.
and the fused and enriched versions as described in
https://svn.aksw.org/papers/2019/ISWC_FlexiFusion/public.pdf
Flexifusion is more systematic and can rewrite any datasetś subject with
any other subject from the ID management. So we could produce these
datasets any way.
Thanks for these pointers! I have run a few analyses, and now can
rerun them again with the actual current data :) I expect this to
improve DBpedia numbers by quite a bit.
You could also try the fused version:
https://databus.dbpedia.org/dbpedia/fusion This is the one we are
working on most and will aggregate a lot more data in the future.
I find it all a bit hard to navigate (although Databus has a few
really neat features, thanks for that).
Any feedback welcome, the issue tracker is linked on top of the website.
Yes, another missing feature. However, we thought that the pros will
just look at the dataid files and then write sparql queries at
https://databus.dbpedia.org/yasgui/
-- Sebastian
On 03.06.19 19:49, Denny Vrandečić wrote:
Oh, wow, thanks Sebastian, thanks Kingsley for the answers!
I was entirely unaware of the DBpedia datasets over at
databus.dbpedia.org <http://databus.dbpedia.org> - when I search for
"dbpedia downloads" that's not where I get to. Also, when I go to
dbpedia.org <http://dbpedia.org> and then click on "Downloads", I get
to the 2016 datasets.
https://wiki.dbpedia.org/Datasets
https://wiki.dbpedia.org/develop/datasets
I honestly thought, that the 2016 dataset is the latest one, and was
rather disappointed. Thank you for showing me that I was just looking
in the wrong place - but I would really suggest that you update your
Websites to point to databus. I am sure I am not the only one who
believes that there has been no DBpedia update since 2016.
Thanks for these pointers! I have run a few analyses, and now can
rerun them again with the actual current data :) I expect this to
improve DBpedia numbers by quite a bit.
One question, I liked to use the canonicalized versions from here
https://wiki.dbpedia.org/downloads-2016-10, i.e. files such as
http://downloads.dbpedia.org/2016-10/core-i18n/de/mappingbased_objects_wkd_uris_de.ttl.bz2
- can you point me where I can find the canonicalized versions in the
new files? I find it all a bit hard to navigate (although Databus has
a few really neat features, thanks for that).
Cheers,
Denny
On Sat, Jun 1, 2019 at 9:43 AM Kingsley Idehen <kide...@openlinksw.com
<mailto:kide...@openlinksw.com>> wrote:
On 6/1/19 5:45 AM, Sebastian Hellmann wrote:
Hi Denny,
* the old system was like this:
we load from here: http://downloads.dbpedia.org/2016-10/core/
metadata is in
http://downloads.dbpedia.org/2016-10/core/2016-10_dataid_core.ttl
with void:sparqlEndpoint <http://dbpedia.org/sparql>
<http://dbpedia.org/sparql> ;
Hi Sebastian,
I will also have the TTL referenced above loaded to a named graph
so that it becomes accessible from the query solution I shared in
my prior post.
* the new system is here: https://databus.dbpedia.org/dbpedia
There are 6 new releases and the metadata is in the endpoint
https://databus.dbpedia.org/repo/sparql
Once the collection saving feature is finished, we will build a
collection of datasets on the bus, which will then be loaded. It
is basically a sparql query retrieving the downloadurls like this:
http://dev.dbpedia.org/Data#example-application-virtuoso-docker
Okay.
Please install the Faceted Browser so that URIs like
http://dev.dbpedia.org/Data#example-application-virtuoso-docker
can also be looked up.
As an aside, here's an Entity Type overview query results page
<https://databus.dbpedia.org/repo/sparql?default-graph-uri=&query=SELECT+%28SAMPLE%28%3Fs%29+AS+%3Fsample%29+%28COUNT%281%29+AS+%3Fcount%29++%28%3Fo+AS+%3FentityType%29%0D%0AWHERE+%7B%0D%0A++++++++%3Fs+a+%3Fo.+%0D%0A%09%09FILTER+%28isIRI%28%3Fs%29%29+%0D%0A++++++++++++++++FILTER+%28%21+contains%28str%28%3Fs%29%2C%22virt%22%29%29%0D%0A++++++%7D+%0D%0AGROUP+BY+%3Fo%0D%0AORDER+BY+DESC+%28%3Fcount%29&format=text%2Fhtml&timeout=0&debug=on>
for future use etc..
Kingsley
On 31.05.19 21:59, Denny Vrandečić wrote:
Thank you for the answer!
I don't see how the query solution page that you linked
indicates that this is the English Wikipedia extraction. Where
does it say that? How can I tell? I am trying to understand, thanks.
Also, when I download the set of English extractions from here,
http://downloads.dbpedia.org/2016-10/core-i18n/en/
particularly this one,
http://downloads.dbpedia.org/2016-10/core-i18n/en/mappingbased_objects_en.ttl.bz2
it is only about 17,467 people with parents, not 20,120, so that
dataset seems out of sync with the one in the SPARQL endpoint.
I am curious where do you load the dataset from?
Thank you!
On Fri, May 31, 2019 at 11:49 AM Kingsley Idehen
<kide...@openlinksw.com <mailto:kide...@openlinksw.com>> wrote:
On 5/31/19 2:23 PM, Denny Vrandečić wrote:
When I query the dbpedia.org/sparql
<http://dbpedia.org/sparql> endpoint asking for "how many
people with a parent do you know?", i.e. select (count
(distinct ?p) as ?c) where { ?s dbo:parent ?o }, I get as
the answer 20,120.
Where among the Downloads on
wiki.dbpedia.org/downloads-2016-10
<http://wiki.dbpedia.org/downloads-2016-10> can I find the
dataset that the SPARQL endpoint actually serves? Is it the
English Wikipedia-based "Mappingbased" one? Or is t the
"Infobox Properties Mapped"?
Cheers,
Denny
The query solution page
<http://dbpedia.org/sparql?default-graph-uri=&query=prefix+dbo%3A+%3Chttp%3A%2F%2Fdbpedia.org%2Fontology%2F%3E+%0D%0A%0D%0Aselect+%3Fg+%28count+%28distinct+%3Fs%29+as+%3Fc%29%0D%0Awhere+%7B+%0D%0A+++++++%0D%0A+++++++++graph+%3Fg+%7B%3Fs+dbo%3Aparent+%3Fo.%7D%0D%0A%0D%0A+++++%7D%0D%0Agroup+by+%3Fg&format=text%2Fhtml&CXML_redir_for_subjs=121&CXML_redir_for_hrefs=&timeout=30000&debug=on&run=+Run+Query+>
indicates this is the English Wikipedia dataset. That's what
we've always loaded into the Virtuoso instance from which
DBpedia Linked Data and its associated SPARQL endpoint are
deployed.
--
Regards,
Kingsley Idehen
Founder & CEO
OpenLink Software
Home Page:http://www.openlinksw.com
Community Support:https://community.openlinksw.com
Weblogs (Blogs):
Company Blog:https://medium.com/openlink-software-blog
Virtuoso Blog:https://medium.com/virtuoso-blog
Data Access Drivers
Blog:https://medium.com/openlink-odbc-jdbc-ado-net-data-access-drivers
Personal Weblogs (Blogs):
Medium Blog:https://medium.com/@kidehen
Legacy Blogs:http://www.openlinksw.com/blog/~kidehen/
http://kidehen.blogspot.com
Profile Pages:
Pinterest:https://www.pinterest.com/kidehen/
Quora:https://www.quora.com/profile/Kingsley-Uyi-Idehen
Twitter:https://twitter.com/kidehen
Google+:https://plus.google.com/+KingsleyIdehen/about
LinkedIn:http://www.linkedin.com/in/kidehen
Web Identities (WebID):
Personal:http://kingsley.idehen.net/public_home/kidehen/profile.ttl#i
:http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this
_______________________________________________
DBpedia-discussion mailing list
DBpedia-discussion@lists.sourceforge.net
<mailto:DBpedia-discussion@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
_______________________________________________
DBpedia-discussion mailing list
DBpedia-discussion@lists.sourceforge.net
<mailto:DBpedia-discussion@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
--
All the best,
Sebastian Hellmann
Director of Knowledge Integration and Linked Data Technologies
(KILT) Competence Center
at the Institute for Applied Informatics (InfAI) at Leipzig
University
Executive Director of the DBpedia Association
Projects: http://dbpedia.org, http://nlp2rdf.org,
http://linguistics.okfn.org, https://www.w3.org/community/ld4lt
<http://www.w3.org/community/ld4lt>
Homepage: http://aksw.org/SebastianHellmann
Research Group: http://aksw.org
_______________________________________________
DBpedia-discussion mailing list
DBpedia-discussion@lists.sourceforge.net
<mailto:DBpedia-discussion@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
--
Regards,
Kingsley Idehen
Founder & CEO
OpenLink Software
Home Page:http://www.openlinksw.com
Community Support:https://community.openlinksw.com
Weblogs (Blogs):
Company Blog:https://medium.com/openlink-software-blog
Virtuoso Blog:https://medium.com/virtuoso-blog
Data Access Drivers
Blog:https://medium.com/openlink-odbc-jdbc-ado-net-data-access-drivers
Personal Weblogs (Blogs):
Medium Blog:https://medium.com/@kidehen
Legacy Blogs:http://www.openlinksw.com/blog/~kidehen/
http://kidehen.blogspot.com
Profile Pages:
Pinterest:https://www.pinterest.com/kidehen/
Quora:https://www.quora.com/profile/Kingsley-Uyi-Idehen
Twitter:https://twitter.com/kidehen
Google+:https://plus.google.com/+KingsleyIdehen/about
LinkedIn:http://www.linkedin.com/in/kidehen
Web Identities (WebID):
Personal:http://kingsley.idehen.net/public_home/kidehen/profile.ttl#i
:http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this
_______________________________________________
DBpedia-discussion mailing list
DBpedia-discussion@lists.sourceforge.net
<mailto:DBpedia-discussion@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
_______________________________________________
DBpedia-discussion mailing list
DBpedia-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
--
All the best,
Sebastian Hellmann
Director of Knowledge Integration and Linked Data Technologies (KILT)
Competence Center
at the Institute for Applied Informatics (InfAI) at Leipzig University
Executive Director of the DBpedia Association
Projects: http://dbpedia.org, http://nlp2rdf.org,
http://linguistics.okfn.org, https://www.w3.org/community/ld4lt
<http://www.w3.org/community/ld4lt>
Homepage: http://aksw.org/SebastianHellmann
Research Group: http://aksw.org
_______________________________________________
DBpedia-discussion mailing list
DBpedia-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion