Thank you a lot for the answer, that is super useful. I'll see if I can get the canonicalized version recreated :)
One question though, is there a cleaned version of the DBpedia ontology mapping based data? I only found the uncleaned version. Do you have any plans when the next release of DBpedia is going to be available? On Mon, Jun 3, 2019 at 2:19 PM Sebastian Hellmann < hellm...@informatik.uni-leipzig.de> wrote: > Hi Denny, > > you didn't find them really, because they are not yet publicly released. > Please see them as a beta. > > The main reason is, that there are a handful of missing features and a > handful of stupid bugs. > > One example: > > - we discovered a unicode issue in URIs which still allows valid analysis, > but would not allow to load it into dbpedia.org/sparql > > - we built the Databus to have a group changelog and a dataset/artifact > changelog, however, these can only be changed at release time, so we can > not update reported errors after it was published, like the one above. > > It is not hard and marvin did new extractions already: > https://databus.dbpedia.org/marvin , there is just a bit missing. > > > i.e. files such as > http://downloads.dbpedia.org/2016-10/core-i18n/de/mappingbased_objects_wkd_uris_de.ttl.bz2 > - can you point me where I can find the canonicalized versions in the new > files? > > > These are discontinued. Instead there is: > > https://databus.dbpedia.org/dbpedia/id-management/global-ids loaded into > this webservice: > https://global.dbpedia.org/same-thing/lookup/?uri=http://www.wikidata.org/entity/Q8087 > where you can resolve many URIs against clusters. > > and the fused and enriched versions as described in > https://svn.aksw.org/papers/2019/ISWC_FlexiFusion/public.pdf > > Flexifusion is more systematic and can rewrite any datasetś subject with > any other subject from the ID management. So we could produce these > datasets any way. > > > Thanks for these pointers! I have run a few analyses, and now can rerun > them again with the actual current data :) I expect this to improve DBpedia > numbers by quite a bit. > > You could also try the fused version: > https://databus.dbpedia.org/dbpedia/fusion This is the one we are > working on most and will aggregate a lot more data in the future. > > > I find it all a bit hard to navigate (although Databus has a few really > neat features, thanks for that). > > Any feedback welcome, the issue tracker is linked on top of the website. > > > Yes, another missing feature. However, we thought that the pros will just > look at the dataid files and then write sparql queries at > https://databus.dbpedia.org/yasgui/ > > -- Sebastian > > > On 03.06.19 19:49, Denny Vrandečić wrote: > > Oh, wow, thanks Sebastian, thanks Kingsley for the answers! > > I was entirely unaware of the DBpedia datasets over at databus.dbpedia.org > - when I search for "dbpedia downloads" that's not where I get to. Also, > when I go to dbpedia.org and then click on "Downloads", I get to the 2016 > datasets. > > https://wiki.dbpedia.org/Datasets > > https://wiki.dbpedia.org/develop/datasets > > I honestly thought, that the 2016 dataset is the latest one, and was > rather disappointed. Thank you for showing me that I was just looking in > the wrong place - but I would really suggest that you update your Websites > to point to databus. I am sure I am not the only one who believes that > there has been no DBpedia update since 2016. > > Thanks for these pointers! I have run a few analyses, and now can rerun > them again with the actual current data :) I expect this to improve DBpedia > numbers by quite a bit. > > One question, I liked to use the canonicalized versions from here > https://wiki.dbpedia.org/downloads-2016-10, i.e. files such as > http://downloads.dbpedia.org/2016-10/core-i18n/de/mappingbased_objects_wkd_uris_de.ttl.bz2 > - can you point me where I can find the canonicalized versions in the new > files? I find it all a bit hard to navigate (although Databus has a few > really neat features, thanks for that). > > Cheers, > Denny > > > > > > On Sat, Jun 1, 2019 at 9:43 AM Kingsley Idehen <kide...@openlinksw.com> > wrote: > >> On 6/1/19 5:45 AM, Sebastian Hellmann wrote: >> >> Hi Denny, >> >> * the old system was like this: >> >> we load from here: http://downloads.dbpedia.org/2016-10/core/ >> >> metadata is in >> http://downloads.dbpedia.org/2016-10/core/2016-10_dataid_core.ttl with >> void:sparqlEndpoint <http://dbpedia.org/sparql> >> <http://dbpedia.org/sparql> ; >> >> >> Hi Sebastian, >> >> >> I will also have the TTL referenced above loaded to a named graph so that >> it becomes accessible from the query solution I shared in my prior post. >> >> >> >> * the new system is here: https://databus.dbpedia.org/dbpedia >> >> There are 6 new releases and the metadata is in the endpoint >> https://databus.dbpedia.org/repo/sparql >> >> Once the collection saving feature is finished, we will build a >> collection of datasets on the bus, which will then be loaded. It is >> basically a sparql query retrieving the downloadurls like this: >> >> http://dev.dbpedia.org/Data#example-application-virtuoso-docker >> >> >> Okay. >> >> Please install the Faceted Browser so that URIs like >> http://dev.dbpedia.org/Data#example-application-virtuoso-docker can also >> be looked up. >> >> As an aside, here's an Entity Type overview query results page >> <https://databus.dbpedia.org/repo/sparql?default-graph-uri=&query=SELECT+%28SAMPLE%28%3Fs%29+AS+%3Fsample%29+%28COUNT%281%29+AS+%3Fcount%29++%28%3Fo+AS+%3FentityType%29%0D%0AWHERE+%7B%0D%0A++++++++%3Fs+a+%3Fo.+%0D%0A%09%09FILTER+%28isIRI%28%3Fs%29%29+%0D%0A++++++++++++++++FILTER+%28%21+contains%28str%28%3Fs%29%2C%22virt%22%29%29%0D%0A++++++%7D+%0D%0AGROUP+BY+%3Fo%0D%0AORDER+BY+DESC+%28%3Fcount%29&format=text%2Fhtml&timeout=0&debug=on> >> for future use etc.. >> >> >> Kingsley >> >> >> >> >> On 31.05.19 21:59, Denny Vrandečić wrote: >> >> Thank you for the answer! >> >> I don't see how the query solution page that you linked indicates that >> this is the English Wikipedia extraction. Where does it say that? How can I >> tell? I am trying to understand, thanks. >> >> Also, when I download the set of English extractions from here, >> >> http://downloads.dbpedia.org/2016-10/core-i18n/en/ >> >> particularly this one, >> >> >> http://downloads.dbpedia.org/2016-10/core-i18n/en/mappingbased_objects_en.ttl.bz2 >> >> >> it is only about 17,467 people with parents, not 20,120, so that dataset >> seems out of sync with the one in the SPARQL endpoint. >> >> I am curious where do you load the dataset from? >> >> Thank you! >> >> >> On Fri, May 31, 2019 at 11:49 AM Kingsley Idehen <kide...@openlinksw.com> >> wrote: >> >>> On 5/31/19 2:23 PM, Denny Vrandečić wrote: >>> >>> When I query the dbpedia.org/sparql endpoint asking for "how many >>> people with a parent do you know?", i.e. select (count (distinct ?p) as ?c) >>> where { ?s dbo:parent ?o }, I get as the answer 20,120. >>> >>> Where among the Downloads on wiki.dbpedia.org/downloads-2016-10 can I >>> find the dataset that the SPARQL endpoint actually serves? Is it the >>> English Wikipedia-based "Mappingbased" one? Or is t the "Infobox Properties >>> Mapped"? >>> >>> Cheers, >>> Denny >>> >>> >>> The query solution page >>> <http://dbpedia.org/sparql?default-graph-uri=&query=prefix+dbo%3A+%3Chttp%3A%2F%2Fdbpedia.org%2Fontology%2F%3E+%0D%0A%0D%0Aselect+%3Fg+%28count+%28distinct+%3Fs%29+as+%3Fc%29%0D%0Awhere+%7B+%0D%0A+++++++%0D%0A+++++++++graph+%3Fg+%7B%3Fs+dbo%3Aparent+%3Fo.%7D%0D%0A%0D%0A+++++%7D%0D%0Agroup+by+%3Fg&format=text%2Fhtml&CXML_redir_for_subjs=121&CXML_redir_for_hrefs=&timeout=30000&debug=on&run=+Run+Query+> >>> indicates this is the English Wikipedia dataset. That's what we've always >>> loaded into the Virtuoso instance from which DBpedia Linked Data and its >>> associated SPARQL endpoint are deployed. >>> >>> >>> -- >>> Regards, >>> >>> Kingsley Idehen >>> Founder & CEO >>> OpenLink Software >>> Home Page: http://www.openlinksw.com >>> Community Support: https://community.openlinksw.com >>> Weblogs (Blogs): >>> Company Blog: https://medium.com/openlink-software-blog >>> Virtuoso Blog: https://medium.com/virtuoso-blog >>> Data Access Drivers Blog: >>> https://medium.com/openlink-odbc-jdbc-ado-net-data-access-drivers >>> >>> Personal Weblogs (Blogs): >>> Medium Blog: https://medium.com/@kidehen >>> Legacy Blogs: http://www.openlinksw.com/blog/~kidehen/ >>> http://kidehen.blogspot.com >>> >>> Profile Pages: >>> Pinterest: https://www.pinterest.com/kidehen/ >>> Quora: https://www.quora.com/profile/Kingsley-Uyi-Idehen >>> Twitter: https://twitter.com/kidehen >>> Google+: https://plus.google.com/+KingsleyIdehen/about >>> LinkedIn: http://www.linkedin.com/in/kidehen >>> >>> Web Identities (WebID): >>> Personal: http://kingsley.idehen.net/public_home/kidehen/profile.ttl#i >>> : >>> http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this >>> >>> _______________________________________________ >>> DBpedia-discussion mailing list >>> DBpedia-discussion@lists.sourceforge.net >>> https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion >>> >> >> >> _______________________________________________ >> DBpedia-discussion mailing >> listDBpedia-discussion@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/dbpedia-discussion >> >> -- >> All the best, >> Sebastian Hellmann >> >> Director of Knowledge Integration and Linked Data Technologies (KILT) >> Competence Center >> at the Institute for Applied Informatics (InfAI) at Leipzig University >> Executive Director of the DBpedia Association >> Projects: http://dbpedia.org, http://nlp2rdf.org, >> http://linguistics.okfn.org, https://www.w3.org/community/ld4lt >> <http://www.w3.org/community/ld4lt> >> Homepage: http://aksw.org/SebastianHellmann >> Research Group: http://aksw.org >> >> >> _______________________________________________ >> DBpedia-discussion mailing >> listDBpedia-discussion@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/dbpedia-discussion >> >> >> -- >> Regards, >> >> Kingsley Idehen >> Founder & CEO >> OpenLink Software >> Home Page: http://www.openlinksw.com >> Community Support: https://community.openlinksw.com >> Weblogs (Blogs): >> Company Blog: https://medium.com/openlink-software-blog >> Virtuoso Blog: https://medium.com/virtuoso-blog >> Data Access Drivers Blog: >> https://medium.com/openlink-odbc-jdbc-ado-net-data-access-drivers >> >> Personal Weblogs (Blogs): >> Medium Blog: https://medium.com/@kidehen >> Legacy Blogs: http://www.openlinksw.com/blog/~kidehen/ >> http://kidehen.blogspot.com >> >> Profile Pages: >> Pinterest: https://www.pinterest.com/kidehen/ >> Quora: https://www.quora.com/profile/Kingsley-Uyi-Idehen >> Twitter: https://twitter.com/kidehen >> Google+: https://plus.google.com/+KingsleyIdehen/about >> LinkedIn: http://www.linkedin.com/in/kidehen >> >> Web Identities (WebID): >> Personal: http://kingsley.idehen.net/public_home/kidehen/profile.ttl#i >> : >> http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this >> >> _______________________________________________ >> DBpedia-discussion mailing list >> DBpedia-discussion@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion >> > > > _______________________________________________ > DBpedia-discussion mailing > listDBpedia-discussion@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/dbpedia-discussion > > -- > All the best, > Sebastian Hellmann > > Director of Knowledge Integration and Linked Data Technologies (KILT) > Competence Center > at the Institute for Applied Informatics (InfAI) at Leipzig University > Executive Director of the DBpedia Association > Projects: http://dbpedia.org, http://nlp2rdf.org, > http://linguistics.okfn.org, https://www.w3.org/community/ld4lt > <http://www.w3.org/community/ld4lt> > Homepage: http://aksw.org/SebastianHellmann > Research Group: http://aksw.org >
_______________________________________________ DBpedia-discussion mailing list DBpedia-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion