Hi Denny,

you didn't find them really, because they are not yet publicly released. Please see them as a beta.

The main reason is, that there are a handful of missing features and a handful of stupid bugs.

One example:

- we discovered a unicode issue in URIs which still allows valid analysis, but would not allow to load it into dbpedia.org/sparql

- we built the Databus to have a group changelog and a dataset/artifact changelog, however, these can only be changed at release time, so we can not update reported errors after it was published, like the one above.

It is not hard and marvin did new extractions already: https://databus.dbpedia.org/marvin , there is just a bit missing.


i.e. files such as http://downloads.dbpedia.org/2016-10/core-i18n/de/mappingbased_objects_wkd_uris_de.ttl.bz2 - can you point me where I can find the canonicalized versions in the new files?

These are discontinued. Instead there is:

https://databus.dbpedia.org/dbpedia/id-management/global-ids loaded into this webservice: https://global.dbpedia.org/same-thing/lookup/?uri=http://www.wikidata.org/entity/Q8087 where you can resolve many URIs against clusters.

and the fused and enriched versions as described in https://svn.aksw.org/papers/2019/ISWC_FlexiFusion/public.pdf

Flexifusion is more systematic and can rewrite any datasetś subject with any other subject from the ID management. So we could produce these datasets any way.


Thanks for these pointers! I have run a few analyses, and now can rerun them again with the actual current data :) I expect this to improve DBpedia numbers by quite a bit.
You could also try the fused version: https://databus.dbpedia.org/dbpedia/fusion   This is the one we are working on most and will aggregate a lot more data in the future.


I find it all a bit hard to navigate (although Databus has a few really neat features, thanks for that).

Any feedback welcome, the issue tracker is linked on top of the website.


Yes, another missing feature. However, we thought that the pros will just look at the dataid files and then write sparql queries at https://databus.dbpedia.org/yasgui/

-- Sebastian


On 03.06.19 19:49, Denny Vrandečić wrote:
Oh, wow, thanks Sebastian, thanks Kingsley for the answers!

I was entirely unaware of the DBpedia datasets over at databus.dbpedia.org <http://databus.dbpedia.org> - when I search for "dbpedia downloads" that's not where I get to. Also, when I go to dbpedia.org <http://dbpedia.org> and then click on "Downloads", I get to the 2016 datasets.

https://wiki.dbpedia.org/Datasets

https://wiki.dbpedia.org/develop/datasets

I honestly thought, that the 2016 dataset is the latest one, and was rather disappointed. Thank you for showing me that I was just looking in the wrong place - but I would really suggest that you update your Websites to point to databus. I am sure I am not the only one who believes that there has been no DBpedia update since 2016.

Thanks for these pointers! I have run a few analyses, and now can rerun them again with the actual current data :) I expect this to improve DBpedia numbers by quite a bit.

One question, I liked to use the canonicalized versions from here https://wiki.dbpedia.org/downloads-2016-10, i.e. files such as http://downloads.dbpedia.org/2016-10/core-i18n/de/mappingbased_objects_wkd_uris_de.ttl.bz2 - can you point me where I can find the canonicalized versions in the new files? I find it all a bit hard to navigate (although Databus has a few really neat features, thanks for that).

Cheers,
Denny





On Sat, Jun 1, 2019 at 9:43 AM Kingsley Idehen <kide...@openlinksw.com <mailto:kide...@openlinksw.com>> wrote:

    On 6/1/19 5:45 AM, Sebastian Hellmann wrote:

    Hi Denny,

    * the old system was like this:

    we load from here: http://downloads.dbpedia.org/2016-10/core/

    metadata is in
    http://downloads.dbpedia.org/2016-10/core/2016-10_dataid_core.ttl
    with void:sparqlEndpoint <http://dbpedia.org/sparql>
    <http://dbpedia.org/sparql> ;


    Hi Sebastian,


    I will also have the TTL referenced above loaded to a named graph
    so that it becomes accessible from the query solution I shared in
    my prior post.



    * the new system is here: https://databus.dbpedia.org/dbpedia

    There are 6 new releases and the metadata is in the endpoint
    https://databus.dbpedia.org/repo/sparql

    Once the collection saving feature  is finished, we will build a
    collection of datasets on the bus, which will then be loaded. It
    is basically a sparql query retrieving the downloadurls like this:

    http://dev.dbpedia.org/Data#example-application-virtuoso-docker


    Okay.

    Please install the Faceted Browser so that URIs like
    http://dev.dbpedia.org/Data#example-application-virtuoso-docker
    can also be looked up.

    As an aside, here's an Entity Type overview query results page
    
<https://databus.dbpedia.org/repo/sparql?default-graph-uri=&query=SELECT+%28SAMPLE%28%3Fs%29+AS+%3Fsample%29+%28COUNT%281%29+AS+%3Fcount%29++%28%3Fo+AS+%3FentityType%29%0D%0AWHERE+%7B%0D%0A++++++++%3Fs+a+%3Fo.+%0D%0A%09%09FILTER+%28isIRI%28%3Fs%29%29+%0D%0A++++++++++++++++FILTER+%28%21+contains%28str%28%3Fs%29%2C%22virt%22%29%29%0D%0A++++++%7D+%0D%0AGROUP+BY+%3Fo%0D%0AORDER+BY+DESC+%28%3Fcount%29&format=text%2Fhtml&timeout=0&debug=on>
    for future use etc..


    Kingsley




    On 31.05.19 21:59, Denny Vrandečić wrote:
    Thank you for the answer!

    I don't see how the query solution page that you linked
    indicates that this is the English Wikipedia extraction. Where
    does it say that? How can I tell? I am trying to understand, thanks.

    Also, when I download the set of English extractions from here,

    http://downloads.dbpedia.org/2016-10/core-i18n/en/

    particularly this one,

    
http://downloads.dbpedia.org/2016-10/core-i18n/en/mappingbased_objects_en.ttl.bz2


    it is only about 17,467 people with parents, not 20,120, so that
    dataset seems out of sync with the one in the SPARQL endpoint.

    I am curious where do you load the dataset from?

    Thank you!


    On Fri, May 31, 2019 at 11:49 AM Kingsley Idehen
    <kide...@openlinksw.com <mailto:kide...@openlinksw.com>> wrote:

        On 5/31/19 2:23 PM, Denny Vrandečić wrote:
        When I query the dbpedia.org/sparql
        <http://dbpedia.org/sparql> endpoint asking for "how many
        people with a parent do you know?", i.e. select (count
        (distinct ?p) as ?c) where { ?s dbo:parent ?o }, I get as
        the answer 20,120.

        Where among the Downloads on
        wiki.dbpedia.org/downloads-2016-10
        <http://wiki.dbpedia.org/downloads-2016-10> can I find the
        dataset that the SPARQL endpoint actually serves? Is it the
        English Wikipedia-based "Mappingbased" one? Or is t the
        "Infobox Properties Mapped"?

        Cheers,
        Denny


        The query solution page
        
<http://dbpedia.org/sparql?default-graph-uri=&query=prefix+dbo%3A+%3Chttp%3A%2F%2Fdbpedia.org%2Fontology%2F%3E+%0D%0A%0D%0Aselect+%3Fg+%28count+%28distinct+%3Fs%29+as+%3Fc%29%0D%0Awhere+%7B+%0D%0A+++++++%0D%0A+++++++++graph+%3Fg+%7B%3Fs+dbo%3Aparent+%3Fo.%7D%0D%0A%0D%0A+++++%7D%0D%0Agroup+by+%3Fg&format=text%2Fhtml&CXML_redir_for_subjs=121&CXML_redir_for_hrefs=&timeout=30000&debug=on&run=+Run+Query+>
        indicates this is the English Wikipedia dataset. That's what
        we've always loaded into the Virtuoso instance from which
        DBpedia Linked Data and its associated SPARQL endpoint are
        deployed.


-- Regards,

        Kingsley Idehen 
        Founder & CEO
        OpenLink Software
        Home Page:http://www.openlinksw.com
        Community Support:https://community.openlinksw.com
        Weblogs (Blogs):
        Company Blog:https://medium.com/openlink-software-blog
        Virtuoso Blog:https://medium.com/virtuoso-blog
        Data Access Drivers 
Blog:https://medium.com/openlink-odbc-jdbc-ado-net-data-access-drivers

        Personal Weblogs (Blogs):
        Medium Blog:https://medium.com/@kidehen
        Legacy Blogs:http://www.openlinksw.com/blog/~kidehen/
                       http://kidehen.blogspot.com

        Profile Pages:
        Pinterest:https://www.pinterest.com/kidehen/
        Quora:https://www.quora.com/profile/Kingsley-Uyi-Idehen
        Twitter:https://twitter.com/kidehen
        Google+:https://plus.google.com/+KingsleyIdehen/about
        LinkedIn:http://www.linkedin.com/in/kidehen

        Web Identities (WebID):
        Personal:http://kingsley.idehen.net/public_home/kidehen/profile.ttl#i
                 
:http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this

        _______________________________________________
        DBpedia-discussion mailing list
        DBpedia-discussion@lists.sourceforge.net
        <mailto:DBpedia-discussion@lists.sourceforge.net>
        https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion



    _______________________________________________
    DBpedia-discussion mailing list
    DBpedia-discussion@lists.sourceforge.net  
<mailto:DBpedia-discussion@lists.sourceforge.net>
    https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
-- All the best,
    Sebastian Hellmann

    Director of Knowledge Integration and Linked Data Technologies
    (KILT) Competence Center
    at the Institute for Applied Informatics (InfAI) at Leipzig
    University
    Executive Director of the DBpedia Association
    Projects: http://dbpedia.org, http://nlp2rdf.org,
    http://linguistics.okfn.org, https://www.w3.org/community/ld4lt
    <http://www.w3.org/community/ld4lt>
    Homepage: http://aksw.org/SebastianHellmann
    Research Group: http://aksw.org


    _______________________________________________
    DBpedia-discussion mailing list
    DBpedia-discussion@lists.sourceforge.net  
<mailto:DBpedia-discussion@lists.sourceforge.net>
    https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion


-- Regards,

    Kingsley Idehen     
    Founder & CEO
    OpenLink Software
    Home Page:http://www.openlinksw.com
    Community Support:https://community.openlinksw.com
    Weblogs (Blogs):
    Company Blog:https://medium.com/openlink-software-blog
    Virtuoso Blog:https://medium.com/virtuoso-blog
    Data Access Drivers 
Blog:https://medium.com/openlink-odbc-jdbc-ado-net-data-access-drivers

    Personal Weblogs (Blogs):
    Medium Blog:https://medium.com/@kidehen
    Legacy Blogs:http://www.openlinksw.com/blog/~kidehen/
                   http://kidehen.blogspot.com

    Profile Pages:
    Pinterest:https://www.pinterest.com/kidehen/
    Quora:https://www.quora.com/profile/Kingsley-Uyi-Idehen
    Twitter:https://twitter.com/kidehen
    Google+:https://plus.google.com/+KingsleyIdehen/about
    LinkedIn:http://www.linkedin.com/in/kidehen

    Web Identities (WebID):
    Personal:http://kingsley.idehen.net/public_home/kidehen/profile.ttl#i
             
:http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this

    _______________________________________________
    DBpedia-discussion mailing list
    DBpedia-discussion@lists.sourceforge.net
    <mailto:DBpedia-discussion@lists.sourceforge.net>
    https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion



_______________________________________________
DBpedia-discussion mailing list
DBpedia-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
--
All the best,
Sebastian Hellmann

Director of Knowledge Integration and Linked Data Technologies (KILT) Competence Center
at the Institute for Applied Informatics (InfAI) at Leipzig University
Executive Director of the DBpedia Association
Projects: http://dbpedia.org, http://nlp2rdf.org, http://linguistics.okfn.org, https://www.w3.org/community/ld4lt <http://www.w3.org/community/ld4lt>
Homepage: http://aksw.org/SebastianHellmann
Research Group: http://aksw.org
_______________________________________________
DBpedia-discussion mailing list
DBpedia-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion

Reply via email to