On Fri, Sep 22, 2017 at 3:07 PM, Jörn Hees <j_h...@cs.uni-kl.de> wrote:

> Hi all,
>
> > On 22 Sep 2017, at 11:39, Milan Dojchinovski <
> dojcinovski.mi...@gmail.com> wrote:
> >
> > - usefulness - do you find it useful? do you miss any useful
> information? would you benefit if we enrich the data, e.g. with entity
> types, annotate sentences, etc.?
>
> Yes, very useful... publishing them opens new ways to enhance DBpedia from
> within RDF without the need to go back to some external datasource or dump.
>
>
> > - usability - how easy/difficult it is for consumption? Any specific
> type of queries you would like to execute over the data but you find it
> impossible or difficult?
>
> I'm pretty interested in the dbo:wikiPageWikiLink (wikilinks) and their
> position within the text. They carry a lot of information and are already
> used in tons of similarity / relatedness heuristics. Currently it's not
> really simple to answer questions like "give me the first 15 links on the
> page", as the SPARQL query for that looks mental. Maybe we could come up
> with some simple additional modelling for information like "this is
> wikilink number n on the page" or this is a wikilink occurring in the
> abstract already. Maybe something like this (modulo properly defining or
> re-using existing vocabs):
>
> dbr:Berlin nif:wikiPageWikiLink <http://dbpedia.org/resource/
> Berlin?dbpv=2016-04&nif=wikilink&target=Germany&occ=1> .
> <http://dbpedia.org/resource/Berlin?dbpv=2016-04&nif=
> wikilink&target=Germany&occ=1> a nif:WikiLink ;
>   nif:link_source dbr:Berlin ;
>   nif:link_target dbr:Germany ;
>   nif:page_link_number 2 ;
>   nif:page_link_occurrence 1 ;
>   nif:page_link_occurrences 1 ;
>   nif:appears_in nif:WikiAbstract ;
>   nif:sec_link_number 2 ;  # here the same, but potentially useful for
> ranking of related terms wrt. certain context
>   nif:sec_link_occurrence 1 ;
>   nif:sec_link_occurrences 1 ;
>   # ... all other nif props
>   nif:anchorOf "Germany" ;
>   nif:referenceContext ...
>   ...
>
> Despite the "first link principle" in Wikipedia the above modelling would
> already allow further links to the same other article (possibly with other
> anchor texts) to occur...
>
> Not sure how feasible that kind of extraction would be though...
>

A minor comment from my side here is that this information already exists
in DBpedia.
It not (easily) queryable with SPARQL but you can get it with a simple
script from the dumps:

if you look at the quad files, we have the absolute line in the wiki text
each triple was extracted e.g.
<http://dbpedia.org/resource/AccessibleComputing> <
http://dbpedia.org/ontology/wikiPageWikiLink> <
http://dbpedia.org/resource/Computer_accessibility> <
http://en.wikipedia.org/wiki/AccessibleComputing?oldid=631144794#absolute-line=1>
.

when we extract triples from a page section, the context becomes like

http://en.wikipedia.org/wiki/Alabama?oldid=745139393#section=Geograph&relative-line=4&absolute-line=221

hope that helps

Cheers,
Dimitris



>
>
> Best,
> Jörn
>
>
> ------------------------------------------------------------
> ------------------
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> _______________________________________________
> DBpedia-discussion mailing list
> DBpedia-discussion@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
>



-- 
Kontokostas Dimitris
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
DBpedia-discussion mailing list
DBpedia-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion

Reply via email to