Re: [Wiki-research-l] Finding the number of links between two wikipedia pages

2017-02-24 Thread Mara Sorella
Thank you, Giovanni, I'll check it out!

Mara

On Fri, Feb 24, 2017 at 10:59 PM, Giovanni Luca Ciampaglia <
gciam...@indiana.edu> wrote:

> Hi Mara,
>
> since you were asking about ontologies, let me point you to our work on 
> computational
> fact checking from knowledge networks PLoS ONE
> .
> We developed a measure of semantic similarity based on shortest paths
> between any two concepts of Wikipedia using the linked data from DBPedia;
> these the are links found in the infoboxes of Wikipedia articles; so it is
> a subset of the hyperlinks of the whole web page.
>
> In the article we use it as a way to check simple relational statements,
> but it could be used for other uses too. And there are also a couple other
> approaches from the literature, which we cite in the paper, that could also
> be relevant for what you are doing.
>
> HTH!
>
> Giovanni
>
>
> Giovanni Luca Ciampaglia  *∙* Assistant Research
> Scientist, Indiana University
>
>
> On Sun, Feb 19, 2017 at 2:56 PM, Mara Sorella 
> wrote:
>
>> Hi everybody, I'm new to the list and have been referred here by a
>> comment from a SO user as per my question [1], that I'm quoting next:
>>
>>
>> I
>>
>>
>>
>> * have been successfully able to use the Wikipedia pagelinks SQL dump to
>> obtain hyperlinks between Wikipedia pages for a specific revision
>> time.However, there are cases where multiple instances of such links exist,
>> e.g. the very same https://en.wikipedia.org/wiki/Wikipedia
>>  page and
>> https://en.wikipedia.org/wiki/Wikimedia_Foundation
>> . I'm interested to
>> find number of links between pairs of pages for a specific revision. Ideal
>> solutions would involve dump files other than pagelinks (which I'm not
>> aware of), or using the MediaWiki API.*
>>
>>
>>
>> To elaborate, I need this information to weight (almost) every hyperlink
>> between article pages (that is, in NS0), that was present in a specific
>> wikipedia revision (end of 2015), therefore, I would prefer not to follow
>> the solution suggested by the SO user, that would be rather impractical.
>>
>> Indeed, my final aim is to use this weight in a thresholding fashion to
>> sparsify the wikipedia graph (that due to the short diameter is more or
>> less a giant connected component), in a way that should reflect the
>> "relatedness" of the linked pages (where relatedness is not intended as
>> strictly semantic, but at a higher "concept" level, if I may say so).
>> For this reason, other suggestions on how determine such weights
>> (possibly using other data sources -- ontologies?) are more than welcome.
>>
>> The graph will be used as dataset to test an event tracking algorithm I
>> am doing research on.
>>
>>
>> Thanks,
>>
>> Mara
>>
>>
>>
>>
>> [1] http://stackoverflow.com/questions/4223/number-of-links-
>> between-two-wikipedia-pages/
>>
>> ___
>> Wiki-research-l mailing list
>> Wiki-research-l@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>>
>>
>
> ___
> Wiki-research-l mailing list
> Wiki-research-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>
>
___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] Finding the number of links between two wikipedia pages

2017-02-24 Thread Giovanni Luca Ciampaglia
Hi Mara,

since you were asking about ontologies, let me point you to our work
on computational
fact checking from knowledge networks PLoS ONE
.
We developed a measure of semantic similarity based on shortest paths
between any two concepts of Wikipedia using the linked data from DBPedia;
these the are links found in the infoboxes of Wikipedia articles; so it is
a subset of the hyperlinks of the whole web page.

In the article we use it as a way to check simple relational statements,
but it could be used for other uses too. And there are also a couple other
approaches from the literature, which we cite in the paper, that could also
be relevant for what you are doing.

HTH!

Giovanni


Giovanni Luca Ciampaglia  *∙* Assistant Research
Scientist, Indiana University


On Sun, Feb 19, 2017 at 2:56 PM, Mara Sorella 
wrote:

> Hi everybody, I'm new to the list and have been referred here by a comment
> from a SO user as per my question [1], that I'm quoting next:
>
>
> I
>
>
>
> * have been successfully able to use the Wikipedia pagelinks SQL dump to
> obtain hyperlinks between Wikipedia pages for a specific revision
> time.However, there are cases where multiple instances of such links exist,
> e.g. the very same https://en.wikipedia.org/wiki/Wikipedia
>  page and
> https://en.wikipedia.org/wiki/Wikimedia_Foundation
> . I'm interested to
> find number of links between pairs of pages for a specific revision. Ideal
> solutions would involve dump files other than pagelinks (which I'm not
> aware of), or using the MediaWiki API.*
>
>
>
> To elaborate, I need this information to weight (almost) every hyperlink
> between article pages (that is, in NS0), that was present in a specific
> wikipedia revision (end of 2015), therefore, I would prefer not to follow
> the solution suggested by the SO user, that would be rather impractical.
>
> Indeed, my final aim is to use this weight in a thresholding fashion to
> sparsify the wikipedia graph (that due to the short diameter is more or
> less a giant connected component), in a way that should reflect the
> "relatedness" of the linked pages (where relatedness is not intended as
> strictly semantic, but at a higher "concept" level, if I may say so).
> For this reason, other suggestions on how determine such weights (possibly
> using other data sources -- ontologies?) are more than welcome.
>
> The graph will be used as dataset to test an event tracking algorithm I am
> doing research on.
>
>
> Thanks,
>
> Mara
>
>
>
>
> [1] http://stackoverflow.com/questions/4223/number-of-
> links-between-two-wikipedia-pages/
>
> ___
> Wiki-research-l mailing list
> Wiki-research-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>
>
___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l