Re: [Commons-l] Some help needed

2016-11-26 Thread Maarten Dammers
For the other people who are reading this: I also got this question. 
Solved this by doing a query on the database, see 
https://quarry.wmflabs.org/query/14350


Parsing wikitext is generally messy. Quite a few identifier templates on 
Commons (like https://commons.wikimedia.org/wiki/Template:Rijksmonument 
) set a tracker category and use the identifier as the sorting key. This 
way it's possible to keep track of what identifier is used on what page 
(see https://www.mediawiki.org/wiki/Manual:Categorylinks_table for the 
database layout). In this case no tracker category was set so the 
externallinks table was used as a fallback ( 
https://www.mediawiki.org/wiki/Manual:Externallinks_table ).


Maarten


On 25-11-16 15:11, Hugo Manguinhas wrote:

Hi everyone,

I am new to the Commons API and would like to know how to get (in a machine 
readable way) the metadata found within the Summary section of a page.

In particular, given a File page like this one: 
https://commons.wikimedia.org/wiki/File:African_Dusky_Nightjar_(Caprimulgus_pectoralis)_(W1CDR386_BD28).ogghttps://commons.wikimedia.org/wiki/File:African_Dusky_Nightjar_(Caprimulgus_pectoralis)_(W1CDR386_BD28).ogg

I would like to get the "Europeana link" part... it is enough for me to get the 
data as Wiki markup, but parsing the whole HTML would be too much :S

... btw, is there any way to query for such data? I have been using the API 
Sandbox (https://en.wikipedia.org/wiki/Special:ApiSandbox ) but could not find 
a method that could do this...

Your help is really appreciated! Thank you in advance!

Best regards,
Hugo
___
Commons-l mailing list
Commons-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/commons-l



___
Commons-l mailing list
Commons-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/commons-l


Re: [Commons-l] Some help needed

2016-11-25 Thread Gergo Tisza
On Fri, Nov 25, 2016 at 6:11 AM, Hugo Manguinhas <
hugo.manguin...@europeana.eu> wrote:

> In particular, given a File page like this one:
> https://commons.wikimedia.org/wiki/File:African_Dusky_
> Nightjar_(Caprimulgus_pectoralis)_(W1CDR386_BD28).ogghttps://commons.
> wikimedia.org/wiki/File:African_Dusky_Nightjar_(Caprimulgus_pectoralis)_(
> W1CDR386_BD28).ogg
>
> I would like to get the "Europeana link" part... it is enough for me to
> get the data as Wiki markup, but parsing the whole HTML would be too much :S
>
> ... btw, is there any way to query for such data? I have been using the
> API Sandbox (https://en.wikipedia.org/wiki/Special:ApiSandbox ) but could
> not find a method that could do this...
>

I don't think it's possible. You can query the main fields of the
information table (author, source etc) via prop=imageinfo&iiprop=extmetadata
,
but that field is just marked as a miscellaneous info field so there isn't
really any way to find it.
___
Commons-l mailing list
Commons-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/commons-l


Re: [Commons-l] Some help needed

2016-11-25 Thread Gaurav Vaidya
If you know what the external link looks like (does it always start with 
"http://www.europeana.eu/“?) and the page(s) you’re interested in, you can use 
‘extlinks’ to find all external links on a set of pages:

 - 
https://commons.wikimedia.org/w/api.php?action=query&titles=File:African%20Dusky%20Nightjar%20(Caprimulgus%20pectoralis)%20(W1CDR386%20BD28).ogg&prop=extlinks

You can also get a list of every page on the Commons that has a URL containing 
"europeana.eu/portal/record”, like in Special:Linksearch:

 - 
https://commons.wikimedia.org/w/api.php?action=query&list=exturlusage&euquery=europeana.eu/portal/record&eulimit=500

I don’t think there’s an API to parse the Information template yet. DBpedia 
tries to do this (e.g. 
http://commons.dbpedia.org/page/File:These_three_geese.jpg), but I couldn’t 
find the file you were interested in on their website.

Hope that helps!

cheers,
Gaurav

> On 25 Nov 2016, at 9:21 AM, Magnus Manske  wrote:
> 
> One option (old, unmaintained code, no support, no warranty, good luck) would 
> be my attempt at parsing this:
> https://tools.wmflabs.org/magnustools/commonsapi.php
> 
> On Fri, Nov 25, 2016 at 2:11 PM Hugo Manguinhas 
>  wrote:
> Hi everyone,
> 
> I am new to the Commons API and would like to know how to get (in a machine 
> readable way) the metadata found within the Summary section of a page.
> 
> In particular, given a File page like this one: 
> https://commons.wikimedia.org/wiki/File:African_Dusky_Nightjar_(Caprimulgus_pectoralis)_(W1CDR386_BD28).ogghttps://commons.wikimedia.org/wiki/File:African_Dusky_Nightjar_(Caprimulgus_pectoralis)_(W1CDR386_BD28).ogg
> 
> I would like to get the "Europeana link" part... it is enough for me to get 
> the data as Wiki markup, but parsing the whole HTML would be too much :S
> 
> ... btw, is there any way to query for such data? I have been using the API 
> Sandbox (https://en.wikipedia.org/wiki/Special:ApiSandbox ) but could not 
> find a method that could do this...
> 
> Your help is really appreciated! Thank you in advance!
> 
> Best regards,
> Hugo
> ___
> Commons-l mailing list
> Commons-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/commons-l
> ___
> Commons-l mailing list
> Commons-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/commons-l


___
Commons-l mailing list
Commons-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/commons-l


Re: [Commons-l] Some help needed

2016-11-25 Thread Magnus Manske
One option (old, unmaintained code, no support, no warranty, good luck)
would be my attempt at parsing this:
https://tools.wmflabs.org/magnustools/commonsapi.php

On Fri, Nov 25, 2016 at 2:11 PM Hugo Manguinhas <
hugo.manguin...@europeana.eu> wrote:

> Hi everyone,
>
> I am new to the Commons API and would like to know how to get (in a
> machine readable way) the metadata found within the Summary section of a
> page.
>
> In particular, given a File page like this one:
> https://commons.wikimedia.org/wiki/File:African_Dusky_Nightjar_(Caprimulgus_pectoralis)_(W1CDR386_BD28).ogghttps://commons.wikimedia.org/wiki/File:African_Dusky_Nightjar_(Caprimulgus_pectoralis)_(W1CDR386_BD28).ogg
>
> I would like to get the "Europeana link" part... it is enough for me to
> get the data as Wiki markup, but parsing the whole HTML would be too much :S
>
> ... btw, is there any way to query for such data? I have been using the
> API Sandbox (https://en.wikipedia.org/wiki/Special:ApiSandbox ) but could
> not find a method that could do this...
>
> Your help is really appreciated! Thank you in advance!
>
> Best regards,
> Hugo
> ___
> Commons-l mailing list
> Commons-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/commons-l
>
___
Commons-l mailing list
Commons-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/commons-l