Re: [Wikidata] searching for Wikidata items

2019-06-07 Thread Amirouche Boubekki
Hello all,


Le mar. 4 juin 2019 à 15:46, Marielle Volz  a
écrit :

> Yes, the api is at
> https://www.wikidata.org/w/api.php?action=query&list=search&srsearch=Bush
>
> There's a sandbox where you can play with the various options:
>
> https://www.wikidata.org/wiki/Special:ApiSandbox#action=query&format=json&list=search&srsearch=Bush
>


Can anyone point me to the relevant code that supports the search feature?
Or explain to me how it is done?


Thanks in advance!


On Tue, Jun 4, 2019 at 2:22 PM Tim Finin  wrote:
>
>> What's the best way to search Wikidata for items whose name or alias
>> matches a string?  The search available via pywikibot seems to only find a
>> match if the search string is a prefix of an item's name or alias, so
>> searching for "Bush" does not return any of the the George Bush items.  I
>> don't want to use a SPARQL query with a regex, since I expect that to be
>> slow.
>>
>> The search box on the Wikidata pages is closer to what I want.  Is there
>> a good way to call this via an API?
>>
>> Ideally, I'd like to be able to specify a language and also a set of
>> types, but I can do that once I've identified candidates based on a simple
>> match with a query string.
>>
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] searching for Wikidata items

2019-06-05 Thread Maarten Dammers

Hi Tim,

Pywikibot has generators around the API. For example for search you have 
https://doc.wikimedia.org/pywikibot/master/api_ref/pywikibot.html#pywikibot.pagegenerators.SearchPageGenerator 
. So basically anything you can search for as a user can also be used as 
a generator in Pywikibot.


Say for example all bands that have "Bush" in their name. We have the 
band Bush at https://www.wikidata.org/wiki/Q247949 . With a bit of a 
trick you can see what the search engine knows about a page: 
https://www.wikidata.org/w/index.php?title=Q247949&action=cirrusdump . 
We can use this to limit the search engine to limit the results to only 
instance of (P31) band (Q215380), see 
https://www.wikidata.org/w/index.php?search=bush+-wbhasstatement%3A%22P31%3DQ215380%22&title=Special%3ASearch&profile=advanced&fulltext=1&advancedSearch-current=%7B%7D&ns0=1&ns120=1 
or as API output at 
https://www.wikidata.org/w/api.php?action=query&list=search&srsearch=bush%20-wbhasstatement:%22P31=Q215380%22&format=json


Pywikibot accepts the same search string:
>>> import pywikibot
>>> from pywikibot import pagegenerators
>>> query = 'bush -wbhasstatement:"P31=Q215380"'
>>> repo = pywikibot.Site().data_repository()
>>> searchgen = pagegenerators.SearchPageGenerator(query,site=repo)
>>> for item in searchgen:
... print (item.title())
...
Q1156378
Q16945866
Q16953971
Q247949
Q2928714
Q5001360
Q5001432
Q7720714
Q7757229
>>>

Maarten

On 04-06-19 15:44, Marielle Volz wrote:
Yes, the api is at 
https://www.wikidata.org/w/api.php?action=query&list=search&srsearch=Bush


There's a sandbox where you can play with the various options:
https://www.wikidata.org/wiki/Special:ApiSandbox#action=query&format=json&list=search&srsearch=Bush


On Tue, Jun 4, 2019 at 2:22 PM Tim Finin > wrote:


What's the best way to search Wikidata for items whose name or
alias matches a string?  The search available via pywikibot seems
to only find a match if the search string is a prefix of an item's
name or alias, so searching for "Bush" does not return any of the
the George Bush items. I don't want to use a SPARQL query with a
regex, since I expect that to be slow.

The search box on the Wikidata pages is closer to what I want.  Is
there a good way to call this via an API?

Ideally, I'd like to be able to specify a language and also a set
of types, but I can do that once I've identified candidates based
on a simple match with a query string.
___
Wikidata mailing list
Wikidata@lists.wikimedia.org 
https://lists.wikimedia.org/mailman/listinfo/wikidata


___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] searching for Wikidata items

2019-06-04 Thread Stas Malyshev
Hi!

> Yes, the api is
> at https://www.wikidata.org/w/api.php?action=query&list=search&srsearch=Bush

There's also
https://www.wikidata.org/w/api.php?action=wbsearchentities&search=Bush&language=en&format=json

This is what completion search in Wikidata is using.
-- 
Stas Malyshev
smalys...@wikimedia.org

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] searching for Wikidata items

2019-06-04 Thread Marielle Volz
And I should add there's a third party api called open refine that can
filter by types:

https://tools.wmflabs.org/openrefine-wikidata/

https://tools.wmflabs.org/openrefine-wikidata/en/api?query=%7B%22query%22:%22bush%22,%22type%22:%22Q5%22%7D

On Tue, Jun 4, 2019 at 2:44 PM Marielle Volz 
wrote:

> Yes, the api is at
> https://www.wikidata.org/w/api.php?action=query&list=search&srsearch=Bush
>
> There's a sandbox where you can play with the various options:
>
> https://www.wikidata.org/wiki/Special:ApiSandbox#action=query&format=json&list=search&srsearch=Bush
>
>
> On Tue, Jun 4, 2019 at 2:22 PM Tim Finin  wrote:
>
>> What's the best way to search Wikidata for items whose name or alias
>> matches a string?  The search available via pywikibot seems to only find a
>> match if the search string is a prefix of an item's name or alias, so
>> searching for "Bush" does not return any of the the George Bush items.  I
>> don't want to use a SPARQL query with a regex, since I expect that to be
>> slow.
>>
>> The search box on the Wikidata pages is closer to what I want.  Is there
>> a good way to call this via an API?
>>
>> Ideally, I'd like to be able to specify a language and also a set of
>> types, but I can do that once I've identified candidates based on a simple
>> match with a query string.
>> ___
>> Wikidata mailing list
>> Wikidata@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikidata
>>
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] searching for Wikidata items

2019-06-04 Thread Marielle Volz
Yes, the api is at
https://www.wikidata.org/w/api.php?action=query&list=search&srsearch=Bush

There's a sandbox where you can play with the various options:
https://www.wikidata.org/wiki/Special:ApiSandbox#action=query&format=json&list=search&srsearch=Bush


On Tue, Jun 4, 2019 at 2:22 PM Tim Finin  wrote:

> What's the best way to search Wikidata for items whose name or alias
> matches a string?  The search available via pywikibot seems to only find a
> match if the search string is a prefix of an item's name or alias, so
> searching for "Bush" does not return any of the the George Bush items.  I
> don't want to use a SPARQL query with a regex, since I expect that to be
> slow.
>
> The search box on the Wikidata pages is closer to what I want.  Is there a
> good way to call this via an API?
>
> Ideally, I'd like to be able to specify a language and also a set of
> types, but I can do that once I've identified candidates based on a simple
> match with a query string.
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


[Wikidata] searching for Wikidata items

2019-06-04 Thread Tim Finin
What's the best way to search Wikidata for items whose name or alias
matches a string?  The search available via pywikibot seems to only find a
match if the search string is a prefix of an item's name or alias, so
searching for "Bush" does not return any of the the George Bush items.  I
don't want to use a SPARQL query with a regex, since I expect that to be
slow.

The search box on the Wikidata pages is closer to what I want.  Is there a
good way to call this via an API?

Ideally, I'd like to be able to specify a language and also a set of types,
but I can do that once I've identified candidates based on a simple match
with a query string.
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata