I round up from DOI/PubMed ID counts on https://tools.wmflabs.org/scholia/

Egon

On Sat, Dec 15, 2018 at 3:03 PM Fabrizio Carrai <fabrizio.car...@gmail.com>
wrote:

> Excellent, I did some tests and with some cycles I already identified and
> classified several articles.
> I will have a look at your script in the  next days but I already have a
> question: the number of iterations is based on the total number of
> articles, how do you know that ?
>
> ---
> Fabrizio
>
> Il giorno sab 15 dic 2018 alle ore 10:18 Egon Willighagen <
> egon.willigha...@gmail.com> ha scritto:
>
>>
>> The approach I use is the following, see this (Bioclipse/Groovy) script:
>> https://gist.github.com/egonw/ca4c348b9a2d1116efcdb55fa85dd158
>>
>> It takes advantage of a combination Blazegraph SPARQL trick and breaking
>> up thing in batches of a certain size:
>>
>> SELECT ?art ?artLabel
>> WITH {
>> SELECT ?art WHERE {
>> ?art wdt:P31 wd:Q13442814
>> } LIMIT $batchSize OFFSET $offset
>> } AS %RESULTS {
>> INCLUDE %RESULTS
>> ?art wdt:P1476 ?artLabel .
>> MINUS { ?art wdt:P921 wd:$conceptQ }
>> FILTER (contains(lcase(str(?artLabel)), "$concept"))
>> }
>> where "$concept" is my search word in the title, and $batchSize and
>> $offset take care of the batching by the script. This script creates
>> QuickStatements.
>>
>> Mind you, I manually check the created statements, because in my domain
>> (biochem) a simple search results of false positives, hence the "blacklist"
>> in the script :)
>>
>> Egon
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> On Sat, Dec 15, 2018 at 10:13 AM Fabrizio Carrai <
>> fabrizio.car...@gmail.com> wrote:
>>
>>> Thanks Matthias,
>>> that's a pity. Your suggestion relies on the effective characterization
>>> of the item that,  at this writing time, is pretty poor for my interest.
>>> Could it be an idea to download all the "scholary articles", locally
>>> select  for the keyword of interest (e.g. "microgravity") and set the
>>> property P921 for all of them ? Quickstatements may be helpful for the last
>>> step, any suggestions for other tools ?
>>>
>>> Thanks
>>> Fabrizio
>>>
>>> Il giorno ven 14 dic 2018 alle ore 22:16 Matthias Erfurth <
>>> erfu...@gmx.de> ha scritto:
>>>
>>>> Hi Fabrizio,
>>>> unfortunately you can't fulltext search all the scholarly articles
>>>> <https://www.wikidata.org/wiki/Q13442814> , you should better work
>>>> with indexed properties, so
>>>> you can query for other articles with microgravity as main subject ...
>>>> With the ajax based wikidata search
>>>>
>>>> SELECT ?item
>>>> WHERE {
>>>>     ?item wdt:P31 wd:Q13442814;
>>>>           wdt:P921 wd:Q48655.
>>>> }
>>>>
>>>> Best regards,
>>>>
>>>> ciao matthias
>>>>
>>>>
>>>> *Gesendet:* Freitag, 14. Dezember 2018 um 18:55 Uhr
>>>> *Von:* "Fabrizio Carrai" <fabrizio.car...@gmail.com>
>>>> *An:* "Discussion list for the Wikidata project" <
>>>> wikidata@lists.wikimedia.org>
>>>> *Betreff:* Re: [Wikidata] Query on scholarly article fails
>>>> Thanks again to Ettore, but I immediately found another timeout problem
>>>> when I just added a FILTER to find all the articles with the word "biokis"
>>>> in the title
>>>>
>>>> SELECT ?istanza_di ?instanza_diLabel WHERE {
>>>>   ?istanza_di wdt:P31 wd:Q13442814.
>>>>   ?istanza_di rdfs:label ?instanza_diLabel.
>>>>   FILTER((LANG(?instanza_diLabel)) = "en").
>>>>   FILTER(CONTAINS(LCASE(?instanza_diLabel), "biokis"))
>>>> }
>>>> LIMIT 100
>>>>
>>>> At least one article should be returned:
>>>> https://www.wikidata.org/wiki/Q57202937
>>>> but I got a timeout.
>>>>
>>>> Thanks to anybody that can help
>>>>
>>>> Fabrizio
>>>>
>>>>
>>>> Il giorno ven 14 dic 2018 alle ore 10:12 Ettore RIZZA <
>>>> ettoreri...@gmail.com> ha scritto:
>>>>
>>>>> Hello Fabrizio,
>>>>>
>>>>> It seems that the problem comes from SERVICE wikibase:label. As said
>>>>> in another discussion, the query executes in less than one second if you 
>>>>> rewrite
>>>>> it in this way
>>>>> <https://query.wikidata.org/#SELECT%20%3Fistanza_di%20%3Finstanza_diLabel%20WHERE%20%7B%0A%20%20%3Fistanza_di%20wdt%3AP31%20wd%3AQ13442814.%0A%20%20%3Fistanza_di%20rdfs%3Alabel%20%3Finstanza_diLabel.%0A%20%20FILTER%28%28LANG%28%3Finstanza_diLabel%29%29%20%3D%20%22en%22%29%0A%7D%0ALIMIT%2010>
>>>>> .
>>>>>
>>>>> Cheers,
>>>>>
>>>>> Ettore Rizza
>>>>>
>>>>> Le ven. 14 déc. 2018 à 09:59, Fabrizio Carrai <
>>>>> fabrizio.car...@gmail.com> a écrit :
>>>>>
>>>>>> Hello all,
>>>>>> the following query ends with a timeot:
>>>>>>
>>>>>> SELECT ?istanza_di ?istanza_diLabel WHERE {
>>>>>>   SERVICE wikibase:label { bd:serviceParam wikibase:language
>>>>>> "[AUTO_LANGUAGE],en". }
>>>>>>   ?istanza_di wdt:P31 wd:Q13442814.
>>>>>> }
>>>>>> LIMIT 10
>>>>>>
>>>>>> Can anybody explain why ?
>>>>>> Thanks in advance
>>>>>>
>>>>>> --
>>>>>> *Fabrizio*
>>>>>> _______________________________________________
>>>>>> Wikidata mailing list
>>>>>> Wikidata@lists.wikimedia.org
>>>>>> https://lists.wikimedia.org/mailman/listinfo/wikidata
>>>>>
>>>>> _______________________________________________
>>>>> Wikidata mailing list
>>>>> Wikidata@lists.wikimedia.org
>>>>> https://lists.wikimedia.org/mailman/listinfo/wikidata
>>>>
>>>>
>>>>
>>>> --
>>>> *Fabrizio*
>>>> _______________________________________________ Wikidata mailing list
>>>> Wikidata@lists.wikimedia.org
>>>> https://lists.wikimedia.org/mailman/listinfo/wikidata
>>>> _______________________________________________
>>>> Wikidata mailing list
>>>> Wikidata@lists.wikimedia.org
>>>> https://lists.wikimedia.org/mailman/listinfo/wikidata
>>>>
>>>
>>>
>>> --
>>> *Fabrizio*
>>> _______________________________________________
>>> Wikidata mailing list
>>> Wikidata@lists.wikimedia.org
>>> https://lists.wikimedia.org/mailman/listinfo/wikidata
>>>
>>
>>
>> --
>> Hi, do you like citation networks? Already 51% of all citations are
>> available <https://i4oc.org/> available for innovative new uses
>> <https://twitter.com/hashtag/acs2ioc>. Join my in asking the American
>> Chemical Society to join the Initiative for Open Citations too
>> <https://www.change.org/p/asking-the-american-chemical-society-to-join-the-initiative-for-open-citations>.
>>  SpringerNature,
>> the RSC and many others already did <https://i4oc.org/#publishers>.
>>
>> -----
>> E.L. Willighagen
>> Department of Bioinformatics - BiGCaT
>> Maastricht University (http://www.bigcat.unimaas.nl/)
>> Homepage: http://egonw.github.com/
>> Blog: http://chem-bla-ics.blogspot.com/
>> PubList: https://www.zotero.org/egonw
>> ORCID: 0000-0001-7542-0286 <http://orcid.org/0000-0001-7542-0286>
>> ImpactStory: https://impactstory.org/u/egonwillighagen
>> _______________________________________________
>> Wikidata mailing list
>> Wikidata@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikidata
>>
>
>
> --
> *Fabrizio*
> _______________________________________________
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>


-- 
Hi, do you like citation networks? Already 51% of all citations are
available <https://i4oc.org/> available for innovative new uses
<https://twitter.com/hashtag/acs2ioc>. Join my in asking the American
Chemical Society to join the Initiative for Open Citations too
<https://www.change.org/p/asking-the-american-chemical-society-to-join-the-initiative-for-open-citations>.
SpringerNature,
the RSC and many others already did <https://i4oc.org/#publishers>.

-----
E.L. Willighagen
Department of Bioinformatics - BiGCaT
Maastricht University (http://www.bigcat.unimaas.nl/)
Homepage: http://egonw.github.com/
Blog: http://chem-bla-ics.blogspot.com/
PubList: https://www.zotero.org/egonw
ORCID: 0000-0001-7542-0286 <http://orcid.org/0000-0001-7542-0286>
ImpactStory: https://impactstory.org/u/egonwillighagen
_______________________________________________
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Reply via email to