[Wikidata] Re: Wikidata Graph Split update

2024-06-11 Thread Egon Willighagen
Some Scholia query rewriting discussion is here:
https://github.com/WDscholia/scholia/issues/2423

Egon

On Tue, 11 Jun 2024 at 18:02, Samuel Klein  wrote:

> It would be helpful to see how the standard Scholia queries work under
> federation.  (those that need it)
>
> Are there evals for other graph dbs on how they handle federation?
>
> On Tue, Jun 11, 2024 at 10:39 AM Egon Willighagen <
> egon.willigha...@gmail.com> wrote:
>
>>
>> Hi, thank you for the update.
>>
>> The email writes that "Queries that need federation will need to be
>> rewritten. You can ask for help to rewrite queries".
>>
>> Do you have guidelines on how to do this? It took quite some effort to
>> make some of the (I thought simple) queries work, but later improvements
>> showed more workable. How were they developed? How do people rewrite the
>> SPARQL queries when two or more query triples are distributed over the two
>> SPARQL endpoint, and particularly when they depend on each other?
>>
>> Egon
>>
>>
>> On Tue, 11 Jun 2024 at 16:17, Guillaume Lederrey 
>> wrote:
>>
>>> Hello all!
>>>
>>> The feedback period for our WDQS Graph Split proposal
>>> <https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/WDQS_graph_split/WDQS_Split_Refinement>
>>>  has
>>> come to an end. Many thanks to all people who sent comments, your
>>> contribution is invaluable!
>>>
>>> We’ve incorporated most comments and proposals into our final set of
>>> rules for the graph split
>>> <https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/WDQS_graph_split/Rules>.
>>> The main proposals (including some that were rejected) were:
>>>
>>>- Duplicate properties in both graph (wd:P*) does not seem necessary
>>>and won't be done
>>>- The list of types of publications that identify what is a
>>>scholarly article have been improved, see the final list of items
>>>here
>>>
>>> <https://docs.google.com/spreadsheets/d/1eKX_2Z1rXj1s_zOapQvn_0uD6MVhc-qyqqxbn5loIvk/edit>
>>>- It was discussed whether sitelinks should inform the nature of the
>>>split or not; this idea was not incorporated because it might make it
>>>harder to understand what is where
>>>- Discussions and investigations regarding items that define
>>>multiple instance of (P31)
>>><https://www.wikidata.org/wiki/Property:P31> which might be
>>>ambiguous, it appears that it might not affect a lot of items and that 
>>> the
>>>solution might be to disambiguate these instances by creating separate
>>>entities (see the Clinical Trials section
>>>
>>> <https://www.wikidata.org/wiki/Wikidata_talk:SPARQL_query_service/WDQS_graph_split/WDQS_Split_Refinement#Clinical_trials>
>>>  of
>>>the Talk Page).
>>>- Re-thinking how scholarly articles are modelled was raised,
>>>especially by identifying the nature of the publication using a separate
>>>property rather than using instance of (P31)
>>><https://www.wikidata.org/wiki/Property:P31>. This idea should
>>>probably be explored and discussed by the wikicite community, since it 
>>> does
>>>affect the nature of the split but could be a nice criteria to take into
>>>consideration in the future.
>>>
>>> We are now working on implementing the appropriate tooling to manage
>>> this split, including a new way of processing the Wikidata dumps for an
>>> initial load, modification to the update pipeline to support the graph
>>> split, and additional automation. We hope to have new SPARQL endpoints that
>>> are live updated with the graph split by the end of June. This timeline is
>>> probably slightly optimistic, we’ll let you know when those are ready.
>>>
>>> Once the new SPARQL endpoints that are live updated with the graph split
>>> are available, we will provide a 6 months transition period, during which
>>> the current endpoint (query.wikidata.org/sparql) will keep serving the
>>> full graph. Once that transition is over, query.wikidata.org will only
>>> serve the main graph. Queries that need federation will need to be
>>> rewritten. You can ask for help to rewrite queries
>>> <https://www.wikidata.org/wiki/Wikidata:Request_a_query_rewrite>.
>>>
>>> Thank you all for your help and support!
>>>
>>>
>>> Guillaume
>>>

[Wikidata] Re: Wikidata Graph Split update

2024-06-11 Thread Egon Willighagen
Hi, thank you for the update.

The email writes that "Queries that need federation will need to be
rewritten. You can ask for help to rewrite queries".

Do you have guidelines on how to do this? It took quite some effort to make
some of the (I thought simple) queries work, but later improvements showed
more workable. How were they developed? How do people rewrite the SPARQL
queries when two or more query triples are distributed over the two SPARQL
endpoint, and particularly when they depend on each other?

Egon


On Tue, 11 Jun 2024 at 16:17, Guillaume Lederrey 
wrote:

> Hello all!
>
> The feedback period for our WDQS Graph Split proposal
> 
>  has
> come to an end. Many thanks to all people who sent comments, your
> contribution is invaluable!
>
> We’ve incorporated most comments and proposals into our final set of
> rules for the graph split
> .
> The main proposals (including some that were rejected) were:
>
>- Duplicate properties in both graph (wd:P*) does not seem necessary
>and won't be done
>- The list of types of publications that identify what is a scholarly
>article have been improved, see the final list of items here
>
> 
>- It was discussed whether sitelinks should inform the nature of the
>split or not; this idea was not incorporated because it might make it
>harder to understand what is where
>- Discussions and investigations regarding items that define multiple 
> instance
>of (P31)  which might be
>ambiguous, it appears that it might not affect a lot of items and that the
>solution might be to disambiguate these instances by creating separate
>entities (see the Clinical Trials section
>
> 
>  of
>the Talk Page).
>- Re-thinking how scholarly articles are modelled was raised,
>especially by identifying the nature of the publication using a separate
>property rather than using instance of (P31)
>. This idea should
>probably be explored and discussed by the wikicite community, since it does
>affect the nature of the split but could be a nice criteria to take into
>consideration in the future.
>
> We are now working on implementing the appropriate tooling to manage this
> split, including a new way of processing the Wikidata dumps for an initial
> load, modification to the update pipeline to support the graph split, and
> additional automation. We hope to have new SPARQL endpoints that are live
> updated with the graph split by the end of June. This timeline is probably
> slightly optimistic, we’ll let you know when those are ready.
>
> Once the new SPARQL endpoints that are live updated with the graph split
> are available, we will provide a 6 months transition period, during which
> the current endpoint (query.wikidata.org/sparql) will keep serving the
> full graph. Once that transition is over, query.wikidata.org will only
> serve the main graph. Queries that need federation will need to be
> rewritten. You can ask for help to rewrite queries
> .
>
> Thank you all for your help and support!
>
>
> Guillaume
>
> --
> *Guillaume Lederrey* (he/him)
> Engineering Manager
> Wikimedia Foundation 
> ___
> Wikidata mailing list -- wikidata@lists.wikimedia.org
> Public archives at
> https://lists.wikimedia.org/hyperkitty/list/wikidata@lists.wikimedia.org/message/YS26TSGY3YRSJADWAE3DXSVQR43FNK4K/
> To unsubscribe send an email to wikidata-le...@lists.wikimedia.org
>


-- 
Some nanomaterials stress our cells and cause key event, some towards
adverse outcomes. Read about it in our new paper "From papers to RDF-based
integration of physicochemical data and adverse outcome pathways for
nanomaterials", https://doi.org/10.1186/s13321-024-00833

--
E.L. Willighagen
Department of Bioinformatics - BiGCaT
Maastricht University (http://www.bigcat.unimaas.nl/)
Blog: https://chem-bla-ics.linkedchemistry.info/
Mastodon: https://social.edu.nl/@egonw
PubList: https://orcid.org/-0001-7542-0286
___
Wikidata mailing list -- wikidata@lists.wikimedia.org
Public archives at 
https://lists.wikimedia.org/hyperkitty/list/wikidata@lists.wikimedia.org/message/UMSJIAK5BFLGIBRJP6IVY572G4D64QCK/
To unsubscribe send an email to wikidata-le...@lists.wikimedia.org


[Wikidata] Re: Challenge of the day (2): ports without located in or next to body of water

2022-12-26 Thread Egon Willighagen
Thanks, I found it more difficult for regions I do not know, as I am not
familiar with naming conventions, and add them on a map makes it easier to
shorten a todo list:
https://query.wikidata.org/#%23defaultView%3AMap%0ASELECT%20DISTINCT%20%3Fitem%20%3FitemLabel%20%3Fcoordinates%20WHERE%20%7B%0A%20%20%3Fitem%20wdt%3AP31%2Fwdt%3AP279%2a%20wd%3AQ44782%20%3B%20wdt%3AP625%20%3Fcoordinates%20.%0A%20%20MINUS%20%7B%0A%20%20%20%20%3Fitem%20wdt%3AP206%20%3Fsomething%20.%0A%20%20%7D%0A%20%20SERVICE%20wikibase%3Alabel%20%7B%20bd%3AserviceParam%20wikibase%3Alanguage%20%22%5BAUTO_LANGUAGE%5D%22.%20%7D%0A%7D%0A

Egon

On Sun, 25 Dec 2022 at 11:44, Romaine Wiki  wrote:

> Hi all,
>
> Too many items on Wikidata still miss the basic statements. Perhaps we can
> focus together for a short period of time on a single subject to get this
> fixed.
>
> For example: all items with instance of (P31) (maritime) port should also
> contain the waterbody at which it is located (P206).
>
> When I just ran a query I saw about 6000 ports that are still missing the
> waterbody at which it is located (P206).
>
> Query:
> https://query.wikidata.org/#SELECT%20%3Fitem%20%3FitemLabel%20WHERE%20%7B%0A%20%20%3Fitem%20wdt%3AP31%2Fwdt%3AP279%2a%20wd%3AQ44782%20.%0A%20%20MINUS%20%7B%0A%20%20%20%20%3Fitem%20wdt%3AP206%20%3Fsomething%20.%0A%20%20%7D%0A%20%20SERVICE%20wikibase%3Alabel%20%7B%20bd%3AserviceParam%20wikibase%3Alanguage%20%22%5BAUTO_LANGUAGE%5D%22.%20%7D%0A%7D%0A
>
> I already did a few myself but for the largest part help is needed. Who
> has ideas and can help getting this statement added to all the items about
> ports?
>
> Thanks!
>
> Romaine
> ___
> Wikidata mailing list -- wikidata@lists.wikimedia.org
> Public archives at
> https://lists.wikimedia.org/hyperkitty/list/wikidata@lists.wikimedia.org/message/EZFSA6N2ZR5O33MNFIMO65IUUALGWJTR/
> To unsubscribe send an email to wikidata-le...@lists.wikimedia.org
>


-- 
Happy holiday season and new year!

--
E.L. Willighagen
Department of Bioinformatics - BiGCaT
Maastricht University (http://www.bigcat.unimaas.nl/)
Homepage: https://egonw.github.io/
Blog: https://chem-bla-ics.blogspot.com/
Mastodon: https://scholar.social/@egonw
PubList: https://orcid.org/-0001-7542-0286
___
Wikidata mailing list -- wikidata@lists.wikimedia.org
Public archives at 
https://lists.wikimedia.org/hyperkitty/list/wikidata@lists.wikimedia.org/message/73CQOJNBJRLYKDM3K4Q3CR54BT4FN6DS/
To unsubscribe send an email to wikidata-le...@lists.wikimedia.org


Re: [Wikidata] Wikidata in the LOD cloud

2020-09-22 Thread Egon Willighagen
How will we handle LOD entries for data sets provided by a third party?

For example, Bio2RDF released over time quite a few data sets, each having
a separate entry in the LOD cloud, e.g. Bio2rdf::DrugBank?

Do we create a separate Wikidata item for that? Link it to the Wikidata
item for Drugbank, so that one database can have more than one LOD Cloud ID?

What do you think?

Egon




On Tue, Sep 22, 2020 at 11:02 AM Andy Mabbett 
wrote:

> On Wed, 16 Sep 2020 at 14:53, Lydia Pintscher
>  wrote:
>
> > And we now have the Property \o/
> https://www.wikidata.org/wiki/Property:P8605
>
> Now in Mix'n'match:
>
>https://mix-n-match.toolforge.org/#/catalog/3862
>
> --
> Andy Mabbett
> @pigsonthewing
> http://pigsonthewing.org.uk
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>


-- 
Have you heard about Wikidata already? "Use Scholia and Wikidata to find
scientific literature" is a new tutorial from my colleague Lauren Dupuis.
https://laurendupuis.github.io/Scholia_tutorial/

-
E.L. Willighagen
Department of Bioinformatics - BiGCaT
Maastricht University (http://www.bigcat.unimaas.nl/)
Homepage: http://egonw.github.com/
Blog: http://chem-bla-ics.blogspot.com/
PubList: https://www.zotero.org/egonw
ORCID: -0001-7542-0286 
ImpactStory: https://impactstory.org/u/egonwillighagen
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Wikidata in the LOD cloud

2020-09-22 Thread Egon Willighagen
Awesome, thanks!

How was it done? I like to learn a bit more about the steps.

Egon

On Tue, Sep 22, 2020 at 11:02 AM Andy Mabbett 
wrote:

> On Wed, 16 Sep 2020 at 14:53, Lydia Pintscher
>  wrote:
>
> > And we now have the Property \o/
> https://www.wikidata.org/wiki/Property:P8605
>
> Now in Mix'n'match:
>
>https://mix-n-match.toolforge.org/#/catalog/3862
>
> --
> Andy Mabbett
> @pigsonthewing
> http://pigsonthewing.org.uk
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>


-- 
Have you heard about Wikidata already? "Use Scholia and Wikidata to find
scientific literature" is a new tutorial from my colleague Lauren Dupuis.
https://laurendupuis.github.io/Scholia_tutorial/

-
E.L. Willighagen
Department of Bioinformatics - BiGCaT
Maastricht University (http://www.bigcat.unimaas.nl/)
Homepage: http://egonw.github.com/
Blog: http://chem-bla-ics.blogspot.com/
PubList: https://www.zotero.org/egonw
ORCID: -0001-7542-0286 
ImpactStory: https://impactstory.org/u/egonwillighagen
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Wikidata in the LOD cloud

2020-09-16 Thread Egon Willighagen
i guess this is the downstream tools still needing to catch up. The
formatter URL is defined: https://www.wikidata.org/wiki/Property:P8605#P1630

Egon

On Thu, Sep 17, 2020 at 8:52 AM Sebastian Hellmann <
hellm...@informatik.uni-leipzig.de> wrote:

> Hi all,
>
> a question here:
>
> P8605 is shown as string, e.g. as "doi" in
> https://www.wikidata.org/wiki/Q5188229  , which is the last path segment
> of the identifier,  shouldn't this be "http://lod-cloud.net/dataset/doi"; ?
>
> At least for the LOD Cloud project.
>
> -- Sebastian
>
>
>
> On 16.09.20 15:53, Lydia Pintscher wrote:
> > On Sat, Aug 15, 2020 at 9:06 AM Egon Willighagen
> >  wrote:
> >> Proposed:
> https://www.wikidata.org/wiki/Wikidata:Property_proposal/Generic#Linked_Open_Data_Cloud_identifier
> >>
> >> Egon
> > And we now have the Property \o/
> https://www.wikidata.org/wiki/Property:P8605
> >
> >
> > Cheers
> > Lydia
> >
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>


-- 
Have you heard about Wikidata already? "Use Scholia and Wikidata to find
scientific literature" is a new tutorial from my colleague Lauren Dupuis.
https://laurendupuis.github.io/Scholia_tutorial/

-
E.L. Willighagen
Department of Bioinformatics - BiGCaT
Maastricht University (http://www.bigcat.unimaas.nl/)
Homepage: http://egonw.github.com/
Blog: http://chem-bla-ics.blogspot.com/
PubList: https://www.zotero.org/egonw
ORCID: -0001-7542-0286 <http://orcid.org/-0001-7542-0286>
ImpactStory: https://impactstory.org/u/egonwillighagen
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Wikidata in the LOD cloud

2020-09-16 Thread Egon Willighagen
All,

On Wed, Sep 16, 2020 at 3:54 PM Lydia Pintscher <
lydia.pintsc...@wikimedia.de> wrote:

> And we now have the Property \o/
> https://www.wikidata.org/wiki/Property:P8605


When approved, a fifth example was added, for a Bio2RDF subset in the LOD
cloud, linked the Bio2RDF entry in Wikidata itself (and not a Wikidata item
for the subset). That would be possible, but then bio2rdf-pubchem (and
many, many more) will be linked to that Wikidata item for Bio2RDF. Is that
what we want? Or should bio2rdf-pubchem be linked to the PubChem entry in
Wikidata? Etc?

What do you think?

Egon

-- 
Have you heard about Wikidata already? "Use Scholia and Wikidata to find
scientific literature" is a new tutorial from my colleague Lauren Dupuis.
https://laurendupuis.github.io/Scholia_tutorial/

-
E.L. Willighagen
Department of Bioinformatics - BiGCaT
Maastricht University (http://www.bigcat.unimaas.nl/)
Homepage: http://egonw.github.com/
Blog: http://chem-bla-ics.blogspot.com/
PubList: https://www.zotero.org/egonw
ORCID: -0001-7542-0286 
ImpactStory: https://impactstory.org/u/egonwillighagen
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Wikidata in the LOD cloud

2020-09-16 Thread Egon Willighagen
Ah, awesome!

I've been very busy and hope someone will beat me to it, but I was thinking
of doing some webscraping and prepare a Mix'n'Match data set... but we can
all also just do a few more manually :)

Egon


On Wed, Sep 16, 2020 at 3:54 PM Lydia Pintscher <
lydia.pintsc...@wikimedia.de> wrote:

> On Sat, Aug 15, 2020 at 9:06 AM Egon Willighagen
>  wrote:
> > Proposed:
> https://www.wikidata.org/wiki/Wikidata:Property_proposal/Generic#Linked_Open_Data_Cloud_identifier
> >
> > Egon
>
> And we now have the Property \o/
> https://www.wikidata.org/wiki/Property:P8605
>
>
> Cheers
> Lydia
>
> --
> Lydia Pintscher - http://about.me/lydia.pintscher
> Product Manager for Wikidata
>
> Wikimedia Deutschland e.V.
> Tempelhofer Ufer 23-24
> 10963 Berlin
> www.wikimedia.de
>
> Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
>
> Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
> unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das
> Finanzamt für Körperschaften I Berlin, Steuernummer 27/029/42207.
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>


-- 
Have you heard about Wikidata already? "Use Scholia and Wikidata to find
scientific literature" is a new tutorial from my colleague Lauren Dupuis.
https://laurendupuis.github.io/Scholia_tutorial/

-
E.L. Willighagen
Department of Bioinformatics - BiGCaT
Maastricht University (http://www.bigcat.unimaas.nl/)
Homepage: http://egonw.github.com/
Blog: http://chem-bla-ics.blogspot.com/
PubList: https://www.zotero.org/egonw
ORCID: -0001-7542-0286 <http://orcid.org/-0001-7542-0286>
ImpactStory: https://impactstory.org/u/egonwillighagen
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Wikidata in the LOD cloud

2020-08-15 Thread Egon Willighagen
Proposed:
https://www.wikidata.org/wiki/Wikidata:Property_proposal/Generic#Linked_Open_Data_Cloud_identifier

Egon

On Sat, Aug 15, 2020 at 8:38 AM Egon Willighagen 
wrote:

>
> I'll have a go at it.
>
> Egon
>
> On Fri, Aug 14, 2020 at 3:51 PM Lydia Pintscher <
> lydia.pintsc...@wikimedia.de> wrote:
>
>> On Fri, Aug 14, 2020 at 3:46 PM Lydia Pintscher
>>  wrote:
>> > Hey :)
>> >
>> > Adam got the current numbers for us at
>> https://phabricator.wikimedia.org/P12190
>> > Anyone up for helping put those in?
>>
>> Oh and one thing I forgot: It would be pretty helpful for further
>> automating this if we had the mapping between Wikidata's property and
>> the LODCloud entries. Anyone up for proposing a new property for that?
>>
>>
>> Cheers
>> Lydia
>>
>> --
>> Lydia Pintscher - http://about.me/lydia.pintscher
>> Product Manager for Wikidata
>>
>> Wikimedia Deutschland e.V.
>> Tempelhofer Ufer 23-24
>> 10963 Berlin
>> www.wikimedia.de
>>
>> Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
>>
>> Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
>> unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das
>> Finanzamt für Körperschaften I Berlin, Steuernummer 27/029/42207.
>>
>> ___
>> Wikidata mailing list
>> Wikidata@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikidata
>>
>
>
> --
> Have you heard about Wikidata already? "Use Scholia and Wikidata to find
> scientific literature" is a new tutorial from my colleague Lauren Dupuis.
> https://laurendupuis.github.io/Scholia_tutorial/
>
> -
> E.L. Willighagen
> Department of Bioinformatics - BiGCaT
> Maastricht University (http://www.bigcat.unimaas.nl/)
> Homepage: http://egonw.github.com/
> Blog: http://chem-bla-ics.blogspot.com/
> PubList: https://www.zotero.org/egonw
> ORCID: -0001-7542-0286 <http://orcid.org/-0001-7542-0286>
> ImpactStory: https://impactstory.org/u/egonwillighagen
>


-- 
Have you heard about Wikidata already? "Use Scholia and Wikidata to find
scientific literature" is a new tutorial from my colleague Lauren Dupuis.
https://laurendupuis.github.io/Scholia_tutorial/

-
E.L. Willighagen
Department of Bioinformatics - BiGCaT
Maastricht University (http://www.bigcat.unimaas.nl/)
Homepage: http://egonw.github.com/
Blog: http://chem-bla-ics.blogspot.com/
PubList: https://www.zotero.org/egonw
ORCID: -0001-7542-0286 <http://orcid.org/-0001-7542-0286>
ImpactStory: https://impactstory.org/u/egonwillighagen
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Wikidata in the LOD cloud

2020-08-14 Thread Egon Willighagen
I'll have a go at it.

Egon

On Fri, Aug 14, 2020 at 3:51 PM Lydia Pintscher <
lydia.pintsc...@wikimedia.de> wrote:

> On Fri, Aug 14, 2020 at 3:46 PM Lydia Pintscher
>  wrote:
> > Hey :)
> >
> > Adam got the current numbers for us at
> https://phabricator.wikimedia.org/P12190
> > Anyone up for helping put those in?
>
> Oh and one thing I forgot: It would be pretty helpful for further
> automating this if we had the mapping between Wikidata's property and
> the LODCloud entries. Anyone up for proposing a new property for that?
>
>
> Cheers
> Lydia
>
> --
> Lydia Pintscher - http://about.me/lydia.pintscher
> Product Manager for Wikidata
>
> Wikimedia Deutschland e.V.
> Tempelhofer Ufer 23-24
> 10963 Berlin
> www.wikimedia.de
>
> Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
>
> Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
> unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das
> Finanzamt für Körperschaften I Berlin, Steuernummer 27/029/42207.
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>


-- 
Have you heard about Wikidata already? "Use Scholia and Wikidata to find
scientific literature" is a new tutorial from my colleague Lauren Dupuis.
https://laurendupuis.github.io/Scholia_tutorial/

-
E.L. Willighagen
Department of Bioinformatics - BiGCaT
Maastricht University (http://www.bigcat.unimaas.nl/)
Homepage: http://egonw.github.com/
Blog: http://chem-bla-ics.blogspot.com/
PubList: https://www.zotero.org/egonw
ORCID: -0001-7542-0286 
ImpactStory: https://impactstory.org/u/egonwillighagen
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Wikidata in the LOD cloud

2020-08-06 Thread Egon Willighagen
Of particular interest is the link counts, based on formatter URI for RDF
resource (P1921).

There are 167 property - P1921 combinations: https://w.wiki/YsV

Calculating the number of links times out when run for all properties:
https://w.wiki/Ysk So, an iterative script is likely more practical.

But basically, the query just counts the number of items with a property,
but limited by the properties that have a P1921 defined. Surely, that can
be done more efficiently.

Egon

On Wed, Aug 5, 2020 at 11:30 PM Andy Mabbett 
wrote:

> On Wed, 5 Aug 2020 at 22:12, Daniel Mietchen
>  wrote:
>
> > the newest version of the LOD cloud is just a week old,[1] but the
> underlying information for Wikidata is out of date by several years.[2]
> >
> > Has anyone looked into streamlining the data submission process?
>
> Yes; here's the discussion from 2018:
>
>https://lists.wikimedia.org/pipermail/wikidata/2018-April/011988.html
>
> Lucas Werkmeister was handing this from the Wikidata Dev Team's side:
>
>https://lists.wikimedia.org/pipermail/wikidata/2018-May/012042.html
>
> --
> Andy Mabbett
> @pigsonthewing
> http://pigsonthewing.org.uk
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>


-- 
Have you heard about Wikidata already? "Use Scholia and Wikidata to find
scientific literature" is a new tutorial from my colleague Lauren Dupuis.
https://laurendupuis.github.io/Scholia_tutorial/

-
E.L. Willighagen
Department of Bioinformatics - BiGCaT
Maastricht University (http://www.bigcat.unimaas.nl/)
Homepage: http://egonw.github.com/
Blog: http://chem-bla-ics.blogspot.com/
PubList: https://www.zotero.org/egonw
ORCID: -0001-7542-0286 
ImpactStory: https://impactstory.org/u/egonwillighagen
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] WDQS outage - 2020/07/23

2020-07-26 Thread Egon Willighagen
See https://phabricator.wikimedia.org/T242453 linked on the report page

On Sun, Jul 26, 2020 at 6:55 PM Kingsley Idehen 
wrote:

> On 7/24/20 3:18 PM, Ryan Kemper wrote:
> > Hi all,
> >
> > We experienced WDQS service disruptions on 2020/07/23. As a result
> > there was a full outage (inability to respond to all queries) for a
> > period of several minutes, and a more extended period of
> > intermittently degraded service (inability to respond to a subset of
> > queries) for 1-2 hours.
> >
> > The full incident report is available here:
> >
> https://wikitech.wikimedia.org/wiki/Incident_documentation/20200723-wdqs-outage
> >
> > Ultimately, we traced the proximate cause to a series of
> > non-performant queries, which caused a deadlock in blazegraph, the
> > backend for WDQS. We have placed a temporary block on the IP address
> > in question and are taking steps to better define service availability
> > expectations as well as processes to make detection of these events
> > more streamlined going forward.
>
>
> What was the problem query?
>
> I ask because I would like to try it against our Wikidata endpoint at:
> https://wikidata.demo.openlinksw.com/sparql .
>
> We have an "Anytime Query" feature designed for these kinds of problems,
> hence the  vested interest in these kinds of problem queries.
>
> --
> Regards,
>
> Kingsley Idehen
> Founder & CEO
> OpenLink Software
> Home Page: http://www.openlinksw.com
> Community Support: https://community.openlinksw.com
> Weblogs (Blogs):
> Company Blog: https://medium.com/openlink-software-blog
> Virtuoso Blog: https://medium.com/virtuoso-blog
> Data Access Drivers Blog:
> https://medium.com/openlink-odbc-jdbc-ado-net-data-access-drivers
>
> Personal Weblogs (Blogs):
> Medium Blog: https://medium.com/@kidehen
> Legacy Blogs: http://www.openlinksw.com/blog/~kidehen/
>   http://kidehen.blogspot.com
>
> Profile Pages:
> Pinterest: https://www.pinterest.com/kidehen/
> Quora: https://www.quora.com/profile/Kingsley-Uyi-Idehen
> Twitter: https://twitter.com/kidehen
> Google+: https://plus.google.com/+KingsleyIdehen/about
> LinkedIn: http://www.linkedin.com/in/kidehen
>
> Web Identities (WebID):
> Personal: http://kingsley.idehen.net/public_home/kidehen/profile.ttl#i
> :
> http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this
>
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>


-- 
Hi, do you like citation networks? Already 51% of all citations are
available  available for innovative new uses
. Join me in asking the American
Chemical Society to join the Initiative for Open Citations too
.
SpringerNature,
the RSC and many others already did .

-
E.L. Willighagen
Department of Bioinformatics - BiGCaT
Maastricht University (http://www.bigcat.unimaas.nl/)
Homepage: http://egonw.github.com/
Blog: http://chem-bla-ics.blogspot.com/
PubList: https://www.zotero.org/egonw
ORCID: -0001-7542-0286 
ImpactStory: https://impactstory.org/u/egonwillighagen
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] WDQS status

2020-07-09 Thread Egon Willighagen
Dear Guillaume,

On Thu, Jul 9, 2020 at 3:23 PM Guillaume Lederrey 
wrote:

> Some very preliminary analysis indicates that less then 2% of the queries
> on WDQS generate more than 90% of the load. This is definitely something we
> need to better understand.
>

Is the data behind that available? I wonder if I recognize any of the top
25 queries.

(I guess the top 2% can be simple queries run very many times, as well as
hard queries rarely run, correct?)

Egon


-- 
Hi, do you like citation networks? Already 51% of all citations are
available  available for innovative new uses
. Join me in asking the American
Chemical Society to join the Initiative for Open Citations too
.
SpringerNature,
the RSC and many others already did .

-
E.L. Willighagen
Department of Bioinformatics - BiGCaT
Maastricht University (http://www.bigcat.unimaas.nl/)
Homepage: http://egonw.github.com/
Blog: http://chem-bla-ics.blogspot.com/
PubList: https://www.zotero.org/egonw
ORCID: -0001-7542-0286 
ImpactStory: https://impactstory.org/u/egonwillighagen
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] WDQS and SPARQL Endpoint Compatibility

2020-03-31 Thread Egon Willighagen
On Mon, Mar 30, 2020 at 11:15 PM Maarten Dammers 
wrote:

> Since Stas left last year, unfortunately nobody from the WMF has done
> anything with
> https://www.wikidata.org/wiki/Wikidata:SPARQL_federation_input . I don't
> know if the new SPARQL people are even aware of this page.
>

That's my impression too:


> My bot produces a weekly federation report at
>
> https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/Federation_report
>

The WikiPathways SPARQL endpoint URL has changed, and I have requested an
update (Jan 2020, [0]), but no update or reply yet.

Maarten, that is causing the simple query in this report to fail.

Egon

0.
https://www.wikidata.org/wiki/Wikidata_talk:SPARQL_federation_input#Updated_URL_for_the_WikiPathways_SPARQL_endpoint

-- 
Hi, do you like citation networks? Already 51% of all citations are
available  available for innovative new uses
. Join me in asking the American
Chemical Society to join the Initiative for Open Citations too
.
SpringerNature,
the RSC and many others already did .

-
E.L. Willighagen
Department of Bioinformatics - BiGCaT
Maastricht University (http://www.bigcat.unimaas.nl/)
Homepage: http://egonw.github.com/
Blog: http://chem-bla-ics.blogspot.com/
PubList: https://www.zotero.org/egonw
ORCID: -0001-7542-0286 
ImpactStory: https://impactstory.org/u/egonwillighagen
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


[Wikidata] formatter URL that requires encoding the statement value?

2019-10-11 Thread Egon Willighagen
Hi all,

for the SMILES (line representation for chemical structures; [0]) there are
currently two properties (P233 and P2017). Each one of them has a formatter
URL to link out to an open source depiction tool (CDKDepict). But statement
values for both properties can have characters that need to be encoded for
the URL (/, +, #).

Some time ago I saw a solution, a separate Wikidata/-media page that did
this, before linking through to the remote service.

Because chemical structures are now not always shown and sometimes wrong, I
really like to see a solution like that implemented. I'm thinking of
working on that during WikidataCon. But I like to ask for your input. What
are the technical solution that could be used to solve the problem? What
are related recommendations? And anything else you like to comment...

Looking forward to hearing from you,

greetings,

Egon

0.https://tools.wmflabs.org/scholia/topic/Q466769

-- 
Hi, do you like citation networks? Already 51% of all citations are
available  available for innovative new uses
. Join me in asking the American
Chemical Society to join the Initiative for Open Citations too
.
SpringerNature,
the RSC and many others already did .

-
E.L. Willighagen
Department of Bioinformatics - BiGCaT
Maastricht University (http://www.bigcat.unimaas.nl/)
Homepage: http://egonw.github.com/
Blog: http://chem-bla-ics.blogspot.com/
PubList: https://www.zotero.org/egonw
ORCID: -0001-7542-0286 
ImpactStory: https://impactstory.org/u/egonwillighagen
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Weekly Summary #384

2019-09-30 Thread Egon Willighagen
Hi Houcemeddine,

you can just add your article yourself to the "Next" page:
https://www.wikidata.org/wiki/Wikidata:Status_updates/Next

With kind regards,

Egon

On Mon, Sep 30, 2019 at 5:35 PM Houcemeddine A. Turki
 wrote:
>
> Dear Ms.,
> I thank you for your answer. You did not mention our work published in 
> Journal of Biomedical Informatics this week although I have already written 
> an email about it in our mailing list before. It is currently available at 
> https://doi.org/10.1016/j.jbi.2019.103292. You can find the full text of the 
> paper at 
> https://www.researchgate.net/publication/336001723_Wikidata_A_large-scale_collaborative_ontological_medical_database.
>  Please include it to next Wikidata Weekly Summary. I also ask about your 
> opinion about the output of the research paper and the deficiencies of 
> medical information in Wikidata. This will be useful for us to know more 
> about how to continue our project.
> Yours Sincerely,
> Houcemeddine Turki
> 
> De : Wikidata  de la part de Léa 
> Lacroix 
> Envoyé : lundi 30 septembre 2019 16:04
> À : Discussion list for the Wikidata project. 
> Objet : [Wikidata] Weekly Summary #384
>
> Here's your quick overview of what has been happening around Wikidata over 
> the last week.
>
> Discussions
>
> Open request for adminship: Catherine Laurence
>
> Events
>
> Past: Using Wikidata to Provide Visibility to Women in STEM during Dublin 
> Core 2019
> Upcoming: WikiCon, meeting of the German speaking Community from 4th to 6th 
> of October 2019 in Wuppertal (Germany)
> Upcoming: #5 Wikidata tea party/第5回 ウィキデータ茶話会, 18 Oct 2019 in Tokyo, Japan
> Upcoming: Wikidata Zurich Training in Zurich on the weekend of November 2-3. 
> There will be presentations and hands-on sessions on editing, querying and 
> coding for Wikidata.
> Upcoming: Wikidata Zurich Hackathon in Zurich on the weekend of November 
> 23-24.
>
> Press, articles, blog posts
>
> Scottish witches in the press and on TV
> Some witchy history and a very smart woman in data science
> Spoken Conversational Search for General Knowledge - Lina Maria 
> Rojas-Barahona, et al.
> GeneDB and Wikidata, by Magnus Manske
>
> Tool of the week
>
> This Recent changes tool allows to get a list of unpatrolled Wikidata changes 
> with enhanced filters that are more adapted to Wikidata than the standard 
> Recent Changes page. Available actions include mass patroling and easier 
> revert interface.
>
> Other Noteworthy Stuff
>
> The state library of Berlin is using Wikidata to display a map of pictures 
> from their architecture journals: Berlin um 1900 - eine fotografische 
> Zeitreise
> Edit summaries coming from the wbeditentity API (for example from the mobile 
> termbox) will be improved from October 2nd (announcement, ticket)
> You can nominate your favourite Wikidata and Wikibase projects for the 
> WikidataCon Award until October 7th
>
> Did you know?
>
> Newest properties:
>
> General datatypes: regnal ordinal, peer review URL
> External identifiers: GCF Reference, Wi-Fi Certification ID, Rivals.com ID, 
> sixpackfilmdata film ID, sixpackfilmdata person ID, Dirección General de 
> Bibliotecas ID, Elitefootball player ID, The Wind Power farm ID, IGNrando' 
> ID, Fossiilid.info ID, Kivid.info ID
>
> New property proposals to review:
>
> General datatypes: bus, mwnf, Maximum number of playable characters, FIDAL 
> team ID, candidate name string, IP range start, code (image), name in hiero 
> markup, disputed by, name (image), style of karate
> External identifiers: NTS Radio artist ID, Gazetteer for Scotland ID, 
> Gazetteer for Scotland person ID, FootballFacts.ru team ID, 
> FootballDatabase.eu team ID, GENUKI ID, DC Books author ID, LaPreferente.com 
> player ID, Ukrainian Premier League player ID, UEFA team ID, UEFA coach ID, 
> Naver Encyclopedia ID, FAIMER school ID, China Martyrs ID, JournalBase ID, 
> PCEngine Catalog Project ID, PubPeer article ID, memoriademadrid publication 
> ID, doujinshi.org author ID, Beatport label ID, Knesset Law Id, 
> caves.4at.info ID, ADL Hate Symbols Database ID, Amazon Music album ID 2, 
> Qobuz album ID, Plus Music album ID, eska.pl release ID, iHeartRadio album 
> ID, Three Decks people ID, Diccionari de la Literatura Catalana ID, Empik 
> e-book or album ID, identificador GENavarra, Oricon News artist ID, Turkish 
> Football Federation Match ID, Turkish Football Federation Team ID, Turkish 
> Football Federation Referee ID, Turkish Football Federation Stadium ID, 
> Fortuna liga player ID
>
> Deleted properties: P5130 (island of location)
> Query examples:
>
> Civil parishes of Scotland, revealed through their listed buildings (source)
> Scottish civil parishes by county and present-day council area (source)
> Federated SPARQL query from the Swedish National Archive TORA project to 
> Wikidata
> Librairies in Africa (source)
> Most recently dissolved enterprises that were over 200 years old (source)
>
> Newest WikiProjects: Re

Re: [Wikidata] Personal news: a new role

2019-09-21 Thread Egon Willighagen
Dear Houcemeddine,

do you happen to have a preprint for that online?

Egon


-- 
Hi, do you like citation networks? Already 51% of all citations are
available available for innovative new uses. Join me in asking the
American Chemical Society to join the Initiative for Open Citations
too. SpringerNature, the RSC and many others already did.

-
E.L. Willighagen
Department of Bioinformatics - BiGCaT
Maastricht University (http://www.bigcat.unimaas.nl/)
Homepage: http://egonw.github.com/
Blog: http://chem-bla-ics.blogspot.com/
PubList: https://www.zotero.org/egonw
ORCID: -0001-7542-0286
ImpactStory: https://impactstory.org/u/egonwillighagen

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] ShEx to validate Medical Wikidata

2019-08-25 Thread Egon Willighagen
On Sun, Aug 25, 2019 at 12:07 PM Houcemeddine A. Turki
 wrote:
> However, it does not include many schemas. We should develop schemas for 
> medical classes before using ShEx to validate Medical Wikidata statements.

Yes, please do. I have been working on ShEx for other domains. It's a
fairly new effort (great work by the team, IMO), and the more people
adopt this additional way of data validation the better.

Egon

-- 
Hi, do you like citation networks? Already 51% of all citations are
available available for innovative new uses. Join me in asking the
American Chemical Society to join the Initiative for Open Citations
too. SpringerNature, the RSC and many others already did.

-
E.L. Willighagen
Department of Bioinformatics - BiGCaT
Maastricht University (http://www.bigcat.unimaas.nl/)
Homepage: http://egonw.github.com/
Blog: http://chem-bla-ics.blogspot.com/
PubList: https://www.zotero.org/egonw
ORCID: -0001-7542-0286
ImpactStory: https://impactstory.org/u/egonwillighagen

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] ShEx to validate Medical Wikidata

2019-08-24 Thread Egon Willighagen
Dear Houcemeddine,

On Sat, Aug 24, 2019 at 10:57 PM Houcemeddine A. Turki
 wrote:
> I ask if someone is interested in creating shape expressions of Medical 
> Wikidata classes.

Cool, please do!

> As well, I tried to access http://wikidata-shex.wmflabs.org/w/index.php. 
> However, it does not work.

The way I understood it: That link you gave was the test server. It
went "live" earlier this year, and plz check out this now:
https://www.wikidata.org/wiki/Wikidata:WikiProject_ShEx

Grtz,

Egon

-- 
Hi, do you like citation networks? Already 51% of all citations are
available available for innovative new uses. Join me in asking the
American Chemical Society to join the Initiative for Open Citations
too. SpringerNature, the RSC and many others already did.

-
E.L. Willighagen
Department of Bioinformatics - BiGCaT
Maastricht University (http://www.bigcat.unimaas.nl/)
Homepage: http://egonw.github.com/
Blog: http://chem-bla-ics.blogspot.com/
PubList: https://www.zotero.org/egonw
ORCID: -0001-7542-0286
ImpactStory: https://impactstory.org/u/egonwillighagen

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Important, Critical issues related to Wikidata

2019-08-21 Thread Egon Willighagen
On Wed, Aug 21, 2019 at 8:44 PM Houcemeddine A. Turki
 wrote:
>
> I thank you for your efforts. It was an honour for me to meet you in 
> Stockholm. Concerning the points you raised, I
> 1. Concerning the problematic use of Instance of and SubClass of, I invite 
> you to see https://tinyurl.com/y29lx9o4. This seems to be not accurate as 
> drugs differ from drug classes from a pharmacological view. For example, 
> Artemisinin, Ibuprofen and Sobosbuvir are drugs. However, drug classes should 
> include antibiotics and antivirals. Concerning WikiProject Ontology, I know 
> it. The project succeeded to solve many matters related to the structure of 
> Wikidata. However, the project is slow.

To me this is actually a nice example of *why* we need to distinction.

Sobosbuvir as in instance is often the active ingredient.
Sobosbuvir as a class is often the formulation, which is not a single
entity, but a class of formulations with different amounts (and
concentrations) of active ingredient

It's just that in street/common language (not wrong, just different!)
we don't use different words for it. Instance-vs-subclass are tools
that help us make this distinction.

Now, related to this, plz have a look at
https://www.wikidata.org/wiki/Wikidata:WikiProject_Chemistry where
class and instance are both used, for good reasons.

Egon

-- 
Hi, do you like citation networks? Already 51% of all citations are
available available for innovative new uses. Join me in asking the
American Chemical Society to join the Initiative for Open Citations
too. SpringerNature, the RSC and many others already did.

-
E.L. Willighagen
Department of Bioinformatics - BiGCaT
Maastricht University (http://www.bigcat.unimaas.nl/)
Homepage: http://egonw.github.com/
Blog: http://chem-bla-ics.blogspot.com/
PubList: https://www.zotero.org/egonw
ORCID: -0001-7542-0286
ImpactStory: https://impactstory.org/u/egonwillighagen

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Important, Critical issues related to Wikidata

2019-08-21 Thread Egon Willighagen
On Wed, Aug 21, 2019 at 1:20 PM Houcemeddine A. Turki
 wrote:
> I thank you for your efforts. I tried to contact Ms. Léa Lacroix and Ms. 
> Lydia. However, I failed due to my participation to many sessions. I saw Ms. 
> Lydia when returning to the hotel on the third day. But, I had to go for the 
> old town. I had nine points to raise:
> 1. Instance of and Subclass of are not well defined for users although they 
> are quite different. I ask if these two properties can be merged as is-a. 
> This will be easier to process by users and developers.

I disagree with this suggestion. To me, the distinction between
classes and individuals is very useful. Removing it will make not make
things easier, but only harder.

Egon

-- 
Hi, do you like citation networks? Already 51% of all citations are
available available for innovative new uses. Join me in asking the
American Chemical Society to join the Initiative for Open Citations
too. SpringerNature, the RSC and many others already did.

-
E.L. Willighagen
Department of Bioinformatics - BiGCaT
Maastricht University (http://www.bigcat.unimaas.nl/)
Homepage: http://egonw.github.com/
Blog: http://chem-bla-ics.blogspot.com/
PubList: https://www.zotero.org/egonw
ORCID: -0001-7542-0286
ImpactStory: https://impactstory.org/u/egonwillighagen

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Virtuoso hosted Wikidata Instance

2019-08-14 Thread Egon Willighagen
On Wed, Aug 14, 2019 at 1:10 AM Kingsley Idehen  wrote:
> We have loaded Wikidata into a Virtuoso instance accessible via SPARQL [1]. 
> One benefit is helping to understand Wikidata using our Faceted Browsing 
> Interface for Entity Relationship Types [2][3].

Awesome!

I've started seeing how much of Scholia can run on it, and opened a
ticket: https://github.com/fnielsen/scholia/issues/809 It's great the
Wikidata namespaces are loaded. I only had to add the 'bd' prefix to
the Scholia SPARQL. And, the sections that use the WDQS graphical
views, obviously cannot use the VOS instance yet.

So, do you plan to run a WDQS instance on top of your EP? :)

Egon

-- 
Hi, do you like citation networks? Already 51% of all citations are
available available for innovative new uses. Join me in asking the
American Chemical Society to join the Initiative for Open Citations
too. SpringerNature, the RSC and many others already did.

-
E.L. Willighagen
Department of Bioinformatics - BiGCaT
Maastricht University (http://www.bigcat.unimaas.nl/)
Homepage: http://egonw.github.com/
Blog: http://chem-bla-ics.blogspot.com/
PubList: https://www.zotero.org/egonw
ORCID: -0001-7542-0286
ImpactStory: https://impactstory.org/u/egonwillighagen

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Useful resources for Scholia

2019-08-03 Thread Egon Willighagen
On Sat, Aug 3, 2019 at 12:10 AM Houcemeddine A. Turki <
turkiabdelwa...@hotmail.fr> wrote:

> I thank you for your efforts. I invite you to use several useful resources
> available in https://shubhanshu.com/awesome-scholarly-data-analysis/ to
> enrich Scholia project.
>

Many of the linked resources are data resources (and at least several of
them not CC0) and are for Wikidata, not Scholia.

For the software in the list, not all is directly suitable for Scholia
either, or is already working with Wikidata. Did you have something
specific in mind?

Egon

-- 
Hi, do you like citation networks? Already 51% of all citations are
available  available for innovative new uses
. Join me in asking the American
Chemical Society to join the Initiative for Open Citations too
.
SpringerNature,
the RSC and many others already did .

-
E.L. Willighagen
Department of Bioinformatics - BiGCaT
Maastricht University (http://www.bigcat.unimaas.nl/)
Homepage: http://egonw.github.com/
Blog: http://chem-bla-ics.blogspot.com/
PubList: https://www.zotero.org/egonw
ORCID: -0001-7542-0286 
ImpactStory: https://impactstory.org/u/egonwillighagen
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Query on scholarly article fails

2018-12-15 Thread Egon Willighagen
I round up from DOI/PubMed ID counts on https://tools.wmflabs.org/scholia/

Egon

On Sat, Dec 15, 2018 at 3:03 PM Fabrizio Carrai 
wrote:

> Excellent, I did some tests and with some cycles I already identified and
> classified several articles.
> I will have a look at your script in the  next days but I already have a
> question: the number of iterations is based on the total number of
> articles, how do you know that ?
>
> ---
> Fabrizio
>
> Il giorno sab 15 dic 2018 alle ore 10:18 Egon Willighagen <
> egon.willigha...@gmail.com> ha scritto:
>
>>
>> The approach I use is the following, see this (Bioclipse/Groovy) script:
>> https://gist.github.com/egonw/ca4c348b9a2d1116efcdb55fa85dd158
>>
>> It takes advantage of a combination Blazegraph SPARQL trick and breaking
>> up thing in batches of a certain size:
>>
>> SELECT ?art ?artLabel
>> WITH {
>> SELECT ?art WHERE {
>> ?art wdt:P31 wd:Q13442814
>> } LIMIT $batchSize OFFSET $offset
>> } AS %RESULTS {
>> INCLUDE %RESULTS
>> ?art wdt:P1476 ?artLabel .
>> MINUS { ?art wdt:P921 wd:$conceptQ }
>> FILTER (contains(lcase(str(?artLabel)), "$concept"))
>> }
>> where "$concept" is my search word in the title, and $batchSize and
>> $offset take care of the batching by the script. This script creates
>> QuickStatements.
>>
>> Mind you, I manually check the created statements, because in my domain
>> (biochem) a simple search results of false positives, hence the "blacklist"
>> in the script :)
>>
>> Egon
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> On Sat, Dec 15, 2018 at 10:13 AM Fabrizio Carrai <
>> fabrizio.car...@gmail.com> wrote:
>>
>>> Thanks Matthias,
>>> that's a pity. Your suggestion relies on the effective characterization
>>> of the item that,  at this writing time, is pretty poor for my interest.
>>> Could it be an idea to download all the "scholary articles", locally
>>> select  for the keyword of interest (e.g. "microgravity") and set the
>>> property P921 for all of them ? Quickstatements may be helpful for the last
>>> step, any suggestions for other tools ?
>>>
>>> Thanks
>>> Fabrizio
>>>
>>> Il giorno ven 14 dic 2018 alle ore 22:16 Matthias Erfurth <
>>> erfu...@gmx.de> ha scritto:
>>>
>>>> Hi Fabrizio,
>>>> unfortunately you can't fulltext search all the scholarly articles
>>>> <https://www.wikidata.org/wiki/Q13442814> , you should better work
>>>> with indexed properties, so
>>>> you can query for other articles with microgravity as main subject ...
>>>> With the ajax based wikidata search
>>>>
>>>> SELECT ?item
>>>> WHERE {
>>>> ?item wdt:P31 wd:Q13442814;
>>>>   wdt:P921 wd:Q48655.
>>>> }
>>>>
>>>> Best regards,
>>>>
>>>> ciao matthias
>>>>
>>>>
>>>> *Gesendet:* Freitag, 14. Dezember 2018 um 18:55 Uhr
>>>> *Von:* "Fabrizio Carrai" 
>>>> *An:* "Discussion list for the Wikidata project" <
>>>> wikidata@lists.wikimedia.org>
>>>> *Betreff:* Re: [Wikidata] Query on scholarly article fails
>>>> Thanks again to Ettore, but I immediately found another timeout problem
>>>> when I just added a FILTER to find all the articles with the word "biokis"
>>>> in the title
>>>>
>>>> SELECT ?istanza_di ?instanza_diLabel WHERE {
>>>>   ?istanza_di wdt:P31 wd:Q13442814.
>>>>   ?istanza_di rdfs:label ?instanza_diLabel.
>>>>   FILTER((LANG(?instanza_diLabel)) = "en").
>>>>   FILTER(CONTAINS(LCASE(?instanza_diLabel), "biokis"))
>>>> }
>>>> LIMIT 100
>>>>
>>>> At least one article should be returned:
>>>> https://www.wikidata.org/wiki/Q57202937
>>>> but I got a timeout.
>>>>
>>>> Thanks to anybody that can help
>>>>
>>>> Fabrizio
>>>>
>>>>
>>>> Il giorno ven 14 dic 2018 alle ore 10:12 Ettore RIZZA <
>>>> ettoreri...@gmail.com> ha scritto:
>>>>
>>>>> Hello Fabrizio,
>>>>>
>>>>> It seems that the problem comes from SERVICE wikibase:label. As said
>>>>> in another discussion, the query executes in less than one second if yo

Re: [Wikidata] Query on scholarly article fails

2018-12-15 Thread Egon Willighagen
The approach I use is the following, see this (Bioclipse/Groovy) script:
https://gist.github.com/egonw/ca4c348b9a2d1116efcdb55fa85dd158

It takes advantage of a combination Blazegraph SPARQL trick and breaking up
thing in batches of a certain size:

SELECT ?art ?artLabel
WITH {
SELECT ?art WHERE {
?art wdt:P31 wd:Q13442814
} LIMIT $batchSize OFFSET $offset
} AS %RESULTS {
INCLUDE %RESULTS
?art wdt:P1476 ?artLabel .
MINUS { ?art wdt:P921 wd:$conceptQ }
FILTER (contains(lcase(str(?artLabel)), "$concept"))
}
where "$concept" is my search word in the title, and $batchSize and $offset
take care of the batching by the script. This script creates
QuickStatements.

Mind you, I manually check the created statements, because in my domain
(biochem) a simple search results of false positives, hence the "blacklist"
in the script :)

Egon










On Sat, Dec 15, 2018 at 10:13 AM Fabrizio Carrai 
wrote:

> Thanks Matthias,
> that's a pity. Your suggestion relies on the effective characterization of
> the item that,  at this writing time, is pretty poor for my interest.
> Could it be an idea to download all the "scholary articles", locally
> select  for the keyword of interest (e.g. "microgravity") and set the
> property P921 for all of them ? Quickstatements may be helpful for the last
> step, any suggestions for other tools ?
>
> Thanks
> Fabrizio
>
> Il giorno ven 14 dic 2018 alle ore 22:16 Matthias Erfurth 
> ha scritto:
>
>> Hi Fabrizio,
>> unfortunately you can't fulltext search all the scholarly articles
>>  , you should better work with
>> indexed properties, so
>> you can query for other articles with microgravity as main subject ...
>> With the ajax based wikidata search
>>
>> SELECT ?item
>> WHERE {
>> ?item wdt:P31 wd:Q13442814;
>>   wdt:P921 wd:Q48655.
>> }
>>
>> Best regards,
>>
>> ciao matthias
>>
>>
>> *Gesendet:* Freitag, 14. Dezember 2018 um 18:55 Uhr
>> *Von:* "Fabrizio Carrai" 
>> *An:* "Discussion list for the Wikidata project" <
>> wikidata@lists.wikimedia.org>
>> *Betreff:* Re: [Wikidata] Query on scholarly article fails
>> Thanks again to Ettore, but I immediately found another timeout problem
>> when I just added a FILTER to find all the articles with the word "biokis"
>> in the title
>>
>> SELECT ?istanza_di ?instanza_diLabel WHERE {
>>   ?istanza_di wdt:P31 wd:Q13442814.
>>   ?istanza_di rdfs:label ?instanza_diLabel.
>>   FILTER((LANG(?instanza_diLabel)) = "en").
>>   FILTER(CONTAINS(LCASE(?instanza_diLabel), "biokis"))
>> }
>> LIMIT 100
>>
>> At least one article should be returned:
>> https://www.wikidata.org/wiki/Q57202937
>> but I got a timeout.
>>
>> Thanks to anybody that can help
>>
>> Fabrizio
>>
>>
>> Il giorno ven 14 dic 2018 alle ore 10:12 Ettore RIZZA <
>> ettoreri...@gmail.com> ha scritto:
>>
>>> Hello Fabrizio,
>>>
>>> It seems that the problem comes from SERVICE wikibase:label. As said in
>>> another discussion, the query executes in less than one second if you 
>>> rewrite
>>> it in this way
>>> 
>>> .
>>>
>>> Cheers,
>>>
>>> Ettore Rizza
>>>
>>> Le ven. 14 déc. 2018 à 09:59, Fabrizio Carrai 
>>> a écrit :
>>>
 Hello all,
 the following query ends with a timeot:

 SELECT ?istanza_di ?istanza_diLabel WHERE {
   SERVICE wikibase:label { bd:serviceParam wikibase:language
 "[AUTO_LANGUAGE],en". }
   ?istanza_di wdt:P31 wd:Q13442814.
 }
 LIMIT 10

 Can anybody explain why ?
 Thanks in advance

 --
 *Fabrizio*
 ___
 Wikidata mailing list
 Wikidata@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wikidata
>>>
>>> ___
>>> Wikidata mailing list
>>> Wikidata@lists.wikimedia.org
>>> https://lists.wikimedia.org/mailman/listinfo/wikidata
>>
>>
>>
>> --
>> *Fabrizio*
>> ___ Wikidata mailing list
>> Wikidata@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikidata
>> ___
>> Wikidata mailing list
>> Wikidata@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikidata
>>
>
>
> --
> *Fabrizio*
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>


-- 
Hi, do you like citation networks? Already 51% of all citations are
available  available for innovative new uses
. Join my in asking the American
Chemical Society to join the Initiative for Open Citations too


Re: [Wikidata] You can now query the constraint violations with the Query Service

2018-08-18 Thread Egon Willighagen
Nice, I put it to use [0,1].

Question, what if the WDQS has a wikisource export format, allowing me to
copy/paste output easily as table in a Wikidata page? Is something like
that already possible? (Alternatively, maybe the Listeria bot can handle
this much better, but I still have to learn how to use that properly...)

Egon

0.https://twitter.com/egonwillighagen/status/1030732574220582912
1.https://twitter.com/egonwillighagen/status/1030737717095788544


On Fri, Aug 3, 2018 at 12:44 PM Léa Lacroix 
wrote:

> Hello all,
>
> We started integrating the constraint violations into the Query Service.
> That means you can build queries using the constraint violations, with the
> predicate wikibase:hasViolationForConstraint. This will hopefully help
> you to watch better the quality of Wikidata content.
>
> Please note that this is a first step. Not all constraint violations are
> exposed yet, only the ones that can be checked fast enough. We're working
> on having more available in WDQS.
>
> You can base your queries on these few examples:
>
> #10 statements with constraint violations that are currently includedSELECT * 
> WHERE {?x wikibase:hasViolationForConstraint ?y.} LIMIT 10
>
> Try it!
> 
>
> #Map/timeline/image grid of items that have a statement with a constraint 
> violation#defaultView:MapSELECT DISTINCT ?item ?itemLabel ?image 
> ?coordinate_location ?point_in_time ?date_of_birth WHERE {
>   ?s wikibase:hasViolationForConstraint ?y.
>   ?item ?z1 ?s.
>   SERVICE wikibase:label { bd:serviceParam wikibase:language 
> "[AUTO_LANGUAGE],en". }
>   OPTIONAL { ?item wdt:P18 ?image. }
>   OPTIONAL { ?item wdt:P625 ?coordinate_location. }
>   OPTIONAL { ?item wdt:P585 ?point_in_time. }
>   OPTIONAL { ?item wdt:P569 ?date_of_birth. }}
>
> Try it!
> 
>
> #Bar chart of statements that have a constraint violation, grouped by 
> instance of the regarding item:#defaultView:BarChart#TEMPLATE={ "template": { 
> "en": "Bar chart of statements that have a constraint violation grouped by 
> ?property the regarding item" }, "variables": { "?property": { 
> "query":"SELECT ?id  WHERE { VALUES ?id {  wd:P31 wd:P17 wd:P571 wd:P361 
> wd:P19 } }" } } }SELECT ?instance_ofLabel (COUNT(?instance_ofLabel) AS 
> ?count) WHERE {
>   ?s wikibase:hasViolationForConstraint ?y.
>   ?item ?z1 ?s.
>   BIND(wdt:P31 AS ?property)
>   SERVICE wikibase:label { bd:serviceParam wikibase:language 
> "[AUTO_LANGUAGE],en". }
>   OPTIONAL { ?item ?property ?instance_of. }}GROUP BY ?instance_ofLabelORDER 
> BY DESC(?count)LIMIT 30
>
> Try it!
> 
>
> The modules included on the property talk pages, Module:Constraints
> ,
> Module:Constraints/SPARQL
>  etc. has been
> updated with a new query link (thanks Matěj!)
>
> See

Re: [Wikidata] Please support risk factor Wikidata property

2018-08-12 Thread Egon Willighagen
Dear Houcemeddine,

interesting proposal, indeed. I left a comment on the page, which basically
says we need qualifiers and references. The references because some claims
will be controversial (even if backed up by literature), while I also see
these links as likely time bound, due to changing health policies (I can
think of a few other reasons), at least for the residence examples (I hope
no one gets any ideas about Q4 'risk factor' 'residence' 'war zone').

With kind regards,

Egon





On Fri, Aug 10, 2018 at 10:05 PM Houcemeddine A. Turki <
turkiabdelwa...@hotmail.fr> wrote:

> Dear Mr. or Ms.,
> I thank you for your efforts. I invite you to support the proposal of risk
> factor as a Wikidata property. The property will be an excellent
> contribution to Wikidata as it can allow this high-scale knowledge base to
> be useful for digital epidemiology purposes. The proposal is available in
> https://www.wikidata.org/wiki/Wikidata:Property_proposal/risk_factor.
> This property is a generalization of the first property proposal I
> developed.
> Yours Sincerely,
> Houcemeddine Turki
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>


-- 
E.L. Willighagen
Department of Bioinformatics - BiGCaT
Maastricht University (http://www.bigcat.unimaas.nl/)
Homepage: http://egonw.github.com/
LinkedIn: http://se.linkedin.com/in/egonw
Blog: http://chem-bla-ics.blogspot.com/
PubList: https://www.zotero.org/egonw
ORCID: -0001-7542-0286 
ImpactStory: https://impactstory.org/u/egonwillighagen
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] On traceability and reliability of data we publish [was Re: [Wikimedia-l] Solve legal uncertainty of Wikidata]

2018-07-07 Thread Egon Willighagen
On Sat, Jul 7, 2018 at 5:59 PM mathieu lovato stumpf guntz <
psychosl...@culture-libre.org> wrote:

> I agree this is misconception that a copyright license make any direct
> change to data reliability. But attribution requirement does somewhat
> indirectly have an impact on it, as it legally enforce traceability.
>
I know that "law" has a special corner, but therefore not always the
best... law, in the end, is just a social construct, just like anything we
agree on. First, we all agree (it seems to me) that provenance is valuable.

However, having something in law (or contract) effectively criminalizes if
you fail to add the provenance. Is that what you really wish? Do you want
to be able to legally punish people if the fail to give provenance?
Honestly, that sounds a bit harsh to me... and to me, and this is a
personal opinion and not an argument, I think Wikidata is more open, more
inclusive than that: Wikidata offers carrots, not sticks.

Egon

-- 
E.L. Willighagen
Department of Bioinformatics - BiGCaT
Maastricht University (http://www.bigcat.unimaas.nl/)
Homepage: http://egonw.github.com/
LinkedIn: http://se.linkedin.com/in/egonw
Blog: http://chem-bla-ics.blogspot.com/
PubList: https://www.zotero.org/egonw
ORCID: -0001-7542-0286 
ImpactStory: https://impactstory.org/u/egonwillighagen
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Recoin2: New version of relative completeness status indicator for all entities

2017-12-10 Thread Egon Willighagen
On Wed, Dec 6, 2017 at 10:01 AM, Simon Razniewski 
wrote:

> More information can be found on the tools page at
> https://www.wikidata.org/wiki/User:Ls1g/Recoin2.
> 
>
Interesting! At the moment the colored indicator refers to that page, but
people who have the Recoin2 indicator know about it already... how about
having it link to the list of expected (but missing) properties?

Egon

-- 
E.L. Willighagen
Department of Bioinformatics - BiGCaT
Maastricht University (http://www.bigcat.unimaas.nl/)
Homepage: http://egonw.github.com/
LinkedIn: http://se.linkedin.com/in/egonw
Blog: http://chem-bla-ics.blogspot.com/
PubList: http://www.citeulike.org/user/egonw/tag/papers
ORCID: -0001-7542-0286
ImpactStory: https://impactstory.org/u/egonwillighagen
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] An answer to Lydia Pintscher regarding its considerations on Wikidata and CC-0

2017-12-02 Thread Egon Willighagen
Dear Mathieu,

On Thu, Nov 30, 2017 at 2:28 PM, mathieu stumpf guntz
 wrote:
> Le 30/11/2017 à 10:13, Egon Willighagen a écrit :
>> On Wed, Nov 29, 2017 at 10:45 PM, Mathieu Stumpf Guntz 
>>  wrote:

>> As having contributed to many open database and as user of many open
>> database, the CCZero is my default choice for making data open. Adoption of
>> this license is, IMHO, the prime reason Wikidata is growing so fast, and
>> integrated so fast in many use cases.
>
> Well, that would indeed be a huge point in favor of CC0 then. Unfortunately,
> I'm not aware of any way to turn that into a measurable analyze, as too many
> factors might come coincidentally to this. However, since you are
> contributor of many open database, maybe you are aware of some studies on
> the subject which can back your opinion.

Generally for open projects, the impact is hard to measure. It's not
as simple as determining the sales.

Overview of reuse and adoption by independent project is for me the
most important measure. For example, for Wikipedia that Google shows
it prominently on the search results, that students around the world
frequently use it as first source to get an overview of a topic.

For Wikidata this is not as established, but I would look at the
collaborations. Other databases that have adopted the Wikidata
Q-number is identifiers, for example, like we did in WikiPathways, and
less domain-specific, by OpenStreetMap, if not mistaken. Those
collaborations are a good indication of success: projects invest time
in adoption of it, and would not do it if they did not expect "return
on investment".

>> I also note that public domain (which CCZero formalizes across
>> jurisdictions) is still the "ideal" license when uploading images to
>> Wikimedia, suggesting more of Wikimedia actually finds the CCZero idea very
>> welcome.
>
> I'm not sure what you mean here. If you are talking about things like
> pictures that the NASA release, I think it falls in the case exposed above.
> If you are speaking of the most used license on Wikimedia by benevolent
> contributors, I'm not aware of the statistics on this topic, but would be
> interested to have some.

My point was that the impression I get when uploading media is that
the more liberal the license, the happier Wikimedia is about it.

Egon

-- 
E.L. Willighagen
Department of Bioinformatics - BiGCaT
Maastricht University (http://www.bigcat.unimaas.nl/)
Homepage: http://egonw.github.com/
LinkedIn: http://se.linkedin.com/in/egonw
Blog: http://chem-bla-ics.blogspot.com/
PubList: http://www.citeulike.org/user/egonw/tag/papers
ORCID: -0001-7542-0286
ImpactStory: https://impactstory.org/u/egonwillighagen

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] An answer to Lydia Pintscher regarding its considerations on Wikidata and CC-0

2017-11-30 Thread Egon Willighagen
Dear Mathieu,

On Wed, Nov 29, 2017 at 10:45 PM, Mathieu Stumpf Guntz <
psychosl...@culture-libre.org> wrote:

> I forward here the message I initially posted on the Meta Tremendous
> Wiktionary User Group talk page
> ,
> because I'm interested to have a wider feedback of the community on this
> point. Whether you think that my view is completely misguided or that I
> might have a few relevant points, I'm extremely interested to know it, so
> please be bold.
>
As having contributed to many open database and as user of many open
database, the CCZero is my default choice for making data open. Adoption of
this license is, IMHO, the prime reason Wikidata is growing so fast, and
integrated so fast in many use cases. License incompatibilities have been a
major concern in open source development and academic research. Yes, there
too, there is a continuous almost-religious and unsolved discussion about
copylefting, but the plain experience there is that the closer to the idea
of public domain, the easier it is to use. The advantages of CCZero have
been widely discussed in the life sciences, and while not everyone choice,
the benefits outweigh the disadvantages for many. I also note that public
domain (which CCZero formalizes across jurisdictions) is still the "ideal"
license when uploading images to Wikimedia, suggesting more of Wikimedia
actually finds the CCZero idea very welcome.

Also stress that in no way I recognize myself in your comments about Denny
and Google. And your comment that "freedom of one is murder and slavery of
others" needs some refinement, IMHO; my definition of "freedom" is quite
different and I experience your definition as abusive and offensive.

The CCZero license of Wikidata is essential to my contributions and use of
Wikimedia products. The chemistry knowledge in Wikidata is 100x more useful
(to me) than that in Wikipedia etc. That is in part because of the machine
readability, but also to a large part by the choice of CCZero.

I hope this helps,

with kind regards,

Egon

-- 
E.L. Willighagen
Department of Bioinformatics - BiGCaT
Maastricht University (http://www.bigcat.unimaas.nl/)
Homepage: http://egonw.github.com/
LinkedIn: http://se.linkedin.com/in/egonw
Blog: http://chem-bla-ics.blogspot.com/
PubList: http://www.citeulike.org/user/egonw/tag/papers
ORCID: -0001-7542-0286
ImpactStory: https://impactstory.org/u/egonwillighagen
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Improve Wikidata links to Dutch municipalities

2017-05-20 Thread Egon Willighagen
On Sat, May 20, 2017 at 6:47 PM, Andy Mabbett 
wrote:

> On 19 May 2017 at 11:49, Egon Willighagen 
> wrote:
>
> > I am not sure I see how you want to achieve that beyond Wikidata at
> > this moment?
>
> My comment referred only to Wikidata.


Then I think that we were not talking about the same things... what I meant
is, restricting myself to chemical databases, ChEBI and ChemSpider have old
fashion identifiers and IRIs (based on those identifiers), but HMDB, for
example, does not. That's outside Wikidata's control, so Wikidata has to
accept both types of identifiers. Right?

Egon


-- 
E.L. Willighagen
Department of Bioinformatics - BiGCaT
Maastricht University (http://www.bigcat.unimaas.nl/)
Homepage: http://egonw.github.com/
LinkedIn: http://se.linkedin.com/in/egonw
Blog: http://chem-bla-ics.blogspot.com/
PubList: http://www.citeulike.org/user/egonw/tag/papers
ORCID: -0001-7542-0286
ImpactStory: https://impactstory.org/u/egonwillighagen
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Improve Wikidata links to Dutch municipalities

2017-05-19 Thread Egon Willighagen
On Thu, May 18, 2017 at 8:45 PM, Andy Mabbett  wrote:
> I'm ambivalent as to which is the ideal solution; but using both,
> inconsistently, and on an arbitrary basis, is certainly not it.

I agree, but we're in a mixed world right now... semantic web is being
picked up, and I think we must take advantage of this (IRIs are
identifiers too), but many databases are not semantic and have old
fashion IDs. (As maintainer of an ID mapping system called BridgeDb,
this is my daily work: dealing with two types of identifiers from two
related but different worlds...)

I absolutely agree ideally we would have one solution, and while on
Wikidata we might try that community consensus, I am not sure I see
how you want to achieve that beyond Wikidata at this moment?

Egon

-- 
E.L. Willighagen
Department of Bioinformatics - BiGCaT
Maastricht University (http://www.bigcat.unimaas.nl/)
Homepage: http://egonw.github.com/
LinkedIn: http://se.linkedin.com/in/egonw
Blog: http://chem-bla-ics.blogspot.com/
PubList: http://www.citeulike.org/user/egonw/tag/papers
ORCID: -0001-7542-0286
ImpactStory: https://impactstory.org/u/egonwillighagen

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Improve Wikidata links to Dutch municipalities

2017-05-17 Thread Egon Willighagen
On Wed, May 17, 2017 at 4:09 PM, Wouter Beek  wrote:
> Wikidata currently includes links to the Kadaster, but these do not point to
> the proper IRIs.  For example, Dutch municipalities in Wikidata currently
> link to an HTML viewer of Kadaster [1], but they should point to our
> dereferenceable IRI [2].

It can have both... (if I understand correctly what you want)... see
https://www.wikidata.org/wiki/Property:P662

It has "formatter URL" (P1630) for the HTML and "RDF URI template"
(P1921) for the IRI

> Low-hanging fruit seems to be to update the Kadaster links for all Dutch
> municipalities.  What's the best way to go about this?  Can we make these
> corrections ourselves / what is the procedure for editing?

So, I would say, just add the RDF links...

Egon

-- 
E.L. Willighagen
Department of Bioinformatics - BiGCaT
Maastricht University (http://www.bigcat.unimaas.nl/)
Homepage: http://egonw.github.com/
LinkedIn: http://se.linkedin.com/in/egonw
Blog: http://chem-bla-ics.blogspot.com/
PubList: http://www.citeulike.org/user/egonw/tag/papers
ORCID: -0001-7542-0286
ImpactStory: https://impactstory.org/u/egonwillighagen

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


[Wikidata] [update] Re: Joining May 29, National Plan Open Science meeting in Delft, The Netherlands

2017-05-13 Thread Egon Willighagen
Hi all,

here's a quick update below... (in case you missed it: there is this
event page: https://www.wikidata.org/wiki/Wikidata:Events/NPOS2017)

On Mon, Apr 24, 2017 at 9:36 AM, Léa Lacroix  wrote:
> Thanks Egon for letting us know.
> Feel free to add the event on https://www.wikidata.org/wiki/Wikidata:Events :)

We had a last in person meeting with the organisation of the "" event
yesterday. The schedule is now fixed:

https://www.openscience.nl/binaries/content/assets/surf/en/2017/programme-open-science-researcher-meeting-170529.pdf

From 12:00 to 13:30 is the time we have our table, during the
"Knowledge Commons" session... it's in the (free) lunch area...

Andra Waagmeester and Yaroslav Blanter have indicated to join too.

As far as I am concerned, we can upgrade the table to a Wikimedia-NL
table... it's just that I am not too familiar with others... I have
pinged this team though:
https://nl.wikipedia.org/wiki/Wikipedia:GLAM/Koninklijke_Bibliotheek_en_Nationaal_Archief/Activiteiten2015

As of today, a handful of citizen scientists have already signed up,
so we're still short on that... outreach ideas to Dutch citizen
scientist communities very much appreciated!

If you like to join the event, please sign up at *two* places:

The aforementioned NPOS2017 event page *and* the official registration
page (given in that event page) allowing us to make a good estimate of
the number of free lunches.

Greetings,

Egon

-- 
E.L. Willighagen
Department of Bioinformatics - BiGCaT
Maastricht University (http://www.bigcat.unimaas.nl/)
Homepage: http://egonw.github.com/
LinkedIn: http://se.linkedin.com/in/egonw
Blog: http://chem-bla-ics.blogspot.com/
PubList: http://www.citeulike.org/user/egonw/tag/papers
ORCID: -0001-7542-0286
ImpactStory: https://impactstory.org/u/egonwillighagen

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Joining May 29, National Plan Open Science meeting in Delft, The Netherlands

2017-04-30 Thread Egon Willighagen
Dear Arne,

On Tue, Apr 25, 2017 at 11:18 AM, Arne Wossink  wrote:
> Thank you for taking on the role of coordinator! I'm happy to present
> anything if necessary, but that depends on the format of the day, I guess.

This Friday we will likely finalize the program of the full meeting,
but the idea of our Wikidata presence is the 1.5h "market" during
lunch time, where researchers ("hoge school", uni, research institute)
can share Open Science experiences...

I'm hoping also for Dutch Wikidata use in research from the social
sciences, humanities, etc... Also, Wikidata is just textual things,
but I can imagine that photo's, scans, etc, as in WikiCommons, are
actively used by Dutch researchers... projects like:
https://www.kb.nl/nieuws/2014/kb-en-nationaal-archief-zetten-samenwerking-met-wikipedia-voort

> Keep us informed if you need any additional information, or once you have
> new information to share with us!

I'll keep the event page (see other email) updated!

Egon

-- 
E.L. Willighagen
Department of Bioinformatics - BiGCaT
Maastricht University (http://www.bigcat.unimaas.nl/)
Homepage: http://egonw.github.com/
LinkedIn: http://se.linkedin.com/in/egonw
Blog: http://chem-bla-ics.blogspot.com/
PubList: http://www.citeulike.org/user/egonw/tag/papers
ORCID: -0001-7542-0286
ImpactStory: https://impactstory.org/u/egonwillighagen

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Joining May 29, National Plan Open Science meeting in Delft, The Netherlands

2017-04-30 Thread Egon Willighagen
Dear Yaroslav,

Happy if you can join the first part of it. Can you please add your
Wikidata account here:
https://www.wikidata.org/wiki/Wikidata:Events/NPOS2017 ?

The formal registration page is this Google Form, which I could not
add to the above page, as it get tagged as spam:
https://goo.gl/forms/RxswajBbFtzvBnvG2

Egon

On Mon, Apr 24, 2017 at 12:47 PM, Yaroslav Blanter  wrote:
> Dear Egon,
>
> I will be teaching starting 13:45, but if this is in the Aula (which is next
> to my building) I can just stay through the whole session, leaving at 13:30.
>
> Yaroslav
>
> On Mon, Apr 24, 2017 at 12:33 PM, Arne Wossink  wrote:
>>
>> Hi Egon,
>>
>> Sounds good. How do we proceed from here? Are you going to propose a
>> session for this day and invite attendees? Do we need to sign up anywhere?
>>
>> Best,
>>
>>
>> Arne Wossink
>>
>> Projectleider / Project Manager Wikimedia Nederland
>>
>> (Werkdagen: maandag, dinsdag, donderdag / Office hours: Monday, Tuesday,
>> Thursday)
>>
>> Tel. +31 (0)6 11000505
>> E-mail: woss...@wikimedia.nl
>>
>> Postadres:  Bezoekadres:
>> Postbus 167        Mariaplaats 3
>> 3500 AD  Utrecht Utrecht
>>
>> 2017-04-24 11:43 GMT+02:00 Egon Willighagen :
>>>
>>> On Mon, Apr 24, 2017 at 10:19 AM, Arne Wossink 
>>> wrote:
>>> > Interesting opportunity! I would be interested in attending and/or
>>> > helping
>>> > out as WMNL representative, but I don't have a case study or anything
>>> > else I
>>> > could contribute.
>>>
>>> Yes, of course, Wikidata is just part of Wikimedia and a WMNL
>>> representative that knows about other interactions of Wikimedia with
>>> research is very welcome indeed.
>>>
>>> Egon
>>>
>>> --
>>> E.L. Willighagen
>>> Department of Bioinformatics - BiGCaT
>>> Maastricht University (http://www.bigcat.unimaas.nl/)
>>> Homepage: http://egonw.github.com/
>>> LinkedIn: http://se.linkedin.com/in/egonw
>>> Blog: http://chem-bla-ics.blogspot.com/
>>> PubList: http://www.citeulike.org/user/egonw/tag/papers
>>> ORCID: -0001-7542-0286
>>> ImpactStory: https://impactstory.org/u/egonwillighagen
>>>
>>> ___
>>> Wikidata mailing list
>>> Wikidata@lists.wikimedia.org
>>> https://lists.wikimedia.org/mailman/listinfo/wikidata
>>
>>
>>
>> ___
>> Wikidata mailing list
>> Wikidata@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikidata
>>
>
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>



-- 
E.L. Willighagen
Department of Bioinformatics - BiGCaT
Maastricht University (http://www.bigcat.unimaas.nl/)
Homepage: http://egonw.github.com/
LinkedIn: http://se.linkedin.com/in/egonw
Blog: http://chem-bla-ics.blogspot.com/
PubList: http://www.citeulike.org/user/egonw/tag/papers
ORCID: -0001-7542-0286
ImpactStory: https://impactstory.org/u/egonwillighagen

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Joining May 29, National Plan Open Science meeting in Delft, The Netherlands

2017-04-30 Thread Egon Willighagen
Dear Léa,

On Mon, Apr 24, 2017 at 9:36 AM, Léa Lacroix  wrote:
> Thanks Egon for letting us know.
> Feel free to add the event on https://www.wikidata.org/wiki/Wikidata:Events :)

OK, done: https://www.wikidata.org/wiki/Wikidata:Events/NPOS2017

Egon

-- 
E.L. Willighagen
Department of Bioinformatics - BiGCaT
Maastricht University (http://www.bigcat.unimaas.nl/)
Homepage: http://egonw.github.com/
LinkedIn: http://se.linkedin.com/in/egonw
Blog: http://chem-bla-ics.blogspot.com/
PubList: http://www.citeulike.org/user/egonw/tag/papers
ORCID: -0001-7542-0286
ImpactStory: https://impactstory.org/u/egonwillighagen

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Joining May 29, National Plan Open Science meeting in Delft, The Netherlands

2017-04-24 Thread Egon Willighagen
On Mon, Apr 24, 2017 at 12:33 PM, Arne Wossink  wrote:
> Sounds good. How do we proceed from here? Are you going to propose a session
> for this day and invite attendees? Do we need to sign up anywhere?

Yes, we need to register. But we don't need a proposal: 1. it will be
an open "market", 2. I'm also in the meeting organization :)

I can coordinate the Wikidata presence...

Egon


-- 
E.L. Willighagen
Department of Bioinformatics - BiGCaT
Maastricht University (http://www.bigcat.unimaas.nl/)
Homepage: http://egonw.github.com/
LinkedIn: http://se.linkedin.com/in/egonw
Blog: http://chem-bla-ics.blogspot.com/
PubList: http://www.citeulike.org/user/egonw/tag/papers
ORCID: -0001-7542-0286
ImpactStory: https://impactstory.org/u/egonwillighagen

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Joining May 29, National Plan Open Science meeting in Delft, The Netherlands

2017-04-24 Thread Egon Willighagen
On Mon, Apr 24, 2017 at 10:19 AM, Arne Wossink  wrote:
> Interesting opportunity! I would be interested in attending and/or helping
> out as WMNL representative, but I don't have a case study or anything else I
> could contribute.

Yes, of course, Wikidata is just part of Wikimedia and a WMNL
representative that knows about other interactions of Wikimedia with
research is very welcome indeed.

Egon

-- 
E.L. Willighagen
Department of Bioinformatics - BiGCaT
Maastricht University (http://www.bigcat.unimaas.nl/)
Homepage: http://egonw.github.com/
LinkedIn: http://se.linkedin.com/in/egonw
Blog: http://chem-bla-ics.blogspot.com/
PubList: http://www.citeulike.org/user/egonw/tag/papers
ORCID: -0001-7542-0286
ImpactStory: https://impactstory.org/u/egonwillighagen

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Joining May 29, National Plan Open Science meeting in Delft, The Netherlands

2017-04-24 Thread Egon Willighagen
Great, thanks. When the plan has become a bit more clear, I will add
information, but it seems we have critical mass in terms of people
interested in joining :)

Egon

On Mon, Apr 24, 2017 at 9:36 AM, Léa Lacroix  wrote:
> Thanks Egon for letting us know.
> Feel free to add the event on https://www.wikidata.org/wiki/Wikidata:Events
> :)
>
> On 23 April 2017 at 11:24, Egon Willighagen 
> wrote:
>>
>> Hi Wikidata community,
>>
>> on May 29 in Delft, The Netherlands, the first national meeting is
>> planned for researchers (at a various stage of their career) about the
>> Dutch National Plan Open Science. There will be a large session where
>> organisations and individuals can present their experiences with Open
>> Science...
>>
>> I will join the meeting and want to see Wikidata there, and plan to
>> host a table about Wikidata in research... every since the joined
>> H2020 funding application (which we didn't get), I have been using
>> Wikidata for our interoperabilty work in various research projects...
>>
>> However, the more the merrier, and I'm hoping to co-host a Wikidata
>> table at this meeting... who else is interested in teaming up and
>> showing the Dutch research community how Wikidata can help them with
>> their Open Science? My own work is in the area of the life sciences,
>> but I know many others are using Wikidata for other research fields,
>> and the meeting is for all research, not just the natural sciences...
>>
>> Looking forward to hearing from you,
>>
>> greetings Egon
>>
>> --
>> E.L. Willighagen
>> Department of Bioinformatics - BiGCaT
>> Maastricht University (http://www.bigcat.unimaas.nl/)
>> Homepage: http://egonw.github.com/
>> LinkedIn: http://se.linkedin.com/in/egonw
>> Blog: http://chem-bla-ics.blogspot.com/
>> PubList: http://www.citeulike.org/user/egonw/tag/papers
>> ORCID: -0001-7542-0286
>> ImpactStory: https://impactstory.org/u/egonwillighagen
>>
>> ___
>> Wikidata mailing list
>> Wikidata@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
>
>
>
> --
> Léa Lacroix
> Project Manager Community Communication for Wikidata
>
> Wikimedia Deutschland e.V.
> Tempelhofer Ufer 23-24
> 10963 Berlin
> www.wikimedia.de
>
> Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
>
> Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
> der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für
> Körperschaften I Berlin, Steuernummer 27/029/42207.
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>



-- 
E.L. Willighagen
Department of Bioinformatics - BiGCaT
Maastricht University (http://www.bigcat.unimaas.nl/)
Homepage: http://egonw.github.com/
LinkedIn: http://se.linkedin.com/in/egonw
Blog: http://chem-bla-ics.blogspot.com/
PubList: http://www.citeulike.org/user/egonw/tag/papers
ORCID: -0001-7542-0286
ImpactStory: https://impactstory.org/u/egonwillighagen

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Joining May 29, National Plan Open Science meeting in Delft, The Netherlands

2017-04-24 Thread Egon Willighagen
Dear Yaroslav,

On Sun, Apr 23, 2017 at 2:37 PM, Yaroslav Blanter  wrote:
> I will be available for the whole morning (unfortunately have my agenda full
> in the afternoon) and will be happy to attend the round table or even to
> chair it if needed, but I do not have any own story to present.

The 'marktplaats' is scheduled for 12-13:30... this will be informal
(no presentation; no chair needed, just volunteers), just us demoing
Wikidata, answer questions, show use cases in research, maybe teaching
some people how to contribute themselves, etc.

Please let me know if that time fits, but if you can only join the
first x minutes, love to meet up and learn a bit how you use Wikidata
in your research myself :)

Egon


> On Sun, Apr 23, 2017 at 1:35 PM, AMIT KUMAR JAISWAL
>  wrote:
>>
>> Dear Egon,
>>
>> Thanks a lot for updating me about this meeting.
>>
>> I have worked a bit on SPARQL and wikidata items but for my
>> undergraduate thesis I recently published a paper on NLP :
>>
>> https://www.academia.edu/32188868/SQL_Query_Generator_For_Natural_Language.
>>
>> Yes, I'll surely invite Mandar(the project lead of DynaML) to join
>> this meeting and currently I'm in India so if I get the chance then
>> I'll try to join this meeting.
>> It's great to know that you already have ML background.
>>
>> Looking forward to hear from you.
>>
>> Regards
>>
>> Thank you
>> Amit Kumar Jaiswal
>>
>> On 4/23/17, Egon Willighagen  wrote:
>> > Sounds excellent! Looking forward to meet Yaroslav.
>> >
>> > Egon
>> >
>> > On Sun, Apr 23, 2017 at 1:06 PM, Maarten Dammers 
>> > wrote:
>> >> Hi Egon,
>> >>
>> >> Yaroslav is one of our (very active) users/admins/bureaucrats and a
>> >> professor at TU Delft.  Maybe he can join you?
>> >>
>> >> Maarten
>> >>
>> >>
>> >>
>> >> On 23-04-17 12:34, Egon Willighagen wrote:
>> >>>
>> >>> Dear Amit,
>> >>>
>> >>> you make me painfully aware of something I should have mentioned (my
>> >>> apologies for haven forgotten that; still tired from the science march
>> >>> yesterday): it is not primarily international Open Science meeting...
>> >>> the context is really about how to implement this Dutch Plan Open
>> >>> Science... so, I was planning to target researchers using Wikidata in
>> >>> the local Dutch region... other sessions will be about Open Science in
>> >>> the Dutch funding environment, etc.
>> >>>
>> >>> That said: there is more information which will get more informative
>> >>> over the next weeks here:
>> >>>
>> >>> "Open Science: the National Plan and you" ->
>> >>> https://www.openscience.nl/nationaal-plan
>> >>>
>> >>> Your reply also makes me wonder if there is Mozilla Open Science
>> >>> projects running in NL?
>> >>>
>> >>> Egon
>> >>>
>> >>>
>> >>> On Sun, Apr 23, 2017 at 12:04 PM, AMIT KUMAR JAISWAL
>> >>>  wrote:
>> >>>>
>> >>>> Hey Egon,
>> >>>>
>> >>>> Thanks for letting us know about this Open Science meeting.
>> >>>>
>> >>>> I'm interested in forming a team and currently I'm working with
>> >>>> couple
>> >>>> of Open Source projects ranges from Machine Learning/AI, Natural
>> >>>> Language Processing and recently started with Deep Learning.
>> >>>> Apart from this I'm also doing few competitions on Kaggle :
>> >>>> https://www.kaggle.com/amitkumarjaiswal.
>> >>>>
>> >>>> Please let me know how can I join/participate in this meeting.
>> >>>>
>> >>>> Regards
>> >>>>
>> >>>> Thank you
>> >>>> Amit Kumar Jaiswal
>> >>>>
>> >>>> On 4/23/17, Egon Willighagen  wrote:
>> >>>>>
>> >>>>> Hi Wikidata community,
>> >>>>>
>> >>>>> on May 29 in Delft, The Netherlands, the first national meeting is
>> >>>>> planned for researchers (at a various stage of their career) about
>> >>>>> the
>> >>>>> Dutch National Plan Open Science. There 

Re: [Wikidata] Joining May 29, National Plan Open Science meeting in Delft, The Netherlands

2017-04-23 Thread Egon Willighagen
Sounds excellent! Looking forward to meet Yaroslav.

Egon

On Sun, Apr 23, 2017 at 1:06 PM, Maarten Dammers  wrote:
> Hi Egon,
>
> Yaroslav is one of our (very active) users/admins/bureaucrats and a
> professor at TU Delft.  Maybe he can join you?
>
> Maarten
>
>
>
> On 23-04-17 12:34, Egon Willighagen wrote:
>>
>> Dear Amit,
>>
>> you make me painfully aware of something I should have mentioned (my
>> apologies for haven forgotten that; still tired from the science march
>> yesterday): it is not primarily international Open Science meeting...
>> the context is really about how to implement this Dutch Plan Open
>> Science... so, I was planning to target researchers using Wikidata in
>> the local Dutch region... other sessions will be about Open Science in
>> the Dutch funding environment, etc.
>>
>> That said: there is more information which will get more informative
>> over the next weeks here:
>>
>> "Open Science: the National Plan and you" ->
>> https://www.openscience.nl/nationaal-plan
>>
>> Your reply also makes me wonder if there is Mozilla Open Science
>> projects running in NL?
>>
>> Egon
>>
>>
>> On Sun, Apr 23, 2017 at 12:04 PM, AMIT KUMAR JAISWAL
>>  wrote:
>>>
>>> Hey Egon,
>>>
>>> Thanks for letting us know about this Open Science meeting.
>>>
>>> I'm interested in forming a team and currently I'm working with couple
>>> of Open Source projects ranges from Machine Learning/AI, Natural
>>> Language Processing and recently started with Deep Learning.
>>> Apart from this I'm also doing few competitions on Kaggle :
>>> https://www.kaggle.com/amitkumarjaiswal.
>>>
>>> Please let me know how can I join/participate in this meeting.
>>>
>>> Regards
>>>
>>> Thank you
>>> Amit Kumar Jaiswal
>>>
>>> On 4/23/17, Egon Willighagen  wrote:
>>>>
>>>> Hi Wikidata community,
>>>>
>>>> on May 29 in Delft, The Netherlands, the first national meeting is
>>>> planned for researchers (at a various stage of their career) about the
>>>> Dutch National Plan Open Science. There will be a large session where
>>>> organisations and individuals can present their experiences with Open
>>>> Science...
>>>>
>>>> I will join the meeting and want to see Wikidata there, and plan to
>>>> host a table about Wikidata in research... every since the joined
>>>> H2020 funding application (which we didn't get), I have been using
>>>> Wikidata for our interoperabilty work in various research projects...
>>>>
>>>> However, the more the merrier, and I'm hoping to co-host a Wikidata
>>>> table at this meeting... who else is interested in teaming up and
>>>> showing the Dutch research community how Wikidata can help them with
>>>> their Open Science? My own work is in the area of the life sciences,
>>>> but I know many others are using Wikidata for other research fields,
>>>> and the meeting is for all research, not just the natural sciences...
>>>>
>>>> Looking forward to hearing from you,
>>>>
>>>> greetings Egon
>>>>
>>>> --
>>>> E.L. Willighagen
>>>> Department of Bioinformatics - BiGCaT
>>>> Maastricht University (http://www.bigcat.unimaas.nl/)
>>>> Homepage: http://egonw.github.com/
>>>> LinkedIn: http://se.linkedin.com/in/egonw
>>>> Blog: http://chem-bla-ics.blogspot.com/
>>>> PubList: http://www.citeulike.org/user/egonw/tag/papers
>>>> ORCID: -0001-7542-0286
>>>> ImpactStory: https://impactstory.org/u/egonwillighagen
>>>>
>>>> ___
>>>> Wikidata mailing list
>>>> Wikidata@lists.wikimedia.org
>>>> https://lists.wikimedia.org/mailman/listinfo/wikidata
>>>>
>>>
>>> --
>>> Amit Kumar Jaiswal
>>> Mozilla Representative <http://reps.mozilla.org/u/amitkumarj441> |
>>> LinkedIn
>>> <http://in.linkedin.com/in/amitkumarjaiswal1> | Portfolio
>>> <http://amitkumarj441.github.io>
>>> Kanpur, India
>>> Mo. : +91-8081187743 | T : @AMIT_GKP | PGP : EBE7 39F0 0427 4A2C
>>>
>>> ___
>>> Wikidata mailing list
>>> Wikidata@lists.wikimedia.org
>>> https://lists.wikimedia.org/mailman/listinfo/wikidata
>>
>>
>>
>
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata



-- 
E.L. Willighagen
Department of Bioinformatics - BiGCaT
Maastricht University (http://www.bigcat.unimaas.nl/)
Homepage: http://egonw.github.com/
LinkedIn: http://se.linkedin.com/in/egonw
Blog: http://chem-bla-ics.blogspot.com/
PubList: http://www.citeulike.org/user/egonw/tag/papers
ORCID: -0001-7542-0286
ImpactStory: https://impactstory.org/u/egonwillighagen

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Joining May 29, National Plan Open Science meeting in Delft, The Netherlands

2017-04-23 Thread Egon Willighagen
Dear Amit,

you make me painfully aware of something I should have mentioned (my
apologies for haven forgotten that; still tired from the science march
yesterday): it is not primarily international Open Science meeting...
the context is really about how to implement this Dutch Plan Open
Science... so, I was planning to target researchers using Wikidata in
the local Dutch region... other sessions will be about Open Science in
the Dutch funding environment, etc.

That said: there is more information which will get more informative
over the next weeks here:

"Open Science: the National Plan and you" ->
https://www.openscience.nl/nationaal-plan

Your reply also makes me wonder if there is Mozilla Open Science
projects running in NL?

Egon


On Sun, Apr 23, 2017 at 12:04 PM, AMIT KUMAR JAISWAL
 wrote:
> Hey Egon,
>
> Thanks for letting us know about this Open Science meeting.
>
> I'm interested in forming a team and currently I'm working with couple
> of Open Source projects ranges from Machine Learning/AI, Natural
> Language Processing and recently started with Deep Learning.
> Apart from this I'm also doing few competitions on Kaggle :
> https://www.kaggle.com/amitkumarjaiswal.
>
> Please let me know how can I join/participate in this meeting.
>
> Regards
>
> Thank you
> Amit Kumar Jaiswal
>
> On 4/23/17, Egon Willighagen  wrote:
>> Hi Wikidata community,
>>
>> on May 29 in Delft, The Netherlands, the first national meeting is
>> planned for researchers (at a various stage of their career) about the
>> Dutch National Plan Open Science. There will be a large session where
>> organisations and individuals can present their experiences with Open
>> Science...
>>
>> I will join the meeting and want to see Wikidata there, and plan to
>> host a table about Wikidata in research... every since the joined
>> H2020 funding application (which we didn't get), I have been using
>> Wikidata for our interoperabilty work in various research projects...
>>
>> However, the more the merrier, and I'm hoping to co-host a Wikidata
>> table at this meeting... who else is interested in teaming up and
>> showing the Dutch research community how Wikidata can help them with
>> their Open Science? My own work is in the area of the life sciences,
>> but I know many others are using Wikidata for other research fields,
>> and the meeting is for all research, not just the natural sciences...
>>
>> Looking forward to hearing from you,
>>
>> greetings Egon
>>
>> --
>> E.L. Willighagen
>> Department of Bioinformatics - BiGCaT
>> Maastricht University (http://www.bigcat.unimaas.nl/)
>> Homepage: http://egonw.github.com/
>> LinkedIn: http://se.linkedin.com/in/egonw
>> Blog: http://chem-bla-ics.blogspot.com/
>> PubList: http://www.citeulike.org/user/egonw/tag/papers
>> ORCID: -0001-7542-0286
>> ImpactStory: https://impactstory.org/u/egonwillighagen
>>
>> ___
>> Wikidata mailing list
>> Wikidata@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikidata
>>
>
>
> --
> Amit Kumar Jaiswal
> Mozilla Representative <http://reps.mozilla.org/u/amitkumarj441> | LinkedIn
> <http://in.linkedin.com/in/amitkumarjaiswal1> | Portfolio
> <http://amitkumarj441.github.io>
> Kanpur, India
> Mo. : +91-8081187743 | T : @AMIT_GKP | PGP : EBE7 39F0 0427 4A2C
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata



-- 
E.L. Willighagen
Department of Bioinformatics - BiGCaT
Maastricht University (http://www.bigcat.unimaas.nl/)
Homepage: http://egonw.github.com/
LinkedIn: http://se.linkedin.com/in/egonw
Blog: http://chem-bla-ics.blogspot.com/
PubList: http://www.citeulike.org/user/egonw/tag/papers
ORCID: -0001-7542-0286
ImpactStory: https://impactstory.org/u/egonwillighagen

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


[Wikidata] Joining May 29, National Plan Open Science meeting in Delft, The Netherlands

2017-04-23 Thread Egon Willighagen
Hi Wikidata community,

on May 29 in Delft, The Netherlands, the first national meeting is
planned for researchers (at a various stage of their career) about the
Dutch National Plan Open Science. There will be a large session where
organisations and individuals can present their experiences with Open
Science...

I will join the meeting and want to see Wikidata there, and plan to
host a table about Wikidata in research... every since the joined
H2020 funding application (which we didn't get), I have been using
Wikidata for our interoperabilty work in various research projects...

However, the more the merrier, and I'm hoping to co-host a Wikidata
table at this meeting... who else is interested in teaming up and
showing the Dutch research community how Wikidata can help them with
their Open Science? My own work is in the area of the life sciences,
but I know many others are using Wikidata for other research fields,
and the meeting is for all research, not just the natural sciences...

Looking forward to hearing from you,

greetings Egon

-- 
E.L. Willighagen
Department of Bioinformatics - BiGCaT
Maastricht University (http://www.bigcat.unimaas.nl/)
Homepage: http://egonw.github.com/
LinkedIn: http://se.linkedin.com/in/egonw
Blog: http://chem-bla-ics.blogspot.com/
PubList: http://www.citeulike.org/user/egonw/tag/papers
ORCID: -0001-7542-0286
ImpactStory: https://impactstory.org/u/egonwillighagen

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] External link Twitter --> Wikidata

2016-10-09 Thread Egon Willighagen
On Sun, Oct 9, 2016 at 5:25 PM, Brill Lyle  wrote:

> How on EARTH could this be the wrong place to talk about this? I am still
> laughing.
>

Because it's a political/social decision to make, not technical. Wikidata
can only provide solutions, but the decision is with each Wikipedia, not
Wikidata.

Egon

-- 
E.L. Willighagen
Department of Bioinformatics - BiGCaT
Maastricht University (http://www.bigcat.unimaas.nl/)
Homepage: http://egonw.github.com/
LinkedIn: http://se.linkedin.com/in/egonw
Blog: http://chem-bla-ics.blogspot.com/
PubList: http://www.citeulike.org/user/egonw/tag/papers
ORCID: -0001-7542-0286
ImpactStory: https://impactstory.org/u/egonwillighagen
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Greater than 400 char limit for Wikidata string data types

2016-10-08 Thread Egon Willighagen
Dear Thomas,

On Sat, Oct 8, 2016 at 12:07 PM, Thomas Douillard <
thomas.douill...@gmail.com> wrote:

> Probably a silly question but ... did you all consider creating a datatype
> for molecue representation ? This seem to be a very similar usecase than
> mathematica formula. Essentially we're not dealing with a raw string but a
> representation of molecule formulas, with its own encoding ...
>

The InChI is actually not a structural representation, but a derived unique
identifier.

What you propose would, however, apply to the SMILES. That one is generally
of about the same size as the InChI, and there your solution sounds like a
great idea!

Egon


> Changing the limit seem to be a poor workaround to a dedicated datatype -
> nobody seems to have found a relevant usecase and it seem to me that we're
> essentially abusing strings for storing blobs ...
>
> 2016-10-08 11:33 GMT+02:00 Egon Willighagen :
>
>>
>>
>> On Sat, Oct 8, 2016 at 11:28 AM, Lydia Pintscher <
>> lydia.pintsc...@wikimedia.de> wrote:
>>
>>> On Sat, Oct 8, 2016 at 11:23 AM, Egon Willighagen
>>>  wrote:
>>> > Ah, those numbers are for https://www.wikidata.org/wiki/Property:P234
>>> ...
>>>
>>> External identifier then. Cool. And for string like in
>>> https://www.wikidata.org/wiki/Property:P233? Sebastian's initial email
>>
>> says 1500 to 2000. Is this still a good number after this discussion?
>>>
>>
>> Yes, that would cover more than 99.9% of all InChIs in PubChem. (See
>> Sebastian's reply earlier in this thread.)
>>
>> Egon
>>
>> --
>> E.L. Willighagen
>> Department of Bioinformatics - BiGCaT
>> Maastricht University (http://www.bigcat.unimaas.nl/)
>> Homepage: http://egonw.github.com/
>> LinkedIn: http://se.linkedin.com/in/egonw
>> Blog: http://chem-bla-ics.blogspot.com/
>> PubList: http://www.citeulike.org/user/egonw/tag/papers
>> ORCID: -0001-7542-0286
>> ImpactStory: https://impactstory.org/u/egonwillighagen
>>
>> ___
>> Wikidata mailing list
>> Wikidata@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikidata
>>
>>
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
>


-- 
E.L. Willighagen
Department of Bioinformatics - BiGCaT
Maastricht University (http://www.bigcat.unimaas.nl/)
Homepage: http://egonw.github.com/
LinkedIn: http://se.linkedin.com/in/egonw
Blog: http://chem-bla-ics.blogspot.com/
PubList: http://www.citeulike.org/user/egonw/tag/papers
ORCID: -0001-7542-0286
ImpactStory: https://impactstory.org/u/egonwillighagen
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Greater than 400 char limit for Wikidata string data types

2016-10-08 Thread Egon Willighagen
On Sat, Oct 8, 2016 at 11:28 AM, Lydia Pintscher <
lydia.pintsc...@wikimedia.de> wrote:

> On Sat, Oct 8, 2016 at 11:23 AM, Egon Willighagen
>  wrote:
> > Ah, those numbers are for https://www.wikidata.org/wiki/Property:P234
> ...
>
> External identifier then. Cool. And for string like in
> https://www.wikidata.org/wiki/Property:P233? Sebastian's initial email

says 1500 to 2000. Is this still a good number after this discussion?
>

Yes, that would cover more than 99.9% of all InChIs in PubChem. (See
Sebastian's reply earlier in this thread.)

Egon

-- 
E.L. Willighagen
Department of Bioinformatics - BiGCaT
Maastricht University (http://www.bigcat.unimaas.nl/)
Homepage: http://egonw.github.com/
LinkedIn: http://se.linkedin.com/in/egonw
Blog: http://chem-bla-ics.blogspot.com/
PubList: http://www.citeulike.org/user/egonw/tag/papers
ORCID: -0001-7542-0286
ImpactStory: https://impactstory.org/u/egonwillighagen
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Greater than 400 char limit for Wikidata string data types

2016-10-08 Thread Egon Willighagen
On Sat, Oct 8, 2016 at 11:19 AM, Lydia Pintscher <
lydia.pintsc...@wikimedia.de> wrote:

> On Sat, Oct 8, 2016 at 11:14 AM, Egon Willighagen
>  wrote:
> > For small compounds this is answered by Sebastian's analysis... 5K would
> > cover all currently known small molecules. 1K would cover 99.9%.
>
> Ok. That is for strings, correct? Input for other use cases?


Ah, those numbers are for https://www.wikidata.org/wiki/Property:P234 ...

Egon

-- 
E.L. Willighagen
Department of Bioinformatics - BiGCaT
Maastricht University (http://www.bigcat.unimaas.nl/)
Homepage: http://egonw.github.com/
LinkedIn: http://se.linkedin.com/in/egonw
Blog: http://chem-bla-ics.blogspot.com/
PubList: http://www.citeulike.org/user/egonw/tag/papers
ORCID: -0001-7542-0286
ImpactStory: https://impactstory.org/u/egonwillighagen
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Greater than 400 char limit for Wikidata string data types

2016-10-08 Thread Egon Willighagen
On Sat, Oct 8, 2016 at 11:07 AM, Lydia Pintscher <
lydia.pintsc...@wikimedia.de> wrote:
>
> Based on this my proposal is to increase string and URL and
> potentially external identifier if you request it. One open question
> is still what the new limit should be.
>

For small compounds this is answered by Sebastian's analysis... 5K would
cover all currently known small molecules. 1K would cover 99.9%.

Lydia, do I understand that a formal request needs to be filed? Who will do
that?

Egon

-- 
E.L. Willighagen
Department of Bioinformatics - BiGCaT
Maastricht University (http://www.bigcat.unimaas.nl/)
Homepage: http://egonw.github.com/
LinkedIn: http://se.linkedin.com/in/egonw
Blog: http://chem-bla-ics.blogspot.com/
PubList: http://www.citeulike.org/user/egonw/tag/papers
ORCID: -0001-7542-0286
ImpactStory: https://impactstory.org/u/egonwillighagen
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] signing license declarations

2016-10-06 Thread Egon Willighagen
On Wed, Oct 5, 2016 at 9:55 PM, Benjamin Good  wrote:
> How do you plan to handle updates ?

Bug the upstream author when they make a release... like in the other
part of the thread, it's upstream should make it clearly available
under CCZero from their site... but we need to show better what people
get in return for that.

Egon

-- 
E.L. Willighagen
Department of Bioinformatics - BiGCaT
Maastricht University (http://www.bigcat.unimaas.nl/)
Homepage: http://egonw.github.com/
LinkedIn: http://se.linkedin.com/in/egonw
Blog: http://chem-bla-ics.blogspot.com/
PubList: http://www.citeulike.org/user/egonw/tag/papers
ORCID: -0001-7542-0286
ImpactStory: https://impactstory.org/u/egonwillighagen

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] signing license declarations

2016-10-05 Thread Egon Willighagen
On Wed, Oct 5, 2016 at 7:44 PM, Benjamin Good  wrote:
> When negotiating the import of data from resources that are not CC0, it
> would be very valuable to have a somewhat formal process to allow them to
> declare that some portions of their databases may be imported into wikidata
> and thus join its CC0 collection.

For the EPA CompTox Dashboard I asked Antony Williams to release the
mapping data as CCZero spefically, which he did on Figshare [0].

Egon

0.https://figshare.com/articles/Mapping_file_of_InChIStrings_InChIKeys_and_DTXSIDs_for_the_EPA_CompTox_Dashboard/3578313

-- 
E.L. Willighagen
Department of Bioinformatics - BiGCaT
Maastricht University (http://www.bigcat.unimaas.nl/)
Homepage: http://egonw.github.com/
LinkedIn: http://se.linkedin.com/in/egonw
Blog: http://chem-bla-ics.blogspot.com/
PubList: http://www.citeulike.org/user/egonw/tag/papers
ORCID: -0001-7542-0286
ImpactStory: https://impactstory.org/u/egonwillighagen

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Greater than 400 char limit for Wikidata string data types

2016-09-23 Thread Egon Willighagen
On Fri, Sep 23, 2016 at 5:53 PM, Denny Vrandečić 
wrote:

> One stupid question: due to the length of these identifiers, and since
> they are not simple intransparent identifiers but rather encode semantics -
> if I understand it correctly - could a single such identifier be encoding
> content or ideas which are potentially covered by copyright or patent law?
> Is there some background available on that?
>


Not the InChI. The standard itself is meant to be reused as much as
possible and the software is open source.

Some information here:
http://jcheminf.springeropen.com/articles/10.1186/1758-2946-5-7

Egon



> On Fri, Sep 23, 2016 at 3:27 AM Egon Willighagen <
> egon.willigha...@gmail.com> wrote:
>
>>
>> Sebastian, great you found time for it! I didn't :/ (Stats are worth a
>> tweet, IMHO :)
>>
>> Egon
>>
>> On Fri, Sep 23, 2016 at 12:20 PM, Sebastian Burgstaller <
>> sebastian.burgstal...@gmail.com> wrote:
>>
>>> Hi Denny,
>>> Sorry, I missed this email. just did the calculation for InChI string
>>> lengths on the 92 Mio PubChem compounds:
>>>   99% 99.9%  100%
>>>   311   676  4502
>>>
>>> That said, there is not upper limit for the length, but 4502 is the
>>> longest string in the PubChem database. The other IDs, canonical and
>>> isomeric SMILES have the same distribution shape, but are overall
>>> slightly shorter.
>>>
>>> Best,
>>> Sebastian
>>>
>>> On Sun, Sep 18, 2016 at 9:19 PM, Denny Vrandečić 
>>> wrote:
>>> > Can you figure out what a good limit would be for these two use cases?
>>> I.e.
>>> > what would support 99%, 99.9%, and 100%?
>>> >
>>> >
>>> > On Sun, Sep 18, 2016, 12:27 Egon Willighagen <
>>> egon.willigha...@gmail.com>
>>> > wrote:
>>> >>
>>> >> Hi all,
>>> >>
>>> >> sorry for joining the party late...
>>> >>
>>> >> On Tue, Sep 13, 2016 at 11:39 AM, Sebastian Burgstaller
>>> >>  wrote:
>>> >> > I think this topic might have been discussed many months ago. For
>>> >> > certain data types in the chemical compound space (P233, canonical
>>> >> > smiles, P2017 isomeric smiles and P234 Inchi key) a higher character
>>> >> > limit than 400 would be really helpful (1500 to 2000 chars (I sense
>>> >> > that this might cause problems with SPARQL)). Are there any plans on
>>> >> > implementing this? In general, for quality assurance, many string
>>> >> > property types would profit from a fixed max string length.
>>> >>
>>> >> 400 characters is not a lot for chemicals... InChIs can be a lot
>>> >> larger indeed. 2k would allow us to capture a lot more chemicals. BTW,
>>> >> this also applies to the canonical SMILES, which also doesn't have an
>>> >> upper bound. Tannic acid (Q427956) is an example (which looking at the
>>> >> InChIKey came up when running the bot :) From working with ChEMBL as
>>> >> RDF I know it has InChIs of length > 1024, which was the max length in
>>> >> Virtuoso... I think it's important for the biology and chemistry to
>>> >> increase the limit.
>>> >>
>>> >> Egon
>>> >>
>>> >> --
>>> >> E.L. Willighagen
>>> >> Department of Bioinformatics - BiGCaT
>>> >> Maastricht University (http://www.bigcat.unimaas.nl/)
>>> >> Homepage: http://egonw.github.com/
>>> >> LinkedIn: http://se.linkedin.com/in/egonw
>>> >> Blog: http://chem-bla-ics.blogspot.com/
>>> >> PubList: http://www.citeulike.org/user/egonw/tag/papers
>>> >> ORCID: -0001-7542-0286
>>> >> ImpactStory: https://impactstory.org/EgonWillighagen
>>> >>
>>> >> ___
>>> >> Wikidata mailing list
>>> >> Wikidata@lists.wikimedia.org
>>> >> https://lists.wikimedia.org/mailman/listinfo/wikidata
>>> >
>>> >
>>> > ___
>>> > Wikidata mailing list
>>> > Wikidata@lists.wikimedia.org
>>> > https://lists.wikimedia.org/mailman/listinfo/wikidata
>>> >
>>>
>>> ___
>>> Wikidata mailing list
>>> Wikidata@lists.wik

Re: [Wikidata] Greater than 400 char limit for Wikidata string data types

2016-09-23 Thread Egon Willighagen
Sebastian, great you found time for it! I didn't :/ (Stats are worth a
tweet, IMHO :)

Egon

On Fri, Sep 23, 2016 at 12:20 PM, Sebastian Burgstaller <
sebastian.burgstal...@gmail.com> wrote:

> Hi Denny,
> Sorry, I missed this email. just did the calculation for InChI string
> lengths on the 92 Mio PubChem compounds:
>   99% 99.9%  100%
>   311   676  4502
>
> That said, there is not upper limit for the length, but 4502 is the
> longest string in the PubChem database. The other IDs, canonical and
> isomeric SMILES have the same distribution shape, but are overall
> slightly shorter.
>
> Best,
> Sebastian
>
> On Sun, Sep 18, 2016 at 9:19 PM, Denny Vrandečić 
> wrote:
> > Can you figure out what a good limit would be for these two use cases?
> I.e.
> > what would support 99%, 99.9%, and 100%?
> >
> >
> > On Sun, Sep 18, 2016, 12:27 Egon Willighagen  >
> > wrote:
> >>
> >> Hi all,
> >>
> >> sorry for joining the party late...
> >>
> >> On Tue, Sep 13, 2016 at 11:39 AM, Sebastian Burgstaller
> >>  wrote:
> >> > I think this topic might have been discussed many months ago. For
> >> > certain data types in the chemical compound space (P233, canonical
> >> > smiles, P2017 isomeric smiles and P234 Inchi key) a higher character
> >> > limit than 400 would be really helpful (1500 to 2000 chars (I sense
> >> > that this might cause problems with SPARQL)). Are there any plans on
> >> > implementing this? In general, for quality assurance, many string
> >> > property types would profit from a fixed max string length.
> >>
> >> 400 characters is not a lot for chemicals... InChIs can be a lot
> >> larger indeed. 2k would allow us to capture a lot more chemicals. BTW,
> >> this also applies to the canonical SMILES, which also doesn't have an
> >> upper bound. Tannic acid (Q427956) is an example (which looking at the
> >> InChIKey came up when running the bot :) From working with ChEMBL as
> >> RDF I know it has InChIs of length > 1024, which was the max length in
> >> Virtuoso... I think it's important for the biology and chemistry to
> >> increase the limit.
> >>
> >> Egon
> >>
> >> --
> >> E.L. Willighagen
> >> Department of Bioinformatics - BiGCaT
> >> Maastricht University (http://www.bigcat.unimaas.nl/)
> >> Homepage: http://egonw.github.com/
> >> LinkedIn: http://se.linkedin.com/in/egonw
> >> Blog: http://chem-bla-ics.blogspot.com/
> >> PubList: http://www.citeulike.org/user/egonw/tag/papers
> >> ORCID: -0001-7542-0286
> >> ImpactStory: https://impactstory.org/EgonWillighagen
> >>
> >> ___
> >> Wikidata mailing list
> >> Wikidata@lists.wikimedia.org
> >> https://lists.wikimedia.org/mailman/listinfo/wikidata
> >
> >
> > ___
> > Wikidata mailing list
> > Wikidata@lists.wikimedia.org
> > https://lists.wikimedia.org/mailman/listinfo/wikidata
> >
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>



-- 
E.L. Willighagen
Department of Bioinformatics - BiGCaT
Maastricht University (http://www.bigcat.unimaas.nl/)
Homepage: http://egonw.github.com/
LinkedIn: http://se.linkedin.com/in/egonw
Blog: http://chem-bla-ics.blogspot.com/
PubList: http://www.citeulike.org/user/egonw/tag/papers
ORCID: -0001-7542-0286
ImpactStory: https://impactstory.org/u/egonwillighagen
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Greater than 400 char limit for Wikidata string data types

2016-09-18 Thread Egon Willighagen
Hi all,

sorry for joining the party late...

On Tue, Sep 13, 2016 at 11:39 AM, Sebastian Burgstaller
 wrote:
> I think this topic might have been discussed many months ago. For
> certain data types in the chemical compound space (P233, canonical
> smiles, P2017 isomeric smiles and P234 Inchi key) a higher character
> limit than 400 would be really helpful (1500 to 2000 chars (I sense
> that this might cause problems with SPARQL)). Are there any plans on
> implementing this? In general, for quality assurance, many string
> property types would profit from a fixed max string length.

400 characters is not a lot for chemicals... InChIs can be a lot
larger indeed. 2k would allow us to capture a lot more chemicals. BTW,
this also applies to the canonical SMILES, which also doesn't have an
upper bound. Tannic acid (Q427956) is an example (which looking at the
InChIKey came up when running the bot :) From working with ChEMBL as
RDF I know it has InChIs of length > 1024, which was the max length in
Virtuoso... I think it's important for the biology and chemistry to
increase the limit.

Egon

-- 
E.L. Willighagen
Department of Bioinformatics - BiGCaT
Maastricht University (http://www.bigcat.unimaas.nl/)
Homepage: http://egonw.github.com/
LinkedIn: http://se.linkedin.com/in/egonw
Blog: http://chem-bla-ics.blogspot.com/
PubList: http://www.citeulike.org/user/egonw/tag/papers
ORCID: -0001-7542-0286
ImpactStory: https://impactstory.org/EgonWillighagen

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Op-ed on Wikipedia Signpost regarding Wikidata licensing

2016-06-18 Thread Egon Willighagen
All,

first, thanks for your reply blog, Gerard.

On Sat, Jun 18, 2016 at 1:00 PM, Markus Kroetzsch
 wrote:
>> I have written my opinion on the licensing of Wikidata data.. [1]
>
> I agree with your position there.

So do I.

The original article ends with:

"Among all Wikimedia projects, Wikidata is conspicuously alone in not
being copylefted."

Copylefting or not has been heavily and religiously debated for many,
many years in the open source community. I have never seen strong
examples why either would be better for open source.

Second, data is not text and is not source code. It's different and
"conspicuously alone" is a false argument.

"Perhaps we should start asking why that is the case"

Two possible reasons given above. Add to that what Gerard wrote.

"and whose interests benefit from weak licensing choices,"

No, that's the wrong way around. CCZero is a stronger license
(actually, it's not a license, but a waiver): it gives people more
freedom, removes many more hurdles.

"and start to organize ourselves to fix this"

There is nothing to fix. CCZero without copylefting gives more freedom
and for me that main reason to invest my time. Before you start
talking about "fixing", realize you will also loose.

Greetings,

Egon
(this is a personal opinion, and may not reflect the interest of my employer)

-- 
E.L. Willighagen
Homepage: http://egonw.github.com/
LinkedIn: http://se.linkedin.com/in/egonw
Blog: http://chem-bla-ics.blogspot.com/
PubList: http://www.citeulike.org/user/egonw/tag/papers
ORCID: -0001-7542-0286
ImpactStory: https://impactstory.org/EgonWillighagen

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] SQID evolved again: references

2016-06-03 Thread Egon Willighagen
Super!

Egon

On Thu, Jun 2, 2016 at 11:26 PM, Markus Kroetzsch
 wrote:
> Dear all,
>
> By popular demand, SQID now also shows references for most statements
> (collapsed by default, of course). You can see it, e.g., here:
>
> http://tools.wmflabs.org/sqid/#/view?id=Q42
>
> Cheers,
>
> Markus
>
> --
> Markus Kroetzsch
> Faculty of Computer Science
> Technische Universität Dresden
> +49 351 463 38486
> http://korrekt.org/
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata



-- 
E.L. Willighagen
Department of Bioinformatics - BiGCaT
Maastricht University (http://www.bigcat.unimaas.nl/)
Homepage: http://egonw.github.com/
LinkedIn: http://se.linkedin.com/in/egonw
Blog: http://chem-bla-ics.blogspot.com/
PubList: http://www.citeulike.org/user/egonw/tag/papers
ORCID: -0001-7542-0286
ImpactStory: https://impactstory.org/EgonWillighagen

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Ontology

2016-05-14 Thread Egon Willighagen
On Sat, May 14, 2016 at 5:36 PM, Gerard Meijssen
 wrote:
> We are talking DSM. When the DSM had never called something a disease and
> never had a consistent presentation. When there is a lot of literature
> showing how that something is NOT a disease, why persist on what has always
> been wrong in any which case?

So, what specific Q-entry do you have in mind (what entry??)? Would it
be enough to file a bug report against that (what??) ontology, and
blacklist making that link, or so?

But what term are you referring to? Are is this ontology so crap that
it disagrees in major parts with DSM and common knowledge?

Egon


-- 
E.L. Willighagen
Department of Bioinformatics - BiGCaT
Maastricht University (http://www.bigcat.unimaas.nl/)
Homepage: http://egonw.github.com/
LinkedIn: http://se.linkedin.com/in/egonw
Blog: http://chem-bla-ics.blogspot.com/
PubList: http://www.citeulike.org/user/egonw/tag/papers
ORCID: -0001-7542-0286
ImpactStory: https://impactstory.org/EgonWillighagen

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Ontology

2016-05-14 Thread Egon Willighagen
On Sat, May 14, 2016 at 3:58 PM, Gerard Meijssen
 wrote:
> The problem is that when there is no agreement on its existence, when it is
> highly stigmatic, when it determines the life of people because of an
> opinion.

This reminds me of discussions around the Basic Formal Ontology that
you cannot define something if it is not real...

> It is damaging to persist on including it as a disease and
> accepting the consequences that it has.

Yes, agreed. However, DSM switches opinion about what is a disease
too. That would make Wikidata a temporal knowledge base.

It is always damaging if you make judgments based on labels given to
something (as you know from current Western politics...) But you
cannot wave away the fact that people talk about things and that
things have an impact on society. Is RSI a real thing? Is 'chronic
fatigue' a disease or not?

What matters more? That we record who calls it a disease (with
provenance) or whether scientists reached consensus?

> Should we allow for things that are diametrical the opposite of each other.

To return to your that question, this is currently the situation in
many areas of Wikidata. This is not something Wikidata can always
solve, and certainly not if you stick to the idea that it does not
intend to be an authority, but take authority from their data
sources... another example where "diametrical the opposite of each
other" occur currently is chemical structures, where something cannot
be both charged and uncharged and specific in chemical formula... yet,
that happens.

But I guess you have a specific thing in mind, which is not included
in the discussion so far... understanding the problem at hand may help
me understand the problem and what could be a good solution... very
often this is formalizing the uncertainty... (where the uncertainty
here seems to be human opinion (of DSM versus some ontology
development team...)

Egon


> Thanks,
>      GerardM
>
> On 14 May 2016 at 15:51, Egon Willighagen 
> wrote:
>>
>> On Sat, May 14, 2016 at 3:39 PM, Gerard Meijssen
>>  wrote:
>> > When an external ontology says that something is a disease and the DSM-5
>> > says it is not. There is a huge problem.
>>
>> How is DSM-5 not an ontology itself? Why is this a huge problem? Isn't
>> this just two sources that contradict each other? Moreover, I am even
>> tempted to say it's not even a formal contradiction; it's just
>> different definitions of something which is hard to define...
>>
>> More interestingly would be: should Wikidata have separate items for both?
>>
>> Egon
>>
>> --
>> E.L. Willighagen
>> Department of Bioinformatics - BiGCaT
>> Maastricht University (http://www.bigcat.unimaas.nl/)
>> Homepage: http://egonw.github.com/
>> LinkedIn: http://se.linkedin.com/in/egonw
>> Blog: http://chem-bla-ics.blogspot.com/
>> PubList: http://www.citeulike.org/user/egonw/tag/papers
>> ORCID: -0001-7542-0286
>> ImpactStory: https://impactstory.org/EgonWillighagen
>>
>> ___
>> Wikidata mailing list
>> Wikidata@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
>
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>



-- 
E.L. Willighagen
Department of Bioinformatics - BiGCaT
Maastricht University (http://www.bigcat.unimaas.nl/)
Homepage: http://egonw.github.com/
LinkedIn: http://se.linkedin.com/in/egonw
Blog: http://chem-bla-ics.blogspot.com/
PubList: http://www.citeulike.org/user/egonw/tag/papers
ORCID: -0001-7542-0286
ImpactStory: https://impactstory.org/EgonWillighagen

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Ontology

2016-05-14 Thread Egon Willighagen
On Sat, May 14, 2016 at 3:39 PM, Gerard Meijssen
 wrote:
> When an external ontology says that something is a disease and the DSM-5
> says it is not. There is a huge problem.

How is DSM-5 not an ontology itself? Why is this a huge problem? Isn't
this just two sources that contradict each other? Moreover, I am even
tempted to say it's not even a formal contradiction; it's just
different definitions of something which is hard to define...

More interestingly would be: should Wikidata have separate items for both?

Egon

-- 
E.L. Willighagen
Department of Bioinformatics - BiGCaT
Maastricht University (http://www.bigcat.unimaas.nl/)
Homepage: http://egonw.github.com/
LinkedIn: http://se.linkedin.com/in/egonw
Blog: http://chem-bla-ics.blogspot.com/
PubList: http://www.citeulike.org/user/egonw/tag/papers
ORCID: -0001-7542-0286
ImpactStory: https://impactstory.org/EgonWillighagen

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Multiple properties/identifiers for the same resource

2016-05-02 Thread Egon Willighagen
Hi Jerven, all

On Fri, Apr 29, 2016 at 3:29 PM, Jerven Tjalling Bolleman
 wrote:
> Could I be so bold to suggest that in Wikidata we should strive
> to use external URI's for identifiers not Strings.
>
> For example in Wikidata, there are a lot of UniProt accessions.
> e.g. behind the property https://www.wikidata.org/wiki/P352
> and there is a formatter for a URL.
>
> I think this is the wrong way round, there should be an URL/URI there
> and a formatter to generate a local string for display purposes.
>
> And of course for chembl the URL/URI to use would be
>
>
> There a 2 advantages to this. It allows easier federates queries from
> the source databases into wikidata (no URI conversions etc..)
> The second is that these URIs are clearly not ambiguous.

What would you suggest for identifiers that do not have an official
RDF serialization?

Egon


> Regards,
> Jerven
>
>
> On 28/04/16 23:49, Julie McMurry wrote:
>>>
>>> "One should also point out to the authorities maintaining these IDs
>>
>> that they should spend some effort on producing a workable solution for
>> this. It seems they should be the first to provide a resolver service
>> (or maybe it would be an "ID search engine" if it is so complicated).
>>
>> With the qualifiers in place, Wikidata can also be used to achieve this,
>> of course, but it seems we are just manually reverse engineering
>> something that should be done at the site of whoever is controlling the
>> ID registration."
>>
>> Well said, Markus. A most hearty agreement here on my side and one
>> colleagues and I have been trying to raise awareness of for a long time
>> now (http://bit.ly/id-guidance). One of the challenges is that databases
>> are already being asked to do more with less. They can see the utility
>> of such a service to others, but when I've asked DBs before (not naming
>> names), traction has been limp (I've yet to ask Chembl). Sometimes it
>> works out though. For instance, KEGG used to have 12 different
>> type-specific URLs, corresponding to:
>>
>> kegg.compound
>> kegg.disease
>> kegg.drug
>> kegg.environ
>> kegg.genes
>> kegg.genome
>> kegg.glycan
>> kegg.metagenome
>> kegg.module
>> kegg.orthology
>> kegg.pathway
>> kegg.reaction
>>
>> Thankfully, they've collapsed those to a single URL pattern.
>>
>> The databases that find it the toughest are not those who simply don't
>> embed typing, but rather those that don't embed typing AND ALSO have
>> local identifiers that would otherwise collide. For instance, a
>> prominent bio database is in this boat (not naming names) and would like
>> to make things better but it is hard and messy due to the collisions.
>>
>> FYI 345 of the 560+ records in the identifiers.org
>>  corpus are type-specific at the level of
>> identifiers.org 's namespace; these roll up to
>> ~300 providers.
>>
>> The question though is what WikiData is trying to accomplish. Say you
>> encounter the chembl ID CHEMBL308052
>>  do you need
>> to retrieve the type of the entity for reasons other than determining
>> what URL to use?
>>
>> How are you representing entity labels / IDs to users?
>>
>> Best,
>> Julie
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> ___
>> Wikidata mailing list
>> Wikidata@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikidata
>>
>
> --
> ---
> Jerven BollemanJerven.Bolleman@sib.swiss
> SIB Swiss Institute of Bioinformatics  Tel: +41 (0)22 379 58 85
> CMU, rue Michel Servet 1   Fax: +41 (0)22 379 58 58
> 1211 Geneve 4,
> Switzerland www.sib.swiss - www.uniprot.org
> Follow us at https://twitter.com/#!/uniprot
> ---
>
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata



-- 
E.L. Willighagen
Department of Bioinformatics - BiGCaT
Maastricht University (http://www.bigcat.unimaas.nl/)
Homepage: http://egonw.github.com/
LinkedIn: http://se.linkedin.com/in/egonw
Blog: http://chem-bla-ics.blogspot.com/
PubList: http://www.citeulike.org/user/egonw/tag/papers
ORCID: -0001-7542-0286
ImpactStory: https://impactstory.org/EgonWillighagen

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Wikidata to search chemical compound names

2016-04-25 Thread Egon Willighagen
On Mon, Apr 25, 2016 at 7:23 PM, Sebastian Burgstaller
 wrote:
> A way to achieve this could be to fetch all labels and aliases for all
> chemical compounds in one query and store them locally in your web
> application. This certainly is only feasible if the number of compounds does
> not get to big in Wikdiata. Currently, the query takes ~ 6 sec.

But the search time goes down when you have something to search on, it
seems... the following query takes <1.5s:

PREFIX wd: 
PREFIX wdt: 
PREFIX rdfs: 

SELECT DISTINCT ?cmpnd ?label WHERE {
  {?cmpnd wdt:P279 wd:Q11173 .} UNION
  {?cmpnd wdt:P31 wd:Q11173 .}
  ?cmpnd rdfs:label ?label .
  FILTER (strstarts(?label, "a"))
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
}

BTW, like Magnus said... if you only want to find things with the
PubChem compound identifier, you could take that route:

PREFIX wd: 
PREFIX wdt: 
PREFIX rdfs: 

SELECT DISTINCT ?cmpnd ?label ?pubchemid WHERE {
  ?cmpnd wdt:P662 ?pubchemid .
  ?cmpnd rdfs:label ?label .
  FILTER (strstarts(?label, "a"))
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
}

But I am not sure that is a lot faster...

Also keep in mind that it seems to do a reasonable job at caching
search results...

Egon

-- 
E.L. Willighagen
Department of Bioinformatics - BiGCaT
Maastricht University (http://www.bigcat.unimaas.nl/)
Homepage: http://egonw.github.com/
LinkedIn: http://se.linkedin.com/in/egonw
Blog: http://chem-bla-ics.blogspot.com/
PubList: http://www.citeulike.org/user/egonw/tag/papers
ORCID: -0001-7542-0286
ImpactStory: https://impactstory.org/EgonWillighagen

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Status and ETA External ID conversion

2016-03-10 Thread Egon Willighagen
I think the predicate may need depend on the type of thing we're
linking, and on what is being linked. A strong predicate, like
owl:sameAs, requires a (very) strong similarity between concepts...
this is often not the case... mind you, this applies also to the
things being linked... there are enough alternatives, like rdf:seeAlso
as probably one of the least informative predicaties, via
skos:closeMatch and skos:exactMatch ... for chemicals, I have been
involved in work by the Open PHACTS project (now foundation), led by
Alasdair Gray, on "scientific lenses"... I'm biased, but the
(conference) papers are a good read anyway... [eg Q23034460]

Egon

On Thu, Mar 10, 2016 at 8:08 PM, Young,Jeff (OR)  wrote:
> Then perhaps umbel:isLike instead of owl:sameAs?
>
> http://wiki.opensemanticframework.org/index.php/UMBEL_Vocabulary#isLike_Property
>
> It conveys sameAs but with a hint of uncertainty.
>
>> -Original Message-
>> From: Wikidata [mailto:wikidata-boun...@lists.wikimedia.org] On Behalf Of
>> Stas Malyshev
>> Sent: Thursday, March 10, 2016 1:52 PM
>> To: Discussion list for the Wikidata project. 
>> Subject: Re: [Wikidata] Status and ETA External ID conversion
>>
>> Hi!
>>
>> > Couldn't you use P460 when there is doubt?
>> >
>> > https://www.wikidata.org/wiki/Property:P460
>>
>> P460's type is Item, which means it is relation between two Wikidata items.
>> External ID is relation between Wikidata item and something outside Wikidata.
>>
>> --
>> Stas Malyshev
>> smalys...@wikimedia.org
>>
>> ___
>> Wikidata mailing list
>> Wikidata@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikidata
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata



-- 
E.L. Willighagen
Department of Bioinformatics - BiGCaT
Maastricht University (http://www.bigcat.unimaas.nl/)
Homepage: http://egonw.github.com/
LinkedIn: http://se.linkedin.com/in/egonw
Blog: http://chem-bla-ics.blogspot.com/
PubList: http://www.citeulike.org/user/egonw/tag/papers
ORCID: -0001-7542-0286
ImpactStory: https://impactstory.org/EgonWillighagen

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Status and ETA External ID conversion

2016-03-10 Thread Egon Willighagen
On Thu, Mar 10, 2016 at 6:12 PM, Tom Morris  wrote:
> On Wed, Mar 9, 2016 at 7:37 PM, Stas Malyshev  wrote:
> From a machine processing point of view, a more interesting statement is 
> probably:
>
> wd:Q1000336 owl:sameAs 

Yes, but this proposal matches part of the discussion... owl:sameAs is
in many cases not appropriate and likely should not be the goal in the
first place: in many cases there is not such a clear 1-to-1 relation,
and even if there is a 1-to-1 relation, the above may still be
inappropriate.

Egon

-- 
E.L. Willighagen
Department of Bioinformatics - BiGCaT
Maastricht University (http://www.bigcat.unimaas.nl/)
Homepage: http://egonw.github.com/
LinkedIn: http://se.linkedin.com/in/egonw
Blog: http://chem-bla-ics.blogspot.com/
PubList: http://www.citeulike.org/user/egonw/tag/papers
ORCID: -0001-7542-0286
ImpactStory: https://impactstory.org/EgonWillighagen

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Status and ETA External ID conversion

2016-03-07 Thread Egon Willighagen
On Mon, Mar 7, 2016 at 9:13 AM, Lydia Pintscher
 wrote:
> Ok. I think we're making this much more complicated than necessary. The
> question you should ask yourself is: Does this identify a concept in another
> database/website/...? Nice to have: a website to link to.
> Once we have that we can look at corner cases and exceptions.

OK, thanks for the clarification. Then I will oppose arguments about
uniqueness with my opinions, experiences, and argument and focus on
this instead.

This helps a lot!

Egon

> Wikimedia Deutschland e.V.
> Tempelhofer Ufer 23-24
> 10963 Berlin
> www.wikimedia.de
>
> Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
>
> Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
> der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für
> Körperschaften I Berlin, Steuernummer 27/029/42207.
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>



-- 
E.L. Willighagen
Department of Bioinformatics - BiGCaT
Maastricht University (http://www.bigcat.unimaas.nl/)
Homepage: http://egonw.github.com/
LinkedIn: http://se.linkedin.com/in/egonw
Blog: http://chem-bla-ics.blogspot.com/
PubList: http://www.citeulike.org/user/egonw/tag/papers
ORCID: -0001-7542-0286
ImpactStory: https://impactstory.org/EgonWillighagen

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Status and ETA External ID conversion

2016-03-05 Thread Egon Willighagen
Never mind. I found these in already done.

Egon

On Sat, Mar 5, 2016 at 3:42 PM, Egon Willighagen
 wrote:
> Mmm... I previously added a few chemical identifiers, like KEGG,
> ChEBI, DrugBank, but I cannot find them anymore... :/
>
> Egon
>
> On Sat, Mar 5, 2016 at 3:16 PM, Egon Willighagen
>  wrote:
>> Hi Lydia, all,
>>
>> On Sat, Mar 5, 2016 at 2:54 PM, Markus Krötzsch
>>  wrote:
>>> On 05.03.2016 14:45, Lydia Pintscher wrote:
>>>> Give it another 2 to 3 weeks and it'll get there. More and more editors
>>>> are exposed to the separation in the UI now and start noticing the ones
>>>> that intuitively should be moved into the identifier section.
>>>
>>> Ok, let's see what happens. I am not saying that the other criteria applied
>>> now in the discussions are bad. It's just another use of the datatype than I
>>> would have expected.
>>
>> I'm one of the people who noticed the separation and indeed wondered
>> why some of the chemistry-related identifiers I tagged and added in
>> the long lists of identifiers were not included yet...
>>
>> What is the exact process? Do you just plan to wait longer to see if
>> anyone supports/contradicts my tagging? Should I get other Wikidata
>> users and contributors to back up my suggestion?
>>
>> Originally, I though the idea was just to remove/leave/add them in/to
>> the list, but people started making comments now. I will do this more
>> explicitly now. Also for the IDs I added.
>>
>> Egon
>>
>> --
>> E.L. Willighagen
>> Department of Bioinformatics - BiGCaT
>> Maastricht University (http://www.bigcat.unimaas.nl/)
>> Homepage: http://egonw.github.com/
>> LinkedIn: http://se.linkedin.com/in/egonw
>> Blog: http://chem-bla-ics.blogspot.com/
>> PubList: http://www.citeulike.org/user/egonw/tag/papers
>> ORCID: -0001-7542-0286
>> ImpactStory: https://impactstory.org/EgonWillighagen
>
>
>
> --
> E.L. Willighagen
> Department of Bioinformatics - BiGCaT
> Maastricht University (http://www.bigcat.unimaas.nl/)
> Homepage: http://egonw.github.com/
> LinkedIn: http://se.linkedin.com/in/egonw
> Blog: http://chem-bla-ics.blogspot.com/
> PubList: http://www.citeulike.org/user/egonw/tag/papers
> ORCID: -0001-7542-0286
> ImpactStory: https://impactstory.org/EgonWillighagen



-- 
E.L. Willighagen
Department of Bioinformatics - BiGCaT
Maastricht University (http://www.bigcat.unimaas.nl/)
Homepage: http://egonw.github.com/
LinkedIn: http://se.linkedin.com/in/egonw
Blog: http://chem-bla-ics.blogspot.com/
PubList: http://www.citeulike.org/user/egonw/tag/papers
ORCID: -0001-7542-0286
ImpactStory: https://impactstory.org/EgonWillighagen

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Status and ETA External ID conversion

2016-03-05 Thread Egon Willighagen
Mmm... I previously added a few chemical identifiers, like KEGG,
ChEBI, DrugBank, but I cannot find them anymore... :/

Egon

On Sat, Mar 5, 2016 at 3:16 PM, Egon Willighagen
 wrote:
> Hi Lydia, all,
>
> On Sat, Mar 5, 2016 at 2:54 PM, Markus Krötzsch
>  wrote:
>> On 05.03.2016 14:45, Lydia Pintscher wrote:
>>> Give it another 2 to 3 weeks and it'll get there. More and more editors
>>> are exposed to the separation in the UI now and start noticing the ones
>>> that intuitively should be moved into the identifier section.
>>
>> Ok, let's see what happens. I am not saying that the other criteria applied
>> now in the discussions are bad. It's just another use of the datatype than I
>> would have expected.
>
> I'm one of the people who noticed the separation and indeed wondered
> why some of the chemistry-related identifiers I tagged and added in
> the long lists of identifiers were not included yet...
>
> What is the exact process? Do you just plan to wait longer to see if
> anyone supports/contradicts my tagging? Should I get other Wikidata
> users and contributors to back up my suggestion?
>
> Originally, I though the idea was just to remove/leave/add them in/to
> the list, but people started making comments now. I will do this more
> explicitly now. Also for the IDs I added.
>
> Egon
>
> --
> E.L. Willighagen
> Department of Bioinformatics - BiGCaT
> Maastricht University (http://www.bigcat.unimaas.nl/)
> Homepage: http://egonw.github.com/
> LinkedIn: http://se.linkedin.com/in/egonw
> Blog: http://chem-bla-ics.blogspot.com/
> PubList: http://www.citeulike.org/user/egonw/tag/papers
> ORCID: -0001-7542-0286
> ImpactStory: https://impactstory.org/EgonWillighagen



-- 
E.L. Willighagen
Department of Bioinformatics - BiGCaT
Maastricht University (http://www.bigcat.unimaas.nl/)
Homepage: http://egonw.github.com/
LinkedIn: http://se.linkedin.com/in/egonw
Blog: http://chem-bla-ics.blogspot.com/
PubList: http://www.citeulike.org/user/egonw/tag/papers
ORCID: -0001-7542-0286
ImpactStory: https://impactstory.org/EgonWillighagen

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Status and ETA External ID conversion

2016-03-05 Thread Egon Willighagen
On Sat, Mar 5, 2016 at 3:25 PM, Lydia Pintscher
 wrote:
> On Sat, Mar 5, 2016 at 3:17 PM Egon Willighagen 
>> What is the exact process? Do you just plan to wait longer to see if
>> anyone supports/contradicts my tagging? Should I get other Wikidata
>> users and contributors to back up my suggestion?
>
> Add them to the list Katie linked if you think they should be converted. We
> wait a bit to see if anyone disagrees and I also do a quick sanity check for
> each property myself before conversion.

I am adding comments for now. I am also looking at the comments for
what it takes to be "identifier":

https://www.wikidata.org/wiki/User:Addshore/Identifiers#Characteristics_of_external_identifiers

What is the resolution in these? There are some strong, often
contradiction, opinions...

For example, the uniqueness requirement is interesting... if an
identifier must be unique for a single Wikidata entry, this is
effectively disqualifying most identifiers used in the life
sciences... simply because Wikidata rarely has the exact same concept
in Wikidata as it has in the remote database.

I'm sure we can give examples from any life science field, but
consider a gene: the concept of a gene in Wikidata is not like a gene
sequence in a DNA sequence database. Hence, an identifier from that
database could not be linked as "identifier" to that Wikidata entry.

Same for most identifiers for small organic compounds (like drugs,
metabolites, etc). I already commented on CAS (P231) and InChI (P234),
both are used as identifier, but none are unique to concepts used as
"types" in Wikidata. The CAS for formaldehyde and formaline is
identical. The InChI may be unique, but only of you strongly type the
definition of a chemical graph instead of a substance (as is now)...
etc.

So, in order to make a decision which chemical identifiers should be
marked as "identifier" type depends on resolution of those required
characteristics...

Can you please inform me about the state of those characteristics
(accepted or declined)?

Egon

> Cheers
> Lydia
> --
> Lydia Pintscher - http://about.me/lydia.pintscher
> Product Manager for Wikidata
>
> Wikimedia Deutschland e.V.
> Tempelhofer Ufer 23-24
> 10963 Berlin
> www.wikimedia.de
>
> Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
>
> Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
> der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für
> Körperschaften I Berlin, Steuernummer 27/029/42207.
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>



-- 
E.L. Willighagen
Department of Bioinformatics - BiGCaT
Maastricht University (http://www.bigcat.unimaas.nl/)
Homepage: http://egonw.github.com/
LinkedIn: http://se.linkedin.com/in/egonw
Blog: http://chem-bla-ics.blogspot.com/
PubList: http://www.citeulike.org/user/egonw/tag/papers
ORCID: -0001-7542-0286
ImpactStory: https://impactstory.org/EgonWillighagen

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Status and ETA External ID conversion

2016-03-05 Thread Egon Willighagen
Hi Lydia, all,

On Sat, Mar 5, 2016 at 2:54 PM, Markus Krötzsch
 wrote:
> On 05.03.2016 14:45, Lydia Pintscher wrote:
>> Give it another 2 to 3 weeks and it'll get there. More and more editors
>> are exposed to the separation in the UI now and start noticing the ones
>> that intuitively should be moved into the identifier section.
>
> Ok, let's see what happens. I am not saying that the other criteria applied
> now in the discussions are bad. It's just another use of the datatype than I
> would have expected.

I'm one of the people who noticed the separation and indeed wondered
why some of the chemistry-related identifiers I tagged and added in
the long lists of identifiers were not included yet...

What is the exact process? Do you just plan to wait longer to see if
anyone supports/contradicts my tagging? Should I get other Wikidata
users and contributors to back up my suggestion?

Originally, I though the idea was just to remove/leave/add them in/to
the list, but people started making comments now. I will do this more
explicitly now. Also for the IDs I added.

Egon

-- 
E.L. Willighagen
Department of Bioinformatics - BiGCaT
Maastricht University (http://www.bigcat.unimaas.nl/)
Homepage: http://egonw.github.com/
LinkedIn: http://se.linkedin.com/in/egonw
Blog: http://chem-bla-ics.blogspot.com/
PubList: http://www.citeulike.org/user/egonw/tag/papers
ORCID: -0001-7542-0286
ImpactStory: https://impactstory.org/EgonWillighagen

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] WDQS stability

2016-02-05 Thread Egon Willighagen
On Fri, Feb 5, 2016 at 10:57 AM, Markus Kroetzsch
 wrote:
> Thanks for the update. Maybe this is a good opportunity to say that you and
> everybody involved in WDQS are doing a tremendous job in maintaining this
> service. Even with the small glitches in the last week, this is still one of
> the most reliable public SPARQL endpoints that I have seen. This is not a
> small achievement, considering how the load is continuously shifting and
> changing (e.g., if someone announces a tool that queries the hitherto
> neglected "GAS Service" every time that a user clicks on the link!).

Very much agreed!

Egon


-- 
E.L. Willighagen
Department of Bioinformatics - BiGCaT
Maastricht University (http://www.bigcat.unimaas.nl/)
Homepage: http://egonw.github.com/
LinkedIn: http://se.linkedin.com/in/egonw
Blog: http://chem-bla-ics.blogspot.com/
PubList: http://www.citeulike.org/user/egonw/tag/papers
ORCID: -0001-7542-0286
ImpactStory: https://impactstory.org/EgonWillighagen

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] property value max lenght? (aka: why doesn't this InChI fit??)

2016-01-16 Thread Egon Willighagen
On Fri, Jan 15, 2016 at 2:24 PM, Lydia Pintscher
 wrote:
> There is currently a character limit in place for everything (labels,
> description, values, etc). The reason is that we need to make sure
> people don't start entering long text that in the end again isn't
> machine-readable. The property you mention is currently scheduled to
> be converted to the new identifier datatype. What we could consider is
> increasing the length for values allowed in this particular datatype.

For ChEMBL I have used 1024 in the past for InChIs. That should do for now.

Egon

-- 
E.L. Willighagen
Department of Bioinformatics - BiGCaT
Maastricht University (http://www.bigcat.unimaas.nl/)
Homepage: http://egonw.github.com/
LinkedIn: http://se.linkedin.com/in/egonw
Blog: http://chem-bla-ics.blogspot.com/
PubList: http://www.citeulike.org/user/egonw/tag/papers
ORCID: -0001-7542-0286
ImpactStory: https://impactstory.org/EgonWillighagen

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Help with SPARQL queries

2016-01-10 Thread Egon Willighagen
Daniel,

On Sun, Jan 10, 2016 at 3:53 AM, Daniel Mietchen
 wrote:
> What I want now is a list of (ideally) statements (or - less ideally -
> items with such statements) that have references of one of the types
> given in (1) and (2).

It would look something like:

PREFIX wd: 
PREFIX wdt: 
PREFIX prov: 
PREFIX pr: 

SELECT ?statement ?PMID ?PMCID WHERE {
  ?statement prov:wasDerivedFrom/pr:P248 ?paper .
  ?paper wdt:P31 wd:Q13442814 .
  OPTIONAL { ?paper wdt:P698 ?PMID . }
  OPTIONAL { ?paper wdt:P932 ?PMCID . }
}

https://query.wikidata.org/#PREFIX%20wd%3A%20%3Chttp%3A%2F%2Fwww.wikidata.org%2Fentity%2F%3E%0APREFIX%20wdt%3A%20%3Chttp%3A%2F%2Fwww.wikidata.org%2Fprop%2Fdirect%2F%3E%0APREFIX%20prov%3A%20%3Chttp%3A%2F%2Fwww.w3.org%2Fns%2Fprov%23%3E%0APREFIX%20pr%3A%20%3Chttp%3A%2F%2Fwww.wikidata.org%2Fprop%2Freference%2F%3E%0A%0ASELECT%20%3Fstatement%20%3FPMID%20%3FPMCID%20WHERE%20%7B%0A%20%20%3Fstatement%20prov%3AwasDerivedFrom%2Fpr%3AP248%20%3Fpaper%20.%0A%20%20%3Fpaper%20wdt%3AP31%20wd%3AQ13442814%20.%0A%20%20OPTIONAL%20%7B%20%3Fpaper%20wdt%3AP698%20%3FPMID%20.%20%7D%0A%20%20OPTIONAL%20%7B%20%3Fpaper%20wdt%3AP932%20%3FPMCID%20.%20%7D%0A%7D

Then you can filter for your two options with:

FILTER (BOUND(?PMID))
FILTER (!BOUND(?PMID))

Grtz,

Egon

-- 
E.L. Willighagen
Department of Bioinformatics - BiGCaT
Maastricht University (http://www.bigcat.unimaas.nl/)
Homepage: http://egonw.github.com/
LinkedIn: http://se.linkedin.com/in/egonw
Blog: http://chem-bla-ics.blogspot.com/
PubList: http://www.citeulike.org/user/egonw/tag/papers
ORCID: -0001-7542-0286
ImpactStory: https://impactstory.org/EgonWillighagen

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] query.wikidata.org and wikibase:Statement ?

2016-01-08 Thread Egon Willighagen
Dear Stas,

thanks for your reply!

On Sat, Jan 9, 2016 at 7:29 AM, Stas Malyshev  wrote:
>> statements (about 2.5M) and on the question if SPARQL could list all
>> entries in Wikidata that do not have statements. I played a bit with
>
> Technically, it could, but since it's so many of them, they might not
> finish in time. The problem is that since there's no indexes on
> something not existing, what probably happens is that the database would
> go entity by entity trying to find one that doesn't have a statement,
> and that is slow. I think there may be a bug with LIMIT implementation,
> or maybe it's just indeed taking too long...

Yeah, ideally LIMIT would make it stop searching when it found that
many hits... but it indeed may really be trying that.

>> combinations of OPTIONAL and FILTER-BOUND and FILTER NOT EXIST...
>> something like:
>>
>> PREFIX wikibase: 
>> SELECT DISTINCT ?entry ?label ?statement WHERE {
>>   ?entry rdfs:label ?label . FILTER (lang(?label) = "en")
>>   FILTER NOT EXISTS {
>> ?statement ?prop ?entry ;
>>   wikibase:rank ?rank .
>>   }
>> } LIMIT 5
>
> This query also seems a bit wrong since it looks for ?entry as object,
> not subject.

There exists predicates between the ?entry and the statement in both
directions. I played a bit with both.

>> But there was something else I noted... statements are not typed...
>> that would probably kick in some index, rather than the above query,
>> and the documentation actually speaks about wikibase:Statement [1] but
>> if I search for anything rdf:type-d as such, then it finds nothing in
>> the SPARQL end point:
>
> Right, please check out:
> https://www.mediawiki.org/wiki/Wikibase/Indexing/RDF_Dump_Format#WDQS_data_differences
>
> wikibase:Statement is ommitted from the database for performance
> reasons.

Ah, I was guessing something like that; thanks for the confirmation.

> You could still match statements by URL by converting them to
> str() and then using substr() function, but that probably wouldn't help
> much since there's a lot of statements so the filtering would not be
> very selective.

Indeed. Well, maybe I give this a try. I'll let you know if I got it working...

Thanks!

Egon

-- 
E.L. Willighagen
Department of Bioinformatics - BiGCaT
Maastricht University (http://www.bigcat.unimaas.nl/)
Homepage: http://egonw.github.com/
LinkedIn: http://se.linkedin.com/in/egonw
Blog: http://chem-bla-ics.blogspot.com/
PubList: http://www.citeulike.org/user/egonw/tag/papers
ORCID: -0001-7542-0286
ImpactStory: https://impactstory.org/EgonWillighagen

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


[Wikidata] query.wikidata.org and wikibase:Statement ?

2016-01-08 Thread Egon Willighagen
Hi all,

after a tweet from Magnus [0] we talked a bit about entries without
statements (about 2.5M) and on the question if SPARQL could list all
entries in Wikidata that do not have statements. I played a bit with
combinations of OPTIONAL and FILTER-BOUND and FILTER NOT EXIST...
something like:

PREFIX wikibase: 
SELECT DISTINCT ?entry ?label ?statement WHERE {
  ?entry rdfs:label ?label . FILTER (lang(?label) = "en")
  FILTER NOT EXISTS {
?statement ?prop ?entry ;
  wikibase:rank ?rank .
  }
} LIMIT 5

These queries tend to fail. Now, I'm not at all a SPARQL wizard and
maybe there is a simple way to not run into the time outs that I do
now.

But there was something else I noted... statements are not typed...
that would probably kick in some index, rather than the above query,
and the documentation actually speaks about wikibase:Statement [1] but
if I search for anything rdf:type-d as such, then it finds nothing in
the SPARQL end point:

https://query.wikidata.org/#PREFIX%20wikibase%3A%20%3Chttp%3A%2F%2Fwikiba.se%2Fontology%23%3E%0Aselect%20%3Fstat%20where%20%7B%0A%20%20%3Fstat%20%3Fpred%20wikibase%3AStatement%0A%7D%20limit%205

Did this typing get lost at some point or is the documentation outdated?

Greetings,

Egon

0.https://twitter.com/MagnusManske/status/685499058228715520
1.https://www.mediawiki.org/wiki/Wikibase/Indexing/RDF_Dump_Format#Statement_types

-- 
E.L. Willighagen
Department of Bioinformatics - BiGCaT
Maastricht University (http://www.bigcat.unimaas.nl/)
Homepage: http://egonw.github.com/
LinkedIn: http://se.linkedin.com/in/egonw
Blog: http://chem-bla-ics.blogspot.com/
PubList: http://www.citeulike.org/user/egonw/tag/papers
ORCID: -0001-7542-0286
ImpactStory: https://impactstory.org/EgonWillighagen

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


[Wikidata] property value max lenght? (aka: why doesn't this InChI fit??)

2015-12-28 Thread Egon Willighagen
Hi all,

I just discovered that there seems to be a limit on the length of
property values... some properties for compounds, however, are longer,
the InChI being a good example... 400 chars is not enough for some
compounds in Wikipedia, like teixobactin (Q18720369)

This length is not defined by the property definition itself (InChI
(P234)), so I am wondering if this max length is system wide, or if
there are options to vary it? A max length of 1024 is better, though
still would not allow InChIs values for all compounds...

Looking forward to hearing from you, and a happy new year,

Egon

-- 
E.L. Willighagen
Department of Bioinformatics - BiGCaT
Maastricht University (http://www.bigcat.unimaas.nl/)
Homepage: http://egonw.github.com/
LinkedIn: http://se.linkedin.com/in/egonw
Blog: http://chem-bla-ics.blogspot.com/
PubList: http://www.citeulike.org/user/egonw/tag/papers
ORCID: -0001-7542-0286
ImpactStory: https://impactstory.org/EgonWillighagen

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Units are live! \o/

2015-09-19 Thread Egon Willighagen
On Tue, Sep 15, 2015 at 10:12 PM, Michael Peel  wrote:

> It seems to assume a default uncertainty on values, though: I just added
> the elevation above sea level to:
> https://www.wikidata.org/wiki/Q1513315
> just specifying the central value, and it assumes that this value is +-
> 0.1 km - which isn't a good assumption to make...
>

Did you mean to write 2.80 km about sea level? Then the error would be 0.01
km  I am guessing the uncertainty follows the scientific notation of
the number... 2.8 has the numeric uncertainty of (about) +/- 0.1...

That sounds like a reasonable approach to me...

Egon

-- 
E.L. Willighagen
Department of Bioinformatics - BiGCaT
Maastricht University (http://www.bigcat.unimaas.nl/)
Homepage: http://egonw.github.com/
LinkedIn: http://se.linkedin.com/in/egonw
Blog: http://chem-bla-ics.blogspot.com/
PubList: http://www.citeulike.org/user/egonw/tag/papers
ORCID: -0001-7542-0286
ImpactStory: https://impactstory.org/EgonWillighagen
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Help needed for Freebase to Wikidata migration

2015-06-17 Thread Egon Willighagen
On Wed, Jun 17, 2015 at 4:01 PM, Andy Mabbett  wrote:
> On 16 June 2015 at 19:43, Thomas Pellissier-Tanon  wrote:
>
>> I've just added to the page properties for chemistry:
>> https://www.wikidata.org/wiki/Wikidata:WikiProject_Freebase/Mapping#Chemistry
>
> It seems to me that, before the dats can be imported, we'll need to
> add several new Wikidata properties. Was that your intention?

Yeah, I noted that too. I think this should be discussed in the
Wikichemistry project, and we should also look at what Marco sent
around...

Egon

-- 
E.L. Willighagen
Department of Bioinformatics - BiGCaT
Maastricht University (http://www.bigcat.unimaas.nl/)
Homepage: http://egonw.github.com/
LinkedIn: http://se.linkedin.com/in/egonw
Blog: http://chem-bla-ics.blogspot.com/
PubList: http://www.citeulike.org/user/egonw/tag/papers
ORCID: -0001-7542-0286
ImpactStory: https://impactstory.org/EgonWillighagen

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Help needed for Freebase to Wikidata migration

2015-06-16 Thread Egon Willighagen
Dear Thomas,

I am personally interested in the chemistry bits, and checked the
list, but did not see the "chemistry" domain from Freebase:

https://www.freebase.com/chemistry/chemical_compound?schema=

Is that just a matter of timing, or is it left out because WP/WD
already has a good coverage?

Egon

On Tue, Jun 16, 2015 at 1:24 AM, Thomas Pellissier-Tanon
 wrote:
> Hey everyone,
>
> As you may already know, I am currently working on the importation of
> Freebase content into Wikidata [1] using the primary source tool [2].
>
> One of the big challenges of the migration is to build a good mapping of the
> properties of Freebase to Wikidata ones.There are a few thousand of
> properties so it is a task too big to be done alone. Your help is far more
> than welcome for this task on this page:
> https://www.wikidata.org/wiki/Wikidata:WikiProject_Freebase/Mapping
>
> Cheers,
>
> Thomas
>
> [1] https://www.wikidata.org/wiki/Wikidata:WikiProject_Freebase
> [2] https://www.wikidata.org/wiki/Wikidata:Primary_sources_tool
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>



-- 
E.L. Willighagen
Department of Bioinformatics - BiGCaT
Maastricht University (http://www.bigcat.unimaas.nl/)
Homepage: http://egonw.github.com/
LinkedIn: http://se.linkedin.com/in/egonw
Blog: http://chem-bla-ics.blogspot.com/
PubList: http://www.citeulike.org/user/egonw/tag/papers
ORCID: -0001-7542-0286
ImpactStory: https://impactstory.org/EgonWillighagen

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata