Re: [Wikidata] Status and ETA External ID conversion

2016-03-05 Thread Maarten Dammers

Hi Luca,

Op 5-3-2016 om 16:45 schreef Luca Martinelli:


Point taken, I apologise for using too dramatic tones.
Looks like more people are eager to get this over with and can't wait to 
get everything converted

Nonetheless, I stick to the point that probably a ">99% unique
identifier" threshold is too high. Just to make another example
(disclaimer: I asked for this property since it is yet another
catalogue that my institution runs), P1949 has not been converted to
identifier because it has "only 98.82% unique out of 507 uses", that
translates in only *six* cases out of 505 items which have two P1949
identifiers.
That's correct. As I said in my previous email: We're first doing the 
easy properties. You can see the easy properties at 
https://www.wikidata.org/wiki/User:ArthurPSmith/Identifiers/1 . The easy 
ones are the ones that have 99%+ single value and 99%+ unique. Compare 
that with https://www.wikidata.org/wiki/User:Addshore/Identifiers/1 and 
you'll notice we still have loads of easy ones we have to process (the 
unchecked list is still quite long).


Once we get those out of the way, we'll get to the more difficult ones. 
I prefer quality over speed here. I don't expect any problems with 
converting P1949.


Maarten


___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Status and ETA External ID conversion

2016-03-05 Thread Gerard Meijssen
Hoi,
Lets take things slowly. It is vital that we get Wikipedia well connected
first. Plenty of challenges there. If we concentrate on what Wikipedia
needs in all its languages, we will get a perspective of what is notable
for us. Other sources have their criteria..
Thanks,
 GerardM

On 5 March 2016 at 19:56, Andy Mabbett  wrote:

> On 5 March 2016 at 16:15, Markus Krötzsch 
> wrote:
>
> > I agree with Egon that the uniqueness requirement is rather weird. What
> it
> > means is that a thing is only considered an "identifier" if it points to
> a
> > database that uses a similar granularity for modelling the world as
> > Wikidata. If the external database is more fine-grained than Wikidata
> > (several ids for one item), then it is not a valid "identifier",
> according
> > to the uniqueness idea.
>
> Then we should create a Wikidata item for each concept on that
> external database.
>
> --
> Andy Mabbett
> @pigsonthewing
> http://pigsonthewing.org.uk
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Status and ETA External ID conversion

2016-03-05 Thread Andy Mabbett
On 5 March 2016 at 16:15, Markus Krötzsch  wrote:

> I agree with Egon that the uniqueness requirement is rather weird. What it
> means is that a thing is only considered an "identifier" if it points to a
> database that uses a similar granularity for modelling the world as
> Wikidata. If the external database is more fine-grained than Wikidata
> (several ids for one item), then it is not a valid "identifier", according
> to the uniqueness idea.

Then we should create a Wikidata item for each concept on that
external database.

-- 
Andy Mabbett
@pigsonthewing
http://pigsonthewing.org.uk

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Status and ETA External ID conversion

2016-03-05 Thread Andy Mabbett
On 5 March 2016 at 14:25, Lydia Pintscher  wrote:

> I also do a quick sanity check for each property myself before conversion.

You might also like to do a sanity check on those marked as not
suitable for conversion.

-- 
Andy Mabbett
@pigsonthewing
http://pigsonthewing.org.uk

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Status and ETA External ID conversion

2016-03-05 Thread James Heald

Just do them all, as fast as the bot can go.

Revert them /if/ somebody complains (which is unlikely).

Make this a process of having to contract out for an identifier /not/ to 
be done, rather than having to contract in for it to be done.



Personally, I am rather more interested in what happens next, after the 
datatype-renaming stage is done.


How does the external-ID datatype then evolve?

How does it cope with a external ID possibly having a short-form 
representation, a URL for humans (currently specified by P1630 for the 
group as a whole), a URL for RDF (currently specified by P1921 for the 
group as a whole), also sometimes a locally preferred name, or a locally 
disambiguated name in the external source.


What becomes its wdt: value for SPARQL?

What other object-values will get hung off its detailed statement form ?

What will specified using qualifiers?


Some more clarifications of current forward thinking on this might also 
help with people's concerns about how to respond to departures from 
strict 1-to-1-ness in the mappings (whether many-to-one or one-to-many).



  -- James.




On 05/03/2016 16:15, Markus Krötzsch wrote:

Hi,

I agree with Egon that the uniqueness requirement is rather weird. What
it means is that a thing is only considered an "identifier" if it points
to a database that uses a similar granularity for modelling the world as
Wikidata. If the external database is more fine-grained than Wikidata
(several ids for one item), then it is not a valid "identifier",
according to the uniqueness idea. I wonder what good this may do. In
particular, anybody who cares about uniqueness can easily determine it
from the data without any property type that says this.

Markus


On 05.03.2016 15:35, Egon Willighagen wrote:

On Sat, Mar 5, 2016 at 3:25 PM, Lydia Pintscher
 wrote:

On Sat, Mar 5, 2016 at 3:17 PM Egon Willighagen


What is the exact process? Do you just plan to wait longer to see if
anyone supports/contradicts my tagging? Should I get other Wikidata
users and contributors to back up my suggestion?


Add them to the list Katie linked if you think they should be
converted. We
wait a bit to see if anyone disagrees and I also do a quick sanity
check for
each property myself before conversion.


I am adding comments for now. I am also looking at the comments for
what it takes to be "identifier":

https://www.wikidata.org/wiki/User:Addshore/Identifiers#Characteristics_of_external_identifiers


What is the resolution in these? There are some strong, often
contradiction, opinions...

For example, the uniqueness requirement is interesting... if an
identifier must be unique for a single Wikidata entry, this is
effectively disqualifying most identifiers used in the life
sciences... simply because Wikidata rarely has the exact same concept
in Wikidata as it has in the remote database.

I'm sure we can give examples from any life science field, but
consider a gene: the concept of a gene in Wikidata is not like a gene
sequence in a DNA sequence database. Hence, an identifier from that
database could not be linked as "identifier" to that Wikidata entry.

Same for most identifiers for small organic compounds (like drugs,
metabolites, etc). I already commented on CAS (P231) and InChI (P234),
both are used as identifier, but none are unique to concepts used as
"types" in Wikidata. The CAS for formaldehyde and formaline is
identical. The InChI may be unique, but only of you strongly type the
definition of a chemical graph instead of a substance (as is now)...
etc.

So, in order to make a decision which chemical identifiers should be
marked as "identifier" type depends on resolution of those required
characteristics...

Can you please inform me about the state of those characteristics
(accepted or declined)?

Egon


Cheers
Lydia
--
Lydia Pintscher - http://about.me/lydia.pintscher
Product Manager for Wikidata

Wikimedia Deutschland e.V.
Tempelhofer Ufer 23-24
10963 Berlin
www.wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.

Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
unter
der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/029/42207.

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata








___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata



___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Status and ETA External ID conversion

2016-03-05 Thread Markus Krötzsch

Hi,

I agree with Egon that the uniqueness requirement is rather weird. What 
it means is that a thing is only considered an "identifier" if it points 
to a database that uses a similar granularity for modelling the world as 
Wikidata. If the external database is more fine-grained than Wikidata 
(several ids for one item), then it is not a valid "identifier", 
according to the uniqueness idea. I wonder what good this may do. In 
particular, anybody who cares about uniqueness can easily determine it 
from the data without any property type that says this.


Markus


On 05.03.2016 15:35, Egon Willighagen wrote:

On Sat, Mar 5, 2016 at 3:25 PM, Lydia Pintscher
 wrote:

On Sat, Mar 5, 2016 at 3:17 PM Egon Willighagen 

What is the exact process? Do you just plan to wait longer to see if
anyone supports/contradicts my tagging? Should I get other Wikidata
users and contributors to back up my suggestion?


Add them to the list Katie linked if you think they should be converted. We
wait a bit to see if anyone disagrees and I also do a quick sanity check for
each property myself before conversion.


I am adding comments for now. I am also looking at the comments for
what it takes to be "identifier":

https://www.wikidata.org/wiki/User:Addshore/Identifiers#Characteristics_of_external_identifiers

What is the resolution in these? There are some strong, often
contradiction, opinions...

For example, the uniqueness requirement is interesting... if an
identifier must be unique for a single Wikidata entry, this is
effectively disqualifying most identifiers used in the life
sciences... simply because Wikidata rarely has the exact same concept
in Wikidata as it has in the remote database.

I'm sure we can give examples from any life science field, but
consider a gene: the concept of a gene in Wikidata is not like a gene
sequence in a DNA sequence database. Hence, an identifier from that
database could not be linked as "identifier" to that Wikidata entry.

Same for most identifiers for small organic compounds (like drugs,
metabolites, etc). I already commented on CAS (P231) and InChI (P234),
both are used as identifier, but none are unique to concepts used as
"types" in Wikidata. The CAS for formaldehyde and formaline is
identical. The InChI may be unique, but only of you strongly type the
definition of a chemical graph instead of a substance (as is now)...
etc.

So, in order to make a decision which chemical identifiers should be
marked as "identifier" type depends on resolution of those required
characteristics...

Can you please inform me about the state of those characteristics
(accepted or declined)?

Egon


Cheers
Lydia
--
Lydia Pintscher - http://about.me/lydia.pintscher
Product Manager for Wikidata

Wikimedia Deutschland e.V.
Tempelhofer Ufer 23-24
10963 Berlin
www.wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.

Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/029/42207.

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata








___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Status and ETA External ID conversion

2016-03-05 Thread Luca Martinelli
2016-03-05 16:09 GMT+01:00 Maarten Dammers :
> Hi Luca,
>
> Op 5-3-2016 om 14:30 schreef Luca Martinelli:
>>
>> Probably the threshold we set up for the conversion is too high, and
>> this might be one of the causes why the whole process has slowed down
>> to a dying pace.
>
> You call
> https://www.wikidata.org/wiki/Special:Contributions/Maintenance_script a
> dying pace?
>
> Instead of complaining here people should participate in
> https://www.wikidata.org/wiki/User:Addshore/Identifiers/0 . Still plenty of
> easy properties that are clearly distinct, unique and have an external url.
> It doesn't make sense to discus the more complicated cases if we haven't
> gotten the easy cases out of the way yet.

Point taken, I apologise for using too dramatic tones.

Nonetheless, I stick to the point that probably a ">99% unique
identifier" threshold is too high. Just to make another example
(disclaimer: I asked for this property since it is yet another
catalogue that my institution runs), P1949 has not been converted to
identifier because it has "only 98.82% unique out of 507 uses", that
translates in only *six* cases out of 505 items which have two P1949
identifiers.

More, I did not intervene because of my blatant conflict of interest
AND because I do not know with who discuss this and where, not even
the general "what is an identifier" discussion. Probably there is a
place where this discussion is going on, and I apologise again for not
knowing (though I have some pretty good excuses), and I'm serious when
I say that I'd be thankful to you if you please can point me in the
general direction of where this is happening. :)
(https://www.wikidata.org/wiki/User:Addshore/Identifiers maybe? Though
that discussion seems to be pretty blocked)

L.

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Status and ETA External ID conversion

2016-03-05 Thread Maarten Dammers

Hi Luca,

Op 5-3-2016 om 14:30 schreef Luca Martinelli:

Probably the threshold we set up for the conversion is too high, and
this might be one of the causes why the whole process has slowed down
to a dying pace.
You call 
https://www.wikidata.org/wiki/Special:Contributions/Maintenance_script a 
dying pace?


Instead of complaining here people should participate in 
https://www.wikidata.org/wiki/User:Addshore/Identifiers/0 . Still plenty 
of easy properties that are clearly distinct, unique and have an 
external url.
It doesn't make sense to discus the more complicated cases if we haven't 
gotten the easy cases out of the way yet.


Maarten


___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Status and ETA External ID conversion

2016-03-05 Thread Egon Willighagen
Never mind. I found these in already done.

Egon

On Sat, Mar 5, 2016 at 3:42 PM, Egon Willighagen
 wrote:
> Mmm... I previously added a few chemical identifiers, like KEGG,
> ChEBI, DrugBank, but I cannot find them anymore... :/
>
> Egon
>
> On Sat, Mar 5, 2016 at 3:16 PM, Egon Willighagen
>  wrote:
>> Hi Lydia, all,
>>
>> On Sat, Mar 5, 2016 at 2:54 PM, Markus Krötzsch
>>  wrote:
>>> On 05.03.2016 14:45, Lydia Pintscher wrote:
 Give it another 2 to 3 weeks and it'll get there. More and more editors
 are exposed to the separation in the UI now and start noticing the ones
 that intuitively should be moved into the identifier section.
>>>
>>> Ok, let's see what happens. I am not saying that the other criteria applied
>>> now in the discussions are bad. It's just another use of the datatype than I
>>> would have expected.
>>
>> I'm one of the people who noticed the separation and indeed wondered
>> why some of the chemistry-related identifiers I tagged and added in
>> the long lists of identifiers were not included yet...
>>
>> What is the exact process? Do you just plan to wait longer to see if
>> anyone supports/contradicts my tagging? Should I get other Wikidata
>> users and contributors to back up my suggestion?
>>
>> Originally, I though the idea was just to remove/leave/add them in/to
>> the list, but people started making comments now. I will do this more
>> explicitly now. Also for the IDs I added.
>>
>> Egon
>>
>> --
>> E.L. Willighagen
>> Department of Bioinformatics - BiGCaT
>> Maastricht University (http://www.bigcat.unimaas.nl/)
>> Homepage: http://egonw.github.com/
>> LinkedIn: http://se.linkedin.com/in/egonw
>> Blog: http://chem-bla-ics.blogspot.com/
>> PubList: http://www.citeulike.org/user/egonw/tag/papers
>> ORCID: -0001-7542-0286
>> ImpactStory: https://impactstory.org/EgonWillighagen
>
>
>
> --
> E.L. Willighagen
> Department of Bioinformatics - BiGCaT
> Maastricht University (http://www.bigcat.unimaas.nl/)
> Homepage: http://egonw.github.com/
> LinkedIn: http://se.linkedin.com/in/egonw
> Blog: http://chem-bla-ics.blogspot.com/
> PubList: http://www.citeulike.org/user/egonw/tag/papers
> ORCID: -0001-7542-0286
> ImpactStory: https://impactstory.org/EgonWillighagen



-- 
E.L. Willighagen
Department of Bioinformatics - BiGCaT
Maastricht University (http://www.bigcat.unimaas.nl/)
Homepage: http://egonw.github.com/
LinkedIn: http://se.linkedin.com/in/egonw
Blog: http://chem-bla-ics.blogspot.com/
PubList: http://www.citeulike.org/user/egonw/tag/papers
ORCID: -0001-7542-0286
ImpactStory: https://impactstory.org/EgonWillighagen

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Status and ETA External ID conversion

2016-03-05 Thread Egon Willighagen
Mmm... I previously added a few chemical identifiers, like KEGG,
ChEBI, DrugBank, but I cannot find them anymore... :/

Egon

On Sat, Mar 5, 2016 at 3:16 PM, Egon Willighagen
 wrote:
> Hi Lydia, all,
>
> On Sat, Mar 5, 2016 at 2:54 PM, Markus Krötzsch
>  wrote:
>> On 05.03.2016 14:45, Lydia Pintscher wrote:
>>> Give it another 2 to 3 weeks and it'll get there. More and more editors
>>> are exposed to the separation in the UI now and start noticing the ones
>>> that intuitively should be moved into the identifier section.
>>
>> Ok, let's see what happens. I am not saying that the other criteria applied
>> now in the discussions are bad. It's just another use of the datatype than I
>> would have expected.
>
> I'm one of the people who noticed the separation and indeed wondered
> why some of the chemistry-related identifiers I tagged and added in
> the long lists of identifiers were not included yet...
>
> What is the exact process? Do you just plan to wait longer to see if
> anyone supports/contradicts my tagging? Should I get other Wikidata
> users and contributors to back up my suggestion?
>
> Originally, I though the idea was just to remove/leave/add them in/to
> the list, but people started making comments now. I will do this more
> explicitly now. Also for the IDs I added.
>
> Egon
>
> --
> E.L. Willighagen
> Department of Bioinformatics - BiGCaT
> Maastricht University (http://www.bigcat.unimaas.nl/)
> Homepage: http://egonw.github.com/
> LinkedIn: http://se.linkedin.com/in/egonw
> Blog: http://chem-bla-ics.blogspot.com/
> PubList: http://www.citeulike.org/user/egonw/tag/papers
> ORCID: -0001-7542-0286
> ImpactStory: https://impactstory.org/EgonWillighagen



-- 
E.L. Willighagen
Department of Bioinformatics - BiGCaT
Maastricht University (http://www.bigcat.unimaas.nl/)
Homepage: http://egonw.github.com/
LinkedIn: http://se.linkedin.com/in/egonw
Blog: http://chem-bla-ics.blogspot.com/
PubList: http://www.citeulike.org/user/egonw/tag/papers
ORCID: -0001-7542-0286
ImpactStory: https://impactstory.org/EgonWillighagen

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Status and ETA External ID conversion

2016-03-05 Thread Egon Willighagen
On Sat, Mar 5, 2016 at 3:25 PM, Lydia Pintscher
 wrote:
> On Sat, Mar 5, 2016 at 3:17 PM Egon Willighagen 
>> What is the exact process? Do you just plan to wait longer to see if
>> anyone supports/contradicts my tagging? Should I get other Wikidata
>> users and contributors to back up my suggestion?
>
> Add them to the list Katie linked if you think they should be converted. We
> wait a bit to see if anyone disagrees and I also do a quick sanity check for
> each property myself before conversion.

I am adding comments for now. I am also looking at the comments for
what it takes to be "identifier":

https://www.wikidata.org/wiki/User:Addshore/Identifiers#Characteristics_of_external_identifiers

What is the resolution in these? There are some strong, often
contradiction, opinions...

For example, the uniqueness requirement is interesting... if an
identifier must be unique for a single Wikidata entry, this is
effectively disqualifying most identifiers used in the life
sciences... simply because Wikidata rarely has the exact same concept
in Wikidata as it has in the remote database.

I'm sure we can give examples from any life science field, but
consider a gene: the concept of a gene in Wikidata is not like a gene
sequence in a DNA sequence database. Hence, an identifier from that
database could not be linked as "identifier" to that Wikidata entry.

Same for most identifiers for small organic compounds (like drugs,
metabolites, etc). I already commented on CAS (P231) and InChI (P234),
both are used as identifier, but none are unique to concepts used as
"types" in Wikidata. The CAS for formaldehyde and formaline is
identical. The InChI may be unique, but only of you strongly type the
definition of a chemical graph instead of a substance (as is now)...
etc.

So, in order to make a decision which chemical identifiers should be
marked as "identifier" type depends on resolution of those required
characteristics...

Can you please inform me about the state of those characteristics
(accepted or declined)?

Egon

> Cheers
> Lydia
> --
> Lydia Pintscher - http://about.me/lydia.pintscher
> Product Manager for Wikidata
>
> Wikimedia Deutschland e.V.
> Tempelhofer Ufer 23-24
> 10963 Berlin
> www.wikimedia.de
>
> Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
>
> Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
> der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für
> Körperschaften I Berlin, Steuernummer 27/029/42207.
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>



-- 
E.L. Willighagen
Department of Bioinformatics - BiGCaT
Maastricht University (http://www.bigcat.unimaas.nl/)
Homepage: http://egonw.github.com/
LinkedIn: http://se.linkedin.com/in/egonw
Blog: http://chem-bla-ics.blogspot.com/
PubList: http://www.citeulike.org/user/egonw/tag/papers
ORCID: -0001-7542-0286
ImpactStory: https://impactstory.org/EgonWillighagen

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Status and ETA External ID conversion

2016-03-05 Thread Lydia Pintscher
On Sat, Mar 5, 2016 at 3:17 PM Egon Willighagen 
wrote:

> Hi Lydia, all,
>
> On Sat, Mar 5, 2016 at 2:54 PM, Markus Krötzsch
>  wrote:
> > On 05.03.2016 14:45, Lydia Pintscher wrote:
> >> Give it another 2 to 3 weeks and it'll get there. More and more editors
> >> are exposed to the separation in the UI now and start noticing the ones
> >> that intuitively should be moved into the identifier section.
> >
> > Ok, let's see what happens. I am not saying that the other criteria
> applied
> > now in the discussions are bad. It's just another use of the datatype
> than I
> > would have expected.
>
> I'm one of the people who noticed the separation and indeed wondered
> why some of the chemistry-related identifiers I tagged and added in
> the long lists of identifiers were not included yet...
>
> What is the exact process? Do you just plan to wait longer to see if
> anyone supports/contradicts my tagging? Should I get other Wikidata
> users and contributors to back up my suggestion?


Add them to the list Katie linked if you think they should be converted. We
wait a bit to see if anyone disagrees and I also do a quick sanity check
for each property myself before conversion.

Cheers
Lydia
-- 
Lydia Pintscher - http://about.me/lydia.pintscher
Product Manager for Wikidata

Wikimedia Deutschland e.V.
Tempelhofer Ufer 23-24
10963 Berlin
www.wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.

Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/029/42207.
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Status and ETA External ID conversion

2016-03-05 Thread Egon Willighagen
Hi Lydia, all,

On Sat, Mar 5, 2016 at 2:54 PM, Markus Krötzsch
 wrote:
> On 05.03.2016 14:45, Lydia Pintscher wrote:
>> Give it another 2 to 3 weeks and it'll get there. More and more editors
>> are exposed to the separation in the UI now and start noticing the ones
>> that intuitively should be moved into the identifier section.
>
> Ok, let's see what happens. I am not saying that the other criteria applied
> now in the discussions are bad. It's just another use of the datatype than I
> would have expected.

I'm one of the people who noticed the separation and indeed wondered
why some of the chemistry-related identifiers I tagged and added in
the long lists of identifiers were not included yet...

What is the exact process? Do you just plan to wait longer to see if
anyone supports/contradicts my tagging? Should I get other Wikidata
users and contributors to back up my suggestion?

Originally, I though the idea was just to remove/leave/add them in/to
the list, but people started making comments now. I will do this more
explicitly now. Also for the IDs I added.

Egon

-- 
E.L. Willighagen
Department of Bioinformatics - BiGCaT
Maastricht University (http://www.bigcat.unimaas.nl/)
Homepage: http://egonw.github.com/
LinkedIn: http://se.linkedin.com/in/egonw
Blog: http://chem-bla-ics.blogspot.com/
PubList: http://www.citeulike.org/user/egonw/tag/papers
ORCID: -0001-7542-0286
ImpactStory: https://impactstory.org/EgonWillighagen

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Status and ETA External ID conversion

2016-03-05 Thread David Cuenca Tudela
Markus, you are not the only one, I am also skeptical about the criteria
used. For me the main problem is perhaps the misunderstanding that the
"external identifier" label creates, actually what I was expecting was
something more like "external references", a place where to put all the
external sources to wikidata in one place. But we'll see how it goes.

Cheers,
Micru

On Sat, Mar 5, 2016 at 2:54 PM, Markus Krötzsch <
mar...@semantic-mediawiki.org> wrote:

> On 05.03.2016 14:45, Lydia Pintscher wrote:
>
>> On Sat, Mar 5, 2016 at 1:28 PM Markus Krötzsch
>> mailto:mar...@semantic-mediawiki.org>>
>> wrote:
>>
>> Thanks, Katie. I see that the external ID datatype does not work as
>> planed. At least I thought the original idea was to clean up the UI by
>> moving hard-to-understand string IDs to a separate section. From the
>> discussions on these pages, I see that the community uses criteria
>> that
>> are completely unrelated to UI aspects, but have something to do with
>> the degree to which the property encodes a one-to-one mapping. I guess
>> this is also valid, but won't be useful for UI purposes. I will need
>> to
>> use another solution for my case then.
>>
>>
>> Give it another 2 to 3 weeks and it'll get there. More and more editors
>> are exposed to the separation in the UI now and start noticing the ones
>> that intuitively should be moved into the identifier section.
>>
>
> Ok, let's see what happens. I am not saying that the other criteria
> applied now in the discussions are bad. It's just another use of the
> datatype than I would have expected.
>
> Markus
>
>
>> Cheers
>> Lydia
>> --
>> Lydia Pintscher - http://about.me/lydia.pintscher
>> Product Manager for Wikidata
>>
>> Wikimedia Deutschland e.V.
>> Tempelhofer Ufer 23-24
>> 10963 Berlin
>> www.wikimedia.de 
>>
>> Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
>>
>> Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
>> unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das
>> Finanzamt für Körperschaften I Berlin, Steuernummer 27/029/42207.
>>
>>
>> ___
>> Wikidata mailing list
>> Wikidata@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikidata
>>
>>
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>



-- 
Etiamsi omnes, ego non
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Status and ETA External ID conversion

2016-03-05 Thread Markus Krötzsch

On 05.03.2016 14:45, Lydia Pintscher wrote:

On Sat, Mar 5, 2016 at 1:28 PM Markus Krötzsch
mailto:mar...@semantic-mediawiki.org>>
wrote:

Thanks, Katie. I see that the external ID datatype does not work as
planed. At least I thought the original idea was to clean up the UI by
moving hard-to-understand string IDs to a separate section. From the
discussions on these pages, I see that the community uses criteria that
are completely unrelated to UI aspects, but have something to do with
the degree to which the property encodes a one-to-one mapping. I guess
this is also valid, but won't be useful for UI purposes. I will need to
use another solution for my case then.


Give it another 2 to 3 weeks and it'll get there. More and more editors
are exposed to the separation in the UI now and start noticing the ones
that intuitively should be moved into the identifier section.


Ok, let's see what happens. I am not saying that the other criteria 
applied now in the discussions are bad. It's just another use of the 
datatype than I would have expected.


Markus



Cheers
Lydia
--
Lydia Pintscher - http://about.me/lydia.pintscher
Product Manager for Wikidata

Wikimedia Deutschland e.V.
Tempelhofer Ufer 23-24
10963 Berlin
www.wikimedia.de 

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.

Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das
Finanzamt für Körperschaften I Berlin, Steuernummer 27/029/42207.


___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata




___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Status and ETA External ID conversion

2016-03-05 Thread Lydia Pintscher
On Sat, Mar 5, 2016 at 1:28 PM Markus Krötzsch <
mar...@semantic-mediawiki.org> wrote:

> Thanks, Katie. I see that the external ID datatype does not work as
> planed. At least I thought the original idea was to clean up the UI by
> moving hard-to-understand string IDs to a separate section. From the
> discussions on these pages, I see that the community uses criteria that
> are completely unrelated to UI aspects, but have something to do with
> the degree to which the property encodes a one-to-one mapping. I guess
> this is also valid, but won't be useful for UI purposes. I will need to
> use another solution for my case then.
>

Give it another 2 to 3 weeks and it'll get there. More and more editors are
exposed to the separation in the UI now and start noticing the ones that
intuitively should be moved into the identifier section.

Cheers
Lydia
-- 
Lydia Pintscher - http://about.me/lydia.pintscher
Product Manager for Wikidata

Wikimedia Deutschland e.V.
Tempelhofer Ufer 23-24
10963 Berlin
www.wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.

Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/029/42207.
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Status and ETA External ID conversion

2016-03-05 Thread Luca Martinelli
2016-03-05 13:26 GMT+01:00 Markus Krötzsch :
> Thanks, Katie. I see that the external ID datatype does not work as planed.
> At least I thought the original idea was to clean up the UI by moving
> hard-to-understand string IDs to a separate section. From the discussions on
> these pages, I see that the community uses criteria that are completely
> unrelated to UI aspects, but have something to do with the degree to which
> the property encodes a one-to-one mapping. I guess this is also valid, but
> won't be useful for UI purposes. I will need to use another solution for my
> case then.

My2c, sorry if I'm going offtopic.

My impression on some properties is that we're probably
underestimating some problems that are independent from our will, such
as:
* the possibility that the original catalogue might have some
duplicates, and we can actually help the original catalogue to correct
this issue;
* the possibility that the Wikimedia approach and the catalogue's
approach might bring one of the two sides to define something as two
different things, while the other sides comprises it as a whole (for
example, "palace+gardens");
* the possibility that some identifiers *are* standardised, but the
authority did not published a single catalogue, leaving the single
institutes to care for their own catalogue (for example, the
International Standard Identifier for Libraries and Related
Organizations, aka P791);
* and so on.

Particularly the ISIL one is an important example to me, since I work
for the Italian institution that actually is entitled to conduct the
census of Italian libraries and assign the ISIL code to every and each
library in Italy. There is no single world catalogue of that
identifier? I really don't see it as a problem, as long as there it is
at least one national authority that does that job. We're probably
underestimating the fact that not everything has been standardised at
a world level - and that we can live with that just fine.

Probably the threshold we set up for the conversion is too high, and
this might be one of the causes why the whole process has slowed down
to a dying pace.

L.

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Status and ETA External ID conversion

2016-03-05 Thread Markus Krötzsch
Thanks, Katie. I see that the external ID datatype does not work as 
planed. At least I thought the original idea was to clean up the UI by 
moving hard-to-understand string IDs to a separate section. From the 
discussions on these pages, I see that the community uses criteria that 
are completely unrelated to UI aspects, but have something to do with 
the degree to which the property encodes a one-to-one mapping. I guess 
this is also valid, but won't be useful for UI purposes. I will need to 
use another solution for my case then.


Markus

On 05.03.2016 11:20, Katie Filbert wrote:

On Sat, Mar 5, 2016 at 11:14 AM, Markus Krötzsch
mailto:mar...@semantic-mediawiki.org>>
wrote:

Hi,

I noticed that many id properties still use the string datatype
(including extremely frequent ids like
https://www.wikidata.org/wiki/Property:P213 and
https://www.wikidata.org/wiki/Property:P227).

Why is the conversion so slow, and when is it supposed to be completed?


The community is checking each property to verify it should be converted:

https://www.wikidata.org/wiki/User:Addshore/Identifiers/0

https://www.wikidata.org/wiki/User:Addshore/Identifiers/1

https://www.wikidata.org/wiki/User:Addshore/Identifiers/2

I'm sure help is welcome in checking properties.

and then we convert them in batches.

Cheers,
Katie


Cheers,

Markus

___
Wikidata mailing list
Wikidata@lists.wikimedia.org 
https://lists.wikimedia.org/mailman/listinfo/wikidata




--
Katie Filbert
Wikidata Developer

Wikimedia Germany e.V. | Tempelhofer Ufer 23-24, 10963 Berlin
Phone (030) 219 158 26-0

http://wikimedia.de

Wikimedia Germany - Society for the Promotion of free knowledge eV
Entered in the register of Amtsgericht Berlin-Charlottenburg under the
number 23 855 as recognized as charitable by the Inland Revenue for
corporations I Berlin, tax number 27/681/51985.


___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata




___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Status and ETA External ID conversion

2016-03-05 Thread Katie Filbert
On Sat, Mar 5, 2016 at 11:14 AM, Markus Krötzsch <
mar...@semantic-mediawiki.org> wrote:

> Hi,
>
> I noticed that many id properties still use the string datatype (including
> extremely frequent ids like https://www.wikidata.org/wiki/Property:P213
> and https://www.wikidata.org/wiki/Property:P227).
>
> Why is the conversion so slow, and when is it supposed to be completed?
>

The community is checking each property to verify it should be converted:

https://www.wikidata.org/wiki/User:Addshore/Identifiers/0

https://www.wikidata.org/wiki/User:Addshore/Identifiers/1

https://www.wikidata.org/wiki/User:Addshore/Identifiers/2

I'm sure help is welcome in checking properties.

and then we convert them in batches.

Cheers,
Katie



>
> Cheers,
>
> Markus
>
> ___
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>



-- 
Katie Filbert
Wikidata Developer

Wikimedia Germany e.V. | Tempelhofer Ufer 23-24, 10963 Berlin
Phone (030) 219 158 26-0

http://wikimedia.de

Wikimedia Germany - Society for the Promotion of free knowledge eV Entered
in the register of Amtsgericht Berlin-Charlottenburg under the number 23
855 as recognized as charitable by the Inland Revenue for corporations I
Berlin, tax number 27/681/51985.
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


[Wikidata] Status and ETA External ID conversion

2016-03-05 Thread Markus Krötzsch

Hi,

I noticed that many id properties still use the string datatype 
(including extremely frequent ids like 
https://www.wikidata.org/wiki/Property:P213 and 
https://www.wikidata.org/wiki/Property:P227).


Why is the conversion so slow, and when is it supposed to be completed?

Cheers,

Markus

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata