Re: [Wikidata-l] weekly summary #108

2014-05-06 Thread Gerard Meijssen
Hoi,
At this time, we made big progress by having a policy in place whereby
ISO-639-3 defined languages can gain eligibility from the WMF language
committee. Eligibility to allow the addition of labels in Wikidata without
any requirement for localisation as is per the policy for any other
project. At the same time we have a situation where it is technically
possible to have languages enabled for Wikidata only.

The plan is to ask to enable all eligible languages that have an Incubator
presence for Wikidata first. What needs doing is for someone to make a list
of the languages involved. Obviously, we want to see what impact it has.
Combined with the Reasonator, it has a great potential as it does provide
fall back languages that can be configured.

When new languages are requested, it will be ISO-639-3 only as per the
policy. Good arguments will need to be provided because we will not engage
in Wikidata as a "post stamp collection" of any and all languages/
 Consequently, the involvement of native speakers will be an important plus.

If this feels like me "throwing cold water" on the enthusiasm for many more
languages then do understand that Wikidata does not support Wiktionary yet.
When lexical values become possible it is soon enough to revisit things
again.
Thanks,
  GerardM


On 6 May 2014 20:55, P. Blissenbach  wrote:

> Gerard Meijssen  wrote:
>
> > Hoi,Purodha what you say about Ethnologue is very biases,
> > wrong and often hardly relevant.
>
> I am sorry if my contribution was biased. My main goal was to
> warn that there are more than 7000-odd languages, extending
> ISO 639-3 is time consuming, and that we have the BCP47 defining
> language variants in addition to ISO 639.
>
> > When you know your history,
> > Ethnologue was asked if they would bring in their expertise
> > and system in the ISO processes because the existing ISO-639-2
> > was extremely inadequate. When it was included, it became
> > part of an established process whereby experts from national
> > standard bodies decide on the further development. Effectively
> > the role of Ethnologue is one of administrator, not initiator.
>
> Thrue.
>
> > Saying that all the issues about languages is because to Ethnlogue
> > is completely false.
>
> I was not meaning to say that.
>
> > The notion if there are many more languages is
> > very much open to debate. There is no good answer.
>
> Sure, it depends. Also, I do not want to put blame on anyone.
> Naturally, whatever you collect, you start somewhere, it will take time,
> and at some point you have an incomplete, but growing list. That is how
> I see Ethnologue. I keep mailing them data knowing that they are going
> to need their time to verify and process it.
>
> Taking into account what we likely have to use as a definition for
> "language" is, whether or not labels, lexemes, or similar, are spelled
> pronunced, signalled, or syntactically/grammatically put together
> differently
> enough to warrant that we call them distinct from another "language".
> I am well aware that this is a foggy thing and there are many instances
> that can cause controversies.
>
> > When you are interested in looking beyond the ISO-639-3 consider
> > the ISO-639-6. It aims to include any and all language variants
> > and it is not that interested in using the political term what
> > language has become.
>
> I was considering to mention it in my post. I did not, mainly for bevity.
>
> Yet also, I doubt, it's in a useful state already. Last fall or late
> summer,
> it had almost twice as many entries as ISO 639-3, language coverage in my
> main field was as incomplete as ISO 639, it was not publicized in a well
> usable way (Website down since long. Before that, queryable in a
> complicated and inefficient manner for individual entries and small
> sets only. No listings available online. No details beyond language
> names. Good news: the web site is partly online again as of today.)
>
> Yes, I do consider ISO 639-6. I am happy about it's clearer and
> simpler approach to the subject matter, and I am looking forward to using
> it, as its coverage grows.
>
> Purodha
>
>
> ___
> Wikidata-l mailing list
> Wikidata-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata-l
>
___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] weekly summary #108

2014-05-06 Thread James Forrester
On 4 May 2014 13:17, Daniel Kinzler  wrote:

> Am 04.05.2014 09:00, schrieb Lydia Pintscher:
> > On Sun, May 4, 2014 at 1:28 AM, Joe Filceolaire 
> wrote:
> >>
> >> Where are we with fallback languages?
> >
> > The status is that we have a plan for the next steps. I realize it is
> > important but currently not doable in the next say 3 months.
>
> I would like to add some information about why language fallback is not as
> easily done as it may seem. Fallback for *display* is simple enough (as
> reasonator proves) - but we allow editing, which makes this much harder.
>
> Consider the case of a user with their language set to "en-gb", but seeing
> a
> label in "en" due to fallback. What should happen if they click "edit"?
> Which
> label will they be editing, the "en" one or the "en-gb" one? They should
> really
> be able to do both, and the consequences of their edit should be obvious to
> them. When automatic transliteration comes into play, as is the case with
> some
> chinese variants, things become more complex still.
>
> This is not impossible to solve (e.g. by showing edit boxes for all the
> relevant
> variants, with some additional information), but needs careful design. This
> cannot be done overnight.
>

BTW, this is a shared design need in VisualEditor for language variant
support (we should show one item/term because that's what users expect from
read mode, but then how do we show when the user is editing that term which
changes they've made have applied automatically to other variants and which
didn't, and why).

J.
-- 
James D. Forrester
jdforres...@gmail.com
[[Wikipedia:User:Jdforrester|James F.]] (speaking purely in a personal
capacity)
___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] weekly summary #108

2014-05-06 Thread P. Blissenbach
Gerard Meijssen  wrote:

> Hoi,Purodha what you say about Ethnologue is very biases,
> wrong and often hardly relevant.

I am sorry if my contribution was biased. My main goal was to
warn that there are more than 7000-odd languages, extending
ISO 639-3 is time consuming, and that we have the BCP47 defining
language variants in addition to ISO 639.

> When you know your history,
> Ethnologue was asked if they would bring in their expertise
> and system in the ISO processes because the existing ISO-639-2
> was extremely inadequate. When it was included, it became
> part of an established process whereby experts from national
> standard bodies decide on the further development. Effectively
> the role of Ethnologue is one of administrator, not initiator.

Thrue.

> Saying that all the issues about languages is because to Ethnlogue
> is completely false.

I was not meaning to say that.

> The notion if there are many more languages is
> very much open to debate. There is no good answer.

Sure, it depends. Also, I do not want to put blame on anyone.
Naturally, whatever you collect, you start somewhere, it will take time,
and at some point you have an incomplete, but growing list. That is how
I see Ethnologue. I keep mailing them data knowing that they are going
to need their time to verify and process it.

Taking into account what we likely have to use as a definition for
"language" is, whether or not labels, lexemes, or similar, are spelled
pronunced, signalled, or syntactically/grammatically put together differently
enough to warrant that we call them distinct from another "language".
I am well aware that this is a foggy thing and there are many instances
that can cause controversies. 

> When you are interested in looking beyond the ISO-639-3 consider
> the ISO-639-6. It aims to include any and all language variants
> and it is not that interested in using the political term what
> language has become.

I was considering to mention it in my post. I did not, mainly for bevity.

Yet also, I doubt, it's in a useful state already. Last fall or late summer,
it had almost twice as many entries as ISO 639-3, language coverage in my
main field was as incomplete as ISO 639, it was not publicized in a well
usable way (Website down since long. Before that, queryable in a 
complicated and inefficient manner for individual entries and small
sets only. No listings available online. No details beyond language
names. Good news: the web site is partly online again as of today.)

Yes, I do consider ISO 639-6. I am happy about it's clearer and
simpler approach to the subject matter, and I am looking forward to using
it, as its coverage grows.

Purodha


___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] weekly summary #108

2014-05-06 Thread Scott MacLeod
Great, Purodha, GerardM and Wikidatans,

I've gathered together some "Language Code" standardization sources, all
potentially helpful for unfolding good design, here ...


Language Code

Ethnologue
(Ethnologue now uses ISO 639 codes)
http://www.ethnologue.com/browse/codes

ISO 639
(International Organization for Standardization)
http://www.iso.org/iso/home/standards/language_codes.htm

ISO-639-3
(International Organization for Standardization)
http://www-01.sil.org/iso639-3/codes.asp

ISO-639-6 (International Organization for Standardization)
(This aims to include any and all language variants and it is not that
interested in using the political term what language has become).
http://www.geolang.com/iso639-6/

Language Subtag Lookup
(A nice tool maintained by W3C corroborator Richard Ishida to look up
current IANA defined language tags, and their constituents (subtags)).
http://rishida.net/utils/subtags/ .

I've also added these initially to some CC wiki WUaS "Language" pages (see
below), which 7,106+ MIT OCW-centric wiki-school plans will allow for many
more language additions with time.

As one Wikidata focus, probably already explored, it seems to make sense to
engage the ISO 639 codes and standards, since ISO-639-3 and ISO-639-6 seem
to address some of both of your concerns.

Does anyone know how ISO-639-6, for example, allows for, or encodes,
invented, "dead," animal/species' communication (or even computer languages
as "human languages")?

Cheers,
Scott




On Tue, May 6, 2014 at 5:26 AM, P. Blissenbach  wrote:

> Gerard Meijssen  writes:
>
> > Hoi,
> > There are standards that define British English et al.
> > It makes part of the ISO codes. We do not have to invent
> > something like  "ISO 639-3eng".
>
> Indeed.
>
> There is a nice tool maintained by W3C corroborator Richard
> Ishida to look up current IANA defined language tags, and their
> constituents (subtags) at:
>
> http://rishida.net/utils/subtags/
>
> Greetings -- Purodha
>
> ___
> Wikidata-l mailing list
> Wikidata-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata-l
>



-- 
- http://worlduniversity.wikia.com/wiki/Languages
- http://worlduniversity.wikia.com/wiki/LANGUAGE_TEMPLATE
- http://scottmacleod.com/interlingual/worlduniversityandschool.html
- World University and School - like Wikipedia with MIT OpenCourseWare (not
endorsed by MIT OCW) - incorporated as a nonprofit effective April 2010.
___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] When the source says the information provided is dubious

2014-05-06 Thread Thomas Douillard
One alternative would be
XX author *unknow value* with the disputer as a source.

To express uncertainty we could also use a statement which says the author
is *one of *, and
create the appropriate class, although we do not have all the expressive
power right now to say that. Basic set operation like "set union" or "set
complement in another set"
or "disjoint with" could be good for that by the way (unfortunaltely
disjoint with has not really been well accepted by community).


2014-05-06 17:18 GMT+02:00 Joe Filceolaire :

> Having a property with multiple values can mean a number of things:
> * All the values are equally valid e.g. because a work has multiple authors
> * All values are valid but one is preferred - usually the current value
> e.g. when we have population figures back over time or all the kings of
> Denmark.
> * One of the values is shown because it is widely used but is deprecated
> because it is wrong e.g. Beethoven born on 17 December 1770 (that his date
> of baptism so he must have been born a few days earlier).
>
> The case described by Freidrich where we have two (or more values) which
> are both disputed (because they can't both be right) although one value is
> more widely supported then this is harder to represent semantically. I
> would go with adding a 'disputed by' qualifier to BOTH claims and marking
> the more widely accepted value as 'rank:preferred'
>
> But that is just me
>
> Joe
>
> ___
> Wikidata-l mailing list
> Wikidata-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata-l
>
>
___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] When the source says the information provided is dubious

2014-05-06 Thread Gerard Meijssen
Hoi,
In the Netherlands it used to be that people were baptised as soon as
possible after birth. The notion that "he must have been born a few days
earlier" is not necessarily correct.
Thanks,
 GerardM


On 6 May 2014 17:18, Joe Filceolaire  wrote:

> Having a property with multiple values can mean a number of things:
> * All the values are equally valid e.g. because a work has multiple authors
> * All values are valid but one is preferred - usually the current value
> e.g. when we have population figures back over time or all the kings of
> Denmark.
> * One of the values is shown because it is widely used but is deprecated
> because it is wrong e.g. Beethoven born on 17 December 1770 (that his date
> of baptism so he must have been born a few days earlier).
>
> The case described by Freidrich where we have two (or more values) which
> are both disputed (because they can't both be right) although one value is
> more widely supported then this is harder to represent semantically. I
> would go with adding a 'disputed by' qualifier to BOTH claims and marking
> the more widely accepted value as 'rank:preferred'
>
> But that is just me
>
> Joe
>
> ___
> Wikidata-l mailing list
> Wikidata-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata-l
>
>
___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] When the source says the information provided is dubious

2014-05-06 Thread Thomas Douillard
@FriedrichIt would be a poor substitute for a negation implementation :)


@Micru :
It does not mix sources and statements, it is an additional information
about a statement, so qualifiers seems to be the good tool for that. Of
course if someone want to reason with Wikidata datas it requires a special
treatment or semantics, but it seems fine as it would be inline in a text :
"PPP is probably the author of WWW, but the expert EEE disputes that" seems a sentence you could see in a Wikipedia
article which is rather NPOV an cites his sources properly.



2014-05-06 16:49 GMT+02:00 Friedrich Röhrs :

> Hi,
>
> These sort of things could also be modeled with another statement and
> opposite properties.
>
> If there is one Statement with the claim Chopin -- creator_of --> Nr. 17
> with multiple source (Kobylańska and others), another statement with the
> claim Chopin -- not_creator_of --> Nr. 17 with a source (Chomińsk) can be
> added.
>
> I dont know if this sort of properties is wanted though.
>
>
> On Tue, May 6, 2014 at 2:06 PM, David Cuenca  wrote:
>
>> On Tue, May 6, 2014 at 1:37 PM, Thomas Douillard <
>> thomas.douill...@gmail.com> wrote:
>>
>>> We could create a new qualifier like ''contradicted by'' or ''disputed
>>> by''. The sourcs are a problem though as we can source only the totality of
>>> a claim, not only a qualifier of this claim, so we would have to source all
>>> the sources for the claim and it's disputation sources in the source
>>> without order..
>>>
>>
>> I have mixed feelings about that... it is good because it doesn't require
>> any development, it isn't that good because it mixes claim and source...
>> And having a "reference rank" to indicate if the source is "supporting",
>> "against" or "unsure" about the claim seems too much work for the number of
>> times that such feature would be needed
>>
>> Thanks,
>> Micru
>>
>> ___
>> Wikidata-l mailing list
>> Wikidata-l@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikidata-l
>>
>>
>
> ___
> Wikidata-l mailing list
> Wikidata-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata-l
>
>
___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] When the source says the information provided is dubious

2014-05-06 Thread Joe Filceolaire
Having a property with multiple values can mean a number of things:
* All the values are equally valid e.g. because a work has multiple authors
* All values are valid but one is preferred - usually the current value
e.g. when we have population figures back over time or all the kings of
Denmark.
* One of the values is shown because it is widely used but is deprecated
because it is wrong e.g. Beethoven born on 17 December 1770 (that his date
of baptism so he must have been born a few days earlier).

The case described by Freidrich where we have two (or more values) which
are both disputed (because they can't both be right) although one value is
more widely supported then this is harder to represent semantically. I
would go with adding a 'disputed by' qualifier to BOTH claims and marking
the more widely accepted value as 'rank:preferred'

But that is just me

Joe
___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] When the source says the information provided is dubious

2014-05-06 Thread Friedrich Röhrs
Hi,

These sort of things could also be modeled with another statement and
opposite properties.

If there is one Statement with the claim Chopin -- creator_of --> Nr. 17
with multiple source (Kobylańska and others), another statement with the
claim Chopin -- not_creator_of --> Nr. 17 with a source (Chomińsk) can be
added.

I dont know if this sort of properties is wanted though.


On Tue, May 6, 2014 at 2:06 PM, David Cuenca  wrote:

> On Tue, May 6, 2014 at 1:37 PM, Thomas Douillard <
> thomas.douill...@gmail.com> wrote:
>
>> We could create a new qualifier like ''contradicted by'' or ''disputed
>> by''. The sourcs are a problem though as we can source only the totality of
>> a claim, not only a qualifier of this claim, so we would have to source all
>> the sources for the claim and it's disputation sources in the source
>> without order..
>>
>
> I have mixed feelings about that... it is good because it doesn't require
> any development, it isn't that good because it mixes claim and source...
> And having a "reference rank" to indicate if the source is "supporting",
> "against" or "unsure" about the claim seems too much work for the number of
> times that such feature would be needed
>
> Thanks,
> Micru
>
> ___
> Wikidata-l mailing list
> Wikidata-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata-l
>
>
___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] weekly summary #108

2014-05-06 Thread P. Blissenbach
Gerard Meijssen  writes:

> Hoi,
> There are standards that define British English et al.
> It makes part of the ISO codes. We do not have to invent
> something like  "ISO 639-3eng".

Indeed.

There is a nice tool maintained by W3C corroborator Richard
Ishida to look up current IANA defined language tags, and their
constituents (subtags) at:

http://rishida.net/utils/subtags/

Greetings -- Purodha

___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] weekly summary #108

2014-05-06 Thread Gerard Meijssen
Hoi,
Purodha what you say about Ethnologue is very biases, wrong and often
hardly relevant. When you know your history, Ethnologue was asked if they
would bring in their expertise and system in the ISO processes because the
existing ISO-639-2 was extremely inadequate. When it was included, it
became part of an established process whereby experts from national
standard bodies decide on the further development. Effectively the role of
Ethnologue is one of administrator, not initiator.

Saying that all the issues about languages is because to Ethnlogue is
completely false.

The notion if there are many more languages is very much open to debate.
There is no good answer. When you are interested in looking beyond the
ISO-639-3 consider the ISO-639-6. It aims to include any and all language
variants and it is not that interested in using the political term what
language has become.
Thanks,
 GerardM


On 6 May 2014 14:02, P. Blissenbach  wrote:

> "Scott MacLeod"  writes:
>
> > Hi Joe, Magnus, Andrew, GerardM, Jane, Daniel and Wikidatans,
> > Since "Language fallback is not a luxury like it is for
> > British English, it is essential for all the smaller languages.
> > It is what prevents it from being editable / usable" (per GerardM),
> > and in terms of Reasonator, statements, and careful design (DanielK),
> > what are current Wikidata processes to plan eventually for all
> > 7,106 living languages (plus even dead and invented languages)
> > in the world per "Ethnologue: Languages of the World, Seventeenth
> edition"
> > (http://www.ethnologue.com/statistics/size), as people add them, and
> use,
> > for example, the ISO coding system (or similar) for this, to anticipate
> > not yet added languages, and especially for 'smaller' languages
> > that GerardM mentions?
>
> Just FYI, the ISO 639 and Ethnologue are grossly incomplete in their
> coverage of world languages. One must assume some 10 times to 100 times
> more natural languages are currently in use than listed.
>
> Some single additions have been made through the BCP47 and IANA, such as
> "en-GB-scouse" representing the Scouse dialect of British English, or
> "sl-rozaj-lipaw" — the Lipovaz dialect of Resian which is itself a
> variant of Slovenian spoken in Italy. In other fields, due differentiation
> is still lacking. For example, in the swiss Alps, almost ever village in
> ever vallley has its on language variety which are often mutually hardly
> comprehesible, but they all together have only one language code, "gsw",
> wich also covers a large part of Germanies South West and South Eastern
> France and their local language varieties. You can easily look up from
> a map that there are hundreds of cities, towns, villages, valleys, and
> even if only a thenth of them had a language of their own, "gsw" actually
> represts more than 1000 distinct languages. Considerig both spelling AND
> pronunciation, the deserve to be  differenciated.
>
> This is not meant do discourage you, or to say it was not manageable.
> You only need to be aware, that taking care of the few languages currently
> listed in ethnologue will not suffice, and coding them must be expected
> to be a bit more complex, than it appears at first sight.
>
> > In terms of British English (en-gb) and English (en) distinction,
> > why not just code English in Wikidata as "ISO 639-3eng" per
> >
> http://www.ethnologue.com/language/eng[http://www.ethnologue.com/language/eng]
> > as part of a careful design for all languages, and then build
> > out for smaller languages? (CC wiki WUaS is planning wiki schools
> > in all 7,106 languages, plus dead and invented languages).
>
> While the current 7106 is way too low, it does include some
> "Macrolanguages"
> (i.e. language groups) and many extinct and some invented languages.
>
> > It seems that using or keying in on the ISO system, or a similar
> > one, would allow for remarkable extensibility and careful design
> > of Wikidata, as well as fallback for other languages such as Hindi,
> > Odia or Malayalam.
>
> Yes indeed, only blindly following a body like SIL (editor of ISO 639-3
> and Etnologue, btw. a fundamental christian missionary organization) with
> their rather slow process of adding languages (taking years) might be
> limiting our capacities and speed. I suggest that we evaluate our own
> needs first, then determine how to meet them best, and then cooperate with
> others.
>
> Purodha
>
> ___
> Wikidata-l mailing list
> Wikidata-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata-l
>
___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] weekly summary #108

2014-05-06 Thread Gerard Meijssen
Hoi,
There are standards that define British English et al. It makes part of the
ISO codes. We do not have to invent something like  "ISO 639-3eng".
Thanks,
 GerardM


On 5 May 2014 20:39, Scott MacLeod wrote:

> Hi Joe, Magnus, Andrew, GerardM, Jane, Daniel and Wikidatans,
>
> Since "Language fallback is not a luxury like it is for British English,
> it is essential for all the smaller languages. It is what prevents it from
> being editable / usable" (per GerardM), and in terms of Reasonator,
> statements, and careful design (DanielK), what are current Wikidata
> processes to plan eventually for all 7,106 living languages (plus even dead
> and invented languages) in the world per "Ethnologue: Languages of the
> World, Seventeenth edition" (http://www.ethnologue.com/statistics/size),
> as people add them, and use, for example, the ISO coding system (or
> similar) for this, to anticipate not yet added languages, and especially
> for 'smaller' languages that GerardM mentions?
>
> In terms of British English (en-gb) and English (en) distinction, why not
> just code English in Wikidata as "ISO 639-3eng" per
> http://www.ethnologue.com/language/eng as part of a careful design for
> all languages, and then build out for smaller languages? (CC wiki WUaS is
> planning wiki schools in all 7,106 languages, plus dead and invented
> languages).
>
> It seems that using or keying in on the ISO system, or a similar one,
> would allow for remarkable extensibility and careful design of Wikidata, as
> well as fallback for other languages such as Hindi, Odia or Malayalam.
> Cheers,
> Scott
>
>
>
>
>
> On Mon, May 5, 2014 at 4:11 AM, Gerard Meijssen  > wrote:
>
>> Hoi,
>> I am talking about statements.. I am not asking for selecting items that
>> have no label in a language.. This would only work if auto descriptions are
>> in use.
>> Thanks,
>>  GerardM
>>
>>
>> On 5 May 2014 12:52, Daniel Kinzler  wrote:
>>
>>> Am 05.05.2014 10:57, schrieb Gerard Meijssen:
>>> > Hoi,
>>> > When the "other languages" box needs to become more flexible, it is a
>>> different
>>> > problem that has nothing to do with the ability to understand what
>>> statements
>>> > are made. At this time it is an absolute inability when there is no
>>> label in
>>> > *YOUR* language.
>>>
>>> You are talking about picking an item as a link target when creating a
>>> statement
>>> when tehre is no label for the target item in your exact variant?
>>>
>>> Yes, we can and should implement fallback for that more swiftly. In
>>> fact, I was
>>> under the impression this was already in place... Lydia, do we have
>>> ticket for that?
>>>
>>> -- daniel
>>>
>>> PS: it's not an *absolute* inability: you can enter the ID directly. But
>>> that's
>>> not very nice, I know.
>>>
>>> --
>>> Daniel Kinzler
>>> Senior Software Developer
>>>
>>> Wikimedia Deutschland
>>> Gesellschaft zur Förderung Freien Wissens e.V.
>>>
>>> ___
>>> Wikidata-l mailing list
>>> Wikidata-l@lists.wikimedia.org
>>> https://lists.wikimedia.org/mailman/listinfo/wikidata-l
>>>
>>
>>
>> ___
>> Wikidata-l mailing list
>> Wikidata-l@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikidata-l
>>
>>
>
>
> --
> - Scott MacLeod - Founder & President
> - http://scottmacleod.com/interlingual/worlduniversityandschool.html
> - World University and School - like Wikipedia with MIT OpenCourseWare
> (not endorsed by MIT OCW) - incorporated as a nonprofit effective April
> 2010.
>
> ___
> Wikidata-l mailing list
> Wikidata-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata-l
>
>
___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] When the source says the information provided is dubious

2014-05-06 Thread David Cuenca
On Tue, May 6, 2014 at 1:37 PM, Thomas Douillard  wrote:

> We could create a new qualifier like ''contradicted by'' or ''disputed
> by''. The sourcs are a problem though as we can source only the totality of
> a claim, not only a qualifier of this claim, so we would have to source all
> the sources for the claim and it's disputation sources in the source
> without order..
>

I have mixed feelings about that... it is good because it doesn't require
any development, it isn't that good because it mixes claim and source...
And having a "reference rank" to indicate if the source is "supporting",
"against" or "unsure" about the claim seems too much work for the number of
times that such feature would be needed

Thanks,
Micru
___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] weekly summary #108

2014-05-06 Thread P. Blissenbach
"Scott MacLeod"  writes:

> Hi Joe, Magnus, Andrew, GerardM, Jane, Daniel and Wikidatans, 
> Since "Language fallback is not a luxury like it is for
> British English, it is essential for all the smaller languages.
> It is what prevents it from being editable / usable" (per GerardM),
> and in terms of Reasonator, statements, and careful design (DanielK),
> what are current Wikidata processes to plan eventually for all
> 7,106 living languages (plus even dead and invented languages)
> in the world per "Ethnologue: Languages of the World, Seventeenth edition"
> (http://www.ethnologue.com/statistics/size), as people add them, and use,
> for example, the ISO coding system (or similar) for this, to anticipate
> not yet added languages, and especially for 'smaller' languages
> that GerardM mentions?

Just FYI, the ISO 639 and Ethnologue are grossly incomplete in their
coverage of world languages. One must assume some 10 times to 100 times
more natural languages are currently in use than listed.

Some single additions have been made through the BCP47 and IANA, such as
"en-GB-scouse" representing the Scouse dialect of British English, or
"sl-rozaj-lipaw" — the Lipovaz dialect of Resian which is itself a
variant of Slovenian spoken in Italy. In other fields, due differentiation
is still lacking. For example, in the swiss Alps, almost ever village in
ever vallley has its on language variety which are often mutually hardly
comprehesible, but they all together have only one language code, "gsw",
wich also covers a large part of Germanies South West and South Eastern
France and their local language varieties. You can easily look up from
a map that there are hundreds of cities, towns, villages, valleys, and
even if only a thenth of them had a language of their own, "gsw" actually
represts more than 1000 distinct languages. Considerig both spelling AND
pronunciation, the deserve to be  differenciated.

This is not meant do discourage you, or to say it was not manageable.
You only need to be aware, that taking care of the few languages currently
listed in ethnologue will not suffice, and coding them must be expected
to be a bit more complex, than it appears at first sight.

> In terms of British English (en-gb) and English (en) distinction,
> why not just code English in Wikidata as "ISO 639-3eng" per 
> http://www.ethnologue.com/language/eng[http://www.ethnologue.com/language/eng]
> as part of a careful design for all languages, and then build
> out for smaller languages? (CC wiki WUaS is planning wiki schools
> in all 7,106 languages, plus dead and invented languages).

While the current 7106 is way too low, it does include some "Macrolanguages"
(i.e. language groups) and many extinct and some invented languages.

> It seems that using or keying in on the ISO system, or a similar
> one, would allow for remarkable extensibility and careful design
> of Wikidata, as well as fallback for other languages such as Hindi,
> Odia or Malayalam. 

Yes indeed, only blindly following a body like SIL (editor of ISO 639-3
and Etnologue, btw. a fundamental christian missionary organization) with
their rather slow process of adding languages (taking years) might be
limiting our capacities and speed. I suggest that we evaluate our own
needs first, then determine how to meet them best, and then cooperate with
others.

Purodha

___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l


Re: [Wikidata-l] When the source says the information provided is dubious

2014-05-06 Thread Thomas Douillard
We could create a new qualifier like ''contradicted by'' or ''disputed
by''. The sourcs are a problem though as we can source only the totality of
a claim, not only a qualifier of this claim, so we would have to source all
the sources for the claim and it's disputation sources in the source
without order..


2014-05-05 18:26 GMT+02:00 P. Blissenbach :

>  "David Cuenca"  writes:
>
> > Jane, this info is in Wikipedia. For instance see:
> > https://en.wikipedia.org/wiki/Waltzes_(Chopin)
>
> > N. 17 was attributed to Chopin (Kobylańska and others),
> > Chomiński says that claim is spurious. And that is just
> > one of many examples.
> > According to Wikidata principles we should collect both
> > statements and let the reader decide which source to believe.
> > I can enter Kobylańska's claim, but I have no way to enter
> > Chomiński's counter-claim.
>
> > I think it is important to be able to model that information
> > because that is how sources act, they don't limit themselves
> > to make "certain" claims, they also make "uncertain" claims
> > or counter other claims (even if they don't offer better ones).
>
> Since attributions in arts, history, composition and many other
> field are uncertain, doubtful, questioned, or contradicted
> without an alternative at significant rates - in the
> 10% magnitude if you go back in time a bit - we ought to have
> them.
>
> Contradictions are indeed a new type of statement, because they
> have to refer to the staements they disclaim.
>
> Purodha
>
> ___
> Wikidata-l mailing list
> Wikidata-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata-l
>
___
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l