Re: [Wikidata] Label gaps on Wikidata

2017-02-27 Thread Daniel Kinzler
Am 27.02.2017 um 18:18 schrieb James Heald:
> From what Daniel is saying, it seems this may not be possible, because the
> template expansion would then depend on the user's preferred language(s), 
> which
> would not be compatible with the template cacheing.
> 
> Is that right?   Or is there a way round this?

We are currently aiming for a compromise: we render the page with the user's
interface language as the target language, and apply fallback accordingly. We do
not take into account secondary user languages, as defined e.g. by the Babel or
Translate extensions.

This means a user with the UI language set to French will see French if
available, but will not see Spanish, even if they somehow declared that they
also speak Spanish.

This way, we split the parser cache once per UI language - a factor of 300, but
not the exponential explosion we would get if we would split on every possible
permutation of languages (does anyone want to compute 300 factorial?).


-- 
Daniel Kinzler
Principal Platform Engineer

Wikimedia Deutschland
Gesellschaft zur Förderung Freien Wissens e.V.

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Label gaps on Wikidata

2017-02-27 Thread James Heald
Something I have been wondering is whether it is possible to get a 
template on eg Commons for a templated WDQS query to take account of the 
user's language (and also, ideally, preferred fall-back languages, as 
perhaps indicated by their {{#babel}} settings).


I had hoped it might be possible to include these preferences as a 
parameter string in the "label service" part of the query text.


From what Daniel is saying, it seems this may not be possible, because 
the template expansion would then depend on the user's preferred 
language(s), which would not be compatible with the template cacheing.


Is that right?   Or is there a way round this?

 -- James.


On 27/02/2017 16:03, Daniel Kinzler wrote:

Am 27.02.2017 um 17:01 schrieb James Hare:

One option is to allow users to define their own ranked preferences for language
beyond just first place. (I personally would enjoy having French as a fallback
to English.)


That would badly fragment the parser cache. I don't think it's viable.


This has the downside of only really working for people with
accounts, which I suspect might be a minority of overall traffic.


Currently, we only support English for anon visiors (yes, this is very sad; the
reason is, again, caching - varnish, this time).




___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Label gaps on Wikidata

2017-02-27 Thread Thad Guidry
Good fall back languages for English would be any of the Germanic languages
or Romance languages.
As a native American, I also would agree with this article's listing of
languages that are more easily understood by my brain:

1. Afrikaans
2. Danish
3. French
4. Italian
5. Norwegian
etc.

9 easy languages for English Speakers.
https://matadornetwork.com/abroad/9-easy-languages-for-english-speakers-to-learn/

-Thad
+ThadGuidry 
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Label gaps on Wikidata

2017-02-27 Thread Daniel Kinzler
Am 27.02.2017 um 17:01 schrieb James Hare:
> One option is to allow users to define their own ranked preferences for 
> language
> beyond just first place. (I personally would enjoy having French as a fallback
> to English.)

That would badly fragment the parser cache. I don't think it's viable.

> This has the downside of only really working for people with
> accounts, which I suspect might be a minority of overall traffic.

Currently, we only support English for anon visiors (yes, this is very sad; the
reason is, again, caching - varnish, this time).

-- 
Daniel Kinzler
Principal Platform Engineer

Wikimedia Deutschland
Gesellschaft zur Förderung Freien Wissens e.V.

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Label gaps on Wikidata

2017-02-27 Thread James Hare
On February 27, 2017 at 7:54:43 AM, Daniel Kinzler (
daniel.kinz...@wikimedia.de) wrote:

The fallback mechanism works OK, but is not great for English speaking
users who
see a lot of items that have no English label. For English, we just don't
know
what to fall back to. Just anything? Or try european languages first? What
should the rule be? If we can decide on a good rule, it should actualyl be
pretty simple to add such fallback for English.




One option is to allow users to define their own ranked preferences for
language beyond just first place. (I personally would enjoy having French
as a fallback to English.) This has the downside of only really working for
people with accounts, which I suspect might be a minority of overall
traffic.


Cheers,
James Hare
___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


Re: [Wikidata] Label gaps on Wikidata

2017-02-27 Thread Daniel Kinzler
Am 19.02.2017 um 17:00 schrieb Romaine Wiki:
> Hi all,
> 
> If you look in the recent changes, most items have labels in English and those
> are shown in the recent changes and elsewhere (so we know what the item is 
> about
> without opening first).

Wikidata actually tries to show you the labels in your üpreferred interface
language. And if you user language is not available, it uses a fallback
mechanism to show the next-best language, which may even include automated
transciptions. When all else fails, it will show the English label. If that
doesn't exist, it shows the ID.

> But not all items have labels, and these items without
> English label are often items with only a label in Chinese, Arabic, Cyrillic
> script, Hebrew, etc. This forms a significant gap.

The fallback mechanism works OK, but is not great for English speaking users who
see a lot of items that have no English label. For English, we just don't know
what to fall back to. Just anything? Or try european languages first? What
should the rule be? If we can decide on a good rule, it should actualyl be
pretty simple to add such fallback for English.

> Is there a way to easily make a transcription from one language to another?

We have such rules for some languages/variants, e.g. between the cyrillic and
the roman representations of Kazakh or Uzbek. But translitteration rules can be
complex, and covering every permutation of the 300 languages we support would
mean we'd need about 45000 rule sets...

> Or alternatively if there is a database that has such transcriptions?

Not yet. One of the goals of Wikidata is to be that database.

-- 
Daniel Kinzler
Principal Platform Engineer

Wikimedia Deutschland
Gesellschaft zur Förderung Freien Wissens e.V.

___
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata


[Wikidata] Weekly Summary #249

2017-02-27 Thread Léa Lacroix
*Here's your quick overview of what has been happening around Wikidata over
the last week.*

Discussions

   - New request for comments: Proposal to include command line arguments
   to Command line tool
   

   - We need your input on SPARQL federation
   

Events /
Press/Blogs


   - Describing Wikidata items with OpenStreetMap tags
   
   - Deadline for applying to Wikicite 2017
    is February 27th
   - Quora blog post about their collaboration with Wikidata
   . They
   are now displaying links to Wikidata items in their topic management pages
   - about 88K of them, so far.
   - Community Digest: Using data to visualize Wikipedia knowledge gaps;
   news in brief
   

Other Noteworthy Stuff

   - There is now a Community for Wikidata editors
    on Facebook
   - You can now make your Harvest Templates
    tasks run
   automatically. After you have generated a permalink to your task, add
   &run= to the url. When you open it next time, it will load and then run
   automatically. Alternatively, you can use &load= which will only prepare
   the task for running.
   - Wikidata description editing in the Wikipedia Android app for Hebrew,
   Russian and Catalan
   


Did you know?

   - Newest properties
   : GPnotebook ID
   , regulated by
   , NCMEC person ID
   , MEROPS enzyme ID
   , social classification
   , NISH Hall of Fame ID
   , Recreation.gov facility
   ID , category for value
   not in Wikidata , objective
   of a project or mission
, Vanderkrogt.net
   Statues ID , Jewish
   Encyclopedia Daat ID
, category
   for value different from Wikidata
   , PhDTree person ID
   , Gridabase glacier ID
   , RITVA Person ID
   , RITVA Program ID
   , KMDb film ID
   , JMDb person ID
   , Catalogue of Illuminated
   Manuscripts ID , incarnation
   of , NHF player ID
   , Transfermarkt referee ID
   , Tennis Australia player
   ID , SRCFB player ID
   , SRCBB player ID
   , SpeedSkatingStats speed
   skater ID , SpeedSkatingNews
   speed skater ID ,
ShorttrackOnLine
   speed skater ID , NCAA
   sports team ID , ISHOF
   swimmer ID , IFSC climber
   ID , ICF slalom canoer ID
   , ICF sprint canoer ID
   , ESPN NHL player ID
   , ESPN NFL player ID
   , ESPN NBA player ID
   , DriverDB driver ID
   , LFP.fr player ID
   , AOC athlete ID
   , ESPN FC player ID
   , statement supported by