[Wikidata-bugs] [Maniphest] [Commented On] T212843: [EPIC] Access to Wikidata's lexicographical data from Wiktionaries and other WMF sites

2019-10-10 Thread reosarevok
reosarevok added a comment.


  I would like to be able to access all forms matching a particular set of 
grammatical features from Wiktionary, so that a template can be made for 
example where a lexeme ID is given and a table will be returned with all the 
forms as per Wiktionary info. For a very basic example, see the table on 
https://no.wiktionary.org/wiki/tirsdag#Grammatikk

TASK DETAIL
  https://phabricator.wikimedia.org/T212843

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: reosarevok
Cc: reosarevok, TomT0m, Yurik, Vesihiisi, ArthurPSmith, Iniquity, Tobias1984, 
Theklan, Fnielsen, RexxS, Pamputt, Mike_Peel, MarcoSwart, Geertivp, 
Liuxinyu970226, Addshore, Jdforrester-WMF, deryckchan, Lydia_Pintscher, 
Lea_Lacroix_WMDE, darthmon_wmde, DannyS712, Nandana, Mringgaard, Lahi, Gq86, 
Cinemantique, GoranSMilovanovic, QZanden, LawExplorer, _jensen, rosalieper, 
jberkel, Psychoslave, Wikidata-bugs, aude, GPHemsley, Shizhao, Nemo_bis, 
Darkdadaah, Mbch331, Ltrlg, Krenair
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T212843: [EPIC] Access to Wikidata's lexicographical data from Wiktionaries and other WMF sites

2019-09-27 Thread Fnielsen
Fnielsen added a comment.


  The query is a bit hard on WDQS. If one execute it twice then the second time 
can apparently use some caching from the first.
  
  The query counts that there are, e.g., 7 Danish lexemes of 'led' from what 
Ordia shows are 9 different forms. In Wiktionary, I suppose we would like to 
have all 9 forms shown - either fully or just as a redirect. The 9 forms can be 
fetched from 7 different Wikidata lexeme pages. That is just for one language. 
I have a problem with formulating a language-agnostic query that doesn't 
timeout in WDQS. The Ordia page 
https://tools.wmflabs.org/ordia/representation/led show that there are two more 
lexemes we should get to make the full Wiktionary page for 'led', - one Czech 
lexeme (with two forms) and one English lexeme

TASK DETAIL
  https://phabricator.wikimedia.org/T212843

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Fnielsen
Cc: TomT0m, Yurik, Vesihiisi, ArthurPSmith, Iniquity, Tobias1984, Theklan, 
Fnielsen, RexxS, Pamputt, Mike_Peel, MarcoSwart, Geertivp, Liuxinyu970226, 
Addshore, Jdforrester-WMF, deryckchan, Lydia_Pintscher, Lea_Lacroix_WMDE, 
darthmon_wmde, DannyS712, Nandana, Mringgaard, Lahi, Gq86, Cinemantique, 
GoranSMilovanovic, QZanden, LawExplorer, _jensen, rosalieper, jberkel, 
Psychoslave, Wikidata-bugs, aude, GPHemsley, Shizhao, Nemo_bis, Darkdadaah, 
Mbch331, Ltrlg, Krenair
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T212843: [EPIC] Access to Wikidata's lexicographical data from Wiktionaries and other WMF sites

2019-09-27 Thread Yurik
Yurik added a comment.


  @Fnielsen i am not sure I understand what that query does, could you 
elaborate?  Especially I am confused why you look at the forms -- from the 
perspective of Wiktionary, you request a single Lexeme, not individual forms. 
(btw, the query times out for me).
  
  Also, I just realized that I shouldn't have grouped by the language, because 
in Wiktionary each page is per Lemma, regardless of which language contains it. 
So if Wiktionary wants to show data about all lexemes spelled a certain way, 
the query becomes https://w.wiki/8yY (the results are nearly identical -- words 
are still by far unique, at least with what we currently have in WD):
  
  | lexemes_per_word | words  |
  | 1| 173657 |
  | 2| 4670   |
  | 3| 351|
  | 4| 66 |
  | 5| 15 |
  | 6| 8  |
  | 7| 3  |
  | 8| 1  |

TASK DETAIL
  https://phabricator.wikimedia.org/T212843

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Yurik
Cc: TomT0m, Yurik, Vesihiisi, ArthurPSmith, Iniquity, Tobias1984, Theklan, 
Fnielsen, RexxS, Pamputt, Mike_Peel, MarcoSwart, Geertivp, Liuxinyu970226, 
Addshore, Jdforrester-WMF, deryckchan, Lydia_Pintscher, Lea_Lacroix_WMDE, 
darthmon_wmde, DannyS712, Nandana, Mringgaard, Lahi, Gq86, Cinemantique, 
GoranSMilovanovic, QZanden, LawExplorer, _jensen, rosalieper, jberkel, 
Psychoslave, Wikidata-bugs, aude, GPHemsley, Shizhao, Nemo_bis, Darkdadaah, 
Mbch331, Ltrlg, Krenair
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T212843: [EPIC] Access to Wikidata's lexicographical data from Wiktionaries and other WMF sites

2019-09-27 Thread Fnielsen
Fnielsen added a comment.


  Here a variation on @Yurik's query with count on within-language forms: 
https://w.wiki/8y8 (current count is obfuscated by Tamil annotation). For 
instance, 'led' in Ordia shows 9 lexemes from 3 language: 
https://tools.wmflabs.org/ordia/representation/led
  
  | lexemes_per_representation | number_of_representations | 
example_representation |
  | 55 | 1 | பெயர்ச்சொல்
|
  | 47 | 1 | ஒருமை  
|
  | 44 | 1 | noun   
|
  | 33 | 1 | singular   
|
  | 8  | 10| сибирка
|
  | 7  | 26| led
|
  | 6  | 61| かえる
|
  | 5  | 191   | かえ 
|
  | 4  | 678   | lede   
|
  | 3  | 2885  | engagerat  
|
  | 2  | 30482 | bager  
|
  | 1  | 1572338 | كتبت 
  |

TASK DETAIL
  https://phabricator.wikimedia.org/T212843

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Fnielsen
Cc: TomT0m, Yurik, Vesihiisi, ArthurPSmith, Iniquity, Tobias1984, Theklan, 
Fnielsen, RexxS, Pamputt, Mike_Peel, MarcoSwart, Geertivp, Liuxinyu970226, 
Addshore, Jdforrester-WMF, deryckchan, Lydia_Pintscher, Lea_Lacroix_WMDE, 
darthmon_wmde, DannyS712, Nandana, Mringgaard, Lahi, Gq86, Cinemantique, 
GoranSMilovanovic, QZanden, LawExplorer, _jensen, rosalieper, jberkel, 
Psychoslave, Wikidata-bugs, aude, GPHemsley, Shizhao, Nemo_bis, Darkdadaah, 
Mbch331, Ltrlg, Krenair
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T212843: [EPIC] Access to Wikidata's lexicographical data from Wiktionaries and other WMF sites

2019-09-27 Thread Yurik
Yurik added a comment.


  P.S. @Fnielsen does bring a valid point about various linked lexemes , and 
that might be useful -- for example if lexeme lists another lexeme as being a 
synonym, it would be good to show it as a word rather than an L-number.
  
  That said, I do not believe we need it just yet -- it will take a while for 
the synonyms to be populated to the level of wiktionary, so for now lexemes 
will be needed just for the "infoboxes" -- e.g. list all forms and basic info, 
not the advanced features.
  
  At this point, I can easily replace the `{{noun ru|...}}` template (generates 
morphology summary and a forms table), but I won't be able to easily replace 
the synonyms section with the auto-generated content, and thus, linked lexemes 
are somewhat useless until they have much better coverage.

TASK DETAIL
  https://phabricator.wikimedia.org/T212843

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Yurik
Cc: TomT0m, Yurik, Vesihiisi, ArthurPSmith, Iniquity, Tobias1984, Theklan, 
Fnielsen, RexxS, Pamputt, Mike_Peel, MarcoSwart, Geertivp, Liuxinyu970226, 
Addshore, Jdforrester-WMF, deryckchan, Lydia_Pintscher, Lea_Lacroix_WMDE, 
darthmon_wmde, DannyS712, Nandana, Mringgaard, Lahi, Gq86, Cinemantique, 
GoranSMilovanovic, QZanden, LawExplorer, _jensen, rosalieper, jberkel, 
Psychoslave, Wikidata-bugs, aude, GPHemsley, Shizhao, Nemo_bis, Darkdadaah, 
Mbch331, Ltrlg, Krenair
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T212843: [EPIC] Access to Wikidata's lexicographical data from Wiktionaries and other WMF sites

2019-09-27 Thread Yurik
Yurik added a comment.


  @Lydia_Pintscher most of the Wiktionary pages have just one corresponding 
lexeme - and that's all I would expect to load.
  
  Some statistics:  https://w.wiki/8xw  (note that this is per language, not 
just when lemmas match)
  
  | lexemes_per_word | words  |
  | 1| 173680 |
  | 2| 4659   |
  | 3| 351|
  | 4| 65 |
  | 5| 15 |
  | 6| 8  |
  | 7| 3  |
  | 8| 1  |
  |
  
  The tricky bit comes when a page has multiple associated lexemes -- yes, in 
theory there could be up to 8 (per query result), but I think this is a mistake 
to store so many lexemes per word -- most of them have identical forms, 
pronunciation, and top-level claims. They only differ in their meaning - and as 
such, we should put that meaning inside the senses.

TASK DETAIL
  https://phabricator.wikimedia.org/T212843

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Yurik
Cc: TomT0m, Yurik, Vesihiisi, ArthurPSmith, Iniquity, Tobias1984, Theklan, 
Fnielsen, RexxS, Pamputt, Mike_Peel, MarcoSwart, Geertivp, Liuxinyu970226, 
Addshore, Jdforrester-WMF, deryckchan, Lydia_Pintscher, Lea_Lacroix_WMDE, 
darthmon_wmde, DannyS712, Nandana, Mringgaard, Lahi, Gq86, Cinemantique, 
GoranSMilovanovic, QZanden, LawExplorer, _jensen, rosalieper, jberkel, 
Psychoslave, Wikidata-bugs, aude, GPHemsley, Shizhao, Nemo_bis, Darkdadaah, 
Mbch331, Ltrlg, Krenair
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T212843: [EPIC] Access to Wikidata's lexicographical data from Wiktionaries and other WMF sites

2019-09-27 Thread Fnielsen
Fnielsen added a comment.


  I count over 30 basic lexemes on https://en.wiktionary.org/wiki/for while 
there may be more when we start to count inflections and derived terms ...

TASK DETAIL
  https://phabricator.wikimedia.org/T212843

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Fnielsen
Cc: TomT0m, Yurik, Vesihiisi, ArthurPSmith, Iniquity, Tobias1984, Theklan, 
Fnielsen, RexxS, Pamputt, Mike_Peel, MarcoSwart, Geertivp, Liuxinyu970226, 
Addshore, Jdforrester-WMF, deryckchan, Lydia_Pintscher, Lea_Lacroix_WMDE, 
darthmon_wmde, DannyS712, Nandana, Mringgaard, Lahi, Gq86, Cinemantique, 
GoranSMilovanovic, QZanden, LawExplorer, _jensen, rosalieper, jberkel, 
Psychoslave, Wikidata-bugs, aude, GPHemsley, Shizhao, Nemo_bis, Darkdadaah, 
Mbch331, Ltrlg, Krenair
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T212843: [EPIC] Access to Wikidata's lexicographical data from Wiktionaries and other WMF sites

2019-09-27 Thread Lydia_Pintscher
Lydia_Pintscher added a comment.


  Thank you for all your input so far. That's really helpful.
  I have one more question: How many Lexemes would you expect to load on a 
single Wiktionary page on average? How many Lexemes would you need to load for 
it to be useful for you?

TASK DETAIL
  https://phabricator.wikimedia.org/T212843

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Lydia_Pintscher
Cc: TomT0m, Yurik, Vesihiisi, ArthurPSmith, Iniquity, Tobias1984, Theklan, 
Fnielsen, RexxS, Pamputt, Mike_Peel, MarcoSwart, Geertivp, Liuxinyu970226, 
Addshore, Jdforrester-WMF, deryckchan, Lydia_Pintscher, Lea_Lacroix_WMDE, 
darthmon_wmde, DannyS712, Nandana, Mringgaard, Lahi, Gq86, Cinemantique, 
GoranSMilovanovic, QZanden, LawExplorer, _jensen, rosalieper, jberkel, 
Psychoslave, Wikidata-bugs, aude, GPHemsley, Shizhao, Nemo_bis, Darkdadaah, 
Mbch331, Ltrlg, Krenair
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T212843: [EPIC] Access to Wikidata's lexicographical data from Wiktionaries and other WMF sites

2019-09-23 Thread Yurik
Yurik added a comment.


  @RexxS you do bring up a valid point about watchlist. The minor difference 
here is that lexeme is tied to a specific language, so it is less likely to 
have content not relevant to that one language / wiktionary.  The only 
exception might be the description of sensese in other languages. TBH, I am not 
sure that adding sense description in a non-native language is a scalable 
solution -- we are repeating the issue of sitelinks, where every wiki page 
referenced all other wiki pages on the same subject. But this is a separate 
discussion, unrelated to this ticket.
  
  Performance-wise, there is not much difference -- lexemes are not attached 
(yet) to wiktionary pages, the way wikidata item are attached with their 
sitelinks, so every lexeme retrieval will be "expensive". On the other hand, 
getting just a handful (at most) lexemes per wiktionary page should not affect 
performance in a significant way.  And since most of the content will be 
relevant to the page generation, having multiple calls might actually be slower 
than rendering a large chunk of page in a single template with a module, where 
that module would get the whole lexeme content.
  
  Lastly, we could always optimize the process, but remember that having a 
simple interface to get the entire lexeme is far quicker to implement than to 
have a very complex system - so at the end it might be better, but in the mean 
time you won't have it for several years (?), and you may need to allocate 
resources to this project at an expense of another project.

TASK DETAIL
  https://phabricator.wikimedia.org/T212843

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Yurik
Cc: TomT0m, Yurik, Vesihiisi, ArthurPSmith, Iniquity, Tobias1984, Theklan, 
Fnielsen, RexxS, Pamputt, Mike_Peel, MarcoSwart, Geertivp, Liuxinyu970226, 
Addshore, Jdforrester-WMF, deryckchan, Lydia_Pintscher, Lea_Lacroix_WMDE, 
darthmon_wmde, DannyS712, Nandana, Mringgaard, Lahi, Gq86, Cinemantique, 
GoranSMilovanovic, QZanden, LawExplorer, _jensen, rosalieper, jberkel, 
Psychoslave, Wikidata-bugs, aude, GPHemsley, Shizhao, Nemo_bis, Darkdadaah, 
Mbch331, Ltrlg, Krenair
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T212843: [EPIC] Access to Wikidata's lexicographical data from Wiktionaries and other WMF sites

2019-09-23 Thread RexxS
RexxS added a comment.


  We started using Scribunto to read Wikidata items in exactly that way - just 
loading the entire entity as an object and working from that. There are two 
downsides that became apparent:
  
  First, the resources consumed made this an "expensive" call unless it was 
done from the page that was already linked to the Wikidata item.
  
  Second, because all of the Wikidata object was loaded, including 
descriptions, aliases, labels, etc. in every language, any change to any of 
those in any language threw up an entry in the watchlist for anybody watching 
the Wikipedia article where that item was loaded. That swamped watchlists with 
irrelevant entries and caused many Wikipedia editors to turn off monitoring of 
Wikidata changes.
  
  Unless anyone can think of a good reason not to, having calls that return 
single items from a large entity is far more efficient and can make 
watchlisting feasible. We should definitely be planning to achieve that 
functionality, even if we begin by loading the entire entity in order to get 
started.

TASK DETAIL
  https://phabricator.wikimedia.org/T212843

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: RexxS
Cc: TomT0m, Yurik, Vesihiisi, ArthurPSmith, Iniquity, Tobias1984, Theklan, 
Fnielsen, RexxS, Pamputt, Mike_Peel, MarcoSwart, Geertivp, Liuxinyu970226, 
Addshore, Jdforrester-WMF, deryckchan, Lydia_Pintscher, Lea_Lacroix_WMDE, 
darthmon_wmde, DannyS712, Nandana, Mringgaard, Lahi, Gq86, Cinemantique, 
GoranSMilovanovic, QZanden, LawExplorer, _jensen, rosalieper, jberkel, 
Psychoslave, Wikidata-bugs, aude, GPHemsley, Shizhao, Nemo_bis, Darkdadaah, 
Mbch331, Ltrlg, Krenair
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T212843: [EPIC] Access to Wikidata's lexicographical data from Wiktionaries and other WMF sites

2019-09-22 Thread TomT0m
TomT0m added a comment.


  I implemented something like that in 
https://www.wikidata.org/wiki/Module:Iterators without the coroutine module. 
There is also the Luafun library https://github.com/luafun/luafun that does 
functional stuffs, totally usable in 
  as a Mediawiki module : https://www.wikidata.org/wiki/Module:Luafun without 
the coroutine module.
  
  My own code is not perfect, as the iterators are statefull however.

TASK DETAIL
  https://phabricator.wikimedia.org/T212843

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: TomT0m
Cc: TomT0m, Yurik, Vesihiisi, ArthurPSmith, Iniquity, Tobias1984, Theklan, 
Fnielsen, RexxS, Pamputt, Mike_Peel, MarcoSwart, Geertivp, Liuxinyu970226, 
Addshore, Jdforrester-WMF, deryckchan, Lydia_Pintscher, Lea_Lacroix_WMDE, 
darthmon_wmde, DannyS712, Nandana, Mringgaard, Lahi, Gq86, Cinemantique, 
GoranSMilovanovic, QZanden, LawExplorer, _jensen, rosalieper, jberkel, 
Psychoslave, Wikidata-bugs, aude, GPHemsley, Shizhao, Nemo_bis, Darkdadaah, 
Mbch331, Ltrlg, Krenair
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T212843: [EPIC] Access to Wikidata's lexicographical data from Wiktionaries and other WMF sites

2019-09-13 Thread Yurik
Yurik added a comment.


  P.S. to sum up -- Wiktionary just needs just a single Lua function for the 
minimum viable product:   `getEntity('L10')`  that simply returns the whole 
Lexeme JSON.  Everything else is optional.

TASK DETAIL
  https://phabricator.wikimedia.org/T212843

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Yurik
Cc: Yurik, Vesihiisi, ArthurPSmith, Iniquity, Tobias1984, Theklan, Fnielsen, 
RexxS, Pamputt, Mike_Peel, MarcoSwart, Geertivp, Liuxinyu970226, Addshore, 
Jdforrester-WMF, deryckchan, Lydia_Pintscher, Lea_Lacroix_WMDE, darthmon_wmde, 
DannyS712, Nandana, Mringgaard, Lahi, Gq86, Cinemantique, GoranSMilovanovic, 
QZanden, LawExplorer, _jensen, rosalieper, jberkel, Psychoslave, Wikidata-bugs, 
aude, GPHemsley, Shizhao, Nemo_bis, Darkdadaah, Mbch331, Ltrlg, Krenair
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T212843: [EPIC] Access to Wikidata's lexicographical data from Wiktionaries and other WMF sites

2019-09-12 Thread Yurik
Yurik added a comment.


  I have imported some Russian nouns (~20,000 so far, but will be more soon), 
plus added a link from Wiktionary to the corresponding Lexeme.  I think the 
simplest use case for Lexemes would be to allow Wiktionary Lua script to be 
able to load Lexeme by its ID.  This will instantly make Lexemes useful to 
Wiktionary because the Lua script will be able to:
  
  - generate table of the word forms
  - generate etymology and pronunciation sections
  - do the above for every lexeme if more than one is used on the page.
  
  Note that the last point makes it substantially different from the regular 
Wikipedia usage because it is likely that more than one Lexeme corresponds to a 
single Wiktionary page.  Also, while nice to have, it is not really required 
for Wiktionary to be able to read Wikidata Q items because those could be 
hardcoded in Lua (the list of used Q-IDs is not too big - under a thousand)

TASK DETAIL
  https://phabricator.wikimedia.org/T212843

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Yurik
Cc: Yurik, Vesihiisi, ArthurPSmith, Iniquity, Tobias1984, Theklan, Fnielsen, 
RexxS, Pamputt, Mike_Peel, MarcoSwart, Geertivp, Liuxinyu970226, Addshore, 
Jdforrester-WMF, deryckchan, Lydia_Pintscher, Lea_Lacroix_WMDE, darthmon_wmde, 
DannyS712, Nandana, Mringgaard, Lahi, Gq86, Cinemantique, GoranSMilovanovic, 
QZanden, LawExplorer, _jensen, rosalieper, jberkel, Psychoslave, Wikidata-bugs, 
aude, GPHemsley, Shizhao, Nemo_bis, Darkdadaah, Mbch331, Ltrlg, Krenair
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T212843: [EPIC] Access to Wikidata's lexicographical data from Wiktionaries and other WMF sites

2019-09-12 Thread Iniquity
Iniquity added a comment.


  In T212843#5488427 , 
@ArthurPSmith wrote:
  
  > One UI suggestion would be: when searching for a word in a wiktionary, if 
it is NOT found, any matching Wikidata forms from that or any other language 
could be shown, so this provides an immediate supplement to small Wiktionaries, 
and there may even be a few words missing from enwikt that could be found in 
Wikidata.
  
  Perhaps #articleplaceholder 
 will be interested 
in this? @Lydia_Pintscher what do you think about this idea?

TASK DETAIL
  https://phabricator.wikimedia.org/T212843

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Iniquity
Cc: ArthurPSmith, Iniquity, Tobias1984, Theklan, Fnielsen, RexxS, Pamputt, 
Mike_Peel, MarcoSwart, Geertivp, Liuxinyu970226, Addshore, Jdforrester-WMF, 
deryckchan, Lydia_Pintscher, Lea_Lacroix_WMDE, darthmon_wmde, DannyS712, 
Nandana, Mringgaard, Lahi, Gq86, Cinemantique, GoranSMilovanovic, QZanden, 
LawExplorer, _jensen, rosalieper, jberkel, Psychoslave, Wikidata-bugs, aude, 
GPHemsley, Shizhao, Nemo_bis, Darkdadaah, Mbch331, Ltrlg, Krenair
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T212843: [EPIC] Access to Wikidata's lexicographical data from Wiktionaries and other WMF sites

2019-09-12 Thread ArthurPSmith
ArthurPSmith added a comment.


  The Basque collection is even more complete now!
  I do think some customization may be needed for Lexemes due to the different 
structure - the forms and senses etc. Perhaps the most useful link for a 
wiktionary may be from words to senses to wikidata items via the "item for this 
sense" property. That in principle allows translations to be provided, grouped 
by sense.
  
  One UI suggestion would be: when searching for a word in a wiktionary, if it 
is NOT found, any matching Wikidata forms from that or any other language could 
be shown, so this provides an immediate supplement to small Wiktionaries, and 
there may even be a few words missing from enwikt that could be found in 
Wikidata.

TASK DETAIL
  https://phabricator.wikimedia.org/T212843

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ArthurPSmith
Cc: ArthurPSmith, Iniquity, Tobias1984, Theklan, Fnielsen, RexxS, Pamputt, 
Mike_Peel, MarcoSwart, Geertivp, Liuxinyu970226, Addshore, Jdforrester-WMF, 
deryckchan, Lydia_Pintscher, Lea_Lacroix_WMDE, darthmon_wmde, DannyS712, 
Nandana, Mringgaard, Lahi, Gq86, Cinemantique, GoranSMilovanovic, QZanden, 
LawExplorer, _jensen, rosalieper, jberkel, Psychoslave, Wikidata-bugs, aude, 
GPHemsley, Shizhao, Nemo_bis, Darkdadaah, Mbch331, Ltrlg, Krenair
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T212843: [EPIC] Access to Wikidata's lexicographical data from Wiktionaries and other WMF sites

2019-06-28 Thread Theklan
Theklan added a comment.


  We have a bunch of words and forms uploaded in Basque, they should be at 
least 5.000, and as euwikt is quite dead, this could be a good boost to the 
project.
  
  If someone wants to use basque wiktionary for testing purposes, let's talk 
about it.

TASK DETAIL
  https://phabricator.wikimedia.org/T212843

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Theklan
Cc: Theklan, Fnielsen, RexxS, Pamputt, Mike_Peel, MarcoSwart, Geertivp, 
Liuxinyu970226, Addshore, Jdforrester-WMF, deryckchan, Lydia_Pintscher, 
Lea_Lacroix_WMDE, darthmon_wmde, Nandana, Mringgaard, Lahi, Gq86, Cinemantique, 
GoranSMilovanovic, QZanden, LawExplorer, _jensen, rosalieper, jberkel, 
Psychoslave, Wikidata-bugs, aude, GPHemsley, Shizhao, Nemo_bis, Darkdadaah, 
Mbch331, Ltrlg, Krenair
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T212843: [EPIC] Access to Wikidata's lexicographical data from Wiktionaries and other WMF sites

2019-06-04 Thread RexxS
RexxS added a comment.


  I'd like to have a complete collection of api calls exposed to Scribunto. I 
should be able to get the following:
  
  getEntity - the whole object (probably expensive, but would mostly be used to 
look at structures)
  getLanguage - entity ID like Q1860 for 'English'
  getLexicalCategory - entity ID like Q24905 for 'verb'
  getStatements - table
  getSenses - table
  getForms - table (each value is an entity ID along with qualifiers 
'Grammatical features', a table of entity IDs like Q110786 for 'singular, etc.)
  
  That would be enough, in my opinion, for me to write almost any Scribunto 
code that the folks at the Wiktionaries and other sites could ask for (until 
you start changing the structure of the lexemes, of course). If all of these 
returned values are normal q-numbers (entity IDs), I already have plenty of 
code to handle getting labels, sitelinks, etc. to display in the local or 
preferred language, so we probably wouldn't need to worry about further 
internationalisation.

TASK DETAIL
  https://phabricator.wikimedia.org/T212843

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: RexxS
Cc: RexxS, Pamputt, Mike_Peel, MarcoSwart, Geertivp, Liuxinyu970226, Addshore, 
Jdforrester-WMF, deryckchan, Lydia_Pintscher, Lea_Lacroix_WMDE, darthmon_wmde, 
Premeditated, Nandana, Mringgaard, Lahi, Gq86, Cinemantique, GoranSMilovanovic, 
QZanden, LawExplorer, _jensen, rosalieper, jberkel, Psychoslave, Wikidata-bugs, 
aude, GPHemsley, Shizhao, Nemo_bis, Darkdadaah, Mbch331, Ltrlg, Krenair
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T212843: [EPIC] Access to Wikidata's lexicographical data from Wiktionaries and other WMF sites

2019-06-04 Thread Pamputt
Pamputt added a comment.


  In T212843#4912902 , 
@deryckchan wrote:
  
  > This will address migration blocks like 
https://www.wikidata.org/wiki/Wikidata:Properties_for_deletion#Property:P2521, 
where the lack of a feature to call Lexemes in Wikipedias is blocking the 
migration of a property.
  
  
  Note that the discussion has been archived. It is now available here: 
https://www.wikidata.org/wiki/Wikidata:Requests_for_deletions/Archive/2019/Properties/1#female_form_of_label_(P2521)

TASK DETAIL
  https://phabricator.wikimedia.org/T212843

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Pamputt
Cc: Pamputt, Mike_Peel, MarcoSwart, Geertivp, Liuxinyu970226, Addshore, 
Jdforrester-WMF, deryckchan, Lydia_Pintscher, Lea_Lacroix_WMDE, darthmon_wmde, 
Premeditated, Nandana, Mringgaard, Lahi, Gq86, Cinemantique, GoranSMilovanovic, 
QZanden, LawExplorer, _jensen, rosalieper, jberkel, Psychoslave, Wikidata-bugs, 
aude, GPHemsley, Shizhao, Nemo_bis, Darkdadaah, Mbch331, Ltrlg, Krenair
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T212843: [EPIC] Access to Wikidata's lexicographical data from Wiktionaries and other WMF sites

2019-06-04 Thread Pamputt
Pamputt added a comment.


  For the French Wiktionary, I do not know what will decide the community but 
if we decide one day to use the Lexeme data from Wikidata, it will be the most 
probably for the Forms (conjugation, inflection, declension, etc). I think we 
will never use the Senses. So what Mike Peel proposed just before makes sense 
for a full flexibility.

TASK DETAIL
  https://phabricator.wikimedia.org/T212843

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Pamputt
Cc: Pamputt, Mike_Peel, MarcoSwart, Geertivp, Liuxinyu970226, Addshore, 
Jdforrester-WMF, deryckchan, Lydia_Pintscher, Lea_Lacroix_WMDE, darthmon_wmde, 
Premeditated, Nandana, Mringgaard, Lahi, Gq86, Cinemantique, GoranSMilovanovic, 
QZanden, LawExplorer, _jensen, rosalieper, jberkel, Psychoslave, Wikidata-bugs, 
aude, GPHemsley, Shizhao, Nemo_bis, Darkdadaah, Mbch331, Ltrlg, Krenair
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs


[Wikidata-bugs] [Maniphest] [Commented On] T212843: [EPIC] Access to Wikidata's lexicographical data from Wiktionaries and other WMF sites

2019-06-04 Thread Mike_Peel
Mike_Peel added a comment.


  My suggestion would be to simply mirror the functions that are currently 
available for Q-items - either by duplicating the code that does that and 
changing "Q" to "L", or better, generalizing it so that it works for all of 
Wikibase's namespaces (P/Q/L/M/...). Then it can be built upon on-wiki as 
needed (e.g., through Module:WikidataIB). That would also help structured data 
on commons, and future projects using wikibase.

TASK DETAIL
  https://phabricator.wikimedia.org/T212843

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Mike_Peel
Cc: Mike_Peel, MarcoSwart, Geertivp, Liuxinyu970226, Addshore, Jdforrester-WMF, 
deryckchan, Lydia_Pintscher, Lea_Lacroix_WMDE, darthmon_wmde, Premeditated, 
Nandana, Mringgaard, Lahi, Gq86, Cinemantique, GoranSMilovanovic, QZanden, 
LawExplorer, _jensen, rosalieper, jberkel, Psychoslave, Wikidata-bugs, aude, 
GPHemsley, Shizhao, Nemo_bis, Darkdadaah, Mbch331, Ltrlg, Krenair
___
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs