[Wikidata-bugs] [Maniphest] T320263: [Wikidata] Provide a feature to link Item labels to Lexemes

2022-11-17 Thread AGutman-WMF
AGutman-WMF added a comment. There is a discussion of my stop-gap solution on the item where I added the //literal translation// property: https://www.wikidata.org/wiki/Talk:Q467#Lexemes To remedy this, I have create a property proposal **Verbalization by lexeme <ht

[Wikidata-bugs] [Maniphest] T320263: [Wikidata] Provide a feature to link Item labels to Lexemes

2022-10-25 Thread AGutman-WMF
AGutman-WMF added a comment. As a stop gap solution, I'm suggesting we use the literal translation <https://www.wikidata.org/wiki/Property:P2441> property to link items to senses. As an example of its usage, I've linked Q467 <https://www.wikidata.org/wiki/Q467> to Hebrew L6

[Wikidata-bugs] [Maniphest] T320263: [Wikidata] Provide a feature to link Item labels to Lexemes

2022-10-13 Thread AGutman-WMF
AGutman-WMF added a comment. - Yes, you're right it makes more sense to link to a sense. - Using a Wikidata property can work, if it's a multilingual property (one value per language code). To some extent, this is already done with properties "female form of label" and

[Wikidata-bugs] [Maniphest] T320263: [Wikidata] Linking Item labels to Lexemes

2022-10-11 Thread AGutman-WMF
AGutman-WMF added a comment. As said, both issues can be solved. The issue is that, as currently construed, the labels/descriptions are not really machine-readable: currently they are usable mostly for human consumption. Having only multi-lingual labels in ontologies, without backing

[Wikidata-bugs] [Maniphest] T320263: [Wikidata] Linking Item labels to Lexemes

2022-10-11 Thread AGutman-WMF
AGutman-WMF added a comment. Do you mean that will clutter the UI or the database itself? If the former, this can be solved by selectively showing these link in the UI. If you refer to cluttering the database itself - I agree this would require extra capacity, but I don't think

[Wikidata-bugs] [Maniphest] T320263: [Wikidata] Linking Item labels to Lexemes

2022-10-07 Thread AGutman-WMF
AGutman-WMF renamed this task from "[Wikidata] Linking items labels to lexemes" to "[Wikidata] Linking Item labels to Lexemes". TASK DETAIL https://phabricator.wikimedia.org/T320263 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/

[Wikidata-bugs] [Maniphest] T320263: [Wikidata] Linking items labels to lexemes

2022-10-07 Thread AGutman-WMF
AGutman-WMF created this task. AGutman-WMF added projects: Abstract Wikipedia team, Wikidata. TASK DESCRIPTION **Feature summary**: I would like to be able to link Wikidata Item labels to corresponding lexemes. For instance, Item Q467 <https://www.wikidata.org/wiki/Q467> has the E

[Wikidata-bugs] [Maniphest] T317193: Add language codes for Sepedi and isiNdebele

2022-10-04 Thread AGutman-WMF
AGutman-WMF added a comment. Ok, who is responsible for this approval? Could we ping them? Currently we would like to add lexemes in these languages, but I suppose all use cases should ultimately be supported. TASK DETAIL https://phabricator.wikimedia.org/T317193 EMAIL PREFERENCES

[Wikidata-bugs] [Maniphest] T317193: Add language codes for Sepedi and isiNdebele

2022-10-03 Thread AGutman-WMF
AGutman-WMF added a comment. Can we go forward with nd & nr codes? TASK DETAIL https://phabricator.wikimedia.org/T317193 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: AGutman-WMF Cc: Mahir256, Lucas_Werkmeister_WMDE, AGutman-WMF, Ke

[Wikidata-bugs] [Maniphest] T317193: Add language codes for Sepedi and isiNdebele

2022-09-07 Thread AGutman-WMF
AGutman-WMF claimed this task. AGutman-WMF added a comment. I'm taking care of the Ndebele language codes (nd & nr) in https://gerrit.wikimedia.org/r/828887. As for the nso code it seems it is already supported (see https://www.wikidata.org/wiki/Lexeme:L690524) TASK DETAIL h

[Wikidata-bugs] [Maniphest] T289776: Enable all ISO 639-3 codes on Wikidata

2022-09-01 Thread AGutman-WMF
AGutman-WMF added a comment. I would like to support here the idea to add all the language codes of ISO 639-3 to be supported by Wikidata (and Abstract Wikipedia). Notwithstanding @mrephabricator's comments, this standard is the de-facto used standard to enumerate all the world's languages

[Wikidata-bugs] [Maniphest] T236593: Cannot enter multiple forms for the same language variant

2022-07-25 Thread AGutman-WMF
AGutman-WMF added a comment. @Asaf Insofar two forms are considered distinct lexemes, it is probably the case that not all statements hold for both forms (e.g. the pronunciation may be different, and possibly other details such as etymology). If the two forms are close enough (e.g. just

[Wikidata-bugs] [Maniphest] T236593: Cannot enter multiple forms for the same language variant

2022-07-22 Thread AGutman-WMF
AGutman-WMF added a comment. @LucasWerkmeister I agree with you that if two variants have two different pronunciation, they should probably be split into two different lexemes (in general, I think we should avoid having multiple forms with the same grammatical features within one lexeme

[Wikidata-bugs] [Maniphest] T236593: Cannot enter multiple forms for the same language variant

2022-07-12 Thread AGutman-WMF
AGutman-WMF added a comment. I believe the current situation, where multiple forms are added to account for spelling variations goes against the spirit of the lexicographical data model, and in particular the idea that there should be exactly one form for each combination of grammatical

[Wikidata-bugs] [Maniphest] T236593: Cannot enter multiple forms for the same language variant

2022-06-30 Thread AGutman-WMF
AGutman-WMF added a comment. I've now created a patch <https://gerrit.wikimedia.org/r/c/808244> that does allow associating several spelling variants with the same private language code. If the patch gets merged, it will allow associating spelling variants of forms or lexemes with

[Wikidata-bugs] [Maniphest] T236593: Cannot enter multiple forms for the same language variant

2022-06-24 Thread AGutman-WMF
AGutman-WMF added a comment. @Fnielsen as far as I see, each variant spelling forms its own set of inflected forms, so you have a paradigm related to //mørklægge// and another paradigm related to the variant spelling //mørkelægge//. So conceptually you don't have a single list of forms

[Wikidata-bugs] [Maniphest] T236593: Cannot enter multiple forms for the same language variant

2022-06-24 Thread AGutman-WMF
AGutman-WMF changed the task status from "Open" to "In Progress". AGutman-WMF added a comment. I'm working on a patch to allow multiple forms associated with the same private language code. TASK DETAIL https://phabricator.wikimedia.org/T236593 EMAIL

[Wikidata-bugs] [Maniphest] T236593: Cannot enter multiple forms for the same language variant

2022-06-24 Thread AGutman-WMF
AGutman-WMF added a comment. @Fnielsen given that the pronunciation of these forms is in fact different (according to the X-Sampa notation), and each has its own distinct inflection set, I would treat these as two distinct (synonymous) lexemes. I don't see the advantage of lumping all

[Wikidata-bugs] [Maniphest] T236593: Cannot enter multiple forms for the same language variant

2022-06-24 Thread AGutman-WMF
AGutman-WMF added a comment. @mxn If these are purely orthographic variants (i.e. the pronunciation is the same) I would list them under a single lexeme. And in that case, the most natural way would be to list them as spelling variants rather than distinct forms. To attach statements

[Wikidata-bugs] [Maniphest] T236593: Cannot enter multiple forms for the same language variant

2022-06-24 Thread AGutman-WMF
AGutman-WMF added a comment. In T236593#8016636 <https://phabricator.wikimedia.org/T236593#8016636>, @Fnielsen wrote: > In Danish, we are currently using multiple forms and linking them with https://www.wikidata.org/wiki/Property:P8530 See also the discussion

[Wikidata-bugs] [Maniphest] T236593: Cannot enter multiple forms for the same language variant

2022-06-21 Thread AGutman-WMF
AGutman-WMF added a comment. The ideal solution would be to allow (in the language code validator) arbitrary language codes including a rank identifier. For instance, for Viatnamese one should be able to use codes such as vi-x-Q8201-1, vi-x-Q8201-2 etc. Currently this doesn't pass