[Corpora-List] Re: Lemon and Noncognitive morphology

Ada Wan via Corpora Tue, 19 Sep 2023 09:33:42 -0700

Dear Max

Thanks for your message.


My suggestion works with all texts (regardless of how "prototypical" or
"clean"/"dirty" --- note that it is our acceptance, of how real and
frequent "prototypical/canonical/textbook linguistic phenomena" are, that
needs adjustment) and all "morphological" types*, and with just a few lines
of code. I had read Hugh's question and Christian's replies.

*Please recall that despite how morphology has come to be an academic
concentration, it is not a necessary method for the classification of
language varieties. (There may be other decomposition methods that are more
universal, but "linguistic morphology" or analysis thereof is not one.)

Thanks for your feedback nonetheless.

Best
Ada


On Tue, Sep 19, 2023 at 1:02 PM Max Ionov via Corpora <
corpora@list.elra.info> wrote:

> Dear Ada,
>
> I am afraid your suggestion is something that the Ontolex-lemon model does
> not support (which is important since the question was specifically about
> that model for representing lexicographic information, not just general
> approaches for dealing with non-concatenative morphology). It is very
> similar though to the underlying current approach that we use in the model
> (that Christian described in his response).
>
> While developing this model as a community effort we try really hard to
> avoid eurocentric views, but since the whole premise of OntoLex is to model
> data, we rely on existing resources, not inventing them from scratch. So
> the model can represent word lists as well as computational lexicons or
> many other types of lexicographic data existing out there.
>
> Best regards,
>
> Max
>
> On 18/09/2023 19:29, Ada Wan via Corpora wrote:
>
> Dear Hugh
>
> An alternative would be to use dictionaries (as in, "{ }" in python) to
> group characters belonging to the consonant and vowel groups (or at least
> one of them) and then examine accordingly. This should render more
> scientific insights on sequences than relying on bigger spans of hard-coded
> information (on "word"/"morph(eme)"-level), esp. when one calculates the
> transition probabilities and interpret accordingly. (Remember the
> whitespaces (if any) and use continuous texts/data (i.e. not just colonial
> data e.g. "word lists")!) What is the purpose, though, of your task that is
> supposed to be related to "morphology"?
>
> Best
> Ada
>
> On Mon, Sep 18, 2023 at 6:07 PM Christian Chiarcos via Corpora <
> corpora@list.elra.info> wrote:
>
>> What I forgot to state is the most elementary aspect: an
>> ontolex:LexicalEntry can be associated with a (nonconcatenative)
>> morph:Morph or a (nonconcatenative) morph:Rule/morph:Replacement in the
>> following ways:
>> - for word formation: the lexical entry (e.g., a lexinfo:Root) from which
>> one or more derived forms can be encoded as the vartrans:source of a
>> morph:WordFormationRelation and an associated morph:WordFormationRule
>> - for inflection: the lexical entry can have an
>> ontolex:morphologicalPattern relation pointing to a morph:Paradigm. Such
>> paradigms are the morph:paradigm of morph:InflectionRules.
>>
>> Both morph:InflectionRule and morph:WordFormationRule are morph:Rules and
>> can thus be connected to a non-concatenative morpheme (or replacement) as
>> described in the other email.
>>
>> Our current real-world examples for noncontenative morphology are from
>> word formation, only. I guess your usecase is more in the inflection area
>> (because for word formation, it would be practical to give a lexical sense,
>> and then you'd need a LexicalEntry anyway), but the noncontenative part of
>> the specification (by means of regular expressions and capturing groups in
>> morph:Replacement) is identical in both use scenarios.
>>
>> Best,
>> Christian
>>
>> Am Mo., 18. Sept. 2023 um 16:45 Uhr schrieb Christian Chiarcos <
>> christian.chiar...@gmail.com>:
>>
>>> Dear Hugh,
>>>
>>> this has been addressed in the context of the emerging OntoLex-Morph
>>> vocabulary (https://www.w3.org/community/ontolex/wiki/Morphology,
>>> https://github.com/ontolex/morph; most recent diagram under
>>> https://github.com/ontolex/morph/blob/master/doc/diagrams/Readme.md).
>>> Here, a morph:Morph object (a lexical entry of a lexical resource for
>>> morphemes, depending on the type of resource, this can be a morpheme or an
>>> allomorph of a morpheme), can be the object of a morph:involves property
>>> that connects it with a morph:Rule. This morph:Rule can have one or more
>>> morph:replacement properties. The morph:Replacement objects that this
>>> points to use regular expressions to formalize source and target strings of
>>> the rule associated with that particular morph(eme). These use
>>> Perl/Java/SPARQL-style regex syntax, which includes the support for
>>> capturing groups.
>>>
>>> Note that this formalizes the form side of morphemes only, not the
>>> meaning side. However, a morph:Rule can also have a
>>> morph:grammaticalMeaning property to which such information can be added.
>>> Last week, Max Ionov and Mike Rosner have described the application (and an
>>> extension) of this mechanism for Maltese in a recent LDK paper: Beyond
>>> Concatenative Morphology: Applying OntoLex-Morph to Maltese *Maxim
>>> Ionov, Mike Rosner*. (Not online, yet.) We were also looking into other
>>> Semitic languages (and related phenomena such as Umlaut in German or vowel
>>> harmony in Turkic), but only on individual examples. If anyone is
>>> interested in discussing this further, please join the biweekly
>>> OntoLex-Morph calls ;)
>>>
>>> The OntoLex-Morph vocabulary is relatively advanced, and we are in the
>>> process of freezing it in order to prepare its publication. Finalization of
>>> the report is expected for mid-next year.
>>>
>>> Best,
>>> Christian
>>>
>>> Am Mo., 18. Sept. 2023 um 15:31 Uhr schrieb Hugh Paterson III via
>>> Corpora <corpora@list.elra.info>:
>>>
>>>> Greetings,
>>>>
>>>> Does anyone know of any descriptions or approaches to using
>>>> Ontolex/lemon with non-concatenative morphology? Is the assumption that
>>>> Cv1Cv2C shaped words will have their own entries for each instance of
>>>> changes for v1 and v2? If this is the case, then this radically increases
>>>> the number of items in a dictionary when compared with languages with affix
>>>> type morphology.
>>>>
>>>>
>>>> Any pointers appreciated,
>>>>
>>>> Kind regards,
>>>> Hugh
>>>> _______________________________________________
>>>> Corpora mailing list -- corpora@list.elra.info
>>>> https://list.elra.info/mailman3/postorius/lists/corpora.list.elra.info/
>>>> To unsubscribe send an email to corpora-le...@list.elra.info
>>>>
>>> _______________________________________________
>> Corpora mailing list -- corpora@list.elra.info
>> https://list.elra.info/mailman3/postorius/lists/corpora.list.elra.info/
>> To unsubscribe send an email to corpora-le...@list.elra.info
>>
>
> _______________________________________________
> Corpora mailing list -- 
> corp...@list.elra.infohttps://list.elra.info/mailman3/postorius/lists/corpora.list.elra.info/
> To unsubscribe send an email to corpora-le...@list.elra.info
>
> _______________________________________________
> Corpora mailing list -- corpora@list.elra.info
> https://list.elra.info/mailman3/postorius/lists/corpora.list.elra.info/
> To unsubscribe send an email to corpora-le...@list.elra.info
>

_______________________________________________
Corpora mailing list -- corpora@list.elra.info
https://list.elra.info/mailman3/postorius/lists/corpora.list.elra.info/
To unsubscribe send an email to corpora-le...@list.elra.info

[Corpora-List] Re: Lemon and Noncognitive morphology

Reply via email to