On Tue, Jun 21, 2022 at 12:14 AM Flor, Michael <mf...@ets.org> wrote:
> The notion of 'word' has difficulties in linguistics. > But not enough for abandoning it. > > Except we don't need it at all --- for both human or machine processing. > The argument from the paper "Fairness in Representation for Multilingual > NLP" > is not convincing at all. > Even if the early findings are correct for transformers , > applicability to human language faculty is not yet supported. > > Right, this paper version has not yet addressed the whole story, which I have yet to continue with. But one can get the gist from conditional probability, context, and finer granularity. > On the other hand, it is not even needed. > Developmental linguists have noted long ago that babies acquire all > natural languages at approximately the same rate (under some 'standard > conditions'), despite vast morphological and other differences between > languages. > Thus, in some sense, all natural human languages are already deemed > 'equal' vis-a-vis acquisition complexity. > > Well, talk to the NLP crowd or the ones who expect LM/MT results from different languages should have different performances, even if/when all else were equal. (I remember how hard and how many rounds I had to work for my rebuttals....) > For language learning later in life, > if one's native language is morphologically rich, learning (some types of) > morphologically rich languages (as an adult) is a bit easier than learning > a language that is very different, etc. > > That's the thing about this paper --- my personal take with L_n learning is that, no, it's actually also just a length and vocabulary thing wrt whatever one is used to (e.g. with L1), the environment/support available, and +/- personal propensity towards new lang. > Complexity of words in a language for non-native speakers/learners is > actually a big issue and a field of research in EFL (and now in NLP as > well). > > See above. > Finally, > word complexity is often defined within the same language (e.g. > able-ability, function-dysfunctional), > and so a notion of cross-linguistic hegemony or malice is not even > applicable here. > > What would it take for me to convince you that such "complexity" really boils down to just length and vocab (think the examples you gave, viewed from, say, a character perspective)? E.g. is 'Xjfewijpiweoheymqaweopaf'h' more or less complex than 'multiple-dysfunction-prone' to you?
_______________________________________________ UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora Corpora mailing list -- corpora@list.elra.info To unsubscribe send an email to corpora-le...@list.elra.info