In Occitan, in many cases the spelling rules require the elision of the
beginning of pronouns and determiners, e.g. "que o > que'u". There are also
numerous cases of fusions, e.g. `de lo > del` or `de lo > deu` or `de lo >
deth` depending on the variety of Occitan. If we add to this the great
(sub)dialectal variety of Occitan, the result is almost a combinatorial
explosion. At present, we have hundreds of lines in the Occitan monodix to
try to deal with them, but it is not enough.

One of the embarrassing problems with this is the issue I have had this
morning: `çò que’u`. `çò que` is one of the many forms of a given relative
pronoun (but it can be also analysed as the pronoun `çò` followed by the
word `que` that may be here at least a kind of adverb). The issue is that
we don't have a definition in the Occitan monodix for `çò que’u` as `çò
que` + `u` (nor as `çò` + `que` + `u`), using </j> (it is not in the
hundreds we have). The result is that the translation has been done almost
correctly, but the translations of `çò que` and `u` have been put together
without a blank, since there is not a blank in the input. That's why we
have to define so many combinations using `</j>`:

```
$ echo "00192. Lo privilègi de l’editorialista qu’es de poder escríver **çò
que’u** passa peu cap." | apertium -d . oci_gascon-fra
00192. Le privilège de l'éditorialiste  est de pouvoir écrire **ce quela**
 passe pour la tête.
```

Does anyone have any ideas on how not to solve this "the hard way" (as we
have done so far)?

Hèctor
_______________________________________________
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Reply via email to