Hi Jonathan, Fran,

Thanks for looking at this, I really appreciate it :)

From those two options, I think the first would be better. If I got it right, the 1st is pure segmentation while the 2nd inserts an additional д.

Segmenting поезде as поез>де (1st option) would allow us to recover the original word easily from the segmented version. Segmenting as поезд>де (2nd option) would not as we may recover the original word wrongly as поездде.


best,

a.

On 06/03/2019 23:00, Francis Tyers wrote:
El 2019-03-06 21:51, Jonathan Washington escribió:
Hi Antonio,

I have something mostly working, but have a few questions about what
specifically you're after.

I guess to start things out, for
поезд<n><loc>:поезд>{D}{A}:поезде, do you prefer
поез>де or поезд>де, or something else?


To clarify:

поезд "train" is the lemma,
-{D}{A} is the morpheme for locative,
-де is the morph
поез is a substring

There is a rule that the final -д of a word deletes in certain cases
if the following morpheme starts with {D} (or something like that).

So, the "exact string" would give you a non-word (поез) but
if you want each part to be an actual word you'd need (поезд) but
then you wouldn't get an exact string match with removing the morpheme
boundaries.

Fran


_______________________________________________
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Reply via email to