My guess is, the transfer rule for Franco-Japanese has a two word input, so
the stored blank is "-". Now the output has 3 words "una Franco-Japonêsa",
since the blanks are printed in order, they're printed in the first
available <b/> spot in the output rules.

There's a few possible solutions for this. One idea is to have two kinds of
blank markers - one that will print a space always, and one that will print
available input blanks. This can also be implemented by having a <lit v="
"/> in the output rule and then <b/> in the next spot. If this seems too
hacky a solution we can discuss other options.

*तन्मय खन्ना *
*Tanmai Khanna*


On Sun, Aug 30, 2020 at 12:09 PM Tanmai Khanna <khanna.tan...@gmail.com>
wrote:

> Hèctor,
> No worries I'll look into this. Can you send the input sentences? I want
> to see the transfer rules that are applying to the erroneous parts. They
> might need some changing.
>
> तन्मय खन्ना
> Tanmai Khanna
>
> ------------------------------
> *From:* Hèctor Alòs i Font <hectora...@gmail.com>
> *Sent:* Sunday, August 30, 2020 11:57:16 AM
> *To:* [apertium-stuff] <apertium-stuff@lists.sourceforge.net>
> *Subject:* Re: [Apertium-stuff] Update about superblanks in transfer
>
> Unfortunately, I found a lot of problems cased by superblanks, especially
> with the handling of hyphens. See a couple of differences in translations
> of my French test corpus into Arpitan before and after the update:
>
> < 00607. Tandis que les Tétes Broulâyes sont en *permission sur *Espritos
> Marcos, tomba amouerox de Yvonne, una Franco-Japonêsa.
> ---
> > 00607. Tandis que les Tétes Broulâyes sont en *permission sur *Espritos
> Marcos, tomba amouerox de Yvonne, una- Franco Japonêsa.
>
> < 00748. On povêt per ègzemplo parlar, sot Charlo-lo-Pelâ, de la
> "*foresta" des pêrches de la Sêna.
> ---
> > 00748. On povêt per ègzemplo parlar, sot Charlo-lo- Pelâ, de la
> "*foresta" des pêrches de la Sêna.
>
> Hèctor
>
> Missatge de Tanmai Khanna <khanna.tan...@gmail.com> del dia ds., 29 d’ag.
> 2020 a les 16:50:
>
> Hey guys!
> The wordbound blanks project handles blanks that are supposed to be
> reordered. Therefore, we no longer need the user to be worried about blank
> positions in transfer rules. The latest update to the apertium code makes
> it such that <b pos="X"/> is now the same as <b/> . You can change the <b
> pos="X"/> in your transfer rules to just <b/> and it'll work.
>
> Now, the only thing you need to worry about when writing transfer rules is
> whether you want a blank between the two LUs or not. *Input blanks will
> be stored as a queue and will be printed in order in all
> available <b/> spots in the rule output. *
>
> *Note:*
> - If the output rule has more blank spots than input blanks, then the
> remaining blank spots will be spaces.
> - If the output rule has less blank spots than input blanks, then the
> remaining input blanks will be output after the rule output.
> - If the input blank is an empty string, it is stored as a space.
>
> In some transfer rules, there are input patterns which don't have a space
> between them. In the output section of these transfer rules, <b pos="1"/> used
> to give an empty string, but it will now give a space. To remove the blank
> from the output, you will need to remove the <b pos="1"/> from the
> transfer rule and it will be fine.
>
> Here are some examples from the tests.
>
> EXAMPLE 1:
> Input:
>
> [blank1] ^worda<det>/wordta<det>$ ;[blank2]; ^wordb<adj>/wordtb<adj>$ 
> [blank3];  ^hun<n><acr>/ho<n><acr>$ [blank4]
>
> There's no <b/> in rule output, so all blanks are after flushed after
> rule output.
>
> Output:
>
> [blank1] ^test1<adj>{^wordta<det>$^wordtb<adj>$^ho<n><acr>$}$ ;[blank2];  
> [blank3];   [blank4]
>
> EXAMPLE 2:
> Input:
>
> [blank1] ^wordb<adj>/wordtb<adj>$ ;[blank2]; ^worda<det>/wordta<det>$ 
> [blank3];  ^hun<n><acr>/ho<n><acr>$ [blank4]
>
> There's one <b/> in rule output, so it prints one and flushes the rest.
>
> Output:
>
> [blank1] ^test1<det>{^wordta<det>$ ;[blank2]; ^ho<n><acr>$}$ [blank3];   
> [blank4]
>
> This has been implemented for the chunker, interchunk, and postchunk.
>
> If you have any questions, suggestions, comments, etc., I'll be happy to
> respond to them.
>
> Thanks and Regards,
> *तन्मय खन्ना *
> *Tanmai Khanna*
> _______________________________________________
> Apertium-stuff mailing list
> Apertium-stuff@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>
>
_______________________________________________
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Reply via email to