Unfortunately, I found a lot of problems cased by superblanks, especially with the handling of hyphens. See a couple of differences in translations of my French test corpus into Arpitan before and after the update:
< 00607. Tandis que les Tétes Broulâyes sont en *permission sur *Espritos Marcos, tomba amouerox de Yvonne, una Franco-Japonêsa. --- > 00607. Tandis que les Tétes Broulâyes sont en *permission sur *Espritos Marcos, tomba amouerox de Yvonne, una- Franco Japonêsa. < 00748. On povêt per ègzemplo parlar, sot Charlo-lo-Pelâ, de la "*foresta" des pêrches de la Sêna. --- > 00748. On povêt per ègzemplo parlar, sot Charlo-lo- Pelâ, de la "*foresta" des pêrches de la Sêna. Hèctor Missatge de Tanmai Khanna <khanna.tan...@gmail.com> del dia ds., 29 d’ag. 2020 a les 16:50: > Hey guys! > The wordbound blanks project handles blanks that are supposed to be > reordered. Therefore, we no longer need the user to be worried about blank > positions in transfer rules. The latest update to the apertium code makes > it such that <b pos="X"/> is now the same as <b/> . You can change the <b > pos="X"/> in your transfer rules to just <b/> and it'll work. > > Now, the only thing you need to worry about when writing transfer rules is > whether you want a blank between the two LUs or not. *Input blanks will > be stored as a queue and will be printed in order in all > available <b/> spots in the rule output. * > > *Note:* > - If the output rule has more blank spots than input blanks, then the > remaining blank spots will be spaces. > - If the output rule has less blank spots than input blanks, then the > remaining input blanks will be output after the rule output. > - If the input blank is an empty string, it is stored as a space. > > In some transfer rules, there are input patterns which don't have a space > between them. In the output section of these transfer rules, <b pos="1"/> used > to give an empty string, but it will now give a space. To remove the blank > from the output, you will need to remove the <b pos="1"/> from the > transfer rule and it will be fine. > > Here are some examples from the tests. > > EXAMPLE 1: > Input: > > [blank1] ^worda<det>/wordta<det>$ ;[blank2]; ^wordb<adj>/wordtb<adj>$ > [blank3]; ^hun<n><acr>/ho<n><acr>$ [blank4] > > There's no <b/> in rule output, so all blanks are after flushed after > rule output. > > Output: > > [blank1] ^test1<adj>{^wordta<det>$^wordtb<adj>$^ho<n><acr>$}$ ;[blank2]; > [blank3]; [blank4] > > EXAMPLE 2: > Input: > > [blank1] ^wordb<adj>/wordtb<adj>$ ;[blank2]; ^worda<det>/wordta<det>$ > [blank3]; ^hun<n><acr>/ho<n><acr>$ [blank4] > > There's one <b/> in rule output, so it prints one and flushes the rest. > > Output: > > [blank1] ^test1<det>{^wordta<det>$ ;[blank2]; ^ho<n><acr>$}$ [blank3]; > [blank4] > > This has been implemented for the chunker, interchunk, and postchunk. > > If you have any questions, suggestions, comments, etc., I'll be happy to > respond to them. > > Thanks and Regards, > *तन्मय खन्ना * > *Tanmai Khanna* > _______________________________________________ > Apertium-stuff mailing list > Apertium-stuff@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/apertium-stuff >
_______________________________________________ Apertium-stuff mailing list Apertium-stuff@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/apertium-stuff