Hi Hèctor, I'm dealing with the issues I see one by one. 1. I was flushing the remaining blanks after processOut because I thought usually we only have one <out>..</out> block in the rule, but in some of your rules there's multiple, so in the latest commit to apertium/apertium, I made them flush after the rule is finished outputting entirely. This solves some of the issues such as:
$ echo "au lycée Louis-le-Grand" | apertium -d .. fra-frp u licê Louis-lo-Grant. 2. The spaces between numbers in your output are probably coming because you have <b/> in the rules. If you remove those, the spaces will go away. I'm still evaluating some other issues. *तन्मय खन्ना * *Tanmai Khanna* On Sun, Aug 30, 2020 at 1:21 PM Hèctor Alòs i Font <hectora...@gmail.com> wrote: > > > Missatge de Tanmai Khanna <khanna.tan...@gmail.com> del dia dg., 30 d’ag. > 2020 a les 9:49: > >> My guess is, the transfer rule for Franco-Japanese has a two word input, >> so the stored blank is "-". Now the output has 3 words "una >> Franco-Japonêsa", since the blanks are printed in order, they're printed in >> the first available <b/> spot in the output rules. >> > > Yes, that is. "Franco" is a prefix and it is analysed as such. I have some > tens of prefixes for avoiding having hundreds of words in the > dictionaries and, more important, to be able to deal to unknown pairs like > "franco-tibétain" or "franco-silésien". > > >> >> There's a few possible solutions for this. One idea is to have two kinds >> of blank markers - one that will print a space always, and one that will >> print available input blanks. This can also be implemented by having a <lit >> v=" "/> in the output rule and then <b/> in the next spot. If this seems >> too hacky a solution we can discuss other options. >> >> *तन्मय खन्ना * >> *Tanmai Khanna* >> >> >> On Sun, Aug 30, 2020 at 12:09 PM Tanmai Khanna <khanna.tan...@gmail.com> >> wrote: >> >>> Hèctor, >>> No worries I'll look into this. Can you send the input sentences? I want >>> to see the transfer rules that are applying to the erroneous parts. They >>> might need some changing. >>> >>> तन्मय खन्ना >>> Tanmai Khanna >>> >>> ------------------------------ >>> *From:* Hèctor Alòs i Font <hectora...@gmail.com> >>> *Sent:* Sunday, August 30, 2020 11:57:16 AM >>> *To:* [apertium-stuff] <apertium-stuff@lists.sourceforge.net> >>> *Subject:* Re: [Apertium-stuff] Update about superblanks in transfer >>> >>> Unfortunately, I found a lot of problems cased by superblanks, >>> especially with the handling of hyphens. See a couple of differences in >>> translations of my French test corpus into Arpitan before and after the >>> update: >>> >>> < 00607. Tandis que les Tétes Broulâyes sont en *permission sur >>> *Espritos Marcos, tomba amouerox de Yvonne, una Franco-Japonêsa. >>> --- >>> > 00607. Tandis que les Tétes Broulâyes sont en *permission sur >>> *Espritos Marcos, tomba amouerox de Yvonne, una- Franco Japonêsa. >>> >>> < 00748. On povêt per ègzemplo parlar, sot Charlo-lo-Pelâ, de la >>> "*foresta" des pêrches de la Sêna. >>> --- >>> > 00748. On povêt per ègzemplo parlar, sot Charlo-lo- Pelâ, de la >>> "*foresta" des pêrches de la Sêna. >>> >>> Hèctor >>> >>> Missatge de Tanmai Khanna <khanna.tan...@gmail.com> del dia ds., 29 >>> d’ag. 2020 a les 16:50: >>> >>> Hey guys! >>> The wordbound blanks project handles blanks that are supposed to be >>> reordered. Therefore, we no longer need the user to be worried about blank >>> positions in transfer rules. The latest update to the apertium code makes >>> it such that <b pos="X"/> is now the same as <b/> . You can change the <b >>> pos="X"/> in your transfer rules to just <b/> and it'll work. >>> >>> Now, the only thing you need to worry about when writing transfer rules >>> is whether you want a blank between the two LUs or not. *Input blanks >>> will be stored as a queue and will be printed in order in all >>> available <b/> spots in the rule output. * >>> >>> *Note:* >>> - If the output rule has more blank spots than input blanks, then the >>> remaining blank spots will be spaces. >>> - If the output rule has less blank spots than input blanks, then the >>> remaining input blanks will be output after the rule output. >>> - If the input blank is an empty string, it is stored as a space. >>> >>> In some transfer rules, there are input patterns which don't have a >>> space between them. In the output section of these transfer rules, <b >>> pos="1"/> used to give an empty string, but it will now give a space. >>> To remove the blank from the output, you will need to remove the <b >>> pos="1"/> from the transfer rule and it will be fine. >>> >>> Here are some examples from the tests. >>> >>> EXAMPLE 1: >>> Input: >>> >>> [blank1] ^worda<det>/wordta<det>$ ;[blank2]; ^wordb<adj>/wordtb<adj>$ >>> [blank3]; ^hun<n><acr>/ho<n><acr>$ [blank4] >>> >>> There's no <b/> in rule output, so all blanks are after flushed after >>> rule output. >>> >>> Output: >>> >>> [blank1] ^test1<adj>{^wordta<det>$^wordtb<adj>$^ho<n><acr>$}$ ;[blank2]; >>> [blank3]; [blank4] >>> >>> EXAMPLE 2: >>> Input: >>> >>> [blank1] ^wordb<adj>/wordtb<adj>$ ;[blank2]; ^worda<det>/wordta<det>$ >>> [blank3]; ^hun<n><acr>/ho<n><acr>$ [blank4] >>> >>> There's one <b/> in rule output, so it prints one and flushes the rest. >>> >>> Output: >>> >>> [blank1] ^test1<det>{^wordta<det>$ ;[blank2]; ^ho<n><acr>$}$ [blank3]; >>> [blank4] >>> >>> This has been implemented for the chunker, interchunk, and postchunk. >>> >>> If you have any questions, suggestions, comments, etc., I'll be happy to >>> respond to them. >>> >>> Thanks and Regards, >>> *तन्मय खन्ना * >>> *Tanmai Khanna* >>> _______________________________________________ >>> Apertium-stuff mailing list >>> Apertium-stuff@lists.sourceforge.net >>> https://lists.sourceforge.net/lists/listinfo/apertium-stuff >>> >>> _______________________________________________ >> Apertium-stuff mailing list >> Apertium-stuff@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/apertium-stuff >> > _______________________________________________ > Apertium-stuff mailing list > Apertium-stuff@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/apertium-stuff >
_______________________________________________ Apertium-stuff mailing list Apertium-stuff@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/apertium-stuff