Hey, It solved them Franco-Japanese issue as well :D Can you check the diff once and see if there's any more issues Hèctor? (After updating apertium). *तन्मय खन्ना * *Tanmai Khanna*
On Sun, Aug 30, 2020 at 2:35 PM Tanmai Khanna <khanna.tan...@gmail.com> wrote: > Hi Hèctor, > I'm dealing with the issues I see one by one. > 1. I was flushing the remaining blanks after processOut because I thought > usually we only have one <out>..</out> block in the rule, but in some of > your rules there's multiple, so in the latest commit to apertium/apertium, > I made them flush after the rule is finished outputting entirely. This > solves some of the issues such as: > > $ echo "au lycée Louis-le-Grand" | apertium -d .. fra-frp > > u licê Louis-lo-Grant. > > > 2. The spaces between numbers in your output are probably coming because > you have <b/> in the rules. If you remove those, the spaces will go away. > > > I'm still evaluating some other issues. > > > *तन्मय खन्ना * > *Tanmai Khanna* > > > On Sun, Aug 30, 2020 at 1:21 PM Hèctor Alòs i Font <hectora...@gmail.com> > wrote: > >> >> >> Missatge de Tanmai Khanna <khanna.tan...@gmail.com> del dia dg., 30 >> d’ag. 2020 a les 9:49: >> >>> My guess is, the transfer rule for Franco-Japanese has a two word input, >>> so the stored blank is "-". Now the output has 3 words "una >>> Franco-Japonêsa", since the blanks are printed in order, they're printed in >>> the first available <b/> spot in the output rules. >>> >> >> Yes, that is. "Franco" is a prefix and it is analysed as such. I have >> some tens of prefixes for avoiding having hundreds of words in the >> dictionaries and, more important, to be able to deal to unknown pairs like >> "franco-tibétain" or "franco-silésien". >> >> >>> >>> There's a few possible solutions for this. One idea is to have two kinds >>> of blank markers - one that will print a space always, and one that will >>> print available input blanks. This can also be implemented by having a <lit >>> v=" "/> in the output rule and then <b/> in the next spot. If this seems >>> too hacky a solution we can discuss other options. >>> >>> *तन्मय खन्ना * >>> *Tanmai Khanna* >>> >>> >>> On Sun, Aug 30, 2020 at 12:09 PM Tanmai Khanna <khanna.tan...@gmail.com> >>> wrote: >>> >>>> Hèctor, >>>> No worries I'll look into this. Can you send the input sentences? I >>>> want to see the transfer rules that are applying to the erroneous parts. >>>> They might need some changing. >>>> >>>> तन्मय खन्ना >>>> Tanmai Khanna >>>> >>>> ------------------------------ >>>> *From:* Hèctor Alòs i Font <hectora...@gmail.com> >>>> *Sent:* Sunday, August 30, 2020 11:57:16 AM >>>> *To:* [apertium-stuff] <apertium-stuff@lists.sourceforge.net> >>>> *Subject:* Re: [Apertium-stuff] Update about superblanks in transfer >>>> >>>> Unfortunately, I found a lot of problems cased by superblanks, >>>> especially with the handling of hyphens. See a couple of differences in >>>> translations of my French test corpus into Arpitan before and after the >>>> update: >>>> >>>> < 00607. Tandis que les Tétes Broulâyes sont en *permission sur >>>> *Espritos Marcos, tomba amouerox de Yvonne, una Franco-Japonêsa. >>>> --- >>>> > 00607. Tandis que les Tétes Broulâyes sont en *permission sur >>>> *Espritos Marcos, tomba amouerox de Yvonne, una- Franco Japonêsa. >>>> >>>> < 00748. On povêt per ègzemplo parlar, sot Charlo-lo-Pelâ, de la >>>> "*foresta" des pêrches de la Sêna. >>>> --- >>>> > 00748. On povêt per ègzemplo parlar, sot Charlo-lo- Pelâ, de la >>>> "*foresta" des pêrches de la Sêna. >>>> >>>> Hèctor >>>> >>>> Missatge de Tanmai Khanna <khanna.tan...@gmail.com> del dia ds., 29 >>>> d’ag. 2020 a les 16:50: >>>> >>>> Hey guys! >>>> The wordbound blanks project handles blanks that are supposed to be >>>> reordered. Therefore, we no longer need the user to be worried about blank >>>> positions in transfer rules. The latest update to the apertium code makes >>>> it such that <b pos="X"/> is now the same as <b/> . You can change the <b >>>> pos="X"/> in your transfer rules to just <b/> and it'll work. >>>> >>>> Now, the only thing you need to worry about when writing transfer rules >>>> is whether you want a blank between the two LUs or not. *Input blanks >>>> will be stored as a queue and will be printed in order in all >>>> available <b/> spots in the rule output. * >>>> >>>> *Note:* >>>> - If the output rule has more blank spots than input blanks, then the >>>> remaining blank spots will be spaces. >>>> - If the output rule has less blank spots than input blanks, then the >>>> remaining input blanks will be output after the rule output. >>>> - If the input blank is an empty string, it is stored as a space. >>>> >>>> In some transfer rules, there are input patterns which don't have a >>>> space between them. In the output section of these transfer rules, <b >>>> pos="1"/> used to give an empty string, but it will now give a space. >>>> To remove the blank from the output, you will need to remove the <b >>>> pos="1"/> from the transfer rule and it will be fine. >>>> >>>> Here are some examples from the tests. >>>> >>>> EXAMPLE 1: >>>> Input: >>>> >>>> [blank1] ^worda<det>/wordta<det>$ ;[blank2]; ^wordb<adj>/wordtb<adj>$ >>>> [blank3]; ^hun<n><acr>/ho<n><acr>$ [blank4] >>>> >>>> There's no <b/> in rule output, so all blanks are after flushed after >>>> rule output. >>>> >>>> Output: >>>> >>>> [blank1] ^test1<adj>{^wordta<det>$^wordtb<adj>$^ho<n><acr>$}$ ;[blank2]; >>>> [blank3]; [blank4] >>>> >>>> EXAMPLE 2: >>>> Input: >>>> >>>> [blank1] ^wordb<adj>/wordtb<adj>$ ;[blank2]; ^worda<det>/wordta<det>$ >>>> [blank3]; ^hun<n><acr>/ho<n><acr>$ [blank4] >>>> >>>> There's one <b/> in rule output, so it prints one and flushes the rest. >>>> >>>> Output: >>>> >>>> [blank1] ^test1<det>{^wordta<det>$ ;[blank2]; ^ho<n><acr>$}$ [blank3]; >>>> [blank4] >>>> >>>> This has been implemented for the chunker, interchunk, and postchunk. >>>> >>>> If you have any questions, suggestions, comments, etc., I'll be happy >>>> to respond to them. >>>> >>>> Thanks and Regards, >>>> *तन्मय खन्ना * >>>> *Tanmai Khanna* >>>> _______________________________________________ >>>> Apertium-stuff mailing list >>>> Apertium-stuff@lists.sourceforge.net >>>> https://lists.sourceforge.net/lists/listinfo/apertium-stuff >>>> >>>> _______________________________________________ >>> Apertium-stuff mailing list >>> Apertium-stuff@lists.sourceforge.net >>> https://lists.sourceforge.net/lists/listinfo/apertium-stuff >>> >> _______________________________________________ >> Apertium-stuff mailing list >> Apertium-stuff@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/apertium-stuff >> >
_______________________________________________ Apertium-stuff mailing list Apertium-stuff@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/apertium-stuff