Hèctor what I mean is, if you don't want a space in the output of rules you have to remove the <b/>. For eg.,
$ echo "f.75v." | apertium -d .. fra-frp *f.75 v.. This space between 75 and v is now there because the output rule has a <b/> and so if you want the output to come without a space, you should remove the <b/>. <rule comment="REGLA: NUM NOM"> <pattern> <pattern-item n="num"/> <pattern-item n="nom"/> </pattern> <action> <call-macro n="f_concord2"> <with-param pos="2"/> <with-param pos="1"/> </call-macro> <call-macro n="f_chunk_tags"> <with-param pos="2"/> </call-macro> <out> <chunk name="num_n"> <tags> <tag><lit-tag v="SN"/></tag> <tag><clip pos="2" side="tl" part="gen"/></tag> <tag><clip pos="2" side="tl" part="nbr"/></tag> <tag><var n="genero_sl"/></tag> <tag><var n="numero_sl"/></tag> </tags> <lu> <clip pos="1" side="tl" part="lemh"/> <clip pos="1" side="tl" part="tags"/> <clip pos="1" side="tl" part="lemq"/> </lu> <b pos="1"/> <lu> <clip pos="2" side="tl" part="lemh"/> <clip pos="2" side="tl" part="tags"/> <clip pos="2" side="tl" part="lemq"/> </lu> </chunk> </out> </action> </rule> This is because if the input blank is an empty string then that isn't counted as a blank. Does that work? *तन्मय खन्ना * *Tanmai Khanna* On Sun, Aug 30, 2020 at 2:58 PM Hèctor Alòs i Font <hectora...@gmail.com> wrote: > Missatge de Tanmai Khanna <khanna.tan...@gmail.com> del dia dg., 30 d’ag. > 2020 a les 12:06: > >> Hi Hèctor, >> I'm dealing with the issues I see one by one. >> 1. I was flushing the remaining blanks after processOut because I thought >> usually we only have one <out>..</out> block in the rule, but in some of >> your rules there's multiple, so in the latest commit to apertium/apertium, >> I made them flush after the rule is finished outputting entirely. This >> solves some of the issues such as: >> >> $ echo "au lycée Louis-le-Grand" | apertium -d .. fra-frp >> >> u licê Louis-lo-Grant. >> >> >> > It's too difficult to have a single <out> when dealing with complex > structures. For instance, in French there is "not + verb + secondary-not", > but in Arpitan I have "verb + not". Furthermore, the verb can be in a past > tense in the source language but needs "aux + participle" in the target > language (and I have to deal with which of the auxiliaries to use). More : > the verb can be pronominal in the output language, but not in the source. > So I use macros that deal with each of these issues and add or remove > stuff. The result is a kind of multi-step output (and I'm not the only that > does it). > > >> 2. The spaces between numbers in your output are probably coming because >> you have <b/> in the rules. If you remove those, the spaces will go away. >> > > I can't remove <b/> in the rules. They are added when a new word is added, > so I must add a blank too, at its beginning or its end. > > >> >> I'm still evaluating some other issues. >> >> >> *तन्मय खन्ना * >> *Tanmai Khanna* >> >> >> On Sun, Aug 30, 2020 at 1:21 PM Hèctor Alòs i Font <hectora...@gmail.com> >> wrote: >> >>> >>> >>> Missatge de Tanmai Khanna <khanna.tan...@gmail.com> del dia dg., 30 >>> d’ag. 2020 a les 9:49: >>> >>>> My guess is, the transfer rule for Franco-Japanese has a two word >>>> input, so the stored blank is "-". Now the output has 3 words "una >>>> Franco-Japonêsa", since the blanks are printed in order, they're printed in >>>> the first available <b/> spot in the output rules. >>>> >>> >>> Yes, that is. "Franco" is a prefix and it is analysed as such. I have >>> some tens of prefixes for avoiding having hundreds of words in the >>> dictionaries and, more important, to be able to deal to unknown pairs like >>> "franco-tibétain" or "franco-silésien". >>> >>> >>>> >>>> There's a few possible solutions for this. One idea is to have two >>>> kinds of blank markers - one that will print a space always, and one that >>>> will print available input blanks. This can also be implemented by having a >>>> <lit v=" "/> in the output rule and then <b/> in the next spot. If this >>>> seems too hacky a solution we can discuss other options. >>>> >>>> *तन्मय खन्ना * >>>> *Tanmai Khanna* >>>> >>>> >>>> On Sun, Aug 30, 2020 at 12:09 PM Tanmai Khanna <khanna.tan...@gmail.com> >>>> wrote: >>>> >>>>> Hèctor, >>>>> No worries I'll look into this. Can you send the input sentences? I >>>>> want to see the transfer rules that are applying to the erroneous parts. >>>>> They might need some changing. >>>>> >>>>> तन्मय खन्ना >>>>> Tanmai Khanna >>>>> >>>>> ------------------------------ >>>>> *From:* Hèctor Alòs i Font <hectora...@gmail.com> >>>>> *Sent:* Sunday, August 30, 2020 11:57:16 AM >>>>> *To:* [apertium-stuff] <apertium-stuff@lists.sourceforge.net> >>>>> *Subject:* Re: [Apertium-stuff] Update about superblanks in transfer >>>>> >>>>> Unfortunately, I found a lot of problems cased by superblanks, >>>>> especially with the handling of hyphens. See a couple of differences in >>>>> translations of my French test corpus into Arpitan before and after the >>>>> update: >>>>> >>>>> < 00607. Tandis que les Tétes Broulâyes sont en *permission sur >>>>> *Espritos Marcos, tomba amouerox de Yvonne, una Franco-Japonêsa. >>>>> --- >>>>> > 00607. Tandis que les Tétes Broulâyes sont en *permission sur >>>>> *Espritos Marcos, tomba amouerox de Yvonne, una- Franco Japonêsa. >>>>> >>>>> < 00748. On povêt per ègzemplo parlar, sot Charlo-lo-Pelâ, de la >>>>> "*foresta" des pêrches de la Sêna. >>>>> --- >>>>> > 00748. On povêt per ègzemplo parlar, sot Charlo-lo- Pelâ, de la >>>>> "*foresta" des pêrches de la Sêna. >>>>> >>>>> Hèctor >>>>> >>>>> Missatge de Tanmai Khanna <khanna.tan...@gmail.com> del dia ds., 29 >>>>> d’ag. 2020 a les 16:50: >>>>> >>>>> Hey guys! >>>>> The wordbound blanks project handles blanks that are supposed to be >>>>> reordered. Therefore, we no longer need the user to be worried about blank >>>>> positions in transfer rules. The latest update to the apertium code makes >>>>> it such that <b pos="X"/> is now the same as <b/> . You can change >>>>> the <b pos="X"/> in your transfer rules to just <b/> and it'll work. >>>>> >>>>> Now, the only thing you need to worry about when writing transfer >>>>> rules is whether you want a blank between the two LUs or not. *Input >>>>> blanks will be stored as a queue and will be printed in order in all >>>>> available <b/> spots in the rule output. * >>>>> >>>>> *Note:* >>>>> - If the output rule has more blank spots than input blanks, then the >>>>> remaining blank spots will be spaces. >>>>> - If the output rule has less blank spots than input blanks, then the >>>>> remaining input blanks will be output after the rule output. >>>>> - If the input blank is an empty string, it is stored as a space. >>>>> >>>>> In some transfer rules, there are input patterns which don't have a >>>>> space between them. In the output section of these transfer rules, <b >>>>> pos="1"/> used to give an empty string, but it will now give a space. >>>>> To remove the blank from the output, you will need to remove the <b >>>>> pos="1"/> from the transfer rule and it will be fine. >>>>> >>>>> Here are some examples from the tests. >>>>> >>>>> EXAMPLE 1: >>>>> Input: >>>>> >>>>> [blank1] ^worda<det>/wordta<det>$ ;[blank2]; ^wordb<adj>/wordtb<adj>$ >>>>> [blank3]; ^hun<n><acr>/ho<n><acr>$ [blank4] >>>>> >>>>> There's no <b/> in rule output, so all blanks are after flushed after >>>>> rule output. >>>>> >>>>> Output: >>>>> >>>>> [blank1] ^test1<adj>{^wordta<det>$^wordtb<adj>$^ho<n><acr>$}$ ;[blank2]; >>>>> [blank3]; [blank4] >>>>> >>>>> EXAMPLE 2: >>>>> Input: >>>>> >>>>> [blank1] ^wordb<adj>/wordtb<adj>$ ;[blank2]; ^worda<det>/wordta<det>$ >>>>> [blank3]; ^hun<n><acr>/ho<n><acr>$ [blank4] >>>>> >>>>> There's one <b/> in rule output, so it prints one and flushes the >>>>> rest. >>>>> >>>>> Output: >>>>> >>>>> [blank1] ^test1<det>{^wordta<det>$ ;[blank2]; ^ho<n><acr>$}$ [blank3]; >>>>> [blank4] >>>>> >>>>> This has been implemented for the chunker, interchunk, and postchunk. >>>>> >>>>> If you have any questions, suggestions, comments, etc., I'll be happy >>>>> to respond to them. >>>>> >>>>> Thanks and Regards, >>>>> *तन्मय खन्ना * >>>>> *Tanmai Khanna* >>>>> _______________________________________________ >>>>> Apertium-stuff mailing list >>>>> Apertium-stuff@lists.sourceforge.net >>>>> https://lists.sourceforge.net/lists/listinfo/apertium-stuff >>>>> >>>>> _______________________________________________ >>>> Apertium-stuff mailing list >>>> Apertium-stuff@lists.sourceforge.net >>>> https://lists.sourceforge.net/lists/listinfo/apertium-stuff >>>> >>> _______________________________________________ >>> Apertium-stuff mailing list >>> Apertium-stuff@lists.sourceforge.net >>> https://lists.sourceforge.net/lists/listinfo/apertium-stuff >>> >> _______________________________________________ >> Apertium-stuff mailing list >> Apertium-stuff@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/apertium-stuff >> > _______________________________________________ > Apertium-stuff mailing list > Apertium-stuff@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/apertium-stuff >
_______________________________________________ Apertium-stuff mailing list Apertium-stuff@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/apertium-stuff