In response to this issue, empty blanks are considered blanks again, so a
or a in the output will put an empty blank if the
input had an empty blank in that position.
Once again, a and a now does the same thing. Earlier, was used to print a blank from the input and was used to
print a literal space in the output. Now, a will print a blank from
the input in order of the blanks read. In transfer rules written in the
future there's no need to add a pos attribute to , and the ones that
exist already will act the same as a .
This means that there's no way to reorder blanks from transfer rules now,
but that is by design. Hèctor, let me know if this solved your issue :)
Thanks and Regards,
*तन्मय खन्ना *
*Tanmai Khanna*
On Fri, Sep 4, 2020 at 12:15 PM Hèctor Alòs i Font
wrote:
> Missatge de Tanmai Khanna del dia dv., 4 de
> set. 2020 a les 9:22:
>
>> Hèctor,
>> Yes, the new improvements aren't backwards compatible but that's because
>> they're better than the system we had earlier. Here's the changes:
>>
>> So, you are saying that the new stuff is not backwards compatible, aren't
>>> you? There aren't any in the rule, but , which is not
>>> the same. Until now, means explicitly putting a blank, while >> pos="1,2..."/> means copying to the output whatever is in the input in a
>>> given point.
>>>
>>
>> and now do exactly the same thing. You don't need
>> to replace all of the former with the latter but even if you do or don't it
>> won't change anything. Until now it meant what you said but now it means
>> that if you see a or a then print one blank from
>> the blank queue in the output.
>>
>> Superblanks most of the time are blanks, but, as you now probably know
>>> better than anyone else, they can be lots of things; they can even contain
>>> no blanks at all. Even in some cases, like in Romance-language enclitics,
>>> we know there shouldn't be any blank at all before them, but we had to
>>> add for not loosing information on italics, bold letters,
>>> etc.
>>>
>>
>> You're right, except now we have a completely different system to deal
>> with italics, bold letters, and all markup, i.e. wordbound blanks, which
>> aren't considered blanks. Now that there is no information to lose, we
>> didn't want to burden the people who write transfer rules to explicitly
>> define positions of blanks. In cases where you don't want a space in the
>> output, you just don't put a in the output rule.
>>
>>
>>> I'm not really ready to change all in the hundreds of
>>> rules I've been writing in several language pairs. Specifically for
>>> apertium-fra-frp, I hope it will be able to publish it before the new
>>> version of the Apertium core you are preparing, so they are needed right
>>> now.
>>>
>>
>> You won't have to change all of them. Most of them will work as it is.
>> The new system prints blanks in the same order as they were input, so it
>> won't harm most of the rules. The *only thing *you'll have to change, is
>> rules where you don't want a space in the output between LUs, you remove
>> the from those rules. This is because now, an empty blank
>> isn't considered a blank anymore. This was because we want the users to
>> have control about whether they want a blank or not between their output
>> LUs, regardless of the input blanks. If we consider an empty blank, your
>> problem will be solved, but other problems will come up, where empty blanks
>> will appear in the output regardless of s in the output.
>>
>> So to conclude, the only thing you need to remove is the
>> from rules where you know you don't want a space in the output, like num_n,
>> and maybe some enclitics. Apart from that, everything will work as it is.
>> To improve the system, at some point we'll have to add a change that isn't
>> strictly backwards compatible, and several people agree that after
>> wordbound blanks, we should stop handling blank positions in transfer rules.
>>
>
> The problem is that in 99% of the cases I want a blank in num_n, that is
> between the numeral and the name. In most of the cases we have "two cows",
> "3 dogs", etc. In Romance languages, the rule is needed mostly for gender
> agreement. The problem is that sometimes, as we see, we got something else.
> So the question is not whether I want a blank there or not. I want whatever
> was there. So, let me try to formulate it in another way. If I want to
> preserve what was written between two words, I shouldn't write pos="1,2..."/>, but if I want to add a blank, I have to add . Am I
> right? If this is correct, it comes to remove all . It
> seems it would be easier that they wouldn't be taken into account, and thus
> avoiding any change in the language pairs. Am I missing something?
>
> Hèctor
>
>
>>
>> If this isn't acceptable, we can discuss other possible solutions :)
>>
>> *तन्मय खन्ना *
>> *Tanmai Khanna*
>> ___
>> Apertium-stuff mailing list
>> Apertium-stuff@lists.sourceforge.net
>> https://lists.sourc