Re: [Apertium-stuff] Update about superblanks in transfer

2020-09-04 Thread Tanmai Khanna
Hey guys, there's a few more important updates about blanks in Apertium.

1. Empty strings are now counted as blanks as well.
2. A blank is popped from the blank queue only if the  or  is inside .., so that you can do checks with
blanks (the front one in the queue at least) in other places, such as
macros, or within .. blocks.

Thanks and Regards,
*तन्मय खन्ना *
*Tanmai Khanna*


On Sun, Aug 30, 2020 at 5:59 PM Tanmai Khanna 
wrote:

> Are the changes being implemented going to alter the behavior of the
>> punctuation marks that are not analyzed as tokens?
>>
>
> Yes, as was discussed in the thread about markup handling in Apertium,
> input blanks are now read as a queue and output in order in the available
>  spots in the rule output. So it might not be possible to strictly
> control the position of the blanks in the output as was done earlier, but
> that was pretty much the intention of the change.
>
> *तन्मय खन्ना *
> *Tanmai Khanna*
>
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] We now have markup handling and reordering in Apertium!

2020-09-04 Thread Tanmai Khanna
In response to this issue, empty blanks are considered blanks again, so a
 or a  in the output will put an empty blank if the
input had an empty blank in that position.

Once again, a  and a  now does the same thing. Earlier,  was used to print a blank from the input and  was used to
print a literal space in the output. Now, a  will print a blank from
the input in order of the blanks read. In transfer rules written in the
future there's no need to add a pos attribute to , and the ones that
exist already will act the same as a .

This means that there's no way to reorder blanks from transfer rules now,
but that is by design. Hèctor, let me know if this solved your issue :)

Thanks and Regards,
*तन्मय खन्ना *
*Tanmai Khanna*


On Fri, Sep 4, 2020 at 12:15 PM Hèctor Alòs i Font 
wrote:

> Missatge de Tanmai Khanna  del dia dv., 4 de
> set. 2020 a les 9:22:
>
>> Hèctor,
>> Yes, the new improvements aren't backwards compatible but that's because
>> they're better than the system we had earlier. Here's the changes:
>>
>> So, you are saying that the new stuff is not backwards compatible, aren't
>>> you? There aren't any  in the rule, but , which is not
>>> the same. Until now,  means explicitly putting a blank, while >> pos="1,2..."/> means copying to the output whatever is in the input in a
>>> given point.
>>>
>>
>>  and  now do exactly the same thing. You don't need
>> to replace all of the former with the latter but even if you do or don't it
>> won't change anything. Until now it meant what you said but now it means
>> that if you see a  or a  then print one blank from
>> the blank queue in the output.
>>
>> Superblanks most of the time are blanks, but, as you now probably know
>>> better than anyone else, they can be lots of things; they can even contain
>>> no blanks at all. Even in some cases, like in Romance-language enclitics,
>>> we know there shouldn't be any blank at all before them, but we had to
>>> add  for not loosing information on italics, bold letters,
>>> etc.
>>>
>>
>> You're right, except now we have a completely different system to deal
>> with italics, bold letters, and all markup, i.e. wordbound blanks, which
>> aren't considered blanks. Now that there is no information to lose, we
>> didn't want to burden the people who write transfer rules to explicitly
>> define positions of blanks. In cases where you don't want a space in the
>> output, you just don't put a  in the output rule.
>>
>>
>>> I'm not really ready to change all  in the hundreds of
>>> rules I've been writing in several language pairs. Specifically for
>>> apertium-fra-frp, I hope it will be able to publish it before the new
>>> version of the Apertium core you are preparing, so they are needed right
>>> now.
>>>
>>
>> You won't have to change all of them. Most of them will work as it is.
>> The new system prints blanks in the same order as they were input, so it
>> won't harm most of the rules. The *only thing *you'll have to change, is
>> rules where you don't want a space in the output between LUs, you remove
>> the  from those rules. This is because now, an empty blank
>> isn't considered a blank anymore. This was because we want the users to
>> have control about whether they want a blank or not between their output
>> LUs, regardless of the input blanks. If we consider an empty blank, your
>> problem will be solved, but other problems will come up, where empty blanks
>> will appear in the output regardless of s in the output.
>>
>> So to conclude, the only thing you need to remove is the 
>> from rules where you know you don't want a space in the output, like num_n,
>> and maybe some enclitics. Apart from that, everything will work as it is.
>> To improve the system, at some point we'll have to add a change that isn't
>> strictly backwards compatible, and several people agree that after
>> wordbound blanks, we should stop handling blank positions in transfer rules.
>>
>
> The problem is that in 99% of the cases I want a blank in num_n, that is
> between the numeral and the name. In most of the cases we have "two cows",
> "3 dogs", etc. In Romance languages, the rule is needed mostly for gender
> agreement. The problem is that sometimes, as we see, we got something else.
> So the question is not whether I want a blank there or not. I want whatever
> was there. So, let me try to formulate it in another way. If I want to
> preserve what was written between two words, I shouldn't write  pos="1,2..."/>, but if I want to add a blank, I have to add . Am I
> right? If this is correct, it comes to remove all . It
> seems it would be easier that they wouldn't be taken into account, and thus
> avoiding any change in the language pairs. Am I missing something?
>
> Hèctor
>
>
>>
>> If this isn't acceptable, we can discuss other possible solutions :)
>>
>> *तन्मय खन्ना *
>> *Tanmai Khanna*
>> ___
>> Apertium-stuff mailing list
>> Apertium-stuff@lists.sourceforge.net
>> https://lists.sourc