subject:"Re\: \[Apertium\-stuff\] update"

Re: [Apertium-stuff] Update about superblanks in transfer

2020-09-04 Thread Tanmai Khanna

Hey guys, there's a few more important updates about blanks in Apertium.

1. Empty strings are now counted as blanks as well.
2. A blank is popped from the blank queue only if the  or  is inside .., so that you can do checks with
blanks (the front one in the queue at least) in other places, such as
macros, or within .. blocks.

Thanks and Regards,
*तन्मय खन्ना *
*Tanmai Khanna*

On Sun, Aug 30, 2020 at 5:59 PM Tanmai Khanna 
wrote:

> Are the changes being implemented going to alter the behavior of the
>> punctuation marks that are not analyzed as tokens?
>>
>
> Yes, as was discussed in the thread about markup handling in Apertium,
> input blanks are now read as a queue and output in order in the available
>  spots in the rule output. So it might not be possible to strictly
> control the position of the blanks in the output as was done earlier, but
> that was pretty much the intention of the change.
>
> *तन्मय खन्ना *
> *Tanmai Khanna*
>
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Re: [Apertium-stuff] Update about superblanks in transfer

2020-08-30 Thread Tanmai Khanna

> Are the changes being implemented going to alter the behavior of the
> punctuation marks that are not analyzed as tokens?
>

Yes, as was discussed in the thread about markup handling in Apertium,
input blanks are now read as a queue and output in order in the available
 spots in the rule output. So it might not be possible to strictly
control the position of the blanks in the output as was done earlier, but
that was pretty much the intention of the change.

*तन्मय खन्ना *
*Tanmai Khanna*
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Re: [Apertium-stuff] Update about superblanks in transfer

2020-08-30 Thread Jaume Ortolà i Font

Missatge de Tino Didriksen  del dia dg., 30 d’ag.
2020 a les 11:15:

> Why is - a blank in the first place? If it's needed in contexts, it should
> be fully analyzed as a token.
> This goes for all Apertium languages and pairs. I don't understand why
> punctuation generally isn't analyzed. I assume it's just historic.
>

There are pros and cons. For instance, If you analyze a quotation mark (")
as a token, you need to adjust every disambiguation rule where the quote
can appear (which is everywhere, in fact), and that can be very annoying.

I don't have a definitive answer. My guess (in the languages I am familiar
with) is that most punctuation marks should interrupt the analysis, except
for quotation marks, which should not (with some exceptions in turn).

Are the changes being implemented going to alter the behavior of the
punctuation marks that are not analyzed as tokens?

Jaume
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Re: [Apertium-stuff] Update about superblanks in transfer

2020-08-30 Thread Tanmai Khanna

Also, I agree with Tino, if the punctuation is important for the context
then it should probably be analysed as a token.

*तन्मय खन्ना *
*Tanmai Khanna*


On Sun, Aug 30, 2020 at 3:16 PM Tanmai Khanna 
wrote:

> Hèctor what I mean is, if you don't want a space in the output of rules
> you have to remove the .
> For eg.,
>
> $ echo "f.75v." | apertium -d .. fra-frp
>
> *f.75 v..
>
>
> This space between 75 and v is now there because the output rule has a
>  and so if you want the output to come without a space, you should
> remove the .
>
> 
>   
> 
> 
>   
>   
> 
>   
>   
> 
> 
>   
> 
> 
>   
> 
>   
>   
>   
>   
>   
> 
>   
> 
> 
> 
>   
>   
>   
> 
> 
> 
>   
> 
>  
>   
> 
>
> This is because if the input blank is an empty string then that isn't
> counted as a blank. Does that work?
>
> *तन्मय खन्ना *
> *Tanmai Khanna*
>
>
> On Sun, Aug 30, 2020 at 2:58 PM Hèctor Alòs i Font 
> wrote:
>
>> Missatge de Tanmai Khanna  del dia dg., 30
>> d’ag. 2020 a les 12:06:
>>
>>> Hi Hèctor,
>>> I'm dealing with the issues I see one by one.
>>> 1. I was flushing the remaining blanks after processOut because I
>>> thought usually we only have one .. block in the rule, but in
>>> some of your rules there's multiple, so in the latest commit to
>>> apertium/apertium, I made them flush after the rule is finished outputting
>>> entirely. This solves some of the issues such as:
>>>
>>> $ echo "au lycée Louis-le-Grand" | apertium -d .. fra-frp
>>>
>>> u licê Louis-lo-Grant.
>>>
>>>
>>>
>> It's too difficult to have a single  when dealing with complex
>> structures. For instance, in French there is "not + verb + secondary-not",
>> but in Arpitan I have "verb + not". Furthermore, the verb can be in a past
>> tense in the source language but needs "aux + participle" in the target
>> language (and I have to deal with which of the auxiliaries to use). More :
>> the verb can be pronominal in the output language, but not in the source.
>> So I use macros that deal with each of these issues and add or remove
>> stuff. The result is a kind of multi-step output (and I'm not the only that
>> does it).
>>
>>
>>> 2. The spaces between numbers in your output are probably coming
>>> because you have  in the rules. If you remove those, the spaces will go
>>> away.
>>>
>>
>> I can't remove  in the rules. They are added when a new word is
>> added, so I must add a blank too, at its beginning or its end.
>>
>>
>>>
>>> I'm still evaluating some other issues.
>>>
>>>
>>> *तन्मय खन्ना *
>>> *Tanmai Khanna*
>>>
>>>
>>> On Sun, Aug 30, 2020 at 1:21 PM Hèctor Alòs i Font 
>>> wrote:
>>>
>>>>
>>>>
>>>> Missatge de Tanmai Khanna  del dia dg., 30
>>>> d’ag. 2020 a les 9:49:
>>>>
>>>>> My guess is, the transfer rule for Franco-Japanese has a two word
>>>>> input, so the stored blank is "-". Now the output has 3 words "una
>>>>> Franco-Japonêsa", since the blanks are printed in order, they're printed 
>>>>> in
>>>>> the first available  spot in the output rules.
>>>>>
>>>>
>>>> Yes, that is. "Franco" is a prefix and it is analysed as such. I have
>>>> some tens of prefixes for avoiding having hundreds of words in the
>>>> dictionaries and, more important, to be able to deal to unknown pairs like
>>>> "franco-tibétain" or "franco-silésien".
>>>>
>>>>
>>>>>
>>>>> There's a few possible solutions for this. One idea is to have two
>>>>> kinds of blank markers - one that will print a space always, and one that
>>>>> will print available input blanks. This can also be implemented by having 
>>>>> a
>>>>>  in the output rule and then  in the next spot. If this
>>>>> seems too hacky a solution we can discus

Re: [Apertium-stuff] Update about superblanks in transfer

2020-08-30 Thread Tanmai Khanna

Hèctor what I mean is, if you don't want a space in the output of rules you
have to remove the .
For eg.,

$ echo "f.75v." | apertium -d .. fra-frp

*f.75 v..


This space between 75 and v is now there because the output rule has a 
and so if you want the output to come without a space, you should remove
the .


  


  
  

  
  


  


  

  
  
  
  
  

  



  
  
  



  

 
  


This is because if the input blank is an empty string then that isn't
counted as a blank. Does that work?

*तन्मय खन्ना *
*Tanmai Khanna*


On Sun, Aug 30, 2020 at 2:58 PM Hèctor Alòs i Font 
wrote:

> Missatge de Tanmai Khanna  del dia dg., 30 d’ag.
> 2020 a les 12:06:
>
>> Hi Hèctor,
>> I'm dealing with the issues I see one by one.
>> 1. I was flushing the remaining blanks after processOut because I thought
>> usually we only have one .. block in the rule, but in some of
>> your rules there's multiple, so in the latest commit to apertium/apertium,
>> I made them flush after the rule is finished outputting entirely. This
>> solves some of the issues such as:
>>
>> $ echo "au lycée Louis-le-Grand" | apertium -d .. fra-frp
>>
>> u licê Louis-lo-Grant.
>>
>>
>>
> It's too difficult to have a single  when dealing with complex
> structures. For instance, in French there is "not + verb + secondary-not",
> but in Arpitan I have "verb + not". Furthermore, the verb can be in a past
> tense in the source language but needs "aux + participle" in the target
> language (and I have to deal with which of the auxiliaries to use). More :
> the verb can be pronominal in the output language, but not in the source.
> So I use macros that deal with each of these issues and add or remove
> stuff. The result is a kind of multi-step output (and I'm not the only that
> does it).
>
>
>> 2. The spaces between numbers in your output are probably coming because
>> you have  in the rules. If you remove those, the spaces will go away.
>>
>
> I can't remove  in the rules. They are added when a new word is added,
> so I must add a blank too, at its beginning or its end.
>
>
>>
>> I'm still evaluating some other issues.
>>
>>
>> *तन्मय खन्ना *
>> *Tanmai Khanna*
>>
>>
>> On Sun, Aug 30, 2020 at 1:21 PM Hèctor Alòs i Font 
>> wrote:
>>
>>>
>>>
>>> Missatge de Tanmai Khanna  del dia dg., 30
>>> d’ag. 2020 a les 9:49:
>>>
>>>> My guess is, the transfer rule for Franco-Japanese has a two word
>>>> input, so the stored blank is "-". Now the output has 3 words "una
>>>> Franco-Japonêsa", since the blanks are printed in order, they're printed in
>>>> the first available  spot in the output rules.
>>>>
>>>
>>> Yes, that is. "Franco" is a prefix and it is analysed as such. I have
>>> some tens of prefixes for avoiding having hundreds of words in the
>>> dictionaries and, more important, to be able to deal to unknown pairs like
>>> "franco-tibétain" or "franco-silésien".
>>>
>>>
>>>>
>>>> There's a few possible solutions for this. One idea is to have two
>>>> kinds of blank markers - one that will print a space always, and one that
>>>> will print available input blanks. This can also be implemented by having a
>>>>  in the output rule and then  in the next spot. If this
>>>> seems too hacky a solution we can discuss other options.
>>>>
>>>> *तन्मय खन्ना *
>>>> *Tanmai Khanna*
>>>>
>>>>
>>>> On Sun, Aug 30, 2020 at 12:09 PM Tanmai Khanna 
>>>> wrote:
>>>>
>>>>> Hèctor,
>>>>> No worries I'll look into this. Can you send the input sentences? I
>>>>> want to see the transfer rules that are applying to the erroneous parts.
>>>>> They might need some changing.
>>>>>
>>>>> तन्मय खन्ना
>>>>> Tanmai Khanna
>>>>>
>>>>> --
>>>>> *From:* Hèctor Alòs i Font 
>>>>> *Sent:* Sunday, August 30, 2020 11:57:16 AM
>>>>> *To:* [apertium-stuff] 
>>>>&

Re: [Apertium-stuff] Update about superblanks in transfer

2020-08-30 Thread Hèctor Alòs i Font

Missatge de Tanmai Khanna  del dia dg., 30 d’ag.
2020 a les 12:06:

> Hi Hèctor,
> I'm dealing with the issues I see one by one.
> 1. I was flushing the remaining blanks after processOut because I thought
> usually we only have one .. block in the rule, but in some of
> your rules there's multiple, so in the latest commit to apertium/apertium,
> I made them flush after the rule is finished outputting entirely. This
> solves some of the issues such as:
>
> $ echo "au lycée Louis-le-Grand" | apertium -d .. fra-frp
>
> u licê Louis-lo-Grant.
>
>
>
It's too difficult to have a single  when dealing with complex
structures. For instance, in French there is "not + verb + secondary-not",
but in Arpitan I have "verb + not". Furthermore, the verb can be in a past
tense in the source language but needs "aux + participle" in the target
language (and I have to deal with which of the auxiliaries to use). More :
the verb can be pronominal in the output language, but not in the source.
So I use macros that deal with each of these issues and add or remove
stuff. The result is a kind of multi-step output (and I'm not the only that
does it).


> 2. The spaces between numbers in your output are probably coming because
> you have  in the rules. If you remove those, the spaces will go away.
>

I can't remove  in the rules. They are added when a new word is added,
so I must add a blank too, at its beginning or its end.


>
> I'm still evaluating some other issues.
>
>
> *तन्मय खन्ना *
> *Tanmai Khanna*
>
>
> On Sun, Aug 30, 2020 at 1:21 PM Hèctor Alòs i Font 
> wrote:
>
>>
>>
>> Missatge de Tanmai Khanna  del dia dg., 30
>> d’ag. 2020 a les 9:49:
>>
>>> My guess is, the transfer rule for Franco-Japanese has a two word input,
>>> so the stored blank is "-". Now the output has 3 words "una
>>> Franco-Japonêsa", since the blanks are printed in order, they're printed in
>>> the first available  spot in the output rules.
>>>
>>
>> Yes, that is. "Franco" is a prefix and it is analysed as such. I have
>> some tens of prefixes for avoiding having hundreds of words in the
>> dictionaries and, more important, to be able to deal to unknown pairs like
>> "franco-tibétain" or "franco-silésien".
>>
>>
>>>
>>> There's a few possible solutions for this. One idea is to have two kinds
>>> of blank markers - one that will print a space always, and one that will
>>> print available input blanks. This can also be implemented by having a >> v=" "/> in the output rule and then  in the next spot. If this seems
>>> too hacky a solution we can discuss other options.
>>>
>>> *तन्मय खन्ना *
>>> *Tanmai Khanna*
>>>
>>>
>>> On Sun, Aug 30, 2020 at 12:09 PM Tanmai Khanna 
>>> wrote:
>>>
>>>> Hèctor,
>>>> No worries I'll look into this. Can you send the input sentences? I
>>>> want to see the transfer rules that are applying to the erroneous parts.
>>>> They might need some changing.
>>>>
>>>> तन्मय खन्ना
>>>> Tanmai Khanna
>>>>
>>>> --
>>>> *From:* Hèctor Alòs i Font 
>>>> *Sent:* Sunday, August 30, 2020 11:57:16 AM
>>>> *To:* [apertium-stuff] 
>>>> *Subject:* Re: [Apertium-stuff] Update about superblanks in transfer
>>>>
>>>> Unfortunately, I found a lot of problems cased by superblanks,
>>>> especially with the handling of hyphens. See a couple of differences in
>>>> translations of my French test corpus into Arpitan before and after the
>>>> update:
>>>>
>>>> < 00607. Tandis que les Tétes Broulâyes sont en *permission sur
>>>> *Espritos Marcos, tomba amouerox de Yvonne, una Franco-Japonêsa.
>>>> ---
>>>> > 00607. Tandis que les Tétes Broulâyes sont en *permission sur
>>>> *Espritos Marcos, tomba amouerox de Yvonne, una- Franco Japonêsa.
>>>>
>>>> < 00748. On povêt per ègzemplo parlar, sot Charlo-lo-Pelâ, de la
>>>> "*foresta" des pêrches de la Sêna.
>>>> ---
>>>> > 00748. On povêt per ègzemplo parlar, sot Charlo-lo- Pelâ, de la
>>>> "*foresta" des pêrches de la Sêna.
>>>>
>>>> Hèctor
>>>>
>>>> Missatge de Tanmai Khanna  del dia ds., 29
>>>> d’ag. 2020 a les 16:50:
>>>>
>>>> Hey guys!
>>>> The wordbound blanks project handles

Re: [Apertium-stuff] Update about superblanks in transfer

2020-08-30 Thread Tino Didriksen

Why is - a blank in the first place? If it's needed in contexts, it should
be fully analyzed as a token.

This goes for all Apertium languages and pairs. I don't understand why
punctuation generally isn't analyzed. I assume it's just historic.

-- Tino Didriksen


On Sun, 30 Aug 2020 at 08:27, Hèctor Alòs i Font 
wrote:

> Unfortunately, I found a lot of problems cased by superblanks, especially
> with the handling of hyphens. See a couple of differences in translations
> of my French test corpus into Arpitan before and after the update:
>
> < 00607. Tandis que les Tétes Broulâyes sont en *permission sur *Espritos
> Marcos, tomba amouerox de Yvonne, una Franco-Japonêsa.
> ---
> > 00607. Tandis que les Tétes Broulâyes sont en *permission sur *Espritos
> Marcos, tomba amouerox de Yvonne, una- Franco Japonêsa.
>
> < 00748. On povêt per ègzemplo parlar, sot Charlo-lo-Pelâ, de la
> "*foresta" des pêrches de la Sêna.
> ---
> > 00748. On povêt per ègzemplo parlar, sot Charlo-lo- Pelâ, de la
> "*foresta" des pêrches de la Sêna.
>
> Hèctor
>
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Re: [Apertium-stuff] Update about superblanks in transfer

2020-08-30 Thread Tanmai Khanna

Hey,
It solved them Franco-Japanese issue as well :D

Can you check the diff once and see if there's any more issues Hèctor?
(After updating apertium).
*तन्मय खन्ना *
*Tanmai Khanna*


On Sun, Aug 30, 2020 at 2:35 PM Tanmai Khanna 
wrote:

> Hi Hèctor,
> I'm dealing with the issues I see one by one.
> 1. I was flushing the remaining blanks after processOut because I thought
> usually we only have one .. block in the rule, but in some of
> your rules there's multiple, so in the latest commit to apertium/apertium,
> I made them flush after the rule is finished outputting entirely. This
> solves some of the issues such as:
>
> $ echo "au lycée Louis-le-Grand" | apertium -d .. fra-frp
>
> u licê Louis-lo-Grant.
>
>
> 2. The spaces between numbers in your output are probably coming because
> you have  in the rules. If you remove those, the spaces will go away.
>
>
> I'm still evaluating some other issues.
>
>
> *तन्मय खन्ना *
> *Tanmai Khanna*
>
>
> On Sun, Aug 30, 2020 at 1:21 PM Hèctor Alòs i Font 
> wrote:
>
>>
>>
>> Missatge de Tanmai Khanna  del dia dg., 30
>> d’ag. 2020 a les 9:49:
>>
>>> My guess is, the transfer rule for Franco-Japanese has a two word input,
>>> so the stored blank is "-". Now the output has 3 words "una
>>> Franco-Japonêsa", since the blanks are printed in order, they're printed in
>>> the first available  spot in the output rules.
>>>
>>
>> Yes, that is. "Franco" is a prefix and it is analysed as such. I have
>> some tens of prefixes for avoiding having hundreds of words in the
>> dictionaries and, more important, to be able to deal to unknown pairs like
>> "franco-tibétain" or "franco-silésien".
>>
>>
>>>
>>> There's a few possible solutions for this. One idea is to have two kinds
>>> of blank markers - one that will print a space always, and one that will
>>> print available input blanks. This can also be implemented by having a >> v=" "/> in the output rule and then  in the next spot. If this seems
>>> too hacky a solution we can discuss other options.
>>>
>>> *तन्मय खन्ना *
>>> *Tanmai Khanna*
>>>
>>>
>>> On Sun, Aug 30, 2020 at 12:09 PM Tanmai Khanna 
>>> wrote:
>>>
>>>> Hèctor,
>>>> No worries I'll look into this. Can you send the input sentences? I
>>>> want to see the transfer rules that are applying to the erroneous parts.
>>>> They might need some changing.
>>>>
>>>> तन्मय खन्ना
>>>> Tanmai Khanna
>>>>
>>>> --
>>>> *From:* Hèctor Alòs i Font 
>>>> *Sent:* Sunday, August 30, 2020 11:57:16 AM
>>>> *To:* [apertium-stuff] 
>>>> *Subject:* Re: [Apertium-stuff] Update about superblanks in transfer
>>>>
>>>> Unfortunately, I found a lot of problems cased by superblanks,
>>>> especially with the handling of hyphens. See a couple of differences in
>>>> translations of my French test corpus into Arpitan before and after the
>>>> update:
>>>>
>>>> < 00607. Tandis que les Tétes Broulâyes sont en *permission sur
>>>> *Espritos Marcos, tomba amouerox de Yvonne, una Franco-Japonêsa.
>>>> ---
>>>> > 00607. Tandis que les Tétes Broulâyes sont en *permission sur
>>>> *Espritos Marcos, tomba amouerox de Yvonne, una- Franco Japonêsa.
>>>>
>>>> < 00748. On povêt per ègzemplo parlar, sot Charlo-lo-Pelâ, de la
>>>> "*foresta" des pêrches de la Sêna.
>>>> ---
>>>> > 00748. On povêt per ègzemplo parlar, sot Charlo-lo- Pelâ, de la
>>>> "*foresta" des pêrches de la Sêna.
>>>>
>>>> Hèctor
>>>>
>>>> Missatge de Tanmai Khanna  del dia ds., 29
>>>> d’ag. 2020 a les 16:50:
>>>>
>>>> Hey guys!
>>>> The wordbound blanks project handles blanks that are supposed to be
>>>> reordered. Therefore, we no longer need the user to be worried about blank
>>>> positions in transfer rules. The latest update to the apertium code makes
>>>> it such that  is now the same as  . You can change the >>> pos="X"/> in your transfer rules to just  and it'll work.
>>>>
>>>> Now, the only thing you need to worry about when writing transfer rules
>>>> is whether you want a blank between the two LUs or not. *Input blanks
>>&g

Re: [Apertium-stuff] Update about superblanks in transfer

2020-08-30 Thread Tanmai Khanna

Hi Hèctor,
I'm dealing with the issues I see one by one.
1. I was flushing the remaining blanks after processOut because I thought
usually we only have one .. block in the rule, but in some of
your rules there's multiple, so in the latest commit to apertium/apertium,
I made them flush after the rule is finished outputting entirely. This
solves some of the issues such as:

$ echo "au lycée Louis-le-Grand" | apertium -d .. fra-frp

u licê Louis-lo-Grant.


2. The spaces between numbers in your output are probably coming because
you have  in the rules. If you remove those, the spaces will go away.


I'm still evaluating some other issues.


*तन्मय खन्ना *
*Tanmai Khanna*


On Sun, Aug 30, 2020 at 1:21 PM Hèctor Alòs i Font 
wrote:

>
>
> Missatge de Tanmai Khanna  del dia dg., 30 d’ag.
> 2020 a les 9:49:
>
>> My guess is, the transfer rule for Franco-Japanese has a two word input,
>> so the stored blank is "-". Now the output has 3 words "una
>> Franco-Japonêsa", since the blanks are printed in order, they're printed in
>> the first available  spot in the output rules.
>>
>
> Yes, that is. "Franco" is a prefix and it is analysed as such. I have some
> tens of prefixes for avoiding having hundreds of words in the
> dictionaries and, more important, to be able to deal to unknown pairs like
> "franco-tibétain" or "franco-silésien".
>
>
>>
>> There's a few possible solutions for this. One idea is to have two kinds
>> of blank markers - one that will print a space always, and one that will
>> print available input blanks. This can also be implemented by having a > v=" "/> in the output rule and then  in the next spot. If this seems
>> too hacky a solution we can discuss other options.
>>
>> *तन्मय खन्ना *
>> *Tanmai Khanna*
>>
>>
>> On Sun, Aug 30, 2020 at 12:09 PM Tanmai Khanna 
>> wrote:
>>
>>> Hèctor,
>>> No worries I'll look into this. Can you send the input sentences? I want
>>> to see the transfer rules that are applying to the erroneous parts. They
>>> might need some changing.
>>>
>>> तन्मय खन्ना
>>> Tanmai Khanna
>>>
>>> --
>>> *From:* Hèctor Alòs i Font 
>>> *Sent:* Sunday, August 30, 2020 11:57:16 AM
>>> *To:* [apertium-stuff] 
>>> *Subject:* Re: [Apertium-stuff] Update about superblanks in transfer
>>>
>>> Unfortunately, I found a lot of problems cased by superblanks,
>>> especially with the handling of hyphens. See a couple of differences in
>>> translations of my French test corpus into Arpitan before and after the
>>> update:
>>>
>>> < 00607. Tandis que les Tétes Broulâyes sont en *permission sur
>>> *Espritos Marcos, tomba amouerox de Yvonne, una Franco-Japonêsa.
>>> ---
>>> > 00607. Tandis que les Tétes Broulâyes sont en *permission sur
>>> *Espritos Marcos, tomba amouerox de Yvonne, una- Franco Japonêsa.
>>>
>>> < 00748. On povêt per ègzemplo parlar, sot Charlo-lo-Pelâ, de la
>>> "*foresta" des pêrches de la Sêna.
>>> ---
>>> > 00748. On povêt per ègzemplo parlar, sot Charlo-lo- Pelâ, de la
>>> "*foresta" des pêrches de la Sêna.
>>>
>>> Hèctor
>>>
>>> Missatge de Tanmai Khanna  del dia ds., 29
>>> d’ag. 2020 a les 16:50:
>>>
>>> Hey guys!
>>> The wordbound blanks project handles blanks that are supposed to be
>>> reordered. Therefore, we no longer need the user to be worried about blank
>>> positions in transfer rules. The latest update to the apertium code makes
>>> it such that  is now the same as  . You can change the >> pos="X"/> in your transfer rules to just  and it'll work.
>>>
>>> Now, the only thing you need to worry about when writing transfer rules
>>> is whether you want a blank between the two LUs or not. *Input blanks
>>> will be stored as a queue and will be printed in order in all
>>> available  spots in the rule output. *
>>>
>>> *Note:*
>>> - If the output rule has more blank spots than input blanks, then the
>>> remaining blank spots will be spaces.
>>> - If the output rule has less blank spots than input blanks, then the
>>> remaining input blanks will be output after the rule output.
>>> - If the input blank is an empty string, it is stored as a space.
>>>
>>> In some transfer rules, there are input patterns which don't have a
>>> space between them. In the output section of the

Re: [Apertium-stuff] Update about superblanks in transfer

2020-08-30 Thread Hèctor Alòs i Font

Missatge de Tanmai Khanna  del dia dg., 30 d’ag.
2020 a les 9:49:

> My guess is, the transfer rule for Franco-Japanese has a two word input,
> so the stored blank is "-". Now the output has 3 words "una
> Franco-Japonêsa", since the blanks are printed in order, they're printed in
> the first available  spot in the output rules.
>

Yes, that is. "Franco" is a prefix and it is analysed as such. I have some
tens of prefixes for avoiding having hundreds of words in the
dictionaries and, more important, to be able to deal to unknown pairs like
"franco-tibétain" or "franco-silésien".


>
> There's a few possible solutions for this. One idea is to have two kinds
> of blank markers - one that will print a space always, and one that will
> print available input blanks. This can also be implemented by having a  v=" "/> in the output rule and then  in the next spot. If this seems
> too hacky a solution we can discuss other options.
>
> *तन्मय खन्ना *
> *Tanmai Khanna*
>
>
> On Sun, Aug 30, 2020 at 12:09 PM Tanmai Khanna 
> wrote:
>
>> Hèctor,
>> No worries I'll look into this. Can you send the input sentences? I want
>> to see the transfer rules that are applying to the erroneous parts. They
>> might need some changing.
>>
>> तन्मय खन्ना
>> Tanmai Khanna
>>
>> --
>> *From:* Hèctor Alòs i Font 
>> *Sent:* Sunday, August 30, 2020 11:57:16 AM
>> *To:* [apertium-stuff] 
>> *Subject:* Re: [Apertium-stuff] Update about superblanks in transfer
>>
>> Unfortunately, I found a lot of problems cased by superblanks, especially
>> with the handling of hyphens. See a couple of differences in translations
>> of my French test corpus into Arpitan before and after the update:
>>
>> < 00607. Tandis que les Tétes Broulâyes sont en *permission sur *Espritos
>> Marcos, tomba amouerox de Yvonne, una Franco-Japonêsa.
>> ---
>> > 00607. Tandis que les Tétes Broulâyes sont en *permission sur *Espritos
>> Marcos, tomba amouerox de Yvonne, una- Franco Japonêsa.
>>
>> < 00748. On povêt per ègzemplo parlar, sot Charlo-lo-Pelâ, de la
>> "*foresta" des pêrches de la Sêna.
>> ---
>> > 00748. On povêt per ègzemplo parlar, sot Charlo-lo- Pelâ, de la
>> "*foresta" des pêrches de la Sêna.
>>
>> Hèctor
>>
>> Missatge de Tanmai Khanna  del dia ds., 29
>> d’ag. 2020 a les 16:50:
>>
>> Hey guys!
>> The wordbound blanks project handles blanks that are supposed to be
>> reordered. Therefore, we no longer need the user to be worried about blank
>> positions in transfer rules. The latest update to the apertium code makes
>> it such that  is now the same as  . You can change the > pos="X"/> in your transfer rules to just  and it'll work.
>>
>> Now, the only thing you need to worry about when writing transfer rules
>> is whether you want a blank between the two LUs or not. *Input blanks
>> will be stored as a queue and will be printed in order in all
>> available  spots in the rule output. *
>>
>> *Note:*
>> - If the output rule has more blank spots than input blanks, then the
>> remaining blank spots will be spaces.
>> - If the output rule has less blank spots than input blanks, then the
>> remaining input blanks will be output after the rule output.
>> - If the input blank is an empty string, it is stored as a space.
>>
>> In some transfer rules, there are input patterns which don't have a space
>> between them. In the output section of these transfer rules, 
>>  used to give an empty string, but it will now give a space. To remove
>> the blank from the output, you will need to remove the  from
>> the transfer rule and it will be fine.
>>
>> Here are some examples from the tests.
>>
>> EXAMPLE 1:
>> Input:
>>
>> [blank1] ^worda/wordta$ ;[blank2]; ^wordb/wordtb$ 
>> [blank3];  ^hun/ho$ [blank4]
>>
>> There's no  in rule output, so all blanks are after flushed after
>> rule output.
>>
>> Output:
>>
>> [blank1] ^test1{^wordta$^wordtb$^ho$}$ ;[blank2];  
>> [blank3];   [blank4]
>>
>> EXAMPLE 2:
>> Input:
>>
>> [blank1] ^wordb/wordtb$ ;[blank2]; ^worda/wordta$ 
>> [blank3];  ^hun/ho$ [blank4]
>>
>> There's one  in rule output, so it prints one and flushes the rest.
>>
>> Output:
>>
>> [blank1] ^test1{^wordta$ ;[blank2]; ^ho$}$ [blank3];   
>> [blank4]
>>
>> This has been implemented for the chunker, interchunk, and postchunk.
>>
>> If you have any questions, suggestions, comments, etc., I'll be happy to
>> respond to them.
>>
>> Thanks and Regards,
>> *तन्मय खन्ना *
>> *Tanmai Khanna*
>> ___
>> Apertium-stuff mailing list
>> Apertium-stuff@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>>
>> ___
> Apertium-stuff mailing list
> Apertium-stuff@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Re: [Apertium-stuff] Update about superblanks in transfer

2020-08-30 Thread Tanmai Khanna

My guess is, the transfer rule for Franco-Japanese has a two word input, so
the stored blank is "-". Now the output has 3 words "una Franco-Japonêsa",
since the blanks are printed in order, they're printed in the first
available  spot in the output rules.

There's a few possible solutions for this. One idea is to have two kinds of
blank markers - one that will print a space always, and one that will print
available input blanks. This can also be implemented by having a  in the output rule and then  in the next spot. If this seems too
hacky a solution we can discuss other options.

*तन्मय खन्ना *
*Tanmai Khanna*


On Sun, Aug 30, 2020 at 12:09 PM Tanmai Khanna 
wrote:

> Hèctor,
> No worries I'll look into this. Can you send the input sentences? I want
> to see the transfer rules that are applying to the erroneous parts. They
> might need some changing.
>
> तन्मय खन्ना
> Tanmai Khanna
>
> --
> *From:* Hèctor Alòs i Font 
> *Sent:* Sunday, August 30, 2020 11:57:16 AM
> *To:* [apertium-stuff] 
> *Subject:* Re: [Apertium-stuff] Update about superblanks in transfer
>
> Unfortunately, I found a lot of problems cased by superblanks, especially
> with the handling of hyphens. See a couple of differences in translations
> of my French test corpus into Arpitan before and after the update:
>
> < 00607. Tandis que les Tétes Broulâyes sont en *permission sur *Espritos
> Marcos, tomba amouerox de Yvonne, una Franco-Japonêsa.
> ---
> > 00607. Tandis que les Tétes Broulâyes sont en *permission sur *Espritos
> Marcos, tomba amouerox de Yvonne, una- Franco Japonêsa.
>
> < 00748. On povêt per ègzemplo parlar, sot Charlo-lo-Pelâ, de la
> "*foresta" des pêrches de la Sêna.
> ---
> > 00748. On povêt per ègzemplo parlar, sot Charlo-lo- Pelâ, de la
> "*foresta" des pêrches de la Sêna.
>
> Hèctor
>
> Missatge de Tanmai Khanna  del dia ds., 29 d’ag.
> 2020 a les 16:50:
>
> Hey guys!
> The wordbound blanks project handles blanks that are supposed to be
> reordered. Therefore, we no longer need the user to be worried about blank
> positions in transfer rules. The latest update to the apertium code makes
> it such that  is now the same as  . You can change the  pos="X"/> in your transfer rules to just  and it'll work.
>
> Now, the only thing you need to worry about when writing transfer rules is
> whether you want a blank between the two LUs or not. *Input blanks will
> be stored as a queue and will be printed in order in all
> available  spots in the rule output. *
>
> *Note:*
> - If the output rule has more blank spots than input blanks, then the
> remaining blank spots will be spaces.
> - If the output rule has less blank spots than input blanks, then the
> remaining input blanks will be output after the rule output.
> - If the input blank is an empty string, it is stored as a space.
>
> In some transfer rules, there are input patterns which don't have a space
> between them. In the output section of these transfer rules,  used
> to give an empty string, but it will now give a space. To remove the blank
> from the output, you will need to remove the  from the
> transfer rule and it will be fine.
>
> Here are some examples from the tests.
>
> EXAMPLE 1:
> Input:
>
> [blank1] ^worda/wordta$ ;[blank2]; ^wordb/wordtb$ 
> [blank3];  ^hun/ho$ [blank4]
>
> There's no  in rule output, so all blanks are after flushed after
> rule output.
>
> Output:
>
> [blank1] ^test1{^wordta$^wordtb$^ho$}$ ;[blank2];  
> [blank3];   [blank4]
>
> EXAMPLE 2:
> Input:
>
> [blank1] ^wordb/wordtb$ ;[blank2]; ^worda/wordta$ 
> [blank3];  ^hun/ho$ [blank4]
>
> There's one  in rule output, so it prints one and flushes the rest.
>
> Output:
>
> [blank1] ^test1{^wordta$ ;[blank2]; ^ho$}$ [blank3];   
> [blank4]
>
> This has been implemented for the chunker, interchunk, and postchunk.
>
> If you have any questions, suggestions, comments, etc., I'll be happy to
> respond to them.
>
> Thanks and Regards,
> *तन्मय खन्ना *
> *Tanmai Khanna*
> ___
> Apertium-stuff mailing list
> Apertium-stuff@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>
>
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Re: [Apertium-stuff] Update about superblanks in transfer

2020-08-30 Thread Tanmai Khanna

Hèctor,
No worries I'll look into this. Can you send the input sentences? I want to see 
the transfer rules that are applying to the erroneous parts. They might need 
some changing.

तन्मय खन्ना
Tanmai Khanna


From: Hèctor Alòs i Font 
Sent: Sunday, August 30, 2020 11:57:16 AM
To: [apertium-stuff] 
Subject: Re: [Apertium-stuff] Update about superblanks in transfer

Unfortunately, I found a lot of problems cased by superblanks, especially with 
the handling of hyphens. See a couple of differences in translations of my 
French test corpus into Arpitan before and after the update:

< 00607. Tandis que les Tétes Broulâyes sont en *permission sur *Espritos 
Marcos, tomba amouerox de Yvonne, una Franco-Japonêsa.
---
> 00607. Tandis que les Tétes Broulâyes sont en *permission sur *Espritos 
> Marcos, tomba amouerox de Yvonne, una- Franco Japonêsa.

< 00748. On povêt per ègzemplo parlar, sot Charlo-lo-Pelâ, de la "*foresta" des 
pêrches de la Sêna.
---
> 00748. On povêt per ègzemplo parlar, sot Charlo-lo- Pelâ, de la "*foresta" 
> des pêrches de la Sêna.

Hèctor

Missatge de Tanmai Khanna 
mailto:khanna.tan...@gmail.com>> del dia ds., 29 d’ag. 
2020 a les 16:50:
Hey guys!
The wordbound blanks project handles blanks that are supposed to be reordered. 
Therefore, we no longer need the user to be worried about blank positions in 
transfer rules. The latest update to the apertium code makes it such that  is now the same as  . You can change the  in your 
transfer rules to just  and it'll work.

Now, the only thing you need to worry about when writing transfer rules is 
whether you want a blank between the two LUs or not. Input blanks will be 
stored as a queue and will be printed in order in all available  spots in 
the rule output.

Note:
- If the output rule has more blank spots than input blanks, then the remaining 
blank spots will be spaces.
- If the output rule has less blank spots than input blanks, then the remaining 
input blanks will be output after the rule output.
- If the input blank is an empty string, it is stored as a space.

In some transfer rules, there are input patterns which don't have a space 
between them. In the output section of these transfer rules,  used 
to give an empty string, but it will now give a space. To remove the blank from 
the output, you will need to remove the  from the transfer rule and 
it will be fine.

Here are some examples from the tests.

EXAMPLE 1:
Input:

[blank1] ^worda/wordta$ ;[blank2]; ^wordb/wordtb$ [blank3]; 
 ^hun/ho$ [blank4]


There's no  in rule output, so all blanks are after flushed after rule 
output.

Output:

[blank1] ^test1{^wordta$^wordtb$^ho$}$ ;[blank2];  
[blank3];   [blank4]


EXAMPLE 2:
Input:

[blank1] ^wordb/wordtb$ ;[blank2]; ^worda/wordta$ [blank3]; 
 ^hun/ho$ [blank4]


There's one  in rule output, so it prints one and flushes the rest.

Output:

[blank1] ^test1{^wordta$ ;[blank2]; ^ho$}$ [blank3];   
[blank4]


This has been implemented for the chunker, interchunk, and postchunk.

If you have any questions, suggestions, comments, etc., I'll be happy to 
respond to them.

Thanks and Regards,
तन्मय खन्ना
Tanmai Khanna
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net<mailto:Apertium-stuff@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/apertium-stuff
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Re: [Apertium-stuff] Update about superblanks in transfer

2020-08-30 Thread Hèctor Alòs i Font

Unfortunately, I found a lot of problems cased by superblanks, especially
with the handling of hyphens. See a couple of differences in translations
of my French test corpus into Arpitan before and after the update:

< 00607. Tandis que les Tétes Broulâyes sont en *permission sur *Espritos
Marcos, tomba amouerox de Yvonne, una Franco-Japonêsa.
---
> 00607. Tandis que les Tétes Broulâyes sont en *permission sur *Espritos
Marcos, tomba amouerox de Yvonne, una- Franco Japonêsa.

< 00748. On povêt per ègzemplo parlar, sot Charlo-lo-Pelâ, de la "*foresta"
des pêrches de la Sêna.
---
> 00748. On povêt per ègzemplo parlar, sot Charlo-lo- Pelâ, de la
"*foresta" des pêrches de la Sêna.

Hèctor

Missatge de Tanmai Khanna  del dia ds., 29 d’ag.
2020 a les 16:50:

> Hey guys!
> The wordbound blanks project handles blanks that are supposed to be
> reordered. Therefore, we no longer need the user to be worried about blank
> positions in transfer rules. The latest update to the apertium code makes
> it such that  is now the same as  . You can change the  pos="X"/> in your transfer rules to just  and it'll work.
>
> Now, the only thing you need to worry about when writing transfer rules is
> whether you want a blank between the two LUs or not. *Input blanks will
> be stored as a queue and will be printed in order in all
> available  spots in the rule output. *
>
> *Note:*
> - If the output rule has more blank spots than input blanks, then the
> remaining blank spots will be spaces.
> - If the output rule has less blank spots than input blanks, then the
> remaining input blanks will be output after the rule output.
> - If the input blank is an empty string, it is stored as a space.
>
> In some transfer rules, there are input patterns which don't have a space
> between them. In the output section of these transfer rules,  used
> to give an empty string, but it will now give a space. To remove the blank
> from the output, you will need to remove the  from the
> transfer rule and it will be fine.
>
> Here are some examples from the tests.
>
> EXAMPLE 1:
> Input:
>
> [blank1] ^worda/wordta$ ;[blank2]; ^wordb/wordtb$ 
> [blank3];  ^hun/ho$ [blank4]
>
> There's no  in rule output, so all blanks are after flushed after
> rule output.
>
> Output:
>
> [blank1] ^test1{^wordta$^wordtb$^ho$}$ ;[blank2];  
> [blank3];   [blank4]
>
> EXAMPLE 2:
> Input:
>
> [blank1] ^wordb/wordtb$ ;[blank2]; ^worda/wordta$ 
> [blank3];  ^hun/ho$ [blank4]
>
> There's one  in rule output, so it prints one and flushes the rest.
>
> Output:
>
> [blank1] ^test1{^wordta$ ;[blank2]; ^ho$}$ [blank3];   
> [blank4]
>
> This has been implemented for the chunker, interchunk, and postchunk.
>
> If you have any questions, suggestions, comments, etc., I'll be happy to
> respond to them.
>
> Thanks and Regards,
> *तन्मय खन्ना *
> *Tanmai Khanna*
> ___
> Apertium-stuff mailing list
> Apertium-stuff@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Re: [Apertium-stuff] Update about superblanks in transfer

2020-08-29 Thread Kevin Brubeck Unhammer

Tanmai Khanna 
čálii:

> we no longer need the user to be worried about blank
> positions in transfer rules. The latest update to the apertium code makes
> it such that  is now the same as  . You can change the  pos="X"/> in your transfer rules to just  and it'll work.
>
> Now, the only thing you need to worry about when writing transfer rules is
> whether you want a blank between the two LUs or not. 

  


signature.asc
Description: PGP signature
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Re: [Apertium-stuff] update of wiki on xixona

2013-11-12 Thread Francis Tyers

El dt 12 de 11 de 2013 a les 18:19 +, en/na Francis Tyers va
escriure:
 Dear all,
 
 I have updated the versions of php and mediawiki on xixona. This means
 that we are now running version 1.19 which might be less prone to
 spam-attacks.
 
 Please be vigilant, and if you notice anything untoward with the Wiki.

Updating the server --- I should have anticipated this --- has caused
all the language pairs on apertium.org to stop working. :(

I am going to try and fix this. 

Fran


--
DreamFactory - Open Source REST  JSON Services for HTML5  Native Apps
OAuth, Users, Roles, SQL, NoSQL, BLOB Storage and External API Access
Free app hosting. Or install the open source package on any LAMP server.
Sign up and see examples for AngularJS, jQuery, Sencha Touch and Native!
http://pubads.g.doubleclick.net/gampad/clk?id=63469471iu=/4140/ostg.clktrk
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Re: [Apertium-stuff] update

2011-11-17 Thread Jimmy O'Regan

2011/11/16 Felipe Sánchez Martínez fsanc...@dlsi.ua.es:
 Hi all,

 I think the task

 find X rules for how to translate words with more than one possible
 translation

 could be misunderstood as they could mixed lexical selection problems
 with part-of-speech ambiguity problems. I see graduate students doing
 so, every year.

Each task will also include a longer description, we're talking only
about the title, which will be the first thing the student sees. The
two things I heard repeated about why students didn't choose a task
last year were I didn't see it, which we can do little about, and I
didn't understand it, so I didn't click it, which we can address by
using more user-friendly titles.

-- 
Sefam Are any of the mentors around?
jimregan yes, they're the ones trolling you

--
All the data continuously generated in your IT infrastructure 
contains a definitive record of customers, application performance, 
security threats, fraudulent activity, and more. Splunk takes this 
data and makes sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-novd2d
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Re: [Apertium-stuff] update

2011-11-17 Thread Kevin Brubeck Unhammer

Felipe Sánchez Martínez
fsanc...@dlsi.ua.es writes:

Hi all,

I think the task

find X rules for how to translate words with more than one possible
translation

could be misunderstood as they could mixed lexical selection problems
with part-of-speech ambiguity problems. I see graduate students doing
so, every year.

Agree … perhaps a link to http://wiki.apertium.org/wiki/Ambiguity in the
description would be enough? I'm not sure there's a shorter way of
saying it if you don't already know the concepts.

El 16/11/11 15:20, Francis Tyers escribió:
El dc 16 de 11 de 2011 a les 14:18 +, en/na Jimmy O'Regan va
escriure:
On 16 Nov 2011 14:09, Francis Tyersfty...@prompsit.com wrote:

El dc 16 de 11 de 2011 a les 14:02 +, en/na Jimmy O'Regan va
escriure:
On 16 Nov 2011 13:35, Francis Tyersfty...@prompsit.com wrote:

Hey all!

I've thrown all the parts together and have a working prototype
of
the
lexical selection module. A rule compiler, and a processor.

At the moment the rule format is like:

https://apertium.svn.sourceforge.net/svnroot/apertium/branches/apertium-lex-tools/examples/rules.txt

But we have also discussed an XML-based format, which would be
like:

https://apertium.svn.sourceforge.net/svnroot/apertium/branches/apertium-lex-tools/examples/rules.xml

I would like to, as my next step, improve the rule compiler (at
the
moment there is a lot of string mangling that I think could be
improved
on -- e.g. for holding the pattern lengths/ids), and support the
XML
format, but in order to do this, I would first like to get
comments
on
it. Is there anything that you would change? Do you feel
comfortable
writing rules in this format?

It might be better to ask next week, when GCI tasks have been
sorted
and finalised. Split focus and so on.

What a great idea! We could make some GCI tasks like come up with X
lexical selection rules for a language pair of your choice.

You'll want to rephrase that, significantly. GCI students are casually
browsing a list of titles so you should pick a title that doesn't rely
on a relatively obscure phrase - something that immediately informs
them that they probably already know this.

Yeah, how about: find X rules for how to translate words with more than
one possible translation ?

--
All the data continuously generated in your IT infrastructure
contains a definitive record of customers, application performance,
security threats, fraudulent activity, and more. Splunk takes this
data and makes sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-novd2d
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Re: [Apertium-stuff] update

2011-11-17 Thread Francis Tyers

El dj 17 de 11 de 2011 a les 10:36 +0100, en/na Kevin Brubeck Unhammer
va escriure:
 Felipe Sánchez Martínez
 fsanc...@dlsi.ua.es writes:
 
  Hi all,
 
  I think the task
 
  find X rules for how to translate words with more than one possible 
  translation
 
  could be misunderstood as they could mixed lexical selection problems 
  with part-of-speech ambiguity problems. I see graduate students doing 
  so, every year.
 
 Agree … perhaps a link to http://wiki.apertium.org/wiki/Ambiguity in the
 description would be enough? I'm not sure there's a shorter way of
 saying it if you don't already know the concepts.
 

Nice work!

Fran


--
All the data continuously generated in your IT infrastructure 
contains a definitive record of customers, application performance, 
security threats, fraudulent activity, and more. Splunk takes this 
data and makes sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-novd2d
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Re: [Apertium-stuff] update

2011-11-16 Thread Francis Tyers

El dc 16 de 11 de 2011 a les 14:02 +, en/na Jimmy O'Regan va
escriure:
On 16 Nov 2011 13:35, Francis Tyers fty...@prompsit.com wrote:

Hey all!

I've thrown all the parts together and have a working prototype of
the
lexical selection module. A rule compiler, and a processor.

At the moment the rule format is like:

https://apertium.svn.sourceforge.net/svnroot/apertium/branches/apertium-lex-tools/examples/rules.txt

But we have also discussed an XML-based format, which would be like:

https://apertium.svn.sourceforge.net/svnroot/apertium/branches/apertium-lex-tools/examples/rules.xml

I would like to, as my next step, improve the rule compiler (at the
moment there is a lot of string mangling that I think could be
improved
on -- e.g. for holding the pattern lengths/ids), and support the XML
format, but in order to do this, I would first like to get comments
on
it. Is there anything that you would change? Do you feel comfortable
writing rules in this format?

It might be better to ask next week, when GCI tasks have been sorted
and finalised. Split focus and so on.

What a great idea! We could make some GCI tasks like come up with X
lexical selection rules for a language pair of your choice.

I was also thinking of a GCI task to make a human interface to creating
the rules (maybe web-based?), offering the user all the possible
sentences, and asking them to mark them as ok/not ok and mark the
contextually important words. I can't think that it would be more than
a couple of days work for someone with experience with PHP.

Fran

--
RSA(R) Conference 2012
Save $700 by Nov 18
Register now
http://p.sf.net/sfu/rsa-sfdev2dev1
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Re: [Apertium-stuff] update

2011-11-16 Thread Jimmy O'Regan

On 16 Nov 2011 14:09, Francis Tyers fty...@prompsit.com wrote:

El dc 16 de 11 de 2011 a les 14:02 +, en/na Jimmy O'Regan va
escriure:
On 16 Nov 2011 13:35, Francis Tyers fty...@prompsit.com wrote:

Hey all!

I've thrown all the parts together and have a working prototype of
the
lexical selection module. A rule compiler, and a processor.

At the moment the rule format is like:

https://apertium.svn.sourceforge.net/svnroot/apertium/branches/apertium-lex-tools/examples/rules.txt

But we have also discussed an XML-based format, which would be like:

https://apertium.svn.sourceforge.net/svnroot/apertium/branches/apertium-lex-tools/examples/rules.xml

I would like to, as my next step, improve the rule compiler (at the
moment there is a lot of string mangling that I think could be
improved
on -- e.g. for holding the pattern lengths/ids), and support the XML
format, but in order to do this, I would first like to get comments
on
it. Is there anything that you would change? Do you feel comfortable
writing rules in this format?

It might be better to ask next week, when GCI tasks have been sorted
and finalised. Split focus and so on.

What a great idea! We could make some GCI tasks like come up with X
lexical selection rules for a language pair of your choice.

You'll want to rephrase that, significantly. GCI students are casually
browsing a list of titles so you should pick a title that doesn't rely on a
relatively obscure phrase - something that immediately informs them that
they probably already know this.

Nice idea. There are plenty of javascript draggy droppy things around, so
we could maybe also look for an interface to build basic transfer rules too.
--
RSA(R) Conference 2012
Save $700 by Nov 18
Register now
http://p.sf.net/sfu/rsa-sfdev2dev1___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Re: [Apertium-stuff] update

2011-11-16 Thread Jimmy O'Regan

On 16 November 2011 14:20, Francis Tyers fty...@prompsit.com wrote:
El dc 16 de 11 de 2011 a les 14:18 +, en/na Jimmy O'Regan va
escriure:
On 16 Nov 2011 14:09, Francis Tyers fty...@prompsit.com wrote:

El dc 16 de 11 de 2011 a les 14:02 +, en/na Jimmy O'Regan va
escriure:
On 16 Nov 2011 13:35, Francis Tyers fty...@prompsit.com wrote:

Hey all!

I've thrown all the parts together and have a working prototype
of
the
lexical selection module. A rule compiler, and a processor.

At the moment the rule format is like:

https://apertium.svn.sourceforge.net/svnroot/apertium/branches/apertium-lex-tools/examples/rules.txt

But we have also discussed an XML-based format, which would be
like:

https://apertium.svn.sourceforge.net/svnroot/apertium/branches/apertium-lex-tools/examples/rules.xml

It might be better to ask next week, when GCI tasks have been
sorted
and finalised. Split focus and so on.

What a great idea! We could make some GCI tasks like come up with X
lexical selection rules for a language pair of your choice.

Yeah, how about: find X rules for how to translate words with more than
one possible translation ?

Better - seems a little more inviting, so to speak.

I was also thinking of a GCI task to make a human interface to
creating
the rules (maybe web-based?), offering the user all the possible
sentences, and asking them to mark them as ok/not ok and mark the
contextually important words. I can't think that it would be more
than
a couple of days work for someone with experience with PHP.

Nice idea. There are plenty of javascript draggy droppy things around,
so we could maybe also look for an interface to build basic transfer
rules too.

Exactly, also, web 2.0 stuff is hip with the kids!

Fo' shizz.

--
Sefam Are any of the mentors around?
jimregan yes, they're the ones trolling you

Re: [Apertium-stuff] update

2011-11-16 Thread Felipe Sánchez Martínez

Hi all,

I think the task

find X rules for how to translate words with more than one possible
translation

could be misunderstood as they could mixed lexical selection problems
with part-of-speech ambiguity problems. I see graduate students doing
so, every year.

Cheers
--
Felipe