2010/7/11 Jacob Nordfalk <[email protected]>
> First: Great work Jimmy, trying to improve transfer is a good thing.
>
> 2010/7/11 Jimmy O'Regan <[email protected]>
>
>>
>> > Of course, one can always acheive the same as <exception> by using
>> > <choose><when> and duplicating the contents of the single-item rules,
>> > but, well, that means duplicating content… this looks like it would be
>> > a lot simpler to maintain (and less ugly than output macros).
>> >
>>
>> Yeah; having to repeat almost all of the same rules in two layers of
>> transfer is a bit excessive.
>>
>
> So now there would be 2 solutons to the rule duplication problem:
> - macros
> - 'exceptions' - where you as far as I understand keep track of previous
> ''second longest" match of rules and in reality backtrack if the current
> rule 'fails'/throws an exception.
>
> Really, there is a 3rd solution which noone have used, but it might need
> consideration:
> Using XSLT on the transfer files to preprocess them, like the metadixes.
> In this way you could write the rule body just once and use it on several
> rules. Some smart XSLT might even allow to get rid of boring but nontrivial
> duplications like ADJ NOM, ADJ ADJ NOM, ADJ ADJ ADJ NOM, ADJ ADJ ADJ ADJ
> NOM expanding a single rule like ADJ[1-4] NOM. This would, at least in the
> Esperanto pairs, allow us to catch al lot of stuff that occurs too seldom to
> justify another rule duplication, but which would be nice to cover.
>
>
>
>>
>> >
>> > I'm still trying to make it break :)
>> >
>>
>> Well, as I mentioned, if the rule that it's falling back to also has
>> an exception, which also evaluates to true, then you will get
>> breakage: you'll either lose all but the first word, or it'll
>> segfault, which are both drastically bad outcomes.
>>
>
> Thats becaurse you keep track of just the previous rule match.
> Really you'd need an array of all the shorter rule matchs (so a match
> of ADJ ADJ ADJ NOM would be in the 4th place in the array).
>
> If you're interested I think I can implement that reasonable fast in the
> Java version.
>
> There are some things to consider: For example, if there is an ADJ ADJ ADJ
> NOM rule and both an ADJ ADJ ADJ and an ADJ ADJ NOM rule, and you'd like ADJ
> ADJ NOM to be applied. First longest match dictates ADJ ADJ ADJ, so you'd
> have to make that fail, as well as ADJ ADJ and also ADJ.
>
>
>
BTW if you think of it, in reality there are a lot more possibilities in the
FSTProcessor.
For example, we could also have several rules with the same match criteria
(i.e. same length) and then choosing the first one. If it 'fails' we
continue to the next one, etc.
And, a 'shortcut' exception going to the shortest match (or a match of
specified length). In this way a ADJ ADJ ADJ NOM fail could point to the
1-element rule ADJ, and after that the next match would be ADJ ADJ NOM.
--
Jacob Nordfalk
एस्पेरान्तो के हो? http://www.esperanto.org.np/.
Memoraĵoj de KEF -. http://kef.saluton.dk/memorajoj/
------------------------------------------------------------------------------
This SF.net email is sponsored by Sprint
What will you do first with EVO, the first 4G phone?
Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first
_______________________________________________
Apertium-stuff mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/apertium-stuff