On 24 July 2010 13:15, Kevin Brubeck Unhammer <[email protected]> wrote: > I've noticed a lot more rules that all could do with this <exception>, > at least a fifth of the sme-nob chunking rules have possibilities for > mis-chunking (eg. det.loc + n.ill should not be chunked, but most > other cases of det and n should be chunked), the same for all the > conjunction rules (in the first interchunk) that merge two chunks. >
Not exactly sure what you're saying here. The main point of this is not to throw lookahead all over place; but to stop a pattern from 'stealing' from a real chunk; if there's nothing that could follow n.ill that would form a proper chunk, then you don't need this, just check as normal and output two chunks from the existing rule - probably what you're already doing. > In the above example I could have just added extra almost-identical > rules to cover all the patterns (involving a lot of redundancy), but > if the exception depends on target-language information even that > wouldn't do it. Eg. most verbs both in Bokmål and Sámi have adjective > forms, so we allow Sámi <v><adj> to enter into ADJ NOM rules. But some > Sámi verbs translate to a certain class of Bokmål verbs (lexicalised > passives) that don't have adj forms, these get the tag <pstv> in > bidix, but we can't know that from the <pattern>; here the <exception> > would be great. It sounds to me like you're just not being precise enough in the pattern items, but the whole area of derivational morphology bores me to sleep, so maybe I missed something. -- <Leftmost> jimregan, that's because deep inside you, you are evil. <Leftmost> Also not-so-deep inside you. ------------------------------------------------------------------------------ This SF.net email is sponsored by Sprint What will you do first with EVO, the first 4G phone? Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first _______________________________________________ Apertium-stuff mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/apertium-stuff
