Hi

I've added a new pattern rule checker
(commit commit e26967dc4663283574a8d536308c13ad188b44a0)
and it finds this issue:

The Catalan rule: FORCA2:6, token [1], contains "força
                    " that contains token separators, so can't possibly be
matched.
The Catalan rule: FORCA2:7, token [1], contains "força
                    " that contains token separators, so can't possibly be
matched.

The problem is detected in
 
languagetool-language-modules/ca/target/classes/org/languagetool/resource/ca/disambiguation.xml
which looks like this:

<rule>
    <pattern>
        <marker>
            <token postag="_GN_FS">força<exception postag="_GV_"/>
            </token>
        </marker>
    </pattern>
    <disambig action="filter" postag="N.*|_GN_.*"></disambig>
</rule>

It means that the newline and spaces after
the <exception…/> are slurped into the
value of the token which is unexpected
to me.

Removing the spaces and newline after the
exception, as follows, silences the error, but I
wonder whether spaces and newline should not
have been removed automatically after the exception:

<rule>
    <pattern>
        <marker>
            <token postag="_GN_FS">força<exception postag="_GV_"/></token>
        </marker>
    </pattern>
    <disambig action="filter" postag="N.*|_GN_.*"></disambig>
</rule>

Regards
Dominique
------------------------------------------------------------------------------
"Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE
Instantly run your Selenium tests across 300+ browser/OS combos.  Get 
unparalleled scalability from the best Selenium testing platform available.
Simple to use. Nothing to install. Get started now for free."
http://p.sf.net/sfu/SauceLabs
_______________________________________________
Languagetool-devel mailing list
Languagetool-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/languagetool-devel

Reply via email to