W dniu 2014-09-03 10:57, Jaume Ortolà i Font pisze: > 2014-09-03 9:59 GMT+02:00 Marcin Miłkowski <list-addr...@wp.pl > <mailto:list-addr...@wp.pl>>: > > > We could, in principle, try to add this kind of test to the > disambiguator action but I'm not sure if it won't break something. > > > Hi, > > I agree with Dominique. This behavior can generate errors which are very > hard to detect. > > If we change it, it's more than probable than something will break, as > we have a lot of disambiguator rules now. Perhaps some filter action is > to be replaced by a replace action. But I think it's worth a try.
As I expected, it's not trivial. It works fine for Polish and English but not for Catalan. There must be a filter somewhere in the Catalan disambiguator file that expects this behavior. To test, please use this code in DisambiguationPatternRuleReplacer.java, from line 346: boolean newPOSmatches = false; // only apply filter rule when it matches previous tags: for (int i = 0; i < whTokens[fromPos].getReadingsLength(); i++) { if (!whTokens[fromPos].getAnalyzedToken(i).hasNoTag() && whTokens[fromPos].getAnalyzedToken(i).getPOSTag().matches(disambiguatedPOS)) { newPOSmatches = true; break; } } if (newPOSmatches) { final MatchState matchState = tmpMatchToken.createState(rule.getLanguage().getSynthesizer(), whTokens[fromPos]); final String prevValue = whTokens[fromPos].toString(); final String prevAnot = whTokens[fromPos].getHistoricalAnnotations(); whTokens[fromPos] = matchState.filterReadings(); annotateChange(whTokens[fromPos], prevValue, prevAnot); } --- (replace everything until the break statement). You will see that the Catalan pattern rule breaks then. Please fix it, and I'll see if that's everything we need. Regards, Marcin ------------------------------------------------------------------------------ Slashdot TV. Video for Nerds. Stuff that matters. http://tv.slashdot.org/ _______________________________________________ Languagetool-devel mailing list Languagetool-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/languagetool-devel