To comment on the following update, log in, then open the issue: http://www.openoffice.org/issues/show_bug.cgi?id=35725
------- Additional comments from [EMAIL PROTECTED] Sat Aug 27 13:35:20 -0700 2005 ------- I have fixed the problem in Hunspell 1.0.9. (http://sourceforge.net/projects/hunspell) (Changelog) * src/hunspell/{suggestmgr,affixmgr,hunspell}.cxx: improve ngram suggestion. Fix http://qa.openoffice.org/issues/show_bug.cgi?id=35725. See release notes for examples. This problem reported by beccablain at OpenOffice.org. - ngram suggestions now are case insensitive (see `Permenant' bug in Issuezilla) - weight ngram suggestions (with the longest common subsequent algorithm, also considering lengths of bad word and suggestion, identical first letters and almost completely identical character positions) - set strict affix congruency in expand_rootword(). Now ngram suggestions are good for languages with rich morphology and also better for English. Rationale: affixed forms of the first ngram suggestion very often suppress the second and subsequent root word suggestions. But faults in affixes are more uncommon, and can be fix without suggestions. We must prefer the more informative second and subsequent root word suggestions instead of the suggestions for bad affixes. - a better suggestion may not be substring of a less good suggestion Rationale: Suggesting affixed forms of a root word is unnecessary, when root word has got better weighted ngram value. (Checking substrings is a good approximation for this refinement.) - lesser ngram suggestions (default 3 maximum instead of 10) Rationale: For users need a big extra effort to check a lot of bad ngram suggestions, nine times out of ten unnecessarily. It is very distracting, because ngram suggestions could be very different. Usually Myspell and Hunspell suggest one or two suggestions with the old suggestion algorithms (maximum is 15), with ngram algorithm often gives maximum number suggestions. With strict affix congruency and other refinements, the good suggestion there is usually among the first three elements. - new affix parameter: MAXNGRAMSUG (Release notes) ------ examples for ngram improvement (O=old, N = new ngram suggestions) -- 1. Permenant (instead of Permanent) O: Endangerment, Ferment, Fermented, Deferment's, Empowerment, Ferment's, Ferments, Fermenting, Countermen, Weathermen N: Permanent, Supermen, Preferment Note: Ngram suggestions was case sensitive. 2. permenant (instead of permanent) O: supermen, newspapermen, empowerment, endangerment, preferments, preferment, permanent, preferment's, permanently, impermanent N: permanent, supermen, preferment Note: new suggestions are also weighted with longest common subsequence, first letter and common character positions 3. pernemant (instead of permanent) O: pimpernel's, pimpernel, pimpernels, permanently, permanents, permanent, supernatant, impermanent, semipermanent, impermanently N: permanent, supernatant, pimpernel Note: new method also prefers root word instead of not relevant affixes ('s, s and ly) 4. pernament (instead of permanent) O: tournaments, tournament, ornaments, ornament's, ornamenting, ornamented, ornament, ornamentals, ornamental, ornamentally N: ornamental, ornament, tournament Note: Both ngram methods misses here. 5. obvus (instad of obvious): O: obvious, Corvus, obverse, obviously, Jacobus, obtuser, obtuse, obviates, obviate, Travus N: obvious, obtuse, obverse Note: new method also prefers common first letters. 6. unambigus (instead of unambiguous) O: unambiguous, unambiguity, unambiguously, ambiguously, ambiguous, unambitious, ambiguities, ambiguousness N: unambiguous, unambiguity, unambitious 7. consecvence (instead of consequence) O: consecutive, consecutively, consecutiveness, nonconsecutive, consequence, consecutiveness's, convenience's, consistences, consistence N: consequence, consecutive, consecrates An example in a language with rich morphology: 8. Misisipiben (instead of Mississippiben [`in Mississippi' in Hungarian]): O: Misikédéiben, Pisisedéiben, Misikéiéiben, Pisisekéiben, Misikéiben, Misikéidéiben, Misikékéiben, Misikéikéiben, Misikéiméiben, Mississippiiben N: Mississippiben, Mississippiiben, Misiiben Note: Suggesting not relevant affixes was the biggest fault in ngram suggestion for languages with a lot of affixes. --------------------------------------------------------------------- Please do not reply to this automatically generated notification from Issue Tracker. Please log onto the website and enter your comments. http://qa.openoffice.org/issue_handling/project_issues.html#notification --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]