Hi, Marcin. I have tested the current code (1.8.0-SNAPSHOT) and everything is OK, all the changes are there. Thank you.
Now we need a release with the changes, and we'll be able to adapt the tests. Regards, Jaume Salutacions, Jaume Ortolà www.riuraueditors.cat 2013/7/15 Marcin Miłkowski <list-addr...@wp.pl>: > W dniu 2013-07-15 12:41, Jaume Ortolà i Font pisze: >> Thanks, Marcin. >> >> Some remarks. The improvements I sent to the list 15 days ago have not >> been added, and moreover I have found more bugs. > > I'm really sorry but there are 200 mails from the mailing list over the > last two weeks and I have been away from my e-mail. Could you please add > your changes as issues on github for morfologik-stemming? This way it > would make it much easier for us to track these things. > >> >> I attach the code I'm using now and explain briefly the reasons for the >> changes. >> >> - In the getAllReplacements method we need to make sure that the >> replacements are done from left to right. We must complete the >> for-loop of the replacement pairs, choose the first possible >> replacement (form left to right) and then start the two new branches >> (with and without replacement). Otherwise, some replacements are not >> done. > > OK, this sounds OK. I integrated your changes. > >> - If there is "ss" as a key in the replacement pairs, and somebody >> uses a long string of s ("ssssssssss...") as input text, this can >> cause the method to consume all the memory, as the algorithm is >> exponential (2^(number of replacements)). This happened to us in an >> online server, and the LT server crashed. The depth of the recursive >> algorithm should be limited to 4 o 5 levels at most. > > Is that in getAllReplacements()? > >> - It is possible that different "words to check" give the same >> suggestion. So at some point we need to remove duplicates. I do this >> at the end of findReplacements(). > > You are right. We could probably write the same code in a slightly more > elegant way, without converting this to a LinkedHashSet but simply by > adding to a set when iterating the list. > >> >> - The conditions around line 238 (current github version 1.7) are not >> correct. The first isInDictionary makes the lower case conversion >> useless: >> >> if (isInDictionary(wordChecked) >> && dictionaryMetadata.isConvertingCase() >> && isMixedCase(wordChecked) >> && >> isInDictionary(wordChecked.toLowerCase(dictionaryMetadata.getLocale()))) >> >> I think they should be something like: >> >> if (isInDictionary(wordChecked) >> || (dictionaryMetadata.convertCase >> && isMixedCase(wordChecked) >> && isInDictionary(wordChecked >> .toLowerCase(dictionaryMetadata.dictionaryLocale)))) > > Fixed! > > I tried to add your fixes but your code is now quite far away from ours, > so diff does not give any meaningful output. Please review the code on > github, and if needed, file an issue over changes that need to be done. > > Regards, > Marcin > >> >> Regards, >> Jaume Ortolà >> Salutacions, >> Jaume Ortolà >> www.riuraueditors.cat >> >> >> >> 2013/7/15 Marcin Miłkowski <list-addr...@wp.pl>: >>> W dniu 2013-07-15 10:56, Marcin Miłkowski pisze: >>>> Hi, >>>> >>>> Dawid just released morfologik 1.7 on Maven. So we can actually go on >>>> and include a newer version in LT. >>>> >>>> The new version still does not support compounding but it has all the >>>> features required for getting better diacritic suggestions. >>> >>> Here's the documentation: >>> >>> http://wiki.languagetool.org/hunspell-support#toc5 >>> >>> Best, >>> Marcin >>> >>> >>>> Best, >>>> Marcin >>>> >>>> W dniu 2013-07-02 08:59, Marcin Miłkowski pisze: >>>>> W dniu 2013-07-02 01:11, Jaume Ortolà i Font pisze: >>>>>> Hi Marcin, >>>>>> >>>>>> I have been using the still unreleased code of morfologik-stemming and I >>>>>> have made improvements to Speller.java for some previously unforseen >>>>>> cases. See the attachement. >>>>>> >>>>>> In order to complete the development, and test & debug with all >>>>>> languages, perhaps we could include temporarily the morfologik module >>>>>> inside LanguageTool. This will make thinks easier. What do yo think? >>>>> >>>>> No. I should make a release, forking morfologik makes no sense to me. >>>>> >>>>> The only thing that stops me is the lack of time to work on compounds. >>>>> >>>>> Best, >>>>> Marcin >>>>> >>>>> ------------------------------------------------------------------------------ >>>>> >>>>> This SF.net email is sponsored by Windows: >>>>> >>>>> Build for Windows Store. >>>>> >>>>> http://p.sf.net/sfu/windows-dev2dev >>>>> _______________________________________________ >>>>> Languagetool-devel mailing list >>>>> Languagetool-devel@lists.sourceforge.net >>>>> https://lists.sourceforge.net/lists/listinfo/languagetool-devel >>>>> >>>> >>> >>> >>> ------------------------------------------------------------------------------ >>> See everything from the browser to the database with AppDynamics >>> Get end-to-end visibility with application monitoring from AppDynamics >>> Isolate bottlenecks and diagnose root cause in seconds. >>> Start your free trial of AppDynamics Pro today! >>> http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk >>> _______________________________________________ >>> Languagetool-devel mailing list >>> Languagetool-devel@lists.sourceforge.net >>> https://lists.sourceforge.net/lists/listinfo/languagetool-devel >>> >>> >>> ------------------------------------------------------------------------------ >>> See everything from the browser to the database with AppDynamics >>> Get end-to-end visibility with application monitoring from AppDynamics >>> Isolate bottlenecks and diagnose root cause in seconds. >>> Start your free trial of AppDynamics Pro today! >>> http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk >>> >>> >>> _______________________________________________ >>> Languagetool-devel mailing list >>> Languagetool-devel@lists.sourceforge.net >>> https://lists.sourceforge.net/lists/listinfo/languagetool-devel > > > ------------------------------------------------------------------------------ > See everything from the browser to the database with AppDynamics > Get end-to-end visibility with application monitoring from AppDynamics > Isolate bottlenecks and diagnose root cause in seconds. > Start your free trial of AppDynamics Pro today! > http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk > _______________________________________________ > Languagetool-devel mailing list > Languagetool-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/languagetool-devel ------------------------------------------------------------------------------ See everything from the browser to the database with AppDynamics Get end-to-end visibility with application monitoring from AppDynamics Isolate bottlenecks and diagnose root cause in seconds. Start your free trial of AppDynamics Pro today! http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk _______________________________________________ Languagetool-devel mailing list Languagetool-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/languagetool-devel