Hi Jaume, W dniu 2013-07-15 21:16, Jaume Ortolà i Font pisze: > Hi, Marcin. > > I have tested the current code (1.8.0-SNAPSHOT) and everything is OK, > all the changes are there. Thank you.
Great. We'll release 1.7.1, this is just a minor bug fix. BTW, when you see something you want to fix, just make a fork on github to fix it, then file an issue, and then make a pull request associated with that issue. That way, it will be much easier to develop the library with your changes. Also, if you'll find time to use a proper way of removing duplicates (now we lose information from CandidateData that might be significant for something - I know this is me being fussy, this is quite clean). Regards, Marcin > > Now we need a release with the changes, and we'll be able to adapt the tests. > > Regards, > Jaume > Salutacions, > Jaume Ortolà > www.riuraueditors.cat > > > > 2013/7/15 Marcin Miłkowski <list-addr...@wp.pl>: >> W dniu 2013-07-15 12:41, Jaume Ortolà i Font pisze: >>> Thanks, Marcin. >>> >>> Some remarks. The improvements I sent to the list 15 days ago have not >>> been added, and moreover I have found more bugs. >> I'm really sorry but there are 200 mails from the mailing list over the >> last two weeks and I have been away from my e-mail. Could you please add >> your changes as issues on github for morfologik-stemming? This way it >> would make it much easier for us to track these things. >> >>> I attach the code I'm using now and explain briefly the reasons for the >>> changes. >>> >>> - In the getAllReplacements method we need to make sure that the >>> replacements are done from left to right. We must complete the >>> for-loop of the replacement pairs, choose the first possible >>> replacement (form left to right) and then start the two new branches >>> (with and without replacement). Otherwise, some replacements are not >>> done. >> OK, this sounds OK. I integrated your changes. >> >>> - If there is "ss" as a key in the replacement pairs, and somebody >>> uses a long string of s ("ssssssssss...") as input text, this can >>> cause the method to consume all the memory, as the algorithm is >>> exponential (2^(number of replacements)). This happened to us in an >>> online server, and the LT server crashed. The depth of the recursive >>> algorithm should be limited to 4 o 5 levels at most. >> Is that in getAllReplacements()? >> >>> - It is possible that different "words to check" give the same >>> suggestion. So at some point we need to remove duplicates. I do this >>> at the end of findReplacements(). >> You are right. We could probably write the same code in a slightly more >> elegant way, without converting this to a LinkedHashSet but simply by >> adding to a set when iterating the list. >> >>> - The conditions around line 238 (current github version 1.7) are not >>> correct. The first isInDictionary makes the lower case conversion >>> useless: >>> >>> if (isInDictionary(wordChecked) >>> && dictionaryMetadata.isConvertingCase() >>> && isMixedCase(wordChecked) >>> && >>> isInDictionary(wordChecked.toLowerCase(dictionaryMetadata.getLocale()))) >>> >>> I think they should be something like: >>> >>> if (isInDictionary(wordChecked) >>> || (dictionaryMetadata.convertCase >>> && isMixedCase(wordChecked) >>> && isInDictionary(wordChecked >>> .toLowerCase(dictionaryMetadata.dictionaryLocale)))) >> Fixed! >> >> I tried to add your fixes but your code is now quite far away from ours, >> so diff does not give any meaningful output. Please review the code on >> github, and if needed, file an issue over changes that need to be done. >> >> Regards, >> Marcin >> >>> Regards, >>> Jaume Ortolà >>> Salutacions, >>> Jaume Ortolà >>> www.riuraueditors.cat >>> >>> >>> >>> 2013/7/15 Marcin Miłkowski <list-addr...@wp.pl>: >>>> W dniu 2013-07-15 10:56, Marcin Miłkowski pisze: >>>>> Hi, >>>>> >>>>> Dawid just released morfologik 1.7 on Maven. So we can actually go on >>>>> and include a newer version in LT. >>>>> >>>>> The new version still does not support compounding but it has all the >>>>> features required for getting better diacritic suggestions. >>>> Here's the documentation: >>>> >>>> http://wiki.languagetool.org/hunspell-support#toc5 >>>> >>>> Best, >>>> Marcin >>>> >>>> >>>>> Best, >>>>> Marcin >>>>> >>>>> W dniu 2013-07-02 08:59, Marcin Miłkowski pisze: >>>>>> W dniu 2013-07-02 01:11, Jaume Ortolà i Font pisze: >>>>>>> Hi Marcin, >>>>>>> >>>>>>> I have been using the still unreleased code of morfologik-stemming and I >>>>>>> have made improvements to Speller.java for some previously unforseen >>>>>>> cases. See the attachement. >>>>>>> >>>>>>> In order to complete the development, and test & debug with all >>>>>>> languages, perhaps we could include temporarily the morfologik module >>>>>>> inside LanguageTool. This will make thinks easier. What do yo think? >>>>>> No. I should make a release, forking morfologik makes no sense to me. >>>>>> >>>>>> The only thing that stops me is the lack of time to work on compounds. >>>>>> >>>>>> Best, >>>>>> Marcin >>>>>> >>>>>> ------------------------------------------------------------------------------ >>>>>> >>>>>> This SF.net email is sponsored by Windows: >>>>>> >>>>>> Build for Windows Store. >>>>>> >>>>>> http://p.sf.net/sfu/windows-dev2dev >>>>>> _______________________________________________ >>>>>> Languagetool-devel mailing list >>>>>> Languagetool-devel@lists.sourceforge.net >>>>>> https://lists.sourceforge.net/lists/listinfo/languagetool-devel >>>>>> >>>> >>>> ------------------------------------------------------------------------------ >>>> See everything from the browser to the database with AppDynamics >>>> Get end-to-end visibility with application monitoring from AppDynamics >>>> Isolate bottlenecks and diagnose root cause in seconds. >>>> Start your free trial of AppDynamics Pro today! >>>> http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk >>>> _______________________________________________ >>>> Languagetool-devel mailing list >>>> Languagetool-devel@lists.sourceforge.net >>>> https://lists.sourceforge.net/lists/listinfo/languagetool-devel >>>> >>>> >>>> ------------------------------------------------------------------------------ >>>> See everything from the browser to the database with AppDynamics >>>> Get end-to-end visibility with application monitoring from AppDynamics >>>> Isolate bottlenecks and diagnose root cause in seconds. >>>> Start your free trial of AppDynamics Pro today! >>>> http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk >>>> >>>> >>>> _______________________________________________ >>>> Languagetool-devel mailing list >>>> Languagetool-devel@lists.sourceforge.net >>>> https://lists.sourceforge.net/lists/listinfo/languagetool-devel >> >> ------------------------------------------------------------------------------ >> See everything from the browser to the database with AppDynamics >> Get end-to-end visibility with application monitoring from AppDynamics >> Isolate bottlenecks and diagnose root cause in seconds. >> Start your free trial of AppDynamics Pro today! >> http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk >> _______________________________________________ >> Languagetool-devel mailing list >> Languagetool-devel@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/languagetool-devel > ------------------------------------------------------------------------------ > See everything from the browser to the database with AppDynamics > Get end-to-end visibility with application monitoring from AppDynamics > Isolate bottlenecks and diagnose root cause in seconds. > Start your free trial of AppDynamics Pro today! > http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk > _______________________________________________ > Languagetool-devel mailing list > Languagetool-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/languagetool-devel ------------------------------------------------------------------------------ See everything from the browser to the database with AppDynamics Get end-to-end visibility with application monitoring from AppDynamics Isolate bottlenecks and diagnose root cause in seconds. Start your free trial of AppDynamics Pro today! http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk _______________________________________________ Languagetool-devel mailing list Languagetool-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/languagetool-devel