Hi Jaume,

W dniu 2013-07-15 21:16, Jaume Ortolà i Font pisze:
> Hi, Marcin.
>
> I have tested the current code (1.8.0-SNAPSHOT) and everything is OK,
> all the changes are there. Thank you.

Great. We'll release 1.7.1, this is just a minor bug fix.

BTW, when you see something you want to fix, just make a fork on github 
to fix it, then file an issue, and then make a pull request associated 
with that issue. That way, it will be much easier to develop the library 
with your changes.

Also, if you'll find time to use a proper way of removing duplicates 
(now we lose information from CandidateData that might be significant 
for something - I know this is me being fussy, this is quite clean).

Regards,
Marcin

>
> Now we need a release with the changes, and we'll be able to adapt the tests.
>
> Regards,
> Jaume
> Salutacions,
> Jaume Ortolà
> www.riuraueditors.cat
>
>
>
> 2013/7/15 Marcin Miłkowski <list-addr...@wp.pl>:
>> W dniu 2013-07-15 12:41, Jaume Ortolà i Font pisze:
>>> Thanks, Marcin.
>>>
>>> Some remarks. The improvements I sent to the list 15 days ago have not
>>> been added, and moreover I have found more bugs.
>> I'm really sorry but there are 200 mails from the mailing list over the
>> last two weeks and I have been away from my e-mail. Could you please add
>> your changes as issues on github for morfologik-stemming? This way it
>> would make it much easier for us to track these things.
>>
>>> I attach the code I'm using now and explain briefly the reasons for the 
>>> changes.
>>>
>>> - In the getAllReplacements method we need to make sure that the
>>> replacements are done from left to right. We must complete the
>>> for-loop of the replacement pairs, choose the first possible
>>> replacement (form left to right) and then start the two new branches
>>> (with and without replacement). Otherwise, some replacements are not
>>> done.
>> OK, this sounds OK. I integrated your changes.
>>
>>> - If there is "ss" as a key in the replacement pairs, and somebody
>>> uses a long string of s ("ssssssssss...") as input text, this can
>>> cause the method to consume all the memory, as the algorithm is
>>> exponential (2^(number of replacements)). This happened to us in an
>>> online server, and the LT server crashed. The depth of the recursive
>>> algorithm should be limited to 4 o 5 levels at most.
>> Is that in getAllReplacements()?
>>
>>> - It is possible that different "words to check" give the same
>>> suggestion. So at some point we need to remove duplicates. I do this
>>> at the end of findReplacements().
>> You are right. We could probably write the same code in a slightly more
>> elegant way, without converting this to a LinkedHashSet but simply by
>> adding to a set when iterating the list.
>>
>>> - The conditions around line 238 (current github version 1.7) are not
>>> correct. The first isInDictionary makes the lower case conversion
>>> useless:
>>>
>>>                       if (isInDictionary(wordChecked)
>>>                               && dictionaryMetadata.isConvertingCase()
>>>                               && isMixedCase(wordChecked)
>>>                               &&
>>> isInDictionary(wordChecked.toLowerCase(dictionaryMetadata.getLocale())))
>>>
>>> I think they should be something like:
>>>
>>>             if (isInDictionary(wordChecked)
>>>                 || (dictionaryMetadata.convertCase
>>>                 && isMixedCase(wordChecked)
>>>                 && isInDictionary(wordChecked
>>>                     .toLowerCase(dictionaryMetadata.dictionaryLocale))))
>> Fixed!
>>
>> I tried to add your fixes but your code is now quite far away from ours,
>> so diff does not give any meaningful output. Please review the code on
>> github, and if needed, file an issue over changes that need to be done.
>>
>> Regards,
>> Marcin
>>
>>> Regards,
>>> Jaume Ortolà
>>> Salutacions,
>>> Jaume Ortolà
>>> www.riuraueditors.cat
>>>
>>>
>>>
>>> 2013/7/15 Marcin Miłkowski <list-addr...@wp.pl>:
>>>> W dniu 2013-07-15 10:56, Marcin Miłkowski pisze:
>>>>> Hi,
>>>>>
>>>>> Dawid just released morfologik 1.7 on Maven. So we can actually go on
>>>>> and include a newer version in LT.
>>>>>
>>>>> The new version still does not support compounding but it has all the
>>>>> features required for getting better diacritic suggestions.
>>>> Here's the documentation:
>>>>
>>>> http://wiki.languagetool.org/hunspell-support#toc5
>>>>
>>>> Best,
>>>> Marcin
>>>>
>>>>
>>>>> Best,
>>>>> Marcin
>>>>>
>>>>> W dniu 2013-07-02 08:59, Marcin Miłkowski pisze:
>>>>>> W dniu 2013-07-02 01:11, Jaume Ortolà i Font pisze:
>>>>>>> Hi Marcin,
>>>>>>>
>>>>>>> I have been using the still unreleased code of morfologik-stemming and I
>>>>>>> have made improvements to Speller.java for some previously unforseen
>>>>>>> cases. See the attachement.
>>>>>>>
>>>>>>> In order to complete the development, and test & debug with all
>>>>>>> languages, perhaps we could include temporarily the morfologik module
>>>>>>> inside LanguageTool. This will make thinks easier. What do yo think?
>>>>>> No. I should make a release, forking morfologik makes no sense to me.
>>>>>>
>>>>>> The only thing that stops me is the lack of time to work on compounds.
>>>>>>
>>>>>> Best,
>>>>>> Marcin
>>>>>>
>>>>>> ------------------------------------------------------------------------------
>>>>>>
>>>>>> This SF.net email is sponsored by Windows:
>>>>>>
>>>>>> Build for Windows Store.
>>>>>>
>>>>>> http://p.sf.net/sfu/windows-dev2dev
>>>>>> _______________________________________________
>>>>>> Languagetool-devel mailing list
>>>>>> Languagetool-devel@lists.sourceforge.net
>>>>>> https://lists.sourceforge.net/lists/listinfo/languagetool-devel
>>>>>>
>>>>
>>>> ------------------------------------------------------------------------------
>>>> See everything from the browser to the database with AppDynamics
>>>> Get end-to-end visibility with application monitoring from AppDynamics
>>>> Isolate bottlenecks and diagnose root cause in seconds.
>>>> Start your free trial of AppDynamics Pro today!
>>>> http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk
>>>> _______________________________________________
>>>> Languagetool-devel mailing list
>>>> Languagetool-devel@lists.sourceforge.net
>>>> https://lists.sourceforge.net/lists/listinfo/languagetool-devel
>>>>
>>>>
>>>> ------------------------------------------------------------------------------
>>>> See everything from the browser to the database with AppDynamics
>>>> Get end-to-end visibility with application monitoring from AppDynamics
>>>> Isolate bottlenecks and diagnose root cause in seconds.
>>>> Start your free trial of AppDynamics Pro today!
>>>> http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk
>>>>
>>>>
>>>> _______________________________________________
>>>> Languagetool-devel mailing list
>>>> Languagetool-devel@lists.sourceforge.net
>>>> https://lists.sourceforge.net/lists/listinfo/languagetool-devel
>>
>> ------------------------------------------------------------------------------
>> See everything from the browser to the database with AppDynamics
>> Get end-to-end visibility with application monitoring from AppDynamics
>> Isolate bottlenecks and diagnose root cause in seconds.
>> Start your free trial of AppDynamics Pro today!
>> http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk
>> _______________________________________________
>> Languagetool-devel mailing list
>> Languagetool-devel@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/languagetool-devel
> ------------------------------------------------------------------------------
> See everything from the browser to the database with AppDynamics
> Get end-to-end visibility with application monitoring from AppDynamics
> Isolate bottlenecks and diagnose root cause in seconds.
> Start your free trial of AppDynamics Pro today!
> http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk
> _______________________________________________
> Languagetool-devel mailing list
> Languagetool-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/languagetool-devel


------------------------------------------------------------------------------
See everything from the browser to the database with AppDynamics
Get end-to-end visibility with application monitoring from AppDynamics
Isolate bottlenecks and diagnose root cause in seconds.
Start your free trial of AppDynamics Pro today!
http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk
_______________________________________________
Languagetool-devel mailing list
Languagetool-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/languagetool-devel

Reply via email to