Hi, Marcin.

I have tested the current code (1.8.0-SNAPSHOT) and everything is OK,
all the changes are there. Thank you.

Now we need a release with the changes, and we'll be able to adapt the tests.

Regards,
Jaume
Salutacions,
Jaume Ortolà
www.riuraueditors.cat



2013/7/15 Marcin Miłkowski <list-addr...@wp.pl>:
> W dniu 2013-07-15 12:41, Jaume Ortolà i Font pisze:
>> Thanks, Marcin.
>>
>> Some remarks. The improvements I sent to the list 15 days ago have not
>> been added, and moreover I have found more bugs.
>
> I'm really sorry but there are 200 mails from the mailing list over the
> last two weeks and I have been away from my e-mail. Could you please add
> your changes as issues on github for morfologik-stemming? This way it
> would make it much easier for us to track these things.
>
>>
>> I attach the code I'm using now and explain briefly the reasons for the 
>> changes.
>>
>> - In the getAllReplacements method we need to make sure that the
>> replacements are done from left to right. We must complete the
>> for-loop of the replacement pairs, choose the first possible
>> replacement (form left to right) and then start the two new branches
>> (with and without replacement). Otherwise, some replacements are not
>> done.
>
> OK, this sounds OK. I integrated your changes.
>
>> - If there is "ss" as a key in the replacement pairs, and somebody
>> uses a long string of s ("ssssssssss...") as input text, this can
>> cause the method to consume all the memory, as the algorithm is
>> exponential (2^(number of replacements)). This happened to us in an
>> online server, and the LT server crashed. The depth of the recursive
>> algorithm should be limited to 4 o 5 levels at most.
>
> Is that in getAllReplacements()?
>
>> - It is possible that different "words to check" give the same
>> suggestion. So at some point we need to remove duplicates. I do this
>> at the end of findReplacements().
>
> You are right. We could probably write the same code in a slightly more
> elegant way, without converting this to a LinkedHashSet but simply by
> adding to a set when iterating the list.
>
>>
>> - The conditions around line 238 (current github version 1.7) are not
>> correct. The first isInDictionary makes the lower case conversion
>> useless:
>>
>>                      if (isInDictionary(wordChecked)
>>                              && dictionaryMetadata.isConvertingCase()
>>                              && isMixedCase(wordChecked)
>>                              &&
>> isInDictionary(wordChecked.toLowerCase(dictionaryMetadata.getLocale())))
>>
>> I think they should be something like:
>>
>>            if (isInDictionary(wordChecked)
>>                || (dictionaryMetadata.convertCase
>>                && isMixedCase(wordChecked)
>>                && isInDictionary(wordChecked
>>                    .toLowerCase(dictionaryMetadata.dictionaryLocale))))
>
> Fixed!
>
> I tried to add your fixes but your code is now quite far away from ours,
> so diff does not give any meaningful output. Please review the code on
> github, and if needed, file an issue over changes that need to be done.
>
> Regards,
> Marcin
>
>>
>> Regards,
>> Jaume Ortolà
>> Salutacions,
>> Jaume Ortolà
>> www.riuraueditors.cat
>>
>>
>>
>> 2013/7/15 Marcin Miłkowski <list-addr...@wp.pl>:
>>> W dniu 2013-07-15 10:56, Marcin Miłkowski pisze:
>>>> Hi,
>>>>
>>>> Dawid just released morfologik 1.7 on Maven. So we can actually go on
>>>> and include a newer version in LT.
>>>>
>>>> The new version still does not support compounding but it has all the
>>>> features required for getting better diacritic suggestions.
>>>
>>> Here's the documentation:
>>>
>>> http://wiki.languagetool.org/hunspell-support#toc5
>>>
>>> Best,
>>> Marcin
>>>
>>>
>>>> Best,
>>>> Marcin
>>>>
>>>> W dniu 2013-07-02 08:59, Marcin Miłkowski pisze:
>>>>> W dniu 2013-07-02 01:11, Jaume Ortolà i Font pisze:
>>>>>> Hi Marcin,
>>>>>>
>>>>>> I have been using the still unreleased code of morfologik-stemming and I
>>>>>> have made improvements to Speller.java for some previously unforseen
>>>>>> cases. See the attachement.
>>>>>>
>>>>>> In order to complete the development, and test & debug with all
>>>>>> languages, perhaps we could include temporarily the morfologik module
>>>>>> inside LanguageTool. This will make thinks easier. What do yo think?
>>>>>
>>>>> No. I should make a release, forking morfologik makes no sense to me.
>>>>>
>>>>> The only thing that stops me is the lack of time to work on compounds.
>>>>>
>>>>> Best,
>>>>> Marcin
>>>>>
>>>>> ------------------------------------------------------------------------------
>>>>>
>>>>> This SF.net email is sponsored by Windows:
>>>>>
>>>>> Build for Windows Store.
>>>>>
>>>>> http://p.sf.net/sfu/windows-dev2dev
>>>>> _______________________________________________
>>>>> Languagetool-devel mailing list
>>>>> Languagetool-devel@lists.sourceforge.net
>>>>> https://lists.sourceforge.net/lists/listinfo/languagetool-devel
>>>>>
>>>>
>>>
>>>
>>> ------------------------------------------------------------------------------
>>> See everything from the browser to the database with AppDynamics
>>> Get end-to-end visibility with application monitoring from AppDynamics
>>> Isolate bottlenecks and diagnose root cause in seconds.
>>> Start your free trial of AppDynamics Pro today!
>>> http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk
>>> _______________________________________________
>>> Languagetool-devel mailing list
>>> Languagetool-devel@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/languagetool-devel
>>>
>>>
>>> ------------------------------------------------------------------------------
>>> See everything from the browser to the database with AppDynamics
>>> Get end-to-end visibility with application monitoring from AppDynamics
>>> Isolate bottlenecks and diagnose root cause in seconds.
>>> Start your free trial of AppDynamics Pro today!
>>> http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk
>>>
>>>
>>> _______________________________________________
>>> Languagetool-devel mailing list
>>> Languagetool-devel@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/languagetool-devel
>
>
> ------------------------------------------------------------------------------
> See everything from the browser to the database with AppDynamics
> Get end-to-end visibility with application monitoring from AppDynamics
> Isolate bottlenecks and diagnose root cause in seconds.
> Start your free trial of AppDynamics Pro today!
> http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk
> _______________________________________________
> Languagetool-devel mailing list
> Languagetool-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/languagetool-devel

------------------------------------------------------------------------------
See everything from the browser to the database with AppDynamics
Get end-to-end visibility with application monitoring from AppDynamics
Isolate bottlenecks and diagnose root cause in seconds.
Start your free trial of AppDynamics Pro today!
http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk
_______________________________________________
Languagetool-devel mailing list
Languagetool-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/languagetool-devel

Reply via email to