On 02/22/2015 01:18 PM, Marcin Miłkowski wrote:
> W dniu 2015-02-22 o 15:24, Andriy Rysin pisze:
>> On 02/22/2015 04:45 AM, Marcin Miłkowski wrote:
>>> Hi,
>>>
>>>
>>> W dniu 2015-02-21 o 19:22, Andriy Rysin pisze:
So the main problem with this performance improvement is that we read
acro
scussion for LanguageTool
Sent: Sunday, February 22, 2015 11:00 AM
Subject: Re: MultiThreadedJLanguageTool
Well, the condition on overlapping wasn't changed, we only changed on
how many overlapping matches are removed.
So I guess in this case we either have to change the positions for
e
Well, the condition on overlapping wasn't changed, we only changed on
how many overlapping matches are removed.
So I guess in this case we either have to change the positions for
each bracket rule (so that they don't overlap, i.e. position for first
should not be 0-1 but 0-0) or give them different
W dniu 2015-02-22 o 15:24, Andriy Rysin pisze:
> On 02/22/2015 04:45 AM, Marcin Miłkowski wrote:
>> Hi,
>>
>>
>> W dniu 2015-02-21 o 19:22, Andriy Rysin pisze:
>>> So the main problem with this performance improvement is that we read
>>> across paragraphs. There are two problems with this:
>>> 1) e
2015-02-22 15:04 GMT+01:00 Andriy Rysin :
> No, the only thing I pushed that will lead to regressions was remove
> more than one consequitive overlapping matches in SameRuleGroupFilter
> (and also make sure we remove conequitive overlaps produced by
> multiple threads). The regressions above seems
On 02/22/2015 04:45 AM, Marcin Miłkowski wrote:
> Hi,
>
>
> W dniu 2015-02-21 o 19:22, Andriy Rysin pisze:
>> So the main problem with this performance improvement is that we read
>> across paragraphs. There are two problems with this:
>> 1) error context shows sentences from another paragraph:
>>
No, the only thing I pushed that will lead to regressions was remove
more than one consequitive overlapping matches in SameRuleGroupFilter
(and also make sure we remove conequitive overlaps produced by
multiple threads). The regressions above seems to be all removals (the
other change would actuall
On 2015-02-21 19:22, Andriy Rysin wrote:
> So the main problem with this performance improvement is that we read
> across paragraphs. There are two problems with this:
> 1) error context shows sentences from another paragraph:
> I almost worked out a solution for that by adjusting ContextTools but
Hi,
W dniu 2015-02-21 o 19:22, Andriy Rysin pisze:
> So the main problem with this performance improvement is that we read
> across paragraphs. There are two problems with this:
> 1) error context shows sentences from another paragraph:
> I almost worked out a solution for that by adjusting Conte
Thanks, I've pushed suggested cleanups.
Andriy
2015-02-20 8:10 GMT-05:00 Daniel Naber :
> On 2015-02-19 22:16, Andriy Rysin wrote:
>
>> I've merged multithreading branch into master. Please try it out when
>> you have a chance and let me know if you see any issues.
>
> Thanks. Some small cleanup
So the main problem with this performance improvement is that we read
across paragraphs. There are two problems with this:
1) error context shows sentences from another paragraph:
I almost worked out a solution for that by adjusting ContextTools but
then I found the next one:
2) the cross-sentence
So before wrapping these optimizations up I decided to take a last
look at the thread graph in jvisualvm and it showed that the worker
threads spend more time in park state then in running. But the graph
was really not showing why, it was more like a noodle soup. So I
brought one of my past optimiz
On 2015-02-19 22:16, Andriy Rysin wrote:
> I've merged multithreading branch into master. Please try it out when
> you have a chance and let me know if you see any issues.
Thanks. Some small cleanup ideas:
-setThreadPoolSize should probably be a parameter of the constructor, as
calling it after
On 2015-02-20 00:58, Andriy Rysin wrote:
> Also with this we run SameRuleGroupFilter twice for both modes - one
> time (per thread) inside performCheck() and once at the end of check()
> after sorting. I feel like it's redundant and we can remove the first
> one.
Yes, I think so.
BTW, Ukrainian
uagetool/MultiThreadedJLanguageToolTest.java
@@ -18,7 +18,6 @@
*/
package org.languagetool;
-import static junit.framework.TestCase.assertTrue;
import static org.hamcrest.CoreMatchers.is;
import static org.hamcrest.MatcherAssert.assertThat;
@@ -31,7 +30,10 @@
import org.junit.Assert;
import org.junit.Test;
import or
Daniel
I took a look at the problem of SameRuleGroupFilter missing rules on
multithreaded execution due to rules with same id being split across
threads. So I've added a SameRuleGroupFilter.filter() after all
threads return.
But to my surprise the tests that compare single-threaded run with
multit
I've merged multithreading branch into master. Please try it out when
you have a chance and let me know if you see any issues.
Thanks
Andriy
2015-02-18 14:10 GMT-05:00 Andriy Rysin :
> That makes sense, change pushed.
>
> Andriy
>
> 2015-02-18 11:48 GMT-05:00 Daniel Naber :
>> On 2015-02-18 15:19
That makes sense, change pushed.
Andriy
2015-02-18 11:48 GMT-05:00 Daniel Naber :
> On 2015-02-18 15:19, Andriy Rysin wrote:
>
>> 2) remove it completely, but I think it would be nice to have in case
>> somebody (maybe me again :)) will want to do more performance
>> profiling
>
> What about chan
On 2015-02-18 15:19, Andriy Rysin wrote:
> 2) remove it completely, but I think it would be nice to have in case
> somebody (maybe me again :)) will want to do more performance
> profiling
What about changing the property name to something that makes clear it's
internal any nobody should rely on
So we have two options then:
1) move it to a parameter, but this is an option for the developer so
I don't think this makes a lot of sense
2) remove it completely, but I think it would be nice to have in case
somebody (maybe me again :)) will want to do more performance
profiling
E.g. I used this
On 2015-02-18 00:15, Andriy Rysin wrote:
> I don't have much explanation for this so I introduced a system
> property (org.languagetool.thread_count) if you want to force
> different # of threads.
We don't use system properties anywhere else in the core code (only once
to get the temp directory,
Ok, I worked on this a bit more and and didn't get anything as good as
in the first run:
as main thread reading the file and tokenizing sentence is always
single-threaded I tested some improvements there
1) in commandline.Main we do call handleLine (and all the heavy
processing using threads) on d
Great performance achievement!
> I've pushed a new branch "multithreading" into git. There are 3
> changes right now:
> 1) Don't recreate thread pool
> 2) Analyze sentences in threads
> 3) Optimize some code on main thread (as all coordination goes through
> a main thread it is a bottleneck and an
On 2015-02-14 21:58, Andriy Rysin wrote:
> I've pushed a new branch "multithreading" into git. There are 3
> changes right now:
I can confirm it helps quite a bit: For German, testing now takes 4.4ms
per sentence on average compared to 5.8ms before (measured with
org.languagetool.rules.patterns
I've pushed a new branch "multithreading" into git. There are 3
changes right now:
1) Don't recreate thread pool
2) Analyze sentences in threads
3) Optimize some code on main thread (as all coordination goes through
a main thread it is a bottleneck and any improvement there helps a
lot)
On my prof
So I've played with this a bit today and here's what I found:
with 3 relatively small changes:
1) reuse thread pool rather that recreate it every time (this probably
least important from performance point of view but it's easier to
profile 4 worker threads than hundreds)
2) run sentence analyzer in
On 2015-02-11 05:07, Andriy Rysin wrote:
> 1) it seems like we're currently creating and destorying thread pool
> every time we check sentences, would it not make more sense to create
> pool once and keep threads in the pool and reuse them?
I think so. The number of threads should then probably b
I have 2 questions about MultiThreadedJLanguageTool:
1) it seems like we're currently creating and destorying thread pool
every time we check sentences, would it not make more sense to create
pool once and keep threads in the pool and reuse them? It probably
would not improve performance muc
28 matches
Mail list logo