I've pushed a new branch "multithreading" into git. There are 3
changes right now:
1) Don't recreate thread pool
2) Analyze sentences in threads
3) Optimize some code on main thread (as all coordination goes through
a main thread it is a bottleneck and any improvement there helps a
lot)

On my profiling case (big text file) the performance improvement is ~40%:
< Time: 321253ms for 610758 sentences (1901.2 sentences/sec)
---
> Time: 241689ms for 610758 sentences (2527.0 sentences/sec)

The main thread now only takes ~6%, pretty much almost all of it in
sentence tokenizer (according to jmc).
My 4 cores on i3 now idle ~50% down from ~60%. Much better though not perfect.

It looks like if we want to make it truly paralellized there are two
things to do:
1) tokenize sentences in parallel - this probably is not trivial
2) streamline the whole process: right now we're reading the file,
sentence tokenizing happens in main thread, then we analyze sentences
in threads (split by sentences), then main thread collects the
results, then we check rules in threads (split by rules), then main
collects the results - we would need to remove these "checkpoints" on
main thread to make it truly paralellized (may be when we switch to
Java 8 we could use its streams to simplify this)

I didn't add any unit tests as there were no new API or functionality
and existing unit tests pass and my big test on 7 million word text
does not produce any regression.

Please take a look and let me know if this works for you and if
there's anything else we need to do to merge this into master.

Andriy

2015-02-12 22:35 GMT-05:00 Andriy Rysin <ary...@gmail.com>:
> So I've played with this a bit today and here's what I found:
> with 3 relatively small changes:
> 1) reuse thread pool rather that recreate it every time (this probably
> least important from performance point of view but it's easier to
> profile 4 worker threads than hundreds)
> 2) run sentence analyzer in parallel (using same pool as for rules)
> 3) start one callable for each rule check instead of group of rules -
> this way we spread the check more evenly (should help if e.g. rules in
> one group take much longer than the other)
>
> I was able to get the cpu utilization in multi-threaded LT from 40% to
> 60% and the time to check text of 7 million words went went down by
> 20% (1900 sentences/sec before, 2378 after).
>
> There's still seem be to room for improvement, I can see two things:
> 1) run sentence tokenizer in parallel as well (ideally everything
> after reading paragraph should be parallel) - LT still spends ~13%
> time in main thread which means it can make other threads starve
> 2) stream the line end-to-end unit of work into one callable so we
> don't have to wait between tokenizing, analyzing, and checking; the
> problem here seems to be that we can only split sentence analyzing by
> sentences but for rules we split by rules. We need to check if
> splitting rule check by sentences performs worse than when split by
> rules (but if it performs good we can get rid of rule group filter
> problem).
>
> There was one interesting side effect though: when I split each rule
> into callable there was a regression for rule group filter. It seems
> if we split the rules the group filter does not work. But
> theoretically current code can fail here too: if two rules are in the
> same rule group and have same result but we split those two rules in
> separate threads the rule group filter will not work. I must say I
> didn't run any tests to proove this theory and the changes of this
> happening are low (at least with big number of rules and low cpu
> count) it should still be possible.
>
> I'll do a bit more research when I have time.
>
> Andriy
>
> 2015-02-11 7:39 GMT-05:00 Daniel Naber <daniel.na...@languagetool.org>:
>> On 2015-02-11 05:07, Andriy Rysin wrote:
>>
>>> 1) it seems like we're currently creating and destorying thread pool
>>> every time we check sentences, would it not make more sense to create
>>> pool once and keep threads in the pool and reuse them?
>>
>> I think so. The number of threads should then probably be specified via
>> constructor, not via a set method, so it cannot be changed.
>>
>>> So I am wondering if somebody can shed a bit more light on what are
>>> the critical parts that are not thread-safe and is it worth exploring
>>> if they can be paralellized or it's too much work for not much gain?
>>
>> I don't think there's a better way than carefully looking at the code to
>> make sure it's thread-safe *and* have extensive tests that run with
>> several threads (not as active unit tests, as they run too long). But I
>> think it makes sense to make more code run in parallel.
>>
>> Regards
>>   Daniel
>>
>>
>> ------------------------------------------------------------------------------
>> Dive into the World of Parallel Programming. The Go Parallel Website,
>> sponsored by Intel and developed in partnership with Slashdot Media, is your
>> hub for all things parallel software development, from weekly thought
>> leadership blogs to news, videos, case studies, tutorials and more. Take a
>> look and join the conversation now. http://goparallel.sourceforge.net/
>> _______________________________________________
>> Languagetool-devel mailing list
>> Languagetool-devel@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/languagetool-devel

------------------------------------------------------------------------------
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/
_______________________________________________
Languagetool-devel mailing list
Languagetool-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/languagetool-devel

Reply via email to