[ 
https://issues.apache.org/jira/browse/LUCENE-9662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17409182#comment-17409182
 ] 

Zach Chen edited comment on LUCENE-9662 at 9/3/21, 1:06 AM:
------------------------------------------------------------

{quote}Of course, this is on [ridiculously concurrent (256 cores with 
hyperthreading) 
hardware|https://blog.mikemccandless.com/2021/01/apache-lucene-performance-on-128-core.html],
 but still it is only using the default 4 concurrent threads right?  I'll add 
an annotation, and increase its concurrency some!
{quote}
Yes it's indeed capped at 4 threads by default, and the result was indeed 
impressive with just a few more threads! On my not-so-fast 6 cores macbook pro, 
I got about 73% processing time reduction when using '-threadCount 12' versus 
sequential. To increase its concurrency for nightly benchmark, I assume a 
change can be made in 
[luceneutil|https://github.com/mikemccand/luceneutil/blob/0084387e001b426075eb828f43ad0c4e955e9280/src/python/nightlyBench.py#L695-L704]
 to pass in the flag? If so, I can open a PR for it as well!

 
{quote}Hmm, it looks like we didn't fix the {{Usage: ...}} output to advertise 
the new {{-threadCount}} option.  [~zacharymorn] could you open a quick 
followup PR?  Thanks!
{quote}
Ah yes sorry for missing that. I've opened a PR for updating it 
[https://github.com/apache/lucene/pull/281]


was (Author: zacharymorn):
{quote}Of course, this is on [ridiculously concurrent (256 cores with 
hyperthreading) 
hardware|https://blog.mikemccandless.com/2021/01/apache-lucene-performance-on-128-core.html],
 but still it is only using the default 4 concurrent threads right?  I'll add 
an annotation, and increase its concurrency some!
{quote}
Yes it's indeed capped at 4 threads by default, and the result was indeed 
impressive with just a few more threads! On my not-so-fast 6 cores macbook pro, 
I got about 73% processing time reduction when using '-threadCount 12' versus 
sequential. To increase its concurrency for nightly benchmark, I assume a 
change can be made in 
[luceneutil|https://github.com/mikemccand/luceneutil/blob/0084387e001b426075eb828f43ad0c4e955e9280/src/python/nightlyBench.py#L695-L704]
 to pass in the flag? If so, I can open a PR for it as well!
{quote}Hmm, it looks like we didn't fix the {{Usage: ...}} output to advertise 
the new {{-threadCount}} option.  [~zacharymorn] could you open a quick 
followup PR?  Thanks!
{quote}
Ah yes sorry for missing that. I've opened a PR for updating it 
https://github.com/apache/lucene/pull/281

> CheckIndex should be concurrent
> -------------------------------
>
>                 Key: LUCENE-9662
>                 URL: https://issues.apache.org/jira/browse/LUCENE-9662
>             Project: Lucene - Core
>          Issue Type: Bug
>            Reporter: Michael McCandless
>            Priority: Major
>          Time Spent: 18h 10m
>  Remaining Estimate: 0h
>
> I am watching a nightly benchmark run slowly run its {{CheckIndex}} step, 
> using a single core out of the 128 cores the box has.
> It seems like this is an embarrassingly parallel problem, if the index has 
> multiple segments, and would finish much more quickly on concurrent hardware 
> if we did "thread per segment".
> If wanted to get even further concurrency, each part of the Lucene index that 
> is checked is also independent, so it could be "thread per segment per part".



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to