[ https://issues.apache.org/jira/browse/LUCENE-9662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17409182#comment-17409182 ]
Zach Chen edited comment on LUCENE-9662 at 9/3/21, 1:06 AM: ------------------------------------------------------------ {quote}Of course, this is on [ridiculously concurrent (256 cores with hyperthreading) hardware|https://blog.mikemccandless.com/2021/01/apache-lucene-performance-on-128-core.html], but still it is only using the default 4 concurrent threads right? I'll add an annotation, and increase its concurrency some! {quote} Yes it's indeed capped at 4 threads by default, and the result was indeed impressive with just a few more threads! On my not-so-fast 6 cores macbook pro, I got about 73% processing time reduction when using '-threadCount 12' versus sequential. To increase its concurrency for nightly benchmark, I assume a change can be made in [luceneutil|https://github.com/mikemccand/luceneutil/blob/0084387e001b426075eb828f43ad0c4e955e9280/src/python/nightlyBench.py#L695-L704] to pass in the flag? If so, I can open a PR for it as well! {quote}Hmm, it looks like we didn't fix the {{Usage: ...}} output to advertise the new {{-threadCount}} option. [~zacharymorn] could you open a quick followup PR? Thanks! {quote} Ah yes sorry for missing that. I've opened a PR for updating it [https://github.com/apache/lucene/pull/281] was (Author: zacharymorn): {quote}Of course, this is on [ridiculously concurrent (256 cores with hyperthreading) hardware|https://blog.mikemccandless.com/2021/01/apache-lucene-performance-on-128-core.html], but still it is only using the default 4 concurrent threads right? I'll add an annotation, and increase its concurrency some! {quote} Yes it's indeed capped at 4 threads by default, and the result was indeed impressive with just a few more threads! On my not-so-fast 6 cores macbook pro, I got about 73% processing time reduction when using '-threadCount 12' versus sequential. To increase its concurrency for nightly benchmark, I assume a change can be made in [luceneutil|https://github.com/mikemccand/luceneutil/blob/0084387e001b426075eb828f43ad0c4e955e9280/src/python/nightlyBench.py#L695-L704] to pass in the flag? If so, I can open a PR for it as well! {quote}Hmm, it looks like we didn't fix the {{Usage: ...}} output to advertise the new {{-threadCount}} option. [~zacharymorn] could you open a quick followup PR? Thanks! {quote} Ah yes sorry for missing that. I've opened a PR for updating it https://github.com/apache/lucene/pull/281 > CheckIndex should be concurrent > ------------------------------- > > Key: LUCENE-9662 > URL: https://issues.apache.org/jira/browse/LUCENE-9662 > Project: Lucene - Core > Issue Type: Bug > Reporter: Michael McCandless > Priority: Major > Time Spent: 18h 10m > Remaining Estimate: 0h > > I am watching a nightly benchmark run slowly run its {{CheckIndex}} step, > using a single core out of the 128 cores the box has. > It seems like this is an embarrassingly parallel problem, if the index has > multiple segments, and would finish much more quickly on concurrent hardware > if we did "thread per segment". > If wanted to get even further concurrency, each part of the Lucene index that > is checked is also independent, so it could be "thread per segment per part". -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org