[ https://issues.apache.org/jira/browse/LUCENE-7299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15298342#comment-15298342 ]
Michael McCandless commented on LUCENE-7299: -------------------------------------------- I think there is likely a real speedup here ... 3 runs each on trunk (before) vs patch (after), indexing full wikipedia: {noformat} mike@beast2:/l$ grep finished trunk/lucene/before?.log trunk/lucene/before1.log:Indexer: finished (126265 msec), excluding commit trunk/lucene/before2.log:Indexer: finished (124435 msec), excluding commit trunk/lucene/before3.log:Indexer: finished (122559 msec), excluding commit mike@beast2:/l$ grep finished radix/lucene/after?.log radix/lucene/after1.log:Indexer: finished (116234 msec), excluding commit radix/lucene/after2.log:Indexer: finished (120537 msec), excluding commit radix/lucene/after3.log:Indexer: finished (121848 msec), excluding commit {noformat} And when I look specifically at time to flush postings (first 30 segments flushed): Before: {noformat} IW 0 [2016-05-24T14:54:59.531Z; Index #0]: 3947 msec to write postings and finish vectors IW 0 [2016-05-24T14:55:00.087Z; Index #10]: 3719 msec to write postings and finish vectors IW 0 [2016-05-24T14:55:00.825Z; Index #16]: 3878 msec to write postings and finish vectors IW 0 [2016-05-24T14:55:01.777Z; Index #12]: 3970 msec to write postings and finish vectors IW 0 [2016-05-24T14:55:02.792Z; Index #13]: 3943 msec to write postings and finish vectors IW 0 [2016-05-24T14:55:03.121Z; Index #5]: 3514 msec to write postings and finish vectors IW 0 [2016-05-24T14:55:04.591Z; Index #19]: 3611 msec to write postings and finish vectors IW 0 [2016-05-24T14:55:06.264Z; Index #15]: 4315 msec to write postings and finish vectors IW 0 [2016-05-24T14:55:07.184Z; Index #3]: 4404 msec to write postings and finish vectors IW 0 [2016-05-24T14:55:08.093Z; Index #21]: 4408 msec to write postings and finish vectors IW 0 [2016-05-24T14:55:09.199Z; Index #23]: 4812 msec to write postings and finish vectors IW 0 [2016-05-24T14:55:10.735Z; Index #17]: 5275 msec to write postings and finish vectors IW 0 [2016-05-24T14:55:12.325Z; Index #4]: 5796 msec to write postings and finish vectors IW 0 [2016-05-24T14:55:13.888Z; Index #9]: 6017 msec to write postings and finish vectors IW 0 [2016-05-24T14:55:14.873Z; Index #3]: 5742 msec to write postings and finish vectors IW 0 [2016-05-24T14:55:15.738Z; Index #10]: 5394 msec to write postings and finish vectors IW 0 [2016-05-24T14:55:18.286Z; Index #7]: 6125 msec to write postings and finish vectors IW 0 [2016-05-24T14:55:20.156Z; Index #2]: 6424 msec to write postings and finish vectors IW 0 [2016-05-24T14:55:21.925Z; Index #20]: 6640 msec to write postings and finish vectors IW 0 [2016-05-24T14:55:23.447Z; Index #12]: 6827 msec to write postings and finish vectors IW 0 [2016-05-24T14:55:25.251Z; Index #21]: 6977 msec to write postings and finish vectors IW 0 [2016-05-24T14:55:27.426Z; Index #15]: 7459 msec to write postings and finish vectors IW 0 [2016-05-24T14:55:29.341Z; Index #16]: 7550 msec to write postings and finish vectors IW 0 [2016-05-24T14:55:30.548Z; Index #11]: 6981 msec to write postings and finish vectors IW 0 [2016-05-24T14:55:30.973Z; Index #14]: 5968 msec to write postings and finish vectors IW 0 [2016-05-24T14:55:32.538Z; Index #5]: 6209 msec to write postings and finish vectors IW 0 [2016-05-24T14:55:33.962Z; Index #21]: 6256 msec to write postings and finish vectors IW 0 [2016-05-24T14:55:35.243Z; Index #1]: 6219 msec to write postings and finish vectors IW 0 [2016-05-24T14:55:36.786Z; Index #7]: 6089 msec to write postings and finish vectors IW 0 [2016-05-24T14:55:38.149Z; Index #16]: 5999 msec to write postings and finish vectors {noformat} after: {noformat} IW 0 [2016-05-24T14:40:16.296Z; Index #8]: 2706 msec to write postings and finish vectors IW 0 [2016-05-24T14:40:17.174Z; Index #19]: 2977 msec to write postings and finish vectors IW 0 [2016-05-24T14:40:17.651Z; Index #3]: 2717 msec to write postings and finish vectors IW 0 [2016-05-24T14:40:18.731Z; Index #10]: 3010 msec to write postings and finish vectors IW 0 [2016-05-24T14:40:19.624Z; Index #6]: 2975 msec to write postings and finish vectors IW 0 [2016-05-24T14:40:20.264Z; Index #9]: 2770 msec to write postings and finish vectors IW 0 [2016-05-24T14:40:21.297Z; Index #3]: 2559 msec to write postings and finish vectors IW 0 [2016-05-24T14:40:22.598Z; Index #5]: 2958 msec to write postings and finish vectors IW 0 [2016-05-24T14:40:23.757Z; Index #19]: 3046 msec to write postings and finish vectors IW 0 [2016-05-24T14:40:24.554Z; Index #22]: 3098 msec to write postings and finish vectors IW 0 [2016-05-24T14:40:25.727Z; Index #9]: 3412 msec to write postings and finish vectors IW 0 [2016-05-24T14:40:26.980Z; Index #14]: 3728 msec to write postings and finish vectors IW 0 [2016-05-24T14:40:28.440Z; Index #1]: 4251 msec to write postings and finish vectors IW 0 [2016-05-24T14:40:29.723Z; Index #12]: 4361 msec to write postings and finish vectors IW 0 [2016-05-24T14:40:31.040Z; Index #8]: 4516 msec to write postings and finish vectors IW 0 [2016-05-24T14:40:31.935Z; Index #2]: 4107 msec to write postings and finish vectors IW 0 [2016-05-24T14:40:33.631Z; Index #20]: 4414 msec to write postings and finish vectors IW 0 [2016-05-24T14:40:34.688Z; Index #4]: 4165 msec to write postings and finish vectors IW 0 [2016-05-24T14:40:36.060Z; Index #18]: 4158 msec to write postings and finish vectors IW 0 [2016-05-24T14:40:37.625Z; Index #23]: 4287 msec to write postings and finish vectors IW 0 [2016-05-24T14:40:39.622Z; Index #7]: 4861 msec to write postings and finish vectors IW 0 [2016-05-24T14:40:41.147Z; Index #0]: 4976 msec to write postings and finish vectors IW 0 [2016-05-24T14:40:42.833Z; Index #8]: 5074 msec to write postings and finish vectors IW 0 [2016-05-24T14:40:44.399Z; Index #21]: 5254 msec to write postings and finish vectors IW 0 [2016-05-24T14:40:44.734Z; Index #14]: 3862 msec to write postings and finish vectors IW 0 [2016-05-24T14:40:46.047Z; Index #0]: 4107 msec to write postings and finish vectors IW 0 [2016-05-24T14:40:47.227Z; Index #2]: 4151 msec to write postings and finish vectors IW 0 [2016-05-24T14:40:48.884Z; Index #20]: 4571 msec to write postings and finish vectors IW 0 [2016-05-24T14:40:49.918Z; Index #9]: 4350 msec to write postings and finish vectors IW 0 [2016-05-24T14:40:51.088Z; Index #13]: 4221 msec to write postings and finish vectors {noformat} > BytesRefHash.sort() should use radix sort? > ------------------------------------------ > > Key: LUCENE-7299 > URL: https://issues.apache.org/jira/browse/LUCENE-7299 > Project: Lucene - Core > Issue Type: Improvement > Reporter: Adrien Grand > Assignee: Adrien Grand > Priority: Minor > Attachments: LUCENE-7299.patch > > > Switching DocIdSetBuilder to radix sort helped make things significantly > faster. We should be able to do the same with BytesRefHash.sort()? -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org