[ 
https://issues.apache.org/jira/browse/LUCENE-7299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15298342#comment-15298342
 ] 

Michael McCandless commented on LUCENE-7299:
--------------------------------------------

I think there is likely a real speedup here ... 3 runs each on trunk (before) 
vs patch (after), indexing full wikipedia:

{noformat}
mike@beast2:/l$ grep finished trunk/lucene/before?.log
trunk/lucene/before1.log:Indexer: finished (126265 msec), excluding commit
trunk/lucene/before2.log:Indexer: finished (124435 msec), excluding commit
trunk/lucene/before3.log:Indexer: finished (122559 msec), excluding commit

mike@beast2:/l$ grep finished radix/lucene/after?.log
radix/lucene/after1.log:Indexer: finished (116234 msec), excluding commit
radix/lucene/after2.log:Indexer: finished (120537 msec), excluding commit
radix/lucene/after3.log:Indexer: finished (121848 msec), excluding commit
{noformat}

And when I look specifically at time to flush postings (first 30 segments 
flushed):

Before:

{noformat}
IW 0 [2016-05-24T14:54:59.531Z; Index #0]: 3947 msec to write postings and 
finish vectors
IW 0 [2016-05-24T14:55:00.087Z; Index #10]: 3719 msec to write postings and 
finish vectors                                                                  
                                                       
IW 0 [2016-05-24T14:55:00.825Z; Index #16]: 3878 msec to write postings and 
finish vectors
IW 0 [2016-05-24T14:55:01.777Z; Index #12]: 3970 msec to write postings and 
finish vectors
IW 0 [2016-05-24T14:55:02.792Z; Index #13]: 3943 msec to write postings and 
finish vectors
IW 0 [2016-05-24T14:55:03.121Z; Index #5]: 3514 msec to write postings and 
finish vectors
IW 0 [2016-05-24T14:55:04.591Z; Index #19]: 3611 msec to write postings and 
finish vectors
IW 0 [2016-05-24T14:55:06.264Z; Index #15]: 4315 msec to write postings and 
finish vectors
IW 0 [2016-05-24T14:55:07.184Z; Index #3]: 4404 msec to write postings and 
finish vectors
IW 0 [2016-05-24T14:55:08.093Z; Index #21]: 4408 msec to write postings and 
finish vectors
IW 0 [2016-05-24T14:55:09.199Z; Index #23]: 4812 msec to write postings and 
finish vectors
IW 0 [2016-05-24T14:55:10.735Z; Index #17]: 5275 msec to write postings and 
finish vectors
IW 0 [2016-05-24T14:55:12.325Z; Index #4]: 5796 msec to write postings and 
finish vectors
IW 0 [2016-05-24T14:55:13.888Z; Index #9]: 6017 msec to write postings and 
finish vectors
IW 0 [2016-05-24T14:55:14.873Z; Index #3]: 5742 msec to write postings and 
finish vectors
IW 0 [2016-05-24T14:55:15.738Z; Index #10]: 5394 msec to write postings and 
finish vectors
IW 0 [2016-05-24T14:55:18.286Z; Index #7]: 6125 msec to write postings and 
finish vectors
IW 0 [2016-05-24T14:55:20.156Z; Index #2]: 6424 msec to write postings and 
finish vectors
IW 0 [2016-05-24T14:55:21.925Z; Index #20]: 6640 msec to write postings and 
finish vectors
IW 0 [2016-05-24T14:55:23.447Z; Index #12]: 6827 msec to write postings and 
finish vectors
IW 0 [2016-05-24T14:55:25.251Z; Index #21]: 6977 msec to write postings and 
finish vectors
IW 0 [2016-05-24T14:55:27.426Z; Index #15]: 7459 msec to write postings and 
finish vectors
IW 0 [2016-05-24T14:55:29.341Z; Index #16]: 7550 msec to write postings and 
finish vectors
IW 0 [2016-05-24T14:55:30.548Z; Index #11]: 6981 msec to write postings and 
finish vectors
IW 0 [2016-05-24T14:55:30.973Z; Index #14]: 5968 msec to write postings and 
finish vectors
IW 0 [2016-05-24T14:55:32.538Z; Index #5]: 6209 msec to write postings and 
finish vectors
IW 0 [2016-05-24T14:55:33.962Z; Index #21]: 6256 msec to write postings and 
finish vectors
IW 0 [2016-05-24T14:55:35.243Z; Index #1]: 6219 msec to write postings and 
finish vectors
IW 0 [2016-05-24T14:55:36.786Z; Index #7]: 6089 msec to write postings and 
finish vectors
IW 0 [2016-05-24T14:55:38.149Z; Index #16]: 5999 msec to write postings and 
finish vectors
{noformat}

after:
{noformat}
IW 0 [2016-05-24T14:40:16.296Z; Index #8]: 2706 msec to write postings and 
finish vectors
IW 0 [2016-05-24T14:40:17.174Z; Index #19]: 2977 msec to write postings and 
finish vectors                                                                  
                                                       
IW 0 [2016-05-24T14:40:17.651Z; Index #3]: 2717 msec to write postings and 
finish vectors
IW 0 [2016-05-24T14:40:18.731Z; Index #10]: 3010 msec to write postings and 
finish vectors
IW 0 [2016-05-24T14:40:19.624Z; Index #6]: 2975 msec to write postings and 
finish vectors
IW 0 [2016-05-24T14:40:20.264Z; Index #9]: 2770 msec to write postings and 
finish vectors
IW 0 [2016-05-24T14:40:21.297Z; Index #3]: 2559 msec to write postings and 
finish vectors
IW 0 [2016-05-24T14:40:22.598Z; Index #5]: 2958 msec to write postings and 
finish vectors
IW 0 [2016-05-24T14:40:23.757Z; Index #19]: 3046 msec to write postings and 
finish vectors
IW 0 [2016-05-24T14:40:24.554Z; Index #22]: 3098 msec to write postings and 
finish vectors
IW 0 [2016-05-24T14:40:25.727Z; Index #9]: 3412 msec to write postings and 
finish vectors
IW 0 [2016-05-24T14:40:26.980Z; Index #14]: 3728 msec to write postings and 
finish vectors
IW 0 [2016-05-24T14:40:28.440Z; Index #1]: 4251 msec to write postings and 
finish vectors
IW 0 [2016-05-24T14:40:29.723Z; Index #12]: 4361 msec to write postings and 
finish vectors
IW 0 [2016-05-24T14:40:31.040Z; Index #8]: 4516 msec to write postings and 
finish vectors
IW 0 [2016-05-24T14:40:31.935Z; Index #2]: 4107 msec to write postings and 
finish vectors
IW 0 [2016-05-24T14:40:33.631Z; Index #20]: 4414 msec to write postings and 
finish vectors
IW 0 [2016-05-24T14:40:34.688Z; Index #4]: 4165 msec to write postings and 
finish vectors
IW 0 [2016-05-24T14:40:36.060Z; Index #18]: 4158 msec to write postings and 
finish vectors
IW 0 [2016-05-24T14:40:37.625Z; Index #23]: 4287 msec to write postings and 
finish vectors
IW 0 [2016-05-24T14:40:39.622Z; Index #7]: 4861 msec to write postings and 
finish vectors
IW 0 [2016-05-24T14:40:41.147Z; Index #0]: 4976 msec to write postings and 
finish vectors
IW 0 [2016-05-24T14:40:42.833Z; Index #8]: 5074 msec to write postings and 
finish vectors
IW 0 [2016-05-24T14:40:44.399Z; Index #21]: 5254 msec to write postings and 
finish vectors
IW 0 [2016-05-24T14:40:44.734Z; Index #14]: 3862 msec to write postings and 
finish vectors
IW 0 [2016-05-24T14:40:46.047Z; Index #0]: 4107 msec to write postings and 
finish vectors
IW 0 [2016-05-24T14:40:47.227Z; Index #2]: 4151 msec to write postings and 
finish vectors
IW 0 [2016-05-24T14:40:48.884Z; Index #20]: 4571 msec to write postings and 
finish vectors
IW 0 [2016-05-24T14:40:49.918Z; Index #9]: 4350 msec to write postings and 
finish vectors
IW 0 [2016-05-24T14:40:51.088Z; Index #13]: 4221 msec to write postings and 
finish vectors
{noformat}

> BytesRefHash.sort() should use radix sort?
> ------------------------------------------
>
>                 Key: LUCENE-7299
>                 URL: https://issues.apache.org/jira/browse/LUCENE-7299
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Adrien Grand
>            Assignee: Adrien Grand
>            Priority: Minor
>         Attachments: LUCENE-7299.patch
>
>
> Switching DocIdSetBuilder to radix sort helped make things significantly 
> faster. We should be able to do the same with BytesRefHash.sort()?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to