[jira] [Updated] (SOLR-11443) Remove the usage of workqueue for Overseer

2017-10-13 Thread Cao Manh Dat (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-11443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cao Manh Dat updated SOLR-11443:

Attachment: SOLR-11443.patch

Updated patch for master
1. Adding fallbackQueue concept, in the startup, we consider workQueue as 
fallbackQueue. Which contains messages that need to process one by one - if 
there a message that causes exception on writing new clusterstate to Zk, 
consider that as bad message and poll out from fallbackQueue.
2. After that, stateUpdateQueue is used as fallbackQueue, cause we writing in 
batch, so if an exception is thrown on writing new clusterstate, we don't know 
which message is bad, so we go back to the beginning of the loop and do 1. 

> Remove the usage of workqueue for Overseer
> --
>
> Key: SOLR-11443
> URL: https://issues.apache.org/jira/browse/SOLR-11443
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Cao Manh Dat
>Assignee: Cao Manh Dat
> Attachments: SOLR-11443.patch, SOLR-11443.patch, SOLR-11443.patch
>
>
> If we can remove the usage of workqueue, We can save a lot of IO blocking in 
> Overseer, hence boost performance a lot.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-11443) Remove the usage of workqueue for Overseer

2017-10-10 Thread Cao Manh Dat (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-11443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cao Manh Dat updated SOLR-11443:

Attachment: SOLR-11443.patch

Updated patch
- One more optimization, instead of using {{poll()}} we can remove nodes in 
batch by using {{Zk.multi()}}, which saves us a lot of communications time 
between Overseer and Zk. I introduced a new method {{void 
DistributedQueue.remove(Collection paths)}}, which remove nodes in 
batch. ([~dragonsinth] : what do you think?)
- Increased the cap of Overseer queue to 10, cause I think Overseer can 
handle more messages.
- The patch including changes from SOLR-11447.

I run the benchmark which number of state messages to 10, here are result
Before optimize :
{code}
72804 INFO  (TEST-OverseerTest.testPerformance-seed#[CA3B138EDD8B0BD5]) [] 
o.a.s.c.OverseerTest op: state, success: 11, failure: 0
72810 INFO  (TEST-OverseerTest.testPerformance-seed#[CA3B138EDD8B0BD5]) [] 
o.a.s.c.OverseerTest  avgRequestsPerSecond: 2332.563593723015
{code}
After optimize :
{code}
42739 INFO  (TEST-OverseerTest.testPerformance-seed#[7E989E665D42FFFB]) [] 
o.a.s.c.OverseerTest op: state, success: 11, failure: 0
42742 INFO  (TEST-OverseerTest.testPerformance-seed#[7E989E665D42FFFB]) [] 
o.a.s.c.OverseerTest  avgRequestsPerSecond: 11454.952022695608
{code}

> Remove the usage of workqueue for Overseer
> --
>
> Key: SOLR-11443
> URL: https://issues.apache.org/jira/browse/SOLR-11443
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Cao Manh Dat
>Assignee: Cao Manh Dat
> Attachments: SOLR-11443.patch, SOLR-11443.patch
>
>
> If we can remove the usage of workqueue, We can save a lot of IO blocking in 
> Overseer, hence boost performance a lot.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-11443) Remove the usage of workqueue for Overseer

2017-10-06 Thread Cao Manh Dat (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-11443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cao Manh Dat updated SOLR-11443:

Attachment: SOLR-11443.patch

Patch for this ticket. The idea here is simple, we peek for 1000 messages in 
the queue, processed them, write new clusterstate to ZK, then poll out these 
messages. So we only poll out processed messages when new clusterstate is 
written.

In case of Overseer get restarted, all the uncommitted messages still in the 
queue, we will reprocess them and still achieve the desired state.
Here are some benchmark number ( OverseerTest.testPerformance() )
Before optimize : {{avgRequestsPerSecond: 1551.8934622998179}}
After optmize : {{avgRequestsPerSecond: 3425.594762960455}}

> Remove the usage of workqueue for Overseer
> --
>
> Key: SOLR-11443
> URL: https://issues.apache.org/jira/browse/SOLR-11443
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Cao Manh Dat
>Assignee: Cao Manh Dat
> Attachments: SOLR-11443.patch
>
>
> If we can remove the usage of workqueue, We can save a lot of IO blocking in 
> Overseer, hence boost performance a lot.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org