[jira] [Commented] (SOLR-10983) Fix DOWNNODE -> queue-work explosion

2017-08-21 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16134957#comment-16134957
 ] 

ASF subversion and git services commented on SOLR-10983:


Commit f031a85f50902cfc0b54422b35f60effb7353b05 in lucene-solr's branch 
refs/heads/branch_6_6 from [~dragonsinth]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=f031a85 ]

SOLR-10983: Fix DOWNNODE -> queue-work explosion


> Fix DOWNNODE -> queue-work explosion
> 
>
> Key: SOLR-10983
> URL: https://issues.apache.org/jira/browse/SOLR-10983
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrCloud
>Reporter: Scott Blum
>Assignee: Scott Blum
> Fix For: 7.0, 6.6.1, master (8.0)
>
> Attachments: SOLR-10983.patch
>
>
> Every DOWNNODE command enqueues N copies of itself into queue-work, where N 
> is number of collections affected by the DOWNNODE.
> This rarely matters in practice, because queue-work gets immediately dumped-- 
> however, if anything throws an exception (such as ZK bad version), we don't 
> clear queue-work.  Then the next time through the loop we run the expensive 
> DOWNNODE command potentially hundreds of times.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10983) Fix DOWNNODE -> queue-work explosion

2017-08-07 Thread Erick Erickson (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16117562#comment-16117562
 ] 

Erick Erickson commented on SOLR-10983:
---

I backported this to 6x (future 6.7) as I really expect there to be a final 
release of the 6x code line and didn't want this to be omitted. No harm if 
there's _not_ a 6.7.

> Fix DOWNNODE -> queue-work explosion
> 
>
> Key: SOLR-10983
> URL: https://issues.apache.org/jira/browse/SOLR-10983
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrCloud
>Reporter: Scott Blum
>Assignee: Scott Blum
> Fix For: 7.0, master (8.0), 7.1
>
> Attachments: SOLR-10983.patch
>
>
> Every DOWNNODE command enqueues N copies of itself into queue-work, where N 
> is number of collections affected by the DOWNNODE.
> This rarely matters in practice, because queue-work gets immediately dumped-- 
> however, if anything throws an exception (such as ZK bad version), we don't 
> clear queue-work.  Then the next time through the loop we run the expensive 
> DOWNNODE command potentially hundreds of times.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10983) Fix DOWNNODE -> queue-work explosion

2017-08-07 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16117560#comment-16117560
 ] 

ASF subversion and git services commented on SOLR-10983:


Commit d704796a785aa0d8e455661e519bb2f0c67b7311 in lucene-solr's branch 
refs/heads/branch_6x from Erick
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=d704796 ]

SOLR-10983: Fix DOWNNODE -> queue-work explosion, backporting to 6x as per the 
comments in the JIRA


> Fix DOWNNODE -> queue-work explosion
> 
>
> Key: SOLR-10983
> URL: https://issues.apache.org/jira/browse/SOLR-10983
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrCloud
>Reporter: Scott Blum
>Assignee: Scott Blum
> Fix For: 7.0, master (8.0), 7.1
>
> Attachments: SOLR-10983.patch
>
>
> Every DOWNNODE command enqueues N copies of itself into queue-work, where N 
> is number of collections affected by the DOWNNODE.
> This rarely matters in practice, because queue-work gets immediately dumped-- 
> however, if anything throws an exception (such as ZK bad version), we don't 
> clear queue-work.  Then the next time through the loop we run the expensive 
> DOWNNODE command potentially hundreds of times.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10983) Fix DOWNNODE -> queue-work explosion

2017-07-05 Thread Scott Blum (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075666#comment-16075666
 ] 

Scott Blum commented on SOLR-10983:
---

BTW: this issue most likely affects all 6.x releases (and even some late 5.x), 
so it should be considered if we do any 6.x point releases later.

> Fix DOWNNODE -> queue-work explosion
> 
>
> Key: SOLR-10983
> URL: https://issues.apache.org/jira/browse/SOLR-10983
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrCloud
>Reporter: Scott Blum
>Assignee: Scott Blum
> Fix For: 7.0, master (8.0), 7.1
>
> Attachments: SOLR-10983.patch
>
>
> Every DOWNNODE command enqueues N copies of itself into queue-work, where N 
> is number of collections affected by the DOWNNODE.
> This rarely matters in practice, because queue-work gets immediately dumped-- 
> however, if anything throws an exception (such as ZK bad version), we don't 
> clear queue-work.  Then the next time through the loop we run the expensive 
> DOWNNODE command potentially hundreds of times.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10983) Fix DOWNNODE -> queue-work explosion

2017-07-05 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075663#comment-16075663
 ] 

ASF subversion and git services commented on SOLR-10983:


Commit 17245c2e5a93bca59572c09af78a6ad6045e75eb in lucene-solr's branch 
refs/heads/branch_7x from [~dragonsinth]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=17245c2 ]

SOLR-10983: Fix DOWNNODE -> queue-work explosion


> Fix DOWNNODE -> queue-work explosion
> 
>
> Key: SOLR-10983
> URL: https://issues.apache.org/jira/browse/SOLR-10983
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrCloud
>Reporter: Scott Blum
>Assignee: Scott Blum
> Attachments: SOLR-10983.patch
>
>
> Every DOWNNODE command enqueues N copies of itself into queue-work, where N 
> is number of collections affected by the DOWNNODE.
> This rarely matters in practice, because queue-work gets immediately dumped-- 
> however, if anything throws an exception (such as ZK bad version), we don't 
> clear queue-work.  Then the next time through the loop we run the expensive 
> DOWNNODE command potentially hundreds of times.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10983) Fix DOWNNODE -> queue-work explosion

2017-07-05 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075664#comment-16075664
 ] 

ASF subversion and git services commented on SOLR-10983:


Commit 51638c09bf4f5457650ab40c60b5f98512f9ca1d in lucene-solr's branch 
refs/heads/branch_7_0 from [~dragonsinth]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=51638c0 ]

SOLR-10983: Fix DOWNNODE -> queue-work explosion


> Fix DOWNNODE -> queue-work explosion
> 
>
> Key: SOLR-10983
> URL: https://issues.apache.org/jira/browse/SOLR-10983
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrCloud
>Reporter: Scott Blum
>Assignee: Scott Blum
> Attachments: SOLR-10983.patch
>
>
> Every DOWNNODE command enqueues N copies of itself into queue-work, where N 
> is number of collections affected by the DOWNNODE.
> This rarely matters in practice, because queue-work gets immediately dumped-- 
> however, if anything throws an exception (such as ZK bad version), we don't 
> clear queue-work.  Then the next time through the loop we run the expensive 
> DOWNNODE command potentially hundreds of times.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10983) Fix DOWNNODE -> queue-work explosion

2017-07-05 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075662#comment-16075662
 ] 

ASF subversion and git services commented on SOLR-10983:


Commit 380eed838d6646ec02592a9d2e6649e6aa1b5d9b in lucene-solr's branch 
refs/heads/master from [~dragonsinth]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=380eed8 ]

SOLR-10983: Fix DOWNNODE -> queue-work explosion


> Fix DOWNNODE -> queue-work explosion
> 
>
> Key: SOLR-10983
> URL: https://issues.apache.org/jira/browse/SOLR-10983
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrCloud
>Reporter: Scott Blum
>Assignee: Scott Blum
> Attachments: SOLR-10983.patch
>
>
> Every DOWNNODE command enqueues N copies of itself into queue-work, where N 
> is number of collections affected by the DOWNNODE.
> This rarely matters in practice, because queue-work gets immediately dumped-- 
> however, if anything throws an exception (such as ZK bad version), we don't 
> clear queue-work.  Then the next time through the loop we run the expensive 
> DOWNNODE command potentially hundreds of times.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10983) Fix DOWNNODE -> queue-work explosion

2017-07-04 Thread Scott Blum (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16074225#comment-16074225
 ] 

Scott Blum commented on SOLR-10983:
---

Thanks!  Will do

> Fix DOWNNODE -> queue-work explosion
> 
>
> Key: SOLR-10983
> URL: https://issues.apache.org/jira/browse/SOLR-10983
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrCloud
>Reporter: Scott Blum
>Assignee: Scott Blum
> Attachments: SOLR-10983.patch
>
>
> Every DOWNNODE command enqueues N copies of itself into queue-work, where N 
> is number of collections affected by the DOWNNODE.
> This rarely matters in practice, because queue-work gets immediately dumped-- 
> however, if anything throws an exception (such as ZK bad version), we don't 
> clear queue-work.  Then the next time through the loop we run the expensive 
> DOWNNODE command potentially hundreds of times.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10983) Fix DOWNNODE -> queue-work explosion

2017-07-04 Thread Shalin Shekhar Mangar (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16074176#comment-16074176
 ] 

Shalin Shekhar Mangar commented on SOLR-10983:
--

On second thought, creating a batch enqueue command is not so straightforward 
and the callback is called once per enqueue as per the contract of 
ZkWriteCallback so it is technically not a bug. So I am fine with your solution 
as it exists. +1 to commit. Please make sure it is backported to the branch_7x 
and branch_7_0 so that it makes it into the 7.0 release.

> Fix DOWNNODE -> queue-work explosion
> 
>
> Key: SOLR-10983
> URL: https://issues.apache.org/jira/browse/SOLR-10983
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrCloud
>Reporter: Scott Blum
>Assignee: Scott Blum
> Attachments: SOLR-10983.patch
>
>
> Every DOWNNODE command enqueues N copies of itself into queue-work, where N 
> is number of collections affected by the DOWNNODE.
> This rarely matters in practice, because queue-work gets immediately dumped-- 
> however, if anything throws an exception (such as ZK bad version), we don't 
> clear queue-work.  Then the next time through the loop we run the expensive 
> DOWNNODE command potentially hundreds of times.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10983) Fix DOWNNODE -> queue-work explosion

2017-07-04 Thread Shalin Shekhar Mangar (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16074173#comment-16074173
 ] 

Shalin Shekhar Mangar commented on SOLR-10983:
--

Nice catch!

Your patch solves another problem -- today if an exception happens, we run 
through items in the work-queue and the last item from state-update-queue (the 
one during which the exception happened) so we run the same item twice.

Considering that DOWNNODE is the only command that enqueues multiple 
ZkWriteCommands, I think we should add a method to ZkStateWriter which calls 
enqueue only once for the entire batch. That and your patch solve all problems 
nicely i.e. 
# DOWNNODE creating multiple work queue items 
# Exceptions not clearing work queue
# Overseer executing same item twice from work queue and state update queue on 
an exception

> Fix DOWNNODE -> queue-work explosion
> 
>
> Key: SOLR-10983
> URL: https://issues.apache.org/jira/browse/SOLR-10983
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrCloud
>Reporter: Scott Blum
>Assignee: Scott Blum
> Attachments: SOLR-10983.patch
>
>
> Every DOWNNODE command enqueues N copies of itself into queue-work, where N 
> is number of collections affected by the DOWNNODE.
> This rarely matters in practice, because queue-work gets immediately dumped-- 
> however, if anything throws an exception (such as ZK bad version), we don't 
> clear queue-work.  Then the next time through the loop we run the expensive 
> DOWNNODE command potentially hundreds of times.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10983) Fix DOWNNODE -> queue-work explosion

2017-06-29 Thread Scott Blum (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16069503#comment-16069503
 ] 

Scott Blum commented on SOLR-10983:
---

[~shalinmangar] [~jhump]

> Fix DOWNNODE -> queue-work explosion
> 
>
> Key: SOLR-10983
> URL: https://issues.apache.org/jira/browse/SOLR-10983
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrCloud
>Reporter: Scott Blum
>Assignee: Scott Blum
> Attachments: SOLR-10983.patch
>
>
> Every DOWNNODE command enqueues N copies of itself into queue-work, where N 
> is number of collections affected by the DOWNNODE.
> This rarely matters in practice, because queue-work gets immediately dumped-- 
> however, if anything throws an exception (such as ZK bad version), we don't 
> clear queue-work.  Then the next time through the loop we run the expensive 
> DOWNNODE command potentially hundreds of times.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org