[jira] [Commented] (SOLR-10420) Solr 6.x leaking one SolrZkClient instance per second

2017-04-19 Thread Markus Jelsma (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15974402#comment-15974402
 ] 

Markus Jelsma commented on SOLR-10420:
--

Thanks! Great work!

> Solr 6.x leaking one SolrZkClient instance per second
> -
>
> Key: SOLR-10420
> URL: https://issues.apache.org/jira/browse/SOLR-10420
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 5.5.2, 6.4.2, 6.5
>Reporter: Markus Jelsma
>Assignee: Scott Blum
> Fix For: 5.5.5, 5.6, 6.5.1, 6.6, master (7.0)
>
> Attachments: OverseerTest.106.stdout, OverseerTest.119.stdout, 
> OverseerTest.80.stdout, OverseerTest.DEBUG.43.stdout, 
> OverseerTest.DEBUG.48.stdout, OverseerTest.DEBUG.58.stdout, 
> SOLR-10420-dragonsinth.patch, SOLR-10420.patch, SOLR-10420.patch, 
> SOLR-10420.patch, SOLR-10420.patch, SOLR-10420.patch
>
>
> One of our nodes became berzerk after a restart, Solr went completely nuts! 
> So i opened VisualVM to keep an eye on it and spotted a different problem 
> that occurs in all our Solr 6.4.2 and 6.5.0 nodes.
> It appears Solr is leaking one SolrZkClient instance per second via 
> DistributedQueue$ChildWatcher. That one per second is quite accurate for all 
> nodes, there are about the same amount of instances as there are seconds 
> since Solr started. I know VisualVM's instance count includes 
> objects-to-be-collected, the instance count does not drop after a forced 
> garbed collection round.
> It doesn't matter how many cores or collections the nodes carry or how heavy 
> traffic is.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10420) Solr 6.x leaking one SolrZkClient instance per second

2017-04-18 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15973453#comment-15973453
 ] 

ASF subversion and git services commented on SOLR-10420:


Commit 89beee8d61346d50dbbf02f0cc9cfc5032e46eee in lucene-solr's branch 
refs/heads/branch_5_5 from [~dragonsinth]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=89beee8 ]

SOLR-10420: fix watcher leak in DistributedQueue


> Solr 6.x leaking one SolrZkClient instance per second
> -
>
> Key: SOLR-10420
> URL: https://issues.apache.org/jira/browse/SOLR-10420
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 5.5.2, 6.4.2, 6.5
>Reporter: Markus Jelsma
>Assignee: Scott Blum
> Fix For: 5.5.5, 5.6, 6.5.1, 6.6, master (7.0)
>
> Attachments: OverseerTest.106.stdout, OverseerTest.119.stdout, 
> OverseerTest.80.stdout, OverseerTest.DEBUG.43.stdout, 
> OverseerTest.DEBUG.48.stdout, OverseerTest.DEBUG.58.stdout, 
> SOLR-10420-dragonsinth.patch, SOLR-10420.patch, SOLR-10420.patch, 
> SOLR-10420.patch, SOLR-10420.patch, SOLR-10420.patch
>
>
> One of our nodes became berzerk after a restart, Solr went completely nuts! 
> So i opened VisualVM to keep an eye on it and spotted a different problem 
> that occurs in all our Solr 6.4.2 and 6.5.0 nodes.
> It appears Solr is leaking one SolrZkClient instance per second via 
> DistributedQueue$ChildWatcher. That one per second is quite accurate for all 
> nodes, there are about the same amount of instances as there are seconds 
> since Solr started. I know VisualVM's instance count includes 
> objects-to-be-collected, the instance count does not drop after a forced 
> garbed collection round.
> It doesn't matter how many cores or collections the nodes carry or how heavy 
> traffic is.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10420) Solr 6.x leaking one SolrZkClient instance per second

2017-04-18 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15973382#comment-15973382
 ] 

ASF subversion and git services commented on SOLR-10420:


Commit e3beb61a72efbce37710ce3cc48b24093070d052 in lucene-solr's branch 
refs/heads/branch_6_5 from [~dragonsinth]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=e3beb61 ]

SOLR-10420: fix watcher leak in DistributedQueue


> Solr 6.x leaking one SolrZkClient instance per second
> -
>
> Key: SOLR-10420
> URL: https://issues.apache.org/jira/browse/SOLR-10420
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 5.5.2, 6.4.2, 6.5
>Reporter: Markus Jelsma
>Assignee: Scott Blum
> Fix For: 5.5.5, 5.6, 6.5.1, 6.6, master (7.0)
>
> Attachments: OverseerTest.106.stdout, OverseerTest.119.stdout, 
> OverseerTest.80.stdout, OverseerTest.DEBUG.43.stdout, 
> OverseerTest.DEBUG.48.stdout, OverseerTest.DEBUG.58.stdout, 
> SOLR-10420-dragonsinth.patch, SOLR-10420.patch, SOLR-10420.patch, 
> SOLR-10420.patch, SOLR-10420.patch, SOLR-10420.patch
>
>
> One of our nodes became berzerk after a restart, Solr went completely nuts! 
> So i opened VisualVM to keep an eye on it and spotted a different problem 
> that occurs in all our Solr 6.4.2 and 6.5.0 nodes.
> It appears Solr is leaking one SolrZkClient instance per second via 
> DistributedQueue$ChildWatcher. That one per second is quite accurate for all 
> nodes, there are about the same amount of instances as there are seconds 
> since Solr started. I know VisualVM's instance count includes 
> objects-to-be-collected, the instance count does not drop after a forced 
> garbed collection round.
> It doesn't matter how many cores or collections the nodes carry or how heavy 
> traffic is.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10420) Solr 6.x leaking one SolrZkClient instance per second

2017-04-18 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15973354#comment-15973354
 ] 

ASF subversion and git services commented on SOLR-10420:


Commit 42d08dd28c6609a2c70a691e6a88725c9aa31377 in lucene-solr's branch 
refs/heads/branch_5x from [~dragonsinth]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=42d08dd ]

SOLR-10420: fix watcher leak in DistributedQueue


> Solr 6.x leaking one SolrZkClient instance per second
> -
>
> Key: SOLR-10420
> URL: https://issues.apache.org/jira/browse/SOLR-10420
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 5.5.2, 6.4.2, 6.5
>Reporter: Markus Jelsma
>Assignee: Scott Blum
> Fix For: 5.5.5, 5.6, 6.5.1, 6.6, master (7.0)
>
> Attachments: OverseerTest.106.stdout, OverseerTest.119.stdout, 
> OverseerTest.80.stdout, OverseerTest.DEBUG.43.stdout, 
> OverseerTest.DEBUG.48.stdout, OverseerTest.DEBUG.58.stdout, 
> SOLR-10420-dragonsinth.patch, SOLR-10420.patch, SOLR-10420.patch, 
> SOLR-10420.patch, SOLR-10420.patch, SOLR-10420.patch
>
>
> One of our nodes became berzerk after a restart, Solr went completely nuts! 
> So i opened VisualVM to keep an eye on it and spotted a different problem 
> that occurs in all our Solr 6.4.2 and 6.5.0 nodes.
> It appears Solr is leaking one SolrZkClient instance per second via 
> DistributedQueue$ChildWatcher. That one per second is quite accurate for all 
> nodes, there are about the same amount of instances as there are seconds 
> since Solr started. I know VisualVM's instance count includes 
> objects-to-be-collected, the instance count does not drop after a forced 
> garbed collection round.
> It doesn't matter how many cores or collections the nodes carry or how heavy 
> traffic is.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10420) Solr 6.x leaking one SolrZkClient instance per second

2017-04-18 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15973351#comment-15973351
 ] 

ASF subversion and git services commented on SOLR-10420:


Commit ae55dfc10fd3843d35df9096b7626aad36735670 in lucene-solr's branch 
refs/heads/branch_6x from [~dragonsinth]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=ae55dfc ]

SOLR-10420: fix watcher leak in DistributedQueue


> Solr 6.x leaking one SolrZkClient instance per second
> -
>
> Key: SOLR-10420
> URL: https://issues.apache.org/jira/browse/SOLR-10420
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 5.5.2, 6.4.2, 6.5
>Reporter: Markus Jelsma
>Assignee: Scott Blum
> Fix For: 5.5.5, 5.6, 6.5.1, 6.6, master (7.0)
>
> Attachments: OverseerTest.106.stdout, OverseerTest.119.stdout, 
> OverseerTest.80.stdout, OverseerTest.DEBUG.43.stdout, 
> OverseerTest.DEBUG.48.stdout, OverseerTest.DEBUG.58.stdout, 
> SOLR-10420-dragonsinth.patch, SOLR-10420.patch, SOLR-10420.patch, 
> SOLR-10420.patch, SOLR-10420.patch, SOLR-10420.patch
>
>
> One of our nodes became berzerk after a restart, Solr went completely nuts! 
> So i opened VisualVM to keep an eye on it and spotted a different problem 
> that occurs in all our Solr 6.4.2 and 6.5.0 nodes.
> It appears Solr is leaking one SolrZkClient instance per second via 
> DistributedQueue$ChildWatcher. That one per second is quite accurate for all 
> nodes, there are about the same amount of instances as there are seconds 
> since Solr started. I know VisualVM's instance count includes 
> objects-to-be-collected, the instance count does not drop after a forced 
> garbed collection round.
> It doesn't matter how many cores or collections the nodes carry or how heavy 
> traffic is.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10420) Solr 6.x leaking one SolrZkClient instance per second

2017-04-18 Thread Scott Blum (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15973346#comment-15973346
 ] 

Scott Blum commented on SOLR-10420:
---

Got it.  I do think it would be a mistake.  In that case, after I've committed 
to 5x and 6x, I'll also commit to 6_5 and 5_5.

> Solr 6.x leaking one SolrZkClient instance per second
> -
>
> Key: SOLR-10420
> URL: https://issues.apache.org/jira/browse/SOLR-10420
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 5.5.2, 6.4.2, 6.5
>Reporter: Markus Jelsma
>Assignee: Scott Blum
> Fix For: 5.5.5, 5.6, 6.5.1, 6.6, master (7.0)
>
> Attachments: OverseerTest.106.stdout, OverseerTest.119.stdout, 
> OverseerTest.80.stdout, OverseerTest.DEBUG.43.stdout, 
> OverseerTest.DEBUG.48.stdout, OverseerTest.DEBUG.58.stdout, 
> SOLR-10420-dragonsinth.patch, SOLR-10420.patch, SOLR-10420.patch, 
> SOLR-10420.patch, SOLR-10420.patch, SOLR-10420.patch
>
>
> One of our nodes became berzerk after a restart, Solr went completely nuts! 
> So i opened VisualVM to keep an eye on it and spotted a different problem 
> that occurs in all our Solr 6.4.2 and 6.5.0 nodes.
> It appears Solr is leaking one SolrZkClient instance per second via 
> DistributedQueue$ChildWatcher. That one per second is quite accurate for all 
> nodes, there are about the same amount of instances as there are seconds 
> since Solr started. I know VisualVM's instance count includes 
> objects-to-be-collected, the instance count does not drop after a forced 
> garbed collection round.
> It doesn't matter how many cores or collections the nodes carry or how heavy 
> traffic is.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10420) Solr 6.x leaking one SolrZkClient instance per second

2017-04-18 Thread Ishan Chattopadhyaya (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15973331#comment-15973331
 ] 

Ishan Chattopadhyaya commented on SOLR-10420:
-

I think you should do master, branch_6x and branch_6_5. branch_5x is optional.

> Solr 6.x leaking one SolrZkClient instance per second
> -
>
> Key: SOLR-10420
> URL: https://issues.apache.org/jira/browse/SOLR-10420
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 5.5.2, 6.4.2, 6.5
>Reporter: Markus Jelsma
>Assignee: Scott Blum
> Fix For: 5.5.5, 6.4.3, 6.5.1, 6.6, master (7.0)
>
> Attachments: OverseerTest.106.stdout, OverseerTest.119.stdout, 
> OverseerTest.80.stdout, OverseerTest.DEBUG.43.stdout, 
> OverseerTest.DEBUG.48.stdout, OverseerTest.DEBUG.58.stdout, 
> SOLR-10420-dragonsinth.patch, SOLR-10420.patch, SOLR-10420.patch, 
> SOLR-10420.patch, SOLR-10420.patch, SOLR-10420.patch
>
>
> One of our nodes became berzerk after a restart, Solr went completely nuts! 
> So i opened VisualVM to keep an eye on it and spotted a different problem 
> that occurs in all our Solr 6.4.2 and 6.5.0 nodes.
> It appears Solr is leaking one SolrZkClient instance per second via 
> DistributedQueue$ChildWatcher. That one per second is quite accurate for all 
> nodes, there are about the same amount of instances as there are seconds 
> since Solr started. I know VisualVM's instance count includes 
> objects-to-be-collected, the instance count does not drop after a forced 
> garbed collection round.
> It doesn't matter how many cores or collections the nodes carry or how heavy 
> traffic is.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10420) Solr 6.x leaking one SolrZkClient instance per second

2017-04-18 Thread Steve Rowe (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15973330#comment-15973330
 ] 

Steve Rowe commented on SOLR-10420:
---

bq.

I think at a minimum you should also commit to branch_6_5, since it's a known 
release, since you'll be better equipped to handle conflicts if there are any.

I noticed you set fixVersion to releases on branches you won't be committing 
to: 6.4.3, 5.5.5.  I don't think you should do that.  If you're going to commit 
to branch_5x, then the fixVersion would be 5.6, since that's the next release 
on that branch.  Unless you commit to branch_6_4, you shouldn't include 6.4.3 
as a fixVersion.

More generally, if you think it's essential that any X.Y.Z release includes 
this fix, i.e. that it would be a mistake to release without it, then you 
should commit to the branch from which that release will be made.  Otherwise 
you and others may/will forget to backport when such a release materializes.

> Solr 6.x leaking one SolrZkClient instance per second
> -
>
> Key: SOLR-10420
> URL: https://issues.apache.org/jira/browse/SOLR-10420
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 5.5.2, 6.4.2, 6.5
>Reporter: Markus Jelsma
>Assignee: Scott Blum
> Fix For: 5.5.5, 6.4.3, 6.5.1, 6.6, master (7.0)
>
> Attachments: OverseerTest.106.stdout, OverseerTest.119.stdout, 
> OverseerTest.80.stdout, OverseerTest.DEBUG.43.stdout, 
> OverseerTest.DEBUG.48.stdout, OverseerTest.DEBUG.58.stdout, 
> SOLR-10420-dragonsinth.patch, SOLR-10420.patch, SOLR-10420.patch, 
> SOLR-10420.patch, SOLR-10420.patch, SOLR-10420.patch
>
>
> One of our nodes became berzerk after a restart, Solr went completely nuts! 
> So i opened VisualVM to keep an eye on it and spotted a different problem 
> that occurs in all our Solr 6.4.2 and 6.5.0 nodes.
> It appears Solr is leaking one SolrZkClient instance per second via 
> DistributedQueue$ChildWatcher. That one per second is quite accurate for all 
> nodes, there are about the same amount of instances as there are seconds 
> since Solr started. I know VisualVM's instance count includes 
> objects-to-be-collected, the instance count does not drop after a forced 
> garbed collection round.
> It doesn't matter how many cores or collections the nodes carry or how heavy 
> traffic is.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10420) Solr 6.x leaking one SolrZkClient instance per second

2017-04-18 Thread Scott Blum (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15973320#comment-15973320
 ] 

Scott Blum commented on SOLR-10420:
---

So my plan is to commit this to master, branch_6x, and branch_5x, and let the 
release managers pull it into the actual release branches.  SG?

> Solr 6.x leaking one SolrZkClient instance per second
> -
>
> Key: SOLR-10420
> URL: https://issues.apache.org/jira/browse/SOLR-10420
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 5.5.2, 6.4.2, 6.5
>Reporter: Markus Jelsma
>Assignee: Scott Blum
> Fix For: 5.5.5, 6.4.3, 6.5.1, 6.6, master (7.0)
>
> Attachments: OverseerTest.106.stdout, OverseerTest.119.stdout, 
> OverseerTest.80.stdout, OverseerTest.DEBUG.43.stdout, 
> OverseerTest.DEBUG.48.stdout, OverseerTest.DEBUG.58.stdout, 
> SOLR-10420-dragonsinth.patch, SOLR-10420.patch, SOLR-10420.patch, 
> SOLR-10420.patch, SOLR-10420.patch, SOLR-10420.patch
>
>
> One of our nodes became berzerk after a restart, Solr went completely nuts! 
> So i opened VisualVM to keep an eye on it and spotted a different problem 
> that occurs in all our Solr 6.4.2 and 6.5.0 nodes.
> It appears Solr is leaking one SolrZkClient instance per second via 
> DistributedQueue$ChildWatcher. That one per second is quite accurate for all 
> nodes, there are about the same amount of instances as there are seconds 
> since Solr started. I know VisualVM's instance count includes 
> objects-to-be-collected, the instance count does not drop after a forced 
> garbed collection round.
> It doesn't matter how many cores or collections the nodes carry or how heavy 
> traffic is.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10420) Solr 6.x leaking one SolrZkClient instance per second

2017-04-18 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15973261#comment-15973261
 ] 

ASF subversion and git services commented on SOLR-10420:


Commit 43c2b2320dcf344c42086ceb782e0fc53c439952 in lucene-solr's branch 
refs/heads/master from [~dragonsinth]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=43c2b23 ]

SOLR-10420: fix watcher leak in DistributedQueue


> Solr 6.x leaking one SolrZkClient instance per second
> -
>
> Key: SOLR-10420
> URL: https://issues.apache.org/jira/browse/SOLR-10420
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 5.5.2, 6.4.2, 6.5
>Reporter: Markus Jelsma
>Assignee: Scott Blum
> Fix For: 5.5.5, 6.4.3, 6.5.1, master (7.0)
>
> Attachments: OverseerTest.106.stdout, OverseerTest.119.stdout, 
> OverseerTest.80.stdout, OverseerTest.DEBUG.43.stdout, 
> OverseerTest.DEBUG.48.stdout, OverseerTest.DEBUG.58.stdout, 
> SOLR-10420-dragonsinth.patch, SOLR-10420.patch, SOLR-10420.patch, 
> SOLR-10420.patch, SOLR-10420.patch, SOLR-10420.patch
>
>
> One of our nodes became berzerk after a restart, Solr went completely nuts! 
> So i opened VisualVM to keep an eye on it and spotted a different problem 
> that occurs in all our Solr 6.4.2 and 6.5.0 nodes.
> It appears Solr is leaking one SolrZkClient instance per second via 
> DistributedQueue$ChildWatcher. That one per second is quite accurate for all 
> nodes, there are about the same amount of instances as there are seconds 
> since Solr started. I know VisualVM's instance count includes 
> objects-to-be-collected, the instance count does not drop after a forced 
> garbed collection round.
> It doesn't matter how many cores or collections the nodes carry or how heavy 
> traffic is.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10420) Solr 6.x leaking one SolrZkClient instance per second

2017-04-18 Thread Scott Blum (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15973246#comment-15973246
 ] 

Scott Blum commented on SOLR-10420:
---

Great!  Thanks [~steve_rowe]!  I'll get this committed.

> Solr 6.x leaking one SolrZkClient instance per second
> -
>
> Key: SOLR-10420
> URL: https://issues.apache.org/jira/browse/SOLR-10420
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 5.5.2, 6.4.2, 6.5
>Reporter: Markus Jelsma
>Assignee: Scott Blum
> Attachments: OverseerTest.106.stdout, OverseerTest.119.stdout, 
> OverseerTest.80.stdout, OverseerTest.DEBUG.43.stdout, 
> OverseerTest.DEBUG.48.stdout, OverseerTest.DEBUG.58.stdout, 
> SOLR-10420-dragonsinth.patch, SOLR-10420.patch, SOLR-10420.patch, 
> SOLR-10420.patch, SOLR-10420.patch, SOLR-10420.patch
>
>
> One of our nodes became berzerk after a restart, Solr went completely nuts! 
> So i opened VisualVM to keep an eye on it and spotted a different problem 
> that occurs in all our Solr 6.4.2 and 6.5.0 nodes.
> It appears Solr is leaking one SolrZkClient instance per second via 
> DistributedQueue$ChildWatcher. That one per second is quite accurate for all 
> nodes, there are about the same amount of instances as there are seconds 
> since Solr started. I know VisualVM's instance count includes 
> objects-to-be-collected, the instance count does not drop after a forced 
> garbed collection round.
> It doesn't matter how many cores or collections the nodes carry or how heavy 
> traffic is.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10420) Solr 6.x leaking one SolrZkClient instance per second

2017-04-18 Thread Steve Rowe (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15973207#comment-15973207
 ] 

Steve Rowe commented on SOLR-10420:
---

bq. I'm running the full Solr core test suite a couple times now, I'll report 
back when it finishes (should be less than half an hour).

I ran the solr-core and solrj suites three times each with [~dragonsinth]'s 
latest patch on master, and there were zero failures.

> Solr 6.x leaking one SolrZkClient instance per second
> -
>
> Key: SOLR-10420
> URL: https://issues.apache.org/jira/browse/SOLR-10420
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 5.5.2, 6.4.2, 6.5
>Reporter: Markus Jelsma
>Assignee: Scott Blum
> Attachments: OverseerTest.106.stdout, OverseerTest.119.stdout, 
> OverseerTest.80.stdout, OverseerTest.DEBUG.43.stdout, 
> OverseerTest.DEBUG.48.stdout, OverseerTest.DEBUG.58.stdout, 
> SOLR-10420-dragonsinth.patch, SOLR-10420.patch, SOLR-10420.patch, 
> SOLR-10420.patch, SOLR-10420.patch, SOLR-10420.patch
>
>
> One of our nodes became berzerk after a restart, Solr went completely nuts! 
> So i opened VisualVM to keep an eye on it and spotted a different problem 
> that occurs in all our Solr 6.4.2 and 6.5.0 nodes.
> It appears Solr is leaking one SolrZkClient instance per second via 
> DistributedQueue$ChildWatcher. That one per second is quite accurate for all 
> nodes, there are about the same amount of instances as there are seconds 
> since Solr started. I know VisualVM's instance count includes 
> objects-to-be-collected, the instance count does not drop after a forced 
> garbed collection round.
> It doesn't matter how many cores or collections the nodes carry or how heavy 
> traffic is.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10420) Solr 6.x leaking one SolrZkClient instance per second

2017-04-18 Thread Steve Rowe (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15973118#comment-15973118
 ] 

Steve Rowe commented on SOLR-10420:
---

I did 500 beasting iterations of OverseerTest using your latest patch 
[~dragonsinth], zero failures.

I'm running the full Solr core test suite a couple times now, I'll report back 
when it finishes (should be less than half an hour).

> Solr 6.x leaking one SolrZkClient instance per second
> -
>
> Key: SOLR-10420
> URL: https://issues.apache.org/jira/browse/SOLR-10420
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 5.5.2, 6.4.2, 6.5
>Reporter: Markus Jelsma
>Assignee: Scott Blum
> Attachments: OverseerTest.106.stdout, OverseerTest.119.stdout, 
> OverseerTest.80.stdout, OverseerTest.DEBUG.43.stdout, 
> OverseerTest.DEBUG.48.stdout, OverseerTest.DEBUG.58.stdout, 
> SOLR-10420-dragonsinth.patch, SOLR-10420.patch, SOLR-10420.patch, 
> SOLR-10420.patch, SOLR-10420.patch, SOLR-10420.patch
>
>
> One of our nodes became berzerk after a restart, Solr went completely nuts! 
> So i opened VisualVM to keep an eye on it and spotted a different problem 
> that occurs in all our Solr 6.4.2 and 6.5.0 nodes.
> It appears Solr is leaking one SolrZkClient instance per second via 
> DistributedQueue$ChildWatcher. That one per second is quite accurate for all 
> nodes, there are about the same amount of instances as there are seconds 
> since Solr started. I know VisualVM's instance count includes 
> objects-to-be-collected, the instance count does not drop after a forced 
> garbed collection round.
> It doesn't matter how many cores or collections the nodes carry or how heavy 
> traffic is.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10420) Solr 6.x leaking one SolrZkClient instance per second

2017-04-18 Thread Scott Blum (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15973093#comment-15973093
 ] 

Scott Blum commented on SOLR-10420:
---

Agreed.  It passes for me.  Anyone on this issue want to do any extensive 
testing before I commit?  Otherwise I'll commit this today to master and then 
start backporting it to a number of branches.

> Solr 6.x leaking one SolrZkClient instance per second
> -
>
> Key: SOLR-10420
> URL: https://issues.apache.org/jira/browse/SOLR-10420
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 5.5.2, 6.4.2, 6.5
>Reporter: Markus Jelsma
>Assignee: Scott Blum
> Attachments: OverseerTest.106.stdout, OverseerTest.119.stdout, 
> OverseerTest.80.stdout, OverseerTest.DEBUG.43.stdout, 
> OverseerTest.DEBUG.48.stdout, OverseerTest.DEBUG.58.stdout, 
> SOLR-10420-dragonsinth.patch, SOLR-10420.patch, SOLR-10420.patch, 
> SOLR-10420.patch, SOLR-10420.patch, SOLR-10420.patch
>
>
> One of our nodes became berzerk after a restart, Solr went completely nuts! 
> So i opened VisualVM to keep an eye on it and spotted a different problem 
> that occurs in all our Solr 6.4.2 and 6.5.0 nodes.
> It appears Solr is leaking one SolrZkClient instance per second via 
> DistributedQueue$ChildWatcher. That one per second is quite accurate for all 
> nodes, there are about the same amount of instances as there are seconds 
> since Solr started. I know VisualVM's instance count includes 
> objects-to-be-collected, the instance count does not drop after a forced 
> garbed collection round.
> It doesn't matter how many cores or collections the nodes carry or how heavy 
> traffic is.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10420) Solr 6.x leaking one SolrZkClient instance per second

2017-04-18 Thread Cao Manh Dat (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15973070#comment-15973070
 ] 

Cao Manh Dat commented on SOLR-10420:
-

[~dragonsinth] This issue is blocker for Solr 6.5.1. So I think we (you) should 
commit soon.

> Solr 6.x leaking one SolrZkClient instance per second
> -
>
> Key: SOLR-10420
> URL: https://issues.apache.org/jira/browse/SOLR-10420
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 5.5.2, 6.4.2, 6.5
>Reporter: Markus Jelsma
>Assignee: Scott Blum
> Attachments: OverseerTest.106.stdout, OverseerTest.119.stdout, 
> OverseerTest.80.stdout, OverseerTest.DEBUG.43.stdout, 
> OverseerTest.DEBUG.48.stdout, OverseerTest.DEBUG.58.stdout, 
> SOLR-10420-dragonsinth.patch, SOLR-10420.patch, SOLR-10420.patch, 
> SOLR-10420.patch, SOLR-10420.patch, SOLR-10420.patch
>
>
> One of our nodes became berzerk after a restart, Solr went completely nuts! 
> So i opened VisualVM to keep an eye on it and spotted a different problem 
> that occurs in all our Solr 6.4.2 and 6.5.0 nodes.
> It appears Solr is leaking one SolrZkClient instance per second via 
> DistributedQueue$ChildWatcher. That one per second is quite accurate for all 
> nodes, there are about the same amount of instances as there are seconds 
> since Solr started. I know VisualVM's instance count includes 
> objects-to-be-collected, the instance count does not drop after a forced 
> garbed collection round.
> It doesn't matter how many cores or collections the nodes carry or how heavy 
> traffic is.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10420) Solr 6.x leaking one SolrZkClient instance per second

2017-04-17 Thread Cao Manh Dat (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15971964#comment-15971964
 ] 

Cao Manh Dat commented on SOLR-10420:
-

[~dragonsinth] The patch LGTM!

> Solr 6.x leaking one SolrZkClient instance per second
> -
>
> Key: SOLR-10420
> URL: https://issues.apache.org/jira/browse/SOLR-10420
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 5.5.2, 6.4.2, 6.5
>Reporter: Markus Jelsma
>Assignee: Scott Blum
> Attachments: OverseerTest.106.stdout, OverseerTest.119.stdout, 
> OverseerTest.80.stdout, OverseerTest.DEBUG.43.stdout, 
> OverseerTest.DEBUG.48.stdout, OverseerTest.DEBUG.58.stdout, 
> SOLR-10420-dragonsinth.patch, SOLR-10420.patch, SOLR-10420.patch, 
> SOLR-10420.patch, SOLR-10420.patch, SOLR-10420.patch
>
>
> One of our nodes became berzerk after a restart, Solr went completely nuts! 
> So i opened VisualVM to keep an eye on it and spotted a different problem 
> that occurs in all our Solr 6.4.2 and 6.5.0 nodes.
> It appears Solr is leaking one SolrZkClient instance per second via 
> DistributedQueue$ChildWatcher. That one per second is quite accurate for all 
> nodes, there are about the same amount of instances as there are seconds 
> since Solr started. I know VisualVM's instance count includes 
> objects-to-be-collected, the instance count does not drop after a forced 
> garbed collection round.
> It doesn't matter how many cores or collections the nodes carry or how heavy 
> traffic is.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10420) Solr 6.x leaking one SolrZkClient instance per second

2017-04-17 Thread Scott Blum (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15971259#comment-15971259
 ] 

Scott Blum commented on SOLR-10420:
---

[~caomanhdat] I didn't literally mean that we should bring back the isDirty 
bit.  I meant that clearly the last time around, there was a hole in the design 
that led to this leak.  I want to take the opportunity to re-look at the design 
again as a whole and make sure everything seems good, and we're not just 
putting a band-aid on it.  You may have already done this, so just give me a 
little bit to catch up. :D


> Solr 6.x leaking one SolrZkClient instance per second
> -
>
> Key: SOLR-10420
> URL: https://issues.apache.org/jira/browse/SOLR-10420
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 5.5.2, 6.4.2, 6.5
>Reporter: Markus Jelsma
> Attachments: OverseerTest.106.stdout, OverseerTest.119.stdout, 
> OverseerTest.80.stdout, OverseerTest.DEBUG.43.stdout, 
> OverseerTest.DEBUG.48.stdout, OverseerTest.DEBUG.58.stdout, SOLR-10420.patch, 
> SOLR-10420.patch, SOLR-10420.patch, SOLR-10420.patch, SOLR-10420.patch
>
>
> One of our nodes became berzerk after a restart, Solr went completely nuts! 
> So i opened VisualVM to keep an eye on it and spotted a different problem 
> that occurs in all our Solr 6.4.2 and 6.5.0 nodes.
> It appears Solr is leaking one SolrZkClient instance per second via 
> DistributedQueue$ChildWatcher. That one per second is quite accurate for all 
> nodes, there are about the same amount of instances as there are seconds 
> since Solr started. I know VisualVM's instance count includes 
> objects-to-be-collected, the instance count does not drop after a forced 
> garbed collection round.
> It doesn't matter how many cores or collections the nodes carry or how heavy 
> traffic is.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10420) Solr 6.x leaking one SolrZkClient instance per second

2017-04-15 Thread Cao Manh Dat (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15970215#comment-15970215
 ] 

Cao Manh Dat commented on SOLR-10420:
-

[~dragonsinth] As Steve said, Overseer.external.. test failed frequently before 
SOLR-9191 got committed. So I doubt that "isDirty" in retro-code will fix the 
problem.

> Solr 6.x leaking one SolrZkClient instance per second
> -
>
> Key: SOLR-10420
> URL: https://issues.apache.org/jira/browse/SOLR-10420
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 5.5.2, 6.4.2, 6.5
>Reporter: Markus Jelsma
> Attachments: OverseerTest.106.stdout, OverseerTest.119.stdout, 
> OverseerTest.80.stdout, OverseerTest.DEBUG.43.stdout, 
> OverseerTest.DEBUG.48.stdout, OverseerTest.DEBUG.58.stdout, SOLR-10420.patch, 
> SOLR-10420.patch, SOLR-10420.patch, SOLR-10420.patch
>
>
> One of our nodes became berzerk after a restart, Solr went completely nuts! 
> So i opened VisualVM to keep an eye on it and spotted a different problem 
> that occurs in all our Solr 6.4.2 and 6.5.0 nodes.
> It appears Solr is leaking one SolrZkClient instance per second via 
> DistributedQueue$ChildWatcher. That one per second is quite accurate for all 
> nodes, there are about the same amount of instances as there are seconds 
> since Solr started. I know VisualVM's instance count includes 
> objects-to-be-collected, the instance count does not drop after a forced 
> garbed collection round.
> It doesn't matter how many cores or collections the nodes carry or how heavy 
> traffic is.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10420) Solr 6.x leaking one SolrZkClient instance per second

2017-04-15 Thread Cao Manh Dat (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15970205#comment-15970205
 ] 

Cao Manh Dat commented on SOLR-10420:
-

[~dragonsinth] that's correct.

The last patch pass all the tests. 

> Solr 6.x leaking one SolrZkClient instance per second
> -
>
> Key: SOLR-10420
> URL: https://issues.apache.org/jira/browse/SOLR-10420
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 5.5.2, 6.4.2, 6.5
>Reporter: Markus Jelsma
> Attachments: OverseerTest.106.stdout, OverseerTest.119.stdout, 
> OverseerTest.80.stdout, OverseerTest.DEBUG.43.stdout, 
> OverseerTest.DEBUG.48.stdout, OverseerTest.DEBUG.58.stdout, SOLR-10420.patch, 
> SOLR-10420.patch, SOLR-10420.patch, SOLR-10420.patch
>
>
> One of our nodes became berzerk after a restart, Solr went completely nuts! 
> So i opened VisualVM to keep an eye on it and spotted a different problem 
> that occurs in all our Solr 6.4.2 and 6.5.0 nodes.
> It appears Solr is leaking one SolrZkClient instance per second via 
> DistributedQueue$ChildWatcher. That one per second is quite accurate for all 
> nodes, there are about the same amount of instances as there are seconds 
> since Solr started. I know VisualVM's instance count includes 
> objects-to-be-collected, the instance count does not drop after a forced 
> garbed collection round.
> It doesn't matter how many cores or collections the nodes carry or how heavy 
> traffic is.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10420) Solr 6.x leaking one SolrZkClient instance per second

2017-04-15 Thread Scott Blum (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15970030#comment-15970030
 ] 

Scott Blum commented on SOLR-10420:
---

Let me try to unpack what you said..

1) We want a synchronous offer() -> peek() on the same thread to return the 
item offered without delay.
2) This works on master, but the original patch to fix the leak breaks #1.

Is that correct?

Let me look at this on Monday with [~jhump].  I'm pretty sure there's a 
simplification to be made in DQ with how we're handling the watcher and dirty 
tracking.  There used to be an explicit "isDirty" bit that we traded out for 
watcher nullability, which in retrospect I'm not sure was the best choice.

> Solr 6.x leaking one SolrZkClient instance per second
> -
>
> Key: SOLR-10420
> URL: https://issues.apache.org/jira/browse/SOLR-10420
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 5.5.2, 6.4.2, 6.5
>Reporter: Markus Jelsma
> Attachments: OverseerTest.106.stdout, OverseerTest.119.stdout, 
> OverseerTest.80.stdout, OverseerTest.DEBUG.43.stdout, 
> OverseerTest.DEBUG.48.stdout, OverseerTest.DEBUG.58.stdout, SOLR-10420.patch, 
> SOLR-10420.patch, SOLR-10420.patch, SOLR-10420.patch
>
>
> One of our nodes became berzerk after a restart, Solr went completely nuts! 
> So i opened VisualVM to keep an eye on it and spotted a different problem 
> that occurs in all our Solr 6.4.2 and 6.5.0 nodes.
> It appears Solr is leaking one SolrZkClient instance per second via 
> DistributedQueue$ChildWatcher. That one per second is quite accurate for all 
> nodes, there are about the same amount of instances as there are seconds 
> since Solr started. I know VisualVM's instance count includes 
> objects-to-be-collected, the instance count does not drop after a forced 
> garbed collection round.
> It doesn't matter how many cores or collections the nodes carry or how heavy 
> traffic is.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10420) Solr 6.x leaking one SolrZkClient instance per second

2017-04-15 Thread Cao Manh Dat (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15969966#comment-15969966
 ] 

Cao Manh Dat commented on SOLR-10420:
-

[~dragonsinth] Can you review the patch?

> Solr 6.x leaking one SolrZkClient instance per second
> -
>
> Key: SOLR-10420
> URL: https://issues.apache.org/jira/browse/SOLR-10420
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 5.5.2, 6.4.2, 6.5
>Reporter: Markus Jelsma
> Attachments: OverseerTest.106.stdout, OverseerTest.119.stdout, 
> OverseerTest.80.stdout, OverseerTest.DEBUG.43.stdout, 
> OverseerTest.DEBUG.48.stdout, OverseerTest.DEBUG.58.stdout, SOLR-10420.patch, 
> SOLR-10420.patch, SOLR-10420.patch, SOLR-10420.patch
>
>
> One of our nodes became berzerk after a restart, Solr went completely nuts! 
> So i opened VisualVM to keep an eye on it and spotted a different problem 
> that occurs in all our Solr 6.4.2 and 6.5.0 nodes.
> It appears Solr is leaking one SolrZkClient instance per second via 
> DistributedQueue$ChildWatcher. That one per second is quite accurate for all 
> nodes, there are about the same amount of instances as there are seconds 
> since Solr started. I know VisualVM's instance count includes 
> objects-to-be-collected, the instance count does not drop after a forced 
> garbed collection round.
> It doesn't matter how many cores or collections the nodes carry or how heavy 
> traffic is.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10420) Solr 6.x leaking one SolrZkClient instance per second

2017-04-15 Thread Cao Manh Dat (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15969843#comment-15969843
 ] 

Cao Manh Dat commented on SOLR-10420:
-

I'm thinking about two different way to solve this problem :
1. DQ will set lastWatcher = null when we run {{DQ.offer(byte[] data)}} 
sucessfully. Because the overseer.workqueue is locally offered by overseer so 
that will fix the problem. But we changed {{DQ.peek()}} from true positive ( if 
ZK contain new item, we will return that ) to false positive ( if ZK contain 
new item, we may not return that ) so this may inflict other parts as well.
2. each time DQ.peek() is called we will look at the ZK nodes without using the 
watcher.

> Solr 6.x leaking one SolrZkClient instance per second
> -
>
> Key: SOLR-10420
> URL: https://issues.apache.org/jira/browse/SOLR-10420
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 5.5.2, 6.4.2, 6.5
>Reporter: Markus Jelsma
> Attachments: OverseerTest.106.stdout, OverseerTest.119.stdout, 
> OverseerTest.80.stdout, OverseerTest.DEBUG.43.stdout, 
> OverseerTest.DEBUG.48.stdout, OverseerTest.DEBUG.58.stdout, SOLR-10420.patch, 
> SOLR-10420.patch, SOLR-10420.patch
>
>
> One of our nodes became berzerk after a restart, Solr went completely nuts! 
> So i opened VisualVM to keep an eye on it and spotted a different problem 
> that occurs in all our Solr 6.4.2 and 6.5.0 nodes.
> It appears Solr is leaking one SolrZkClient instance per second via 
> DistributedQueue$ChildWatcher. That one per second is quite accurate for all 
> nodes, there are about the same amount of instances as there are seconds 
> since Solr started. I know VisualVM's instance count includes 
> objects-to-be-collected, the instance count does not drop after a forced 
> garbed collection round.
> It doesn't matter how many cores or collections the nodes carry or how heavy 
> traffic is.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10420) Solr 6.x leaking one SolrZkClient instance per second

2017-04-15 Thread Cao Manh Dat (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15969839#comment-15969839
 ] 

Cao Manh Dat commented on SOLR-10420:
-

I kinda find out the reason why the test failure. There are some notice here
- In current DQ version, for each time we peek() if in-memory queue is empty, 
we will actually look at the ZK to get new elements ( watcher are useless in 
this scenario )
- With the patch, for each time we peek()  if in-memory queue is empty, we will 
only look at the ZK nodes when watcher tell us that there are change in our 
queue.

So this is the reason why the test failure
- overseer.queue <- set a replica down
- overseer run the command successfully
- overseer.queue <- set a replica active
- overseer delay this command ( overseer.workqueue <- set a replica active )
- touch /clusterstate.json to change its version
- overseer.queue <- some ZKWriteCommand, let's call this one ZK1
- overseer change the clusterstate to set replica active
- overseer meet badversion exception
- overseer fetch last element from overseer.workqueue. Here are where problem 
happen, overseer.workqueue.peek() return empty because the watcher is not fired.
- overseer process ZK1, it success -> overseer.workqueue is emptied.

> Solr 6.x leaking one SolrZkClient instance per second
> -
>
> Key: SOLR-10420
> URL: https://issues.apache.org/jira/browse/SOLR-10420
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 5.5.2, 6.4.2, 6.5
>Reporter: Markus Jelsma
> Attachments: OverseerTest.106.stdout, OverseerTest.119.stdout, 
> OverseerTest.80.stdout, OverseerTest.DEBUG.43.stdout, 
> OverseerTest.DEBUG.48.stdout, OverseerTest.DEBUG.58.stdout, SOLR-10420.patch, 
> SOLR-10420.patch, SOLR-10420.patch
>
>
> One of our nodes became berzerk after a restart, Solr went completely nuts! 
> So i opened VisualVM to keep an eye on it and spotted a different problem 
> that occurs in all our Solr 6.4.2 and 6.5.0 nodes.
> It appears Solr is leaking one SolrZkClient instance per second via 
> DistributedQueue$ChildWatcher. That one per second is quite accurate for all 
> nodes, there are about the same amount of instances as there are seconds 
> since Solr started. I know VisualVM's instance count includes 
> objects-to-be-collected, the instance count does not drop after a forced 
> garbed collection round.
> It doesn't matter how many cores or collections the nodes carry or how heavy 
> traffic is.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10420) Solr 6.x leaking one SolrZkClient instance per second

2017-04-14 Thread Cao Manh Dat (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15968869#comment-15968869
 ] 

Cao Manh Dat commented on SOLR-10420:
-

It's actually fix the problem even without reusing the same object. But It 
makes the OverseerTest fail.

> Solr 6.x leaking one SolrZkClient instance per second
> -
>
> Key: SOLR-10420
> URL: https://issues.apache.org/jira/browse/SOLR-10420
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 5.5.2, 6.5, 6.4.2
>Reporter: Markus Jelsma
> Attachments: OverseerTest.106.stdout, OverseerTest.119.stdout, 
> OverseerTest.80.stdout, OverseerTest.DEBUG.43.stdout, 
> OverseerTest.DEBUG.48.stdout, OverseerTest.DEBUG.58.stdout, SOLR-10420.patch, 
> SOLR-10420.patch, SOLR-10420.patch
>
>
> One of our nodes became berzerk after a restart, Solr went completely nuts! 
> So i opened VisualVM to keep an eye on it and spotted a different problem 
> that occurs in all our Solr 6.4.2 and 6.5.0 nodes.
> It appears Solr is leaking one SolrZkClient instance per second via 
> DistributedQueue$ChildWatcher. That one per second is quite accurate for all 
> nodes, there are about the same amount of instances as there are seconds 
> since Solr started. I know VisualVM's instance count includes 
> objects-to-be-collected, the instance count does not drop after a forced 
> garbed collection round.
> It doesn't matter how many cores or collections the nodes carry or how heavy 
> traffic is.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10420) Solr 6.x leaking one SolrZkClient instance per second

2017-04-13 Thread Scott Blum (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15968618#comment-15968618
 ] 

Scott Blum commented on SOLR-10420:
---

Fix LGTM.

Is this actual fix this?

```
  // we're not in a dirty state, and we do not have in-memory children
  if (lastWatcher != null) return null;
```

IE, if you just do that, would that fix the leak even without reusing the same 
object?


> Solr 6.x leaking one SolrZkClient instance per second
> -
>
> Key: SOLR-10420
> URL: https://issues.apache.org/jira/browse/SOLR-10420
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 5.5.2, 6.5, 6.4.2
>Reporter: Markus Jelsma
> Attachments: OverseerTest.106.stdout, OverseerTest.119.stdout, 
> OverseerTest.80.stdout, OverseerTest.DEBUG.43.stdout, 
> OverseerTest.DEBUG.48.stdout, OverseerTest.DEBUG.58.stdout, SOLR-10420.patch, 
> SOLR-10420.patch, SOLR-10420.patch
>
>
> One of our nodes became berzerk after a restart, Solr went completely nuts! 
> So i opened VisualVM to keep an eye on it and spotted a different problem 
> that occurs in all our Solr 6.4.2 and 6.5.0 nodes.
> It appears Solr is leaking one SolrZkClient instance per second via 
> DistributedQueue$ChildWatcher. That one per second is quite accurate for all 
> nodes, there are about the same amount of instances as there are seconds 
> since Solr started. I know VisualVM's instance count includes 
> objects-to-be-collected, the instance count does not drop after a forced 
> garbed collection round.
> It doesn't matter how many cores or collections the nodes carry or how heavy 
> traffic is.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10420) Solr 6.x leaking one SolrZkClient instance per second

2017-04-13 Thread Steve Rowe (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15967470#comment-15967470
 ] 

Steve Rowe commented on SOLR-10420:
---

bq. Next I'll try the patch with the OverseerTest changes.

I got 4 failures from 100 beasting iterations using this version of the patch.  
The failures looked the same as the previously posted logs, so I won't add them.

Next I'll try beasting the latest patch again, this time with DEBUG level on 
{{org.apache.solr.common.cloud}}.

> Solr 6.x leaking one SolrZkClient instance per second
> -
>
> Key: SOLR-10420
> URL: https://issues.apache.org/jira/browse/SOLR-10420
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 5.5.2, 6.5, 6.4.2
>Reporter: Markus Jelsma
> Attachments: OverseerTest.106.stdout, OverseerTest.119.stdout, 
> OverseerTest.80.stdout, SOLR-10420.patch, SOLR-10420.patch, SOLR-10420.patch
>
>
> One of our nodes became berzerk after a restart, Solr went completely nuts! 
> So i opened VisualVM to keep an eye on it and spotted a different problem 
> that occurs in all our Solr 6.4.2 and 6.5.0 nodes.
> It appears Solr is leaking one SolrZkClient instance per second via 
> DistributedQueue$ChildWatcher. That one per second is quite accurate for all 
> nodes, there are about the same amount of instances as there are seconds 
> since Solr started. I know VisualVM's instance count includes 
> objects-to-be-collected, the instance count does not drop after a forced 
> garbed collection round.
> It doesn't matter how many cores or collections the nodes carry or how heavy 
> traffic is.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10420) Solr 6.x leaking one SolrZkClient instance per second

2017-04-12 Thread Ishan Chattopadhyaya (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15967158#comment-15967158
 ] 

Ishan Chattopadhyaya commented on SOLR-10420:
-

Ran the test with original patch as well as 15s timeout patch 500 times each. I 
saw no failures. I can run this on better hardware early next week (my AMD 
Ryzen is arriving soon!).

> Solr 6.x leaking one SolrZkClient instance per second
> -
>
> Key: SOLR-10420
> URL: https://issues.apache.org/jira/browse/SOLR-10420
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 5.5.2, 6.5, 6.4.2
>Reporter: Markus Jelsma
> Attachments: OverseerTest.80.stdout, SOLR-10420.patch, 
> SOLR-10420.patch, SOLR-10420.patch
>
>
> One of our nodes became berzerk after a restart, Solr went completely nuts! 
> So i opened VisualVM to keep an eye on it and spotted a different problem 
> that occurs in all our Solr 6.4.2 and 6.5.0 nodes.
> It appears Solr is leaking one SolrZkClient instance per second via 
> DistributedQueue$ChildWatcher. That one per second is quite accurate for all 
> nodes, there are about the same amount of instances as there are seconds 
> since Solr started. I know VisualVM's instance count includes 
> objects-to-be-collected, the instance count does not drop after a forced 
> garbed collection round.
> It doesn't matter how many cores or collections the nodes carry or how heavy 
> traffic is.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10420) Solr 6.x leaking one SolrZkClient instance per second

2017-04-12 Thread Markus Jelsma (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15965790#comment-15965790
 ] 

Markus Jelsma commented on SOLR-10420:
--

This patch appears to solve the problem. 

> Solr 6.x leaking one SolrZkClient instance per second
> -
>
> Key: SOLR-10420
> URL: https://issues.apache.org/jira/browse/SOLR-10420
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 6.5, 6.4.2
>Reporter: Markus Jelsma
> Fix For: master (7.0), branch_6x
>
> Attachments: SOLR-10420.patch
>
>
> One of our nodes became berzerk after a restart, Solr went completely nuts! 
> So i opened VisualVM to keep an eye on it and spotted a different problem 
> that occurs in all our Solr 6.4.2 and 6.5.0 nodes.
> It appears Solr is leaking one SolrZkClient instance per second via 
> DistributedQueue$ChildWatcher. That one per second is quite accurate for all 
> nodes, there are about the same amount of instances as there are seconds 
> since Solr started. I know VisualVM's instance count includes 
> objects-to-be-collected, the instance count does not drop after a forced 
> garbed collection round.
> It doesn't matter how many cores or collections the nodes carry or how heavy 
> traffic is.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10420) Solr 6.x leaking one SolrZkClient instance per second

2017-04-04 Thread Markus Jelsma (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15955575#comment-15955575
 ] 

Markus Jelsma commented on SOLR-10420:
--

So it seems. Forced GC does not remove the object instances in >= 6.1.0. In 
6.0.x regular GC and forced GC does remove the instances from the object count. 
I think almost everyone should be able to see it for themselves, almost all our 
Solr instances show this problem immediately after restart, some don't in some 
occasions.

Although they don't consume a lot of bytes, the problem appears to cause more 
CPU time being used up.

Filtering the memory sampler for org.apache.solr.common reveals it right away.

> Solr 6.x leaking one SolrZkClient instance per second
> -
>
> Key: SOLR-10420
> URL: https://issues.apache.org/jira/browse/SOLR-10420
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 6.5, 6.4.2
>Reporter: Markus Jelsma
> Fix For: master (7.0), branch_6x
>
>
> One of our nodes became berzerk after a restart, Solr went completely nuts! 
> So i opened VisualVM to keep an eye on it and spotted a different problem 
> that occurs in all our Solr 6.4.2 and 6.5.0 nodes.
> It appears Solr is leaking one SolrZkClient instance per second via 
> DistributedQueue$ChildWatcher. That one per second is quite accurate for all 
> nodes, there are about the same amount of instances as there are seconds 
> since Solr started. I know VisualVM's instance count includes 
> objects-to-be-collected, the instance count does not drop after a forced 
> garbed collection round.
> It doesn't matter how many cores or collections the nodes carry or how heavy 
> traffic is.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10420) Solr 6.x leaking one SolrZkClient instance per second

2017-04-04 Thread Walter Underwood (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15955538#comment-15955538
 ] 

Walter Underwood commented on SOLR-10420:
-

To be clear, these are uncollectable objects?

> Solr 6.x leaking one SolrZkClient instance per second
> -
>
> Key: SOLR-10420
> URL: https://issues.apache.org/jira/browse/SOLR-10420
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 6.5, 6.4.2
>Reporter: Markus Jelsma
> Fix For: master (7.0), branch_6x
>
>
> One of our nodes became berzerk after a restart, Solr went completely nuts! 
> So i opened VisualVM to keep an eye on it and spotted a different problem 
> that occurs in all our Solr 6.4.2 and 6.5.0 nodes.
> It appears Solr is leaking one SolrZkClient instance per second via 
> DistributedQueue$ChildWatcher. That one per second is quite accurate for all 
> nodes, there are about the same amount of instances as there are seconds 
> since Solr started. I know VisualVM's instance count includes 
> objects-to-be-collected, the instance count does not drop after a forced 
> garbed collection round.
> It doesn't matter how many cores or collections the nodes carry or how heavy 
> traffic is.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10420) Solr 6.x leaking one SolrZkClient instance per second

2017-04-04 Thread Markus Jelsma (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15955293#comment-15955293
 ] 

Markus Jelsma commented on SOLR-10420:
--

To note another oddity, some nodes of our regular search cluster (6.5.0) do not 
show increased counts. Some nodes with other roles (but running Solr) show the 
problem immediately after each restart every time i restarted them today. So it 
could be 6.0.1 and 6.0.0 also show the problem, although they didn't when i 
just tested them.

> Solr 6.x leaking one SolrZkClient instance per second
> -
>
> Key: SOLR-10420
> URL: https://issues.apache.org/jira/browse/SOLR-10420
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 6.5, 6.4.2
>Reporter: Markus Jelsma
> Fix For: master (7.0), branch_6x
>
>
> One of our nodes became berzerk after a restart, Solr went completely nuts! 
> So i opened VisualVM to keep an eye on it and spotted a different problem 
> that occurs in all our Solr 6.4.2 and 6.5.0 nodes.
> It appears Solr is leaking one SolrZkClient instance per second via 
> DistributedQueue$ChildWatcher. That one per second is quite accurate for all 
> nodes, there are about the same amount of instances as there are seconds 
> since Solr started. I know VisualVM's instance count includes 
> objects-to-be-collected, the instance count does not drop after a forced 
> garbed collection round.
> It doesn't matter how many cores or collections the nodes carry or how heavy 
> traffic is.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10420) Solr 6.x leaking one SolrZkClient instance per second

2017-04-04 Thread Markus Jelsma (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15955287#comment-15955287
 ] 

Markus Jelsma commented on SOLR-10420:
--

Well, actually DistributedQueue$ChildWatcher is being leaked well, so leaking 
of SolrZkClient could be a consequence of that.



> Solr 6.x leaking one SolrZkClient instance per second
> -
>
> Key: SOLR-10420
> URL: https://issues.apache.org/jira/browse/SOLR-10420
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 6.5, 6.4.2
>Reporter: Markus Jelsma
> Fix For: master (7.0), branch_6x
>
>
> One of our nodes became berzerk after a restart, Solr went completely nuts! 
> So i opened VisualVM to keep an eye on it and spotted a different problem 
> that occurs in all our Solr 6.4.2 and 6.5.0 nodes.
> It appears Solr is leaking one SolrZkClient instance per second via 
> DistributedQueue$ChildWatcher. That one per second is quite accurate for all 
> nodes, there are about the same amount of instances as there are seconds 
> since Solr started. I know VisualVM's instance count includes 
> objects-to-be-collected, the instance count does not drop after a forced 
> garbed collection round.
> It doesn't matter how many cores or collections the nodes carry or how heavy 
> traffic is.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10420) Solr 6.x leaking one SolrZkClient instance per second

2017-04-04 Thread Scott Blum (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15955283#comment-15955283
 ] 

Scott Blum commented on SOLR-10420:
---

Hard to see how the problem could be localized to 
DistributedQueue$ChildWatcher.. it doesn't create any ZkClients, it's passed in 
from the outside.

> Solr 6.x leaking one SolrZkClient instance per second
> -
>
> Key: SOLR-10420
> URL: https://issues.apache.org/jira/browse/SOLR-10420
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 6.5, 6.4.2
>Reporter: Markus Jelsma
> Fix For: master (7.0), branch_6x
>
>
> One of our nodes became berzerk after a restart, Solr went completely nuts! 
> So i opened VisualVM to keep an eye on it and spotted a different problem 
> that occurs in all our Solr 6.4.2 and 6.5.0 nodes.
> It appears Solr is leaking one SolrZkClient instance per second via 
> DistributedQueue$ChildWatcher. That one per second is quite accurate for all 
> nodes, there are about the same amount of instances as there are seconds 
> since Solr started. I know VisualVM's instance count includes 
> objects-to-be-collected, the instance count does not drop after a forced 
> garbed collection round.
> It doesn't matter how many cores or collections the nodes carry or how heavy 
> traffic is.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10420) Solr 6.x leaking one SolrZkClient instance per second

2017-04-04 Thread Markus Jelsma (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15955282#comment-15955282
 ] 

Markus Jelsma commented on SOLR-10420:
--

Ah, i found it, the problem appeared in 6.1.0. Versions 6.0.0 and 6.0.1 do not 
show this problem, the instances are eaten by GC.

> Solr 6.x leaking one SolrZkClient instance per second
> -
>
> Key: SOLR-10420
> URL: https://issues.apache.org/jira/browse/SOLR-10420
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 6.5, 6.4.2
>Reporter: Markus Jelsma
> Fix For: master (7.0), branch_6x
>
>
> One of our nodes became berzerk after a restart, Solr went completely nuts! 
> So i opened VisualVM to keep an eye on it and spotted a different problem 
> that occurs in all our Solr 6.4.2 and 6.5.0 nodes.
> It appears Solr is leaking one SolrZkClient instance per second via 
> DistributedQueue$ChildWatcher. That one per second is quite accurate for all 
> nodes, there are about the same amount of instances as there are seconds 
> since Solr started. I know VisualVM's instance count includes 
> objects-to-be-collected, the instance count does not drop after a forced 
> garbed collection round.
> It doesn't matter how many cores or collections the nodes carry or how heavy 
> traffic is.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10420) Solr 6.x leaking one SolrZkClient instance per second

2017-04-04 Thread Ishan Chattopadhyaya (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15955259#comment-15955259
 ] 

Ishan Chattopadhyaya commented on SOLR-10420:
-

The ant resolve could hang due to lock files. You could try this: {{find ~ 
-name "*lck" | xargs rm}}.

> Solr 6.x leaking one SolrZkClient instance per second
> -
>
> Key: SOLR-10420
> URL: https://issues.apache.org/jira/browse/SOLR-10420
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 6.5, 6.4.2
>Reporter: Markus Jelsma
> Fix For: master (7.0), branch_6x
>
>
> One of our nodes became berzerk after a restart, Solr went completely nuts! 
> So i opened VisualVM to keep an eye on it and spotted a different problem 
> that occurs in all our Solr 6.4.2 and 6.5.0 nodes.
> It appears Solr is leaking one SolrZkClient instance per second via 
> DistributedQueue$ChildWatcher. That one per second is quite accurate for all 
> nodes, there are about the same amount of instances as there are seconds 
> since Solr started. I know VisualVM's instance count includes 
> objects-to-be-collected, the instance count does not drop after a forced 
> garbed collection round.
> It doesn't matter how many cores or collections the nodes carry or how heavy 
> traffic is.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10420) Solr 6.x leaking one SolrZkClient instance per second

2017-04-04 Thread Markus Jelsma (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15955253#comment-15955253
 ] 

Markus Jelsma commented on SOLR-10420:
--

I only have 6.5.0 and a not-yet upgraded 6.4.2, both suffer the same.

But i just built a 6.3.0, ran it in cloud mode without registering a collection 
or core using the built-in Zookeeper. After two minutes, i had ~120 client 
objects, now i have more.

6.0.0 doesn't show increased instance counts. Can't test 6.1 and 6.2, ant keeps 
hanging on resolve for whatever reason.


> Solr 6.x leaking one SolrZkClient instance per second
> -
>
> Key: SOLR-10420
> URL: https://issues.apache.org/jira/browse/SOLR-10420
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 6.5, 6.4.2
>Reporter: Markus Jelsma
> Fix For: master (7.0), branch_6x
>
>
> One of our nodes became berzerk after a restart, Solr went completely nuts! 
> So i opened VisualVM to keep an eye on it and spotted a different problem 
> that occurs in all our Solr 6.4.2 and 6.5.0 nodes.
> It appears Solr is leaking one SolrZkClient instance per second via 
> DistributedQueue$ChildWatcher. That one per second is quite accurate for all 
> nodes, there are about the same amount of instances as there are seconds 
> since Solr started. I know VisualVM's instance count includes 
> objects-to-be-collected, the instance count does not drop after a forced 
> garbed collection round.
> It doesn't matter how many cores or collections the nodes carry or how heavy 
> traffic is.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10420) Solr 6.x leaking one SolrZkClient instance per second

2017-04-04 Thread Ishan Chattopadhyaya (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15955222#comment-15955222
 ] 

Ishan Chattopadhyaya commented on SOLR-10420:
-

Nothing changed in that code in the last few releases. Do you know if this 
worked fine in a prior 6x release?
FYI, [~dragonsinh] and [~shalinmangar] <-- experts in that code.

> Solr 6.x leaking one SolrZkClient instance per second
> -
>
> Key: SOLR-10420
> URL: https://issues.apache.org/jira/browse/SOLR-10420
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 6.5, 6.4.2
>Reporter: Markus Jelsma
> Fix For: master (7.0), branch_6x
>
>
> One of our nodes became berzerk after a restart, Solr went completely nuts! 
> So i opened VisualVM to keep an eye on it and spotted a different problem 
> that occurs in all our Solr 6.4.2 and 6.5.0 nodes.
> It appears Solr is leaking one SolrZkClient instance per second via 
> DistributedQueue$ChildWatcher. That one per second is quite accurate for all 
> nodes, there are about the same amount of instances as there are seconds 
> since Solr started. I know VisualVM's instance count includes 
> objects-to-be-collected, the instance count does not drop after a forced 
> garbed collection round.
> It doesn't matter how many cores or collections the nodes carry or how heavy 
> traffic is.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org