[ https://issues.apache.org/jira/browse/SOLR-9290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Shalin Shekhar Mangar updated SOLR-9290: ---------------------------------------- Attachment: SOLR-9290-debug.patch This patch applies on 5.3.2. This patch adds a monitor thread for the pool created in UpdateShardHandler and with this applied, I cannot reproduce this problem anymore. My hypothesis is that: We have a large limit for maxConnections and maxConnectionsPerHost. As long as the limit isn't met and the servers are decently busy, new connections will continue to be created from the pool. In 5.x and 6.x, we do not have a policy of closing idle connections so httpclient will keep these connections in CLOSE_WAIT for reuse. So we must periodically close such connections once they're idle to avoid the number of such connections increasing to absurd limits. Also, I think the reason this wasn't reproducible on master is because SOLR-4509 enabled eviction of idle threads by calling HttpClientBuilder#evictIdleConnections with a 50 second limit. > TCP-connections in CLOSE_WAIT spikes during heavy indexing when SSL is enabled > ------------------------------------------------------------------------------ > > Key: SOLR-9290 > URL: https://issues.apache.org/jira/browse/SOLR-9290 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Affects Versions: 5.5.1, 5.5.2 > Reporter: Anshum Gupta > Priority: Critical > Attachments: SOLR-9290-debug.patch, SOLR-9290-debug.patch, > setup-solr.sh > > > Heavy indexing on Solr with SSL leads to a lot of connections in CLOSE_WAIT > state. > At my workplace, we have seen this issue only with 5.5.1 and could not > reproduce it with 5.4.1 but from my conversation with Shalin, he knows of > users with 5.3.1 running into this issue too. > Here's an excerpt from the email [~shaie] sent to the mailing list (about > what we see: > {quote} > 1) It consistently reproduces on 5.5.1, but *does not* reproduce on 5.4.1 > 2) It does not reproduce when SSL is disabled > 3) Restarting the Solr process (sometimes both need to be restarted), the > count drops to 0, but if indexing continues, they climb up again > When it does happen, Solr seems stuck. The leader cannot talk to the > replica, or vice versa, the replica is usually put in DOWN state and > there's no way to fix it besides restarting the JVM. > {quote} > Here's the mail thread: > http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201607.mbox/%3c46cc66220a8143dc903fa34e79205...@vp-exc01.dips.local%3E > Creating this issue so we could track this and have more people comment on > what they see. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org