Maybe this is the issue: https://github.com/eclipse/jetty.project/issues/2169

I have noticed when number of http requests / sec are increased, CLOSE_WAITS
increase linearly until solr stops accepting socket connections. Netstat
output is
$ netstat -ptan | awk '{print $6 " " $7 }' | sort | uniq -c
   9784 CLOSE_WAIT 1/java
      1 ESTABLISHED -
      3 ESTABLISHED 1/java
      1 Foreign Address
      2 LISTEN -
      4 LISTEN 1/java
     27 TIME_WAIT -
      1 established)


The custom client code is:
            RequestConfig config = RequestConfig.custom()
                    .setConnectionRequestTimeout(4000)
                    .setConnectTimeout(4000)
                    .setSocketTimeout(4000)
                    .build();

            ConnectingIOReactor ioReactor = new
DefaultConnectingIOReactor();
            PoolingNHttpClientConnectionManager cmAsync = new
PoolingNHttpClientConnectionManager(ioReactor);
            cmAsync.setMaxTotal(10000);
            cmAsync.setDefaultMaxPerRoute(1000);

            asynClient = HttpAsyncClients.custom()
                    .setDefaultRequestConfig(config)
                    .setConnectionManager(cmAsync).build();
            asynClient.start();

            executor = Executors.newScheduledThreadPool(1);
            idleConnectionFuture = executor.scheduleAtFixedRate(() -> {
                cmAsync.closeExpiredConnections();
                cmAsync.closeIdleConnections(1, TimeUnit.SECONDS);
            }, 1, 1, TimeUnit.SECONDS);


Also /solr/admin/cores takes a very long time to respond (QTime = 300secs+). 
So curl with a timeout also causes an additional CLOSE_WAIT (as expected)
curl -m 5 'http://<solrhost>:<port>/solr/admin/cores' 

customhandler QTime = ~150 (ms) or lower even under max load for the active
cores.
 
Note there are 100's of solr cores on each solr jvm but only few needed to
be open at any given time in each solr jvm to avoid heap memory bloat.
solr.xml has this setting therefore because we're not running solr cloud.
  <int name="transientCacheSize">${transientCacheSize:30}</int>

Also <updateLog> was removed from solrconfig.xml because we saw 1000's of
threads BLOCKED on VersionBucket even with.
      <int
name="numVersionBuckets">${solr.ulog.numVersionBuckets:655360}</int>

The side effect is lots of merges [ we'll tackle that when solr stops
die-ing :-) ]



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Reply via email to