Hello, In our solr non-cloud env., we are seeing lots of CLOSE_WAIT, causing jvm to stop "working" with 3 mins of solr start.
solr [ /opt/solr ]$ netstat -anp | grep 8983 | grep CLOSE_WAIT | grep 10.xxx.xxx.xxx | wc -l 9453 Only option is then`kill -9` because even `jcmd <pid> Thread.print` is unable to connect to the jvm. The problem can be reproduced at will. Any suggestions what could be causing this or the fix ? Details of system are as follows and has been setup for "bulk indexing". ------------- Solr / server: v6.2.2 non-solrcloud in a docker with kubernetes java: 1.8.0_151 25.151-b12 HotSpot 64bit | Oracle jvm: heap 30GB os: Linux 3.16.0-4-amd64 #1 SMP Debian 3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux os memory: 230GB | no swap configured os cpu: 32vCPU jvm: "-XX:+UseLargePages", "-XX:LargePageSizeInBytes=2m", "-Xms512m", "-Xmx512m", "-XX:NewRatio=3", "-XX:SurvivorRatio=4", "-XX:TargetSurvivorRatio=90", "-XX:MaxTenuringThreshold=8", "-XX:+UseConcMarkSweepGC", "-XX:+UseParNewGC", "-XX:ConcGCThreads=4", "-XX:ParallelGCThreads=4", "-XX:+CMSScavengeBeforeRemark", "-XX:PretenureSizeThreshold=64m", "-XX:+UseCMSInitiatingOccupancyOnly", "-XX:CMSInitiatingOccupancyFraction=50", "-XX:CMSMaxAbortablePrecleanTime=6000", "-XX:+CMSParallelRemarkEnabled", "-XX:+ParallelRefProcEnabled", non-cloud solr.xml: transientCacheSize = 30 shareSchema = true Also only 4 cores are POSTed to. Client / java8 app: An AsyncHTTPClient POST-ing gzip payloads. PoolingNHttpClientConnectionManager maxtotal=10,000 and maxperroute=1000) ConnectionRequestTimeout = ConnectTimeout = SocketTimeout = 4000 (4 secs) Gzip payloads: About 800 json messages like this. [ {id:"abcdefxxxxx", datetimestamp:"xxxxxx", key1:"xxxxxx", key2:"zzzzz", ....}, .... ] POST rate: Each of 4 solr core receives ~32 payloads per second from the custom java app (plugin handler metrics in solr reports the same). Approx ~102,000 docs per sec in total (32 payload x 800 docs x 4 solr cores) Document uniqueness: No doc or id is ever repeated or concurrently sent. No atomic updates needed (overwrite=false in AddUpdateCommand was set in solr handler) Solrconfig.xml For bulk indexing requirement, updatelog and softcommit were minimized / removed. <indexConfig> <lockType>none</lockType> <ramBufferSizeMB>200</ramBufferSizeMB> <mergeScheduler class="org.apache.lucene.index.ConcurrentMergeScheduler"> <int name="maxThreadCount">1</int> <int name="maxMergeCount">6</int> </mergeScheduler> </indexConfig> <updateHandler class="solr.DirectUpdateHandler2"> <autoCommit> <maxTime>${solr.autoCommit.maxTime:10000}</maxTime> <openSearcher>true</openSearcher> </autoCommit> <autoSoftCommit> <maxTime>${solr.autoSoftCommit.maxTime:-1}</maxTime> <maxDocs>${solr.autoSoftCommit.maxDocs:-1}</maxDocs> <openSearcher>false</openSearcher> </autoSoftCommit> </updateHandler> -M -- Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html