How much total physical memory on your machine? Lucene holds a lot of the index in MMapDirectory space. My starting point is to allocate no more than 50% of my physical memory to the Java heap. You’re allocating 31G, if you don’t have at _least_ 64G on these machines you’re probably swapping.
See: http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html Best, Erick > On Aug 5, 2019, at 10:58 AM, dinesh naik <dineshkumarn...@gmail.com> wrote: > > Hi Shawn, > yes i am running solr in cloud mode and Even after adding the params row=0 > and distrib=false, the query response is more than 15 sec due to more than > a billion doc set. > Also the soft commit setting can not be changed to a higher no. due to > requirement from business team. > > http://hostname:8983/solr/parts/select?indent=on&q=*:*&rows=0&wt=json&distrib=false > takes more than 10 sec always. > > Here are the java heap and G1GC setting i have , > > /usr/java/default/bin/java -server -Xmx31g -Xms31g -XX:+UseG1GC > -XX:MaxGCPauseMillis=250 -XX:ConcGCThreads=5 > -XX:ParallelGCThreads=10 -XX:+UseLargePages -XX:+AggressiveOpts > -XX:+PerfDisableSharedMem -XX:+ParallelRefProcEnabled > -XX:InitiatingHeapOccupancyPercent=50 -XX:G1ReservePercent=18 > -XX:MaxNewSize=6G -XX:PrintFLSStatistics=1 > -XX:+PrintPromotionFailure -XX:+HeapDumpOnOutOfMemoryError > -XX:HeapDumpPath=/solr7/logs/heapdump > -verbose:gc -XX:+PrintHeapAtGC -XX:+PrintGCDetails -XX:+PrintGCDateStamps > -XX:+PrintGCTimeStamps > -XX:+PrintTenuringDistribution -XX:+PrintGCApplicationStoppedTime > > JVM heap has never crossed 20GB in my setup , also Young G1GC timing is > well within milli seconds (in range of 25-200 ms). > > On Mon, Aug 5, 2019 at 6:37 PM Shawn Heisey <apa...@elyograg.org> wrote: > >> On 8/4/2019 10:15 PM, dinesh naik wrote: >>> My question is regarding the custom query being used. Here i am querying >>> for field _root_ which is available in all of my cluster and defined as a >>> string field. The result for _root_:abc might not get me any match as >>> well(i am ok with not finding any matches, the query should not be taking >>> 10-15 seconds for getting the response). >> >> Typically the *:* query is the fastest option. It is special syntax >> that means "all documents" and it usually executes very quickly. It >> will be faster than querying for a value in a specific field, which is >> what you have defined currently. >> >> I will typically add a "rows" parameter to the ping handler with a value >> of 1, so Solr will not be retrieving a large amount of data. If you are >> running Solr in cloud mode, you should experiment with setting the >> distrib parameter to false, which will hopefully limit the query to the >> receiving node only. >> >> Erick has already mentioned GC pauses as a potential problem. With a >> 10-15 second response time, I think that has high potential to be the >> underlying cause. >> >> The response you included at the beginning of the thread indicates there >> are 1.3 billion documents, which is going to require a fair amount of >> heap memory. If seeing such long ping times with a *:* query is >> something that happens frequently, your heap may be too small, which >> will cause frequent full garbage collections. >> >> The very low autoSoftCommit time can contribute to system load. I think >> it's very likely, especially with such a large index, that in many cases >> those automatic commits are taking far longer than 5 seconds to >> complete. If that's the case, you're not achieving a 5 second >> visibility interval and you are putting a lot of load on Solr, so I >> would consider increasing it. >> >> Thanks, >> Shawn >> > > > -- > Best Regards, > Dinesh Naik