Re: Default Index config

Shawn Heisey Wed, 28 Mar 2018 22:10:00 -0700

On 3/28/2018 9:44 PM, mganeshs wrote:

Regarding auto commit, we discussed lot with our product owners and atlast
we are forced to keep it to 1sec and we couldn't increase further. As this
itself, sometimes our customers says that they have to refresh their pages
for couple of times to get the update from solr. So we can't increase
further.

I understand pressure from nontechnical departments for very lowresponse times. Executives, sales, and marketing are usually the onesmaking those kinds of demands. I think you should push back on thatparticular requirement on technical grounds.

A soft commit interval that low *can* contribute to performance issues. It doesn't always cause them, I'm just saying that it *can*. Maybeincreasing it to five or ten seconds could help performance, or maybe itwill make no real difference at all.

Yes. As of now only solr is running in that machine. But intially we were
running along with hbase region servers and was working fine. But due to CPU
spikes and OS disk cache, we are forced to move solr to separate machine.
But just I checked, our solr data folder size is coming only to 17GB. 2
collection has around 5GB and other are have 2 to 3 GB of size. If you say
that only 2/3 of total size comes to OS disk cache, in top command VIRT
property it's always 28G, which means more than what we have. Why is that...
Pls check that top command & GC we used in this  doc
<https://docs.google.com/document/d/1SaKPbGAKEPP8bSbdvfX52gaLsYWnQfDqfmV802hWIiQ/edit?usp=sharing>

The VIRT memory should be about equivalent to the RES size plus the sizeof all the index data on the system. So that looks about right. Theactual amount of memory allocated by Java for the heap and other memorystructures is approximately equal to RES minus SHR.

I am not sure whether the SHR size gets counted in VIRT. It probablydoes. On some Linux systems, SHR grows to a very high number, but whenthat happens, it typically doesn't reflect actual memory usage. I donot know why this sometimes happens.That is a question for Oracle, sincethey are the current owners of Java.

Only 5GB is in the buff/cache area. The system has 13GB of freememory. That system is NOT low on memory.

With 4 CPUs, a load average in the 3-4 range is an indication that theserver is busy. I can't say for sure whether it means the server isoverloaded. Sometimes the load average on a system that's working wellcan go higher than the CPU count, sometimes a load average well belowthe CPU count is shown on a system with major performance issues. It'sdifficult to say. The instantaneous CPU usage on the Solr process inthat screenshot is 384 percent. Which means that it is exercising theCPUs hard. But this might be perfectly OK. 96.3 percent of the CPU isbeing used by user processes, a VERY small amount is being used bysystem, and the iowait percentage is zero. Typically servers that arestruggling will have a higher percentage in system and/or iowait, and Idon't see that here.

Queries are quiet fast, most of time simple queries with fq. Regarding
index, during peak hours, we index around 100 documents in a second in a
average.

That's good. And not surprising, given how little memory pressure andhow much free memory there is. An indexing rate of 100 per seconddoesn't seem like a lot of indexing to me, but for some indexes, itmight be very heavy. If your general performance is good, I wouldn't betoo concerned about it.

Regarding release, initially we tried with 6.4.1 and since many discussions
over here, mentioned like moving to 6.5.x will solve lot of performance
issues etc, so we moved to 6.5.1. We will move to 6.6.3 in near future.

The 6.4.1 version had a really bad bug in it that killed performance formost users. Some might not have even noticed a problem, though. It'sdifficult to say for sure whether it would be something you wouldnotice, or whether you would see an increase in performance by upgrading.

Hope I have given enough information. One strange thing is that, CPU and
memory spike are not seen when we move to r4.xlarge to r4.2xlarge ( which is
8 core with 60 GB RAM ). But this would not be cost effective. What's making
CPU and memory to go high in this new version ( due to doc values )? If I
switch off docvalues will CPU & Memory spikes will get reduced ?

Overall memory usage (outside of the Java heap) looks great to me. CPUusage is high, but I can't tell if it's TOO high. As a proof of concept,I think you should try raising autoSoftCommit to five seconds. IfmaxDocs is configured on either autoCommit or autoSoftCommit, remove itso that only maxTime is there, regardless of whether you actually changemaxTime. If raising autoSoftCommit makes no real difference, then the 1second autoSoftCommit probably isn't a worry. I bet if you raised it tofive seconds, most users would never notice anything different.

If you want to provide a GC log to us that covers a relatively longtimeframe, we can analyze that and let you know whether your heap issized appropriately, or whether it might be too big or too small, andwhether garbage collection pauses are keeping your CPU usage high. Thestandard Solr startup in most current versions always logs GC activity. It will usually be in the same directory as solr.log.

Do you know what typical and peak queries per second are on your Solrservers? If your query rate is high, handling that will probablyrequire more servers and a higher replica count.


Thanks,
Shawn

Re: Default Index config

Reply via email to