On 11/21/2013 6:41 PM, Dave Seltzer wrote: > In digging a little deeper and looking at the config I see that > <nrtMode>true</nrtMode> is commented out. I believe this is the default > setting. So I don't know if NRT is enabled or not. Maybe just a red herring.
I had never seen this setting before. The default is true. SolrCloud requires that it be set to true. Looks like it's a new parameter in 4.5, added by SOLR-4909. From what I can tell reading the issue, turning it off effectively disables soft commits. https://issues.apache.org/jira/browse/SOLR-4909 You've said that you are adding about 3 documents per second, but you haven't said anything about how often you are doing commits. Erick's question basically boils down to this: How quickly after indexing do you expect the changes to be visible on a search, and how often are you doing commits? Generally speaking (and ignoring the fact that nrtMode now exists), NRT is not something you enable, it's something you try to achieve, by using soft commits quickly and often, and by adjusting the configuration to make the commits go faster. If you are trying to keep the interval between indexing and document visibility down to less than a few seconds (especially if it's less than one second), then you are trying to achieve NRT. There's a lot of information on the following wiki page about performance problems. This specific link is to the last part of that page, which deals with slow commits: http://wiki.apache.org/solr/SolrPerformanceProblems#Slow_commits > I don't know what Garbage Collector we're using. In this test I'm running > Solr 4.5.1 using Jetty from the example directory. If you aren't using any tuning parameters beyond setting the max heap, then you are using the default parallel collector. It's a poor choice for Solr unless your heap is very small. At 6GB, yours isn't very small. It's not particularly huge either, but not small. > The CPU on the 8 nodes all stay around 70% use during the test. The nodes > have 28GB of RAM. Java is using about 6GB and the rest is being used by OS > cache. How big is your index? If it's larger than about 30 GB, you probably need more memory. If it's much larger than about 40 GB, you definitely need more memory. > To perform the test we're running 200 concurrent threads in JMeter. The > threads hit HAProxy which loadbalances the requests among the nodes. Each > query is for a random word out of a list of about 10,000 words. Some of the > queries have faceting turned on. That's a pretty high query load. If you want to get anywhere near top performance out of it, you'll want to have enough memory to fit your entire index into RAM. You'll also need to reduce the load introduced by indexing. A large part of the load from indexing comes from commits. > Because we're heavily loading the system the queries are returning quite > slowly. For a simple search, the average response time was 300ms. The peak > response time was 11,000ms. The spikes in latency seem to occur about every > 2.5 minutes. I would bet that you're having one or both of the following issues: 1) Garbage collection issues from one or more of the following: a) Heap too small. b) Using the default GC instead of CMS with tuning. 2) General performance issues from one or more of the following: a) Not enough cache memory for your index size. b) Too-frequent commits. c) Commits taking a lot of time and resources due to cache warming. With a high query and index load, any problems become magnified. > I haven't spent that much time messing with SolrConfig, so most of the > settings are the out-of-the-box defaults. The defaults are very good for small to medium indexes and low to medium query load. If you have a big index and/or high query load, you'll generally need to tune. Thanks, Shawn