SolrCloud Archecture recommendations + related questions

Greg Pendlebury Sun, 05 Aug 2012 23:20:01 -0700

Hi All,

TL;DR version: We think we want to explore Lucene/Solr 4.0 and SolrCloud,
but I’m not sure if there is any good doco/articles on how to make
architecture choices for how to chop up big indexes… and what other general
considerations are part of the equation?


====

I’m throwing this post out to the public to see if any kind and
knowledgeable individuals could provide some educated feedback on the
options our team is currently considering for the future architecture of
our Solr indexes. We have a loose collection of Solr indexes, each with a
specific purpose and differing schemas and document makeup, containing just
over 300 million documents with varying degrees of full-text. Our existing
architecture is showing its age, as it is really just the setup used for
small/medium indexes scaled upwards.

The biggest individual index is around 140 million documents and currently
exists as a Master/Slave setup with the Master receiving all writes in the
background and the 3 load balanced slaves updating with a 5 minute poll
interval. The master index is 451gb on disk and the 3 slaves are running
JVMs with RAM allocations of 21gb (right now anyway).

We are struggling under the traffic load and/or scale of our indexes
(mainly the later I think). We know this isn’t the best way to run things,
but the index in question is a fairly new addition and each time we run
into issues we tend to make small changes to improve things in the short
term… like bumping the RAM allocation up, toying with poll intervals,
garbage collection config etc.

We’ve historically run into issues with facet queries generating a lot of
bloat on some types of fields. These had to be solved through internal
modifications, but I expect we’ll have to review this with the new version
anyway. Related to that, there are some question marks on generating good
facet data from a sharded approach. In particular though, we are really
struggling with garbage collection on the slave machines around the time
that the slave/master sync occurs because of multiple copies of the index
being held in memory until all searchers have de-referenced the old index.
The machines typically either crash from OOM when we occasionally have a
third and/or forth copy of the index appear because of really old searchers
not ‘letting go’ (hence we play with widening poll intervals), or they seem
to rarely become perpetually locked in GC and have to be restarted (not
100% why, but large heap allocations aren’t helping, and cache warming may
be a culprit).

The team has lots of things we want to try to improve things, but given the
scale of the systems it is very hard to just try things out without
considerable resourcing implications. The entire ecosystem is spread across
7 machines that are resourced in the 64gb-100gb of RAM range (this is just
me poking around our servers… not a thorough assessment). Each machine is
running several JVMs so that for each ‘type’ of index there are typically
2-4 load balanced slaves available at any given time. One of those machines
is exclusively used as the Master for all indexes and receives no search
traffic… just lots of write traffic.

I believe the answers to some of these are going to be very much dependent
on schemas and documents, so I don’t imagine anyone can answer the
questions better then we can after testing and benchmarking… but right now
we are still trying to choose where to start, so broad ideas are very
welcome.

The kind of things we are currently thinking about:

   - Moving to v4.0 (currently just completed our v3.5 upgrade) to take
   advantage of the reduced RAM consumption:
   https://issues.apache.org/jira/browse/LUCENE-2380 We are hoping that
   this has the double-whammy impact of improving garbage collection as well.
   Lots of full-text data should equal lots of Strings, and thus lots of
   savings from this change.
   - Moving to a basic sharded approach. We’ve only just started testing
   this, and I’m not involved, so I’m not sure on what early results we’ve
   got…. But:
   - Given that we’d like to move to v4.0, I believe this opens up the
   option of a SolrCloud implementation… my suspicion is that this is where
   the money is at… but I’d be happy to hear feedback (good or bad) from
   people that are using it in production.
   - Hardware; we are not certain that the current approach of a few
   colossal machines is any better that lots of smaller clustered machines…
   and it is prohibitively expensive to experiment here. We don’t think that
   our current setup using SSDs and fibre-channel connections would be
   creating too many bottlenecks on I/O, and rarely see other hardware related
   issues, but I’d again be curious if people have observed contradictory
   evidence. My suspicion is that with the changes above though, our current
   hardware would handle the load far better than it currently is.
   - Are there any sort of pros and cons documented out there for making
   decisions on sharding and resource allocation? Like:
      - Good guidelines for choosing numbers of shards versus individual
      shard sizes.
      - Garbage collection implications on high search load with nearly
      non-stop write activity? A few big JVMs versus many small JVMs?
      - Load balancing in SolrCloud during commits? Very much overlaps with
      the above point. We had toyed with the idea of scripting our own
method of
      removing a machine from the load balancer, running a ‘commit’ (pull from
      master) on it, then putting it back into rotation with the
highest priority
      to receive searches… Not knowing whether SolrCloud already does this
      though. Our current pull interval setup is fairly primitive, but changing
      it and avoiding stale search results is complicated.
      - When we shard what are we sacrificing? Performance/quality of
      faceting? This one’s a real noob question from me, since I still have to
      RTFM. It is however our primary concern with sharding.
   - There are some more technical optimisations being considered as well,
   such as playing with Linux page sizes. I also noticed a reference here:
   http://blog.mikemccandless.com/2010/07/lucenes-ram-usage-for-searching.htmlon
‘tuning swappiness down to 0’.
   - There are also some queries over the state of Solr/Lucene 4’s
   efficiency during a Slave pulling from Master’s index segments. Ie. Is
   there much improvement here? Such as per-segment searching and per-segment
   faceting (are these just goals or have they already been met?) resulting in
   reopening an index with minor segment changes should be really extremely
   cheap.
   - Related to the above point there was some recollection amongst the
   team here of Solr’s field caches causing issue for auto warming. Is this
   still current, or even correct? I could me misunderstanding myself.

Anyway, I realise this is quite a brain dump, but I’m hoping that there are
others who’ve looked at these sorts of things and are willing to share. If
there are any useful outcomes from our changes and benchmarking that others
will be interested in, I’m also happy to follow up with contributions to
public doco. Particularly in relation to the last two points (where I’m
passing on information I don’t 100% understand myself right now) we can
even devote some development resources to problems that the community is
already aware that just need some dev time; if there are performance gains
to be had.

Ta,
Greg

SolrCloud Archecture recommendations + related questions

Reply via email to