It's time to start working on the next major evolution of Solr (much
as we did years ago for the SolrCloud effort).  To kick things off,
I've started a project on github and implemented "off-heap" filters,
as a first step toward taking performance to the next level.

For a number of reasons, we felt it best to incubate this project at
github, where we could have a community dedicated solely to it's
advancement.  The plan is to bring it back to the ASF once it has
stabilized and gained enough traction.

Off-Heap Filters:
JVMs have never been good at dealing with large heaps. Large heaps
mean the JVM needs to do a lot of garbage collection work, and often
means some pretty long stop-the-world GC pauses.

Filters (Solr DocSets) stored in the filterCache are now allocated
off-heap and reference counted so they can be freed as soon as they
are no longer needed.  The JVM no longer needs to waste time copying
around these potentially long-lived blocks of memory. This should both
help eliminate the long GC pauses as well as increase request
throughput.

Performance Results:
  I'm still putting together a blog on the results, but they look good!
It was pretty trivial to reproduce >1s stop-the-world GC pauses with a
4GB heap, and then see those pauses completely go away when I switched
to off-heap filters.  Throughput also increased since much less time
was spent doing GC.

Next major feature: Native Code Optimizations.
In addition to moving more large data structures off-heap(like
UnInvertedField?), I am planning to implement native code
optimizations for certain hotspots.  Native code faceting would be an
obvious first choice since it can often be a CPU bottleneck.

Project resources:

https://github.com/Heliosearch/heliosearch

https://groups.google.com/forum/#!forum/heliosearch
https://groups.google.com/forum/#!forum/heliosearch-dev

Freenode IRC: #heliosearch #heliosearch-dev

-Yonik

Reply via email to