RE: ANN: Solr Next

2014-06-10 Thread Jean-Sebastien Vachon
Hi Yonik,

Very impressive results. Looking forward to use this on our systems. Any idea 
what`s the plan for this feature? Will it make its way into Solr 4.9? or do we 
have to switch to HeliosSearch to be able to use it?

Thanks

 -Original Message-
 From: Yonik Seeley [mailto:ysee...@gmail.com]
 Sent: June-09-14 10:50 AM
 To: solr-user@lucene.apache.org
 Subject: Re: ANN: Solr Next
 
 On Tue, Jan 7, 2014 at 1:53 PM, Yonik Seeley ysee...@gmail.com wrote:
 [...]
  Next major feature: Native Code Optimizations.
  In addition to moving more large data structures off-heap(like
  UnInvertedField?), I am planning to implement native code
  optimizations for certain hotspots.  Native code faceting would be an
  obvious first choice since it can often be a CPU bottleneck.
 
 It's in!  Abbreviated report: 2x performance increase over stock solr faceting
 (which is already fast!) http://heliosearch.org/native-code-faceting/
 
 -Yonik
 http://heliosearch.org -- making solr shine
 
  Project resources:
 
  https://github.com/Heliosearch/heliosearch
 
  https://groups.google.com/forum/#!forum/heliosearch
  https://groups.google.com/forum/#!forum/heliosearch-dev
 
  Freenode IRC: #heliosearch #heliosearch-dev
 
  -Yonik
 
 -
 Aucun virus trouvé dans ce message.
 Analyse effectuée par AVG - www.avg.fr
 Version: 2014.0.4570 / Base de données virale: 3950/7571 - Date:
 27/05/2014 La Base de données des virus a expiré.


Re: ANN: Solr Next

2014-06-09 Thread Yonik Seeley
On Tue, Jan 7, 2014 at 1:53 PM, Yonik Seeley ysee...@gmail.com wrote:
[...]
 Next major feature: Native Code Optimizations.
 In addition to moving more large data structures off-heap(like
 UnInvertedField?), I am planning to implement native code
 optimizations for certain hotspots.  Native code faceting would be an
 obvious first choice since it can often be a CPU bottleneck.

It's in!  Abbreviated report: 2x performance increase over stock solr
faceting (which is already fast!)
http://heliosearch.org/native-code-faceting/

-Yonik
http://heliosearch.org -- making solr shine

 Project resources:

 https://github.com/Heliosearch/heliosearch

 https://groups.google.com/forum/#!forum/heliosearch
 https://groups.google.com/forum/#!forum/heliosearch-dev

 Freenode IRC: #heliosearch #heliosearch-dev

 -Yonik


Re: ANN: Solr Next

2014-01-13 Thread Yonik Seeley
Update on the my initial performance findings for off-heap filters:
http://heliosearch.org/off-heap-filters/

-Yonik
http://heliosearch.org -- making solr shine


On Tue, Jan 7, 2014 at 1:53 PM, Yonik Seeley ysee...@gmail.com wrote:
 Off-Heap Filters:
 JVMs have never been good at dealing with large heaps. Large heaps
 mean the JVM needs to do a lot of garbage collection work, and often
 means some pretty long stop-the-world GC pauses.

 Filters (Solr DocSets) stored in the filterCache are now allocated
 off-heap and reference counted so they can be freed as soon as they
 are no longer needed.  The JVM no longer needs to waste time copying
 around these potentially long-lived blocks of memory. This should both
 help eliminate the long GC pauses as well as increase request
 throughput.

 Performance Results:
   I'm still putting together a blog on the results, but they look good!
 It was pretty trivial to reproduce 1s stop-the-world GC pauses with a
 4GB heap, and then see those pauses completely go away when I switched
 to off-heap filters.  Throughput also increased since much less time
 was spent doing GC.


Re: ANN: Solr Next

2014-01-13 Thread Mikhail Khludnev
Yonik,
Don't you think that proper codec format can get the comparable gain
without changes in design?
https://issues.apache.org/jira/browse/LUCENE-5052


On Mon, Jan 13, 2014 at 9:15 PM, Yonik Seeley ysee...@gmail.com wrote:

 Update on the my initial performance findings for off-heap filters:
 http://heliosearch.org/off-heap-filters/

 -Yonik
 http://heliosearch.org -- making solr shine


 On Tue, Jan 7, 2014 at 1:53 PM, Yonik Seeley ysee...@gmail.com wrote:
  Off-Heap Filters:
  JVMs have never been good at dealing with large heaps. Large heaps
  mean the JVM needs to do a lot of garbage collection work, and often
  means some pretty long stop-the-world GC pauses.
 
  Filters (Solr DocSets) stored in the filterCache are now allocated
  off-heap and reference counted so they can be freed as soon as they
  are no longer needed.  The JVM no longer needs to waste time copying
  around these potentially long-lived blocks of memory. This should both
  help eliminate the long GC pauses as well as increase request
  throughput.
 
  Performance Results:
I'm still putting together a blog on the results, but they look good!
  It was pretty trivial to reproduce 1s stop-the-world GC pauses with a
  4GB heap, and then see those pauses completely go away when I switched
  to off-heap filters.  Throughput also increased since much less time
  was spent doing GC.




-- 
Sincerely yours
Mikhail Khludnev
Principal Engineer,
Grid Dynamics

http://www.griddynamics.com
 mkhlud...@griddynamics.com


Re: ANN: Solr Next

2014-01-13 Thread Yonik Seeley
That would be cool, but seems it would only work for simple term queries.
I guess having both would be best.

http://heliosearch.org -- off-heap filters for solr
-Yonik


On Mon, Jan 13, 2014 at 2:21 PM, Mikhail Khludnev
mkhlud...@griddynamics.com wrote:
 Yonik,
 Don't you think that proper codec format can get the comparable gain
 without changes in design?
 https://issues.apache.org/jira/browse/LUCENE-5052


 On Mon, Jan 13, 2014 at 9:15 PM, Yonik Seeley ysee...@gmail.com wrote:

 Update on the my initial performance findings for off-heap filters:
 http://heliosearch.org/off-heap-filters/

 -Yonik
 http://heliosearch.org -- making solr shine


 On Tue, Jan 7, 2014 at 1:53 PM, Yonik Seeley ysee...@gmail.com wrote:
  Off-Heap Filters:
  JVMs have never been good at dealing with large heaps. Large heaps
  mean the JVM needs to do a lot of garbage collection work, and often
  means some pretty long stop-the-world GC pauses.
 
  Filters (Solr DocSets) stored in the filterCache are now allocated
  off-heap and reference counted so they can be freed as soon as they
  are no longer needed.  The JVM no longer needs to waste time copying
  around these potentially long-lived blocks of memory. This should both
  help eliminate the long GC pauses as well as increase request
  throughput.
 
  Performance Results:
I'm still putting together a blog on the results, but they look good!
  It was pretty trivial to reproduce 1s stop-the-world GC pauses with a
  4GB heap, and then see those pauses completely go away when I switched
  to off-heap filters.  Throughput also increased since much less time
  was spent doing GC.




 --
 Sincerely yours
 Mikhail Khludnev
 Principal Engineer,
 Grid Dynamics

 http://www.griddynamics.com
  mkhlud...@griddynamics.com


ANN: Solr Next

2014-01-07 Thread Yonik Seeley
It's time to start working on the next major evolution of Solr (much
as we did years ago for the SolrCloud effort).  To kick things off,
I've started a project on github and implemented off-heap filters,
as a first step toward taking performance to the next level.

For a number of reasons, we felt it best to incubate this project at
github, where we could have a community dedicated solely to it's
advancement.  The plan is to bring it back to the ASF once it has
stabilized and gained enough traction.

Off-Heap Filters:
JVMs have never been good at dealing with large heaps. Large heaps
mean the JVM needs to do a lot of garbage collection work, and often
means some pretty long stop-the-world GC pauses.

Filters (Solr DocSets) stored in the filterCache are now allocated
off-heap and reference counted so they can be freed as soon as they
are no longer needed.  The JVM no longer needs to waste time copying
around these potentially long-lived blocks of memory. This should both
help eliminate the long GC pauses as well as increase request
throughput.

Performance Results:
  I'm still putting together a blog on the results, but they look good!
It was pretty trivial to reproduce 1s stop-the-world GC pauses with a
4GB heap, and then see those pauses completely go away when I switched
to off-heap filters.  Throughput also increased since much less time
was spent doing GC.

Next major feature: Native Code Optimizations.
In addition to moving more large data structures off-heap(like
UnInvertedField?), I am planning to implement native code
optimizations for certain hotspots.  Native code faceting would be an
obvious first choice since it can often be a CPU bottleneck.

Project resources:

https://github.com/Heliosearch/heliosearch

https://groups.google.com/forum/#!forum/heliosearch
https://groups.google.com/forum/#!forum/heliosearch-dev

Freenode IRC: #heliosearch #heliosearch-dev

-Yonik