Solr's use of Lucene's Compression field

2008-09-03 Thread Grant Ingersoll
Thinking about http://lucene.markmail.org/message/mef4cdo7m3s6i3fc?q=background+merge+exception 
, it occurred to me that we probably should refactor Solr's offering  
of compression.  Currently, we rely on Field.COMPRESS from Lucene, but  
this really isn't considered best practice, see http://www.nabble.com/Need-Lucene-Compression-helpcan-pay-nominal-fee-to11001907.html#a11013878 
, because it only offers the highest level of compression, which is  
also the slowest.


Obviously, Solr needs to handle the compression on the server side.  I  
think we should have Solr do the compression, allowing users to set  
the level of compression (maybe even make it pluggable to put in your  
own compression techniques) and then just use Lucene's binary field  
capability.  Granted, this is lower priority since I doubt many people  
use compression to begin with, but, still it would be useful.


-Grant


Re: Solr's use of Lucene's Compression field

2008-09-03 Thread Mike Klaas
Agreed.  It was the simplest thing to do at the time, but it would  
definitely be preferrable to offer the much faster lesser levels of  
compression.


-Mike

On 3-Sep-08, at 8:57 AM, Grant Ingersoll wrote:

Thinking about http://lucene.markmail.org/message/mef4cdo7m3s6i3fc?q=background+merge+exception 
, it occurred to me that we probably should refactor Solr's offering  
of compression.  Currently, we rely on Field.COMPRESS from Lucene,  
but this really isn't considered best practice, see http://www.nabble.com/Need-Lucene-Compression-helpcan-pay-nominal-fee-to11001907.html#a11013878 
, because it only offers the highest level of compression, which is  
also the slowest.


Obviously, Solr needs to handle the compression on the server side.   
I think we should have Solr do the compression, allowing users to  
set the level of compression (maybe even make it pluggable to put in  
your own compression techniques) and then just use Lucene's binary  
field capability.  Granted, this is lower priority since I doubt  
many people use compression to begin with, but, still it would be  
useful.


-Grant




Re: Solr's use of Lucene's Compression field

2008-09-03 Thread Mike Klaas
Also I see that another Lucene bug (LUCENE-1374) was found relating to  
compressed fields in lucene (when we first added compressed field  
support to solr a lucene bug involving lazy-loaded fields and  
compression was uncovered, too).


It would be good to change the implementation simply to avoid relying  
on a deprecated lucene feature that isn't well exercised in development.


-Mike

On 3-Sep-08, at 11:36 AM, Mike Klaas wrote:

Agreed.  It was the simplest thing to do at the time, but it would  
definitely be preferrable to offer the much faster lesser levels of  
compression.


-Mike

On 3-Sep-08, at 8:57 AM, Grant Ingersoll wrote:

Thinking about http://lucene.markmail.org/message/mef4cdo7m3s6i3fc?q=background+merge+exception 
, it occurred to me that we probably should refactor Solr's  
offering of compression.  Currently, we rely on Field.COMPRESS from  
Lucene, but this really isn't considered best practice, see http://www.nabble.com/Need-Lucene-Compression-helpcan-pay-nominal-fee-to11001907.html#a11013878 
, because it only offers the highest level of compression, which is  
also the slowest.


Obviously, Solr needs to handle the compression on the server  
side.  I think we should have Solr do the compression, allowing  
users to set the level of compression (maybe even make it pluggable  
to put in your own compression techniques) and then just use  
Lucene's binary field capability.  Granted, this is lower priority  
since I doubt many people use compression to begin with, but, still  
it would be useful.


-Grant