RE: If you could have one feature in Lucene...

Uwe Schindler Sat, 27 Feb 2010 07:18:00 -0800

Hi Glen,

 
> Pluggable compression allowing for alternatives to gzip for text
> compression for storing.
> Specifically I am interested in bzip2[1] as implemented in Apache
> Commons Compress[2].
> While bzip2 compression is considerable slower than gzip (although
> decompression is not too much slower than gzip) it compresses much
> better than gzip (especially text).
> 
> Having the choice would be helpful, and for Lucene usage for non-text
> indexing, content specific compression algorithms may outperform the
> default gzip.


Since Version 3.0 / 2.9 of Lucene compression support was removed entirely (in 
2.9 still avail as deprecated). All you now have to do is simply store your 
compressed stored fields as a byte[] (see Field javadocs). By that you can use 
any compression. The problems with gzip and the other available compression 
algos lead us to removing the compression support from Lucene (as it had lots 
of problems). In general the way to go is: Create a ByteArrayOutputStream and 
wrap with any compression filter, then feed your data in and use "new 
Field(name,stream.getBytes())". On the client side just use the inverse 
(Document.getBinaryValue(), create input stream on top of byte[] and 
decompress).

Uwe


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

RE: If you could have one feature in Lucene...

Reply via email to