Hi Glen, > Pluggable compression allowing for alternatives to gzip for text > compression for storing. > Specifically I am interested in bzip2[1] as implemented in Apache > Commons Compress[2]. > While bzip2 compression is considerable slower than gzip (although > decompression is not too much slower than gzip) it compresses much > better than gzip (especially text). > > Having the choice would be helpful, and for Lucene usage for non-text > indexing, content specific compression algorithms may outperform the > default gzip.
Since Version 3.0 / 2.9 of Lucene compression support was removed entirely (in 2.9 still avail as deprecated). All you now have to do is simply store your compressed stored fields as a byte[] (see Field javadocs). By that you can use any compression. The problems with gzip and the other available compression algos lead us to removing the compression support from Lucene (as it had lots of problems). In general the way to go is: Create a ByteArrayOutputStream and wrap with any compression filter, then feed your data in and use "new Field(name,stream.getBytes())". On the client side just use the inverse (Document.getBinaryValue(), create input stream on top of byte[] and decompress). Uwe --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org