It's generally considered best practice to compress things first in
your app and then add them as a binary field. That being said, I
don't see why that would blow up on it's own. Have you tried
compressing it outside of Lucene to see what happens? If you can
reproduce it as a test case for Lucene, that would be great.
From FieldsWriter, Lucene's compression code looks like:
private final byte[] compress (byte[] input) {
// Create the compressor with highest level of compression
Deflater compressor = new Deflater();
compressor.setLevel(Deflater.BEST_COMPRESSION);
// Give the compressor the data to compress
compressor.setInput(input);
compressor.finish();
/*
* Create an expandable byte array to hold the compressed data.
* You cannot use an array that's the same size as the orginal
because
* there is no guarantee that the compressed data will be
smaller than
* the uncompressed data.
*/
ByteArrayOutputStream bos = new
ByteArrayOutputStream(input.length);
// Compress the data
byte[] buf = new byte[1024];
while (!compressor.finished()) {
int count = compressor.deflate(buf);
bos.write(buf, 0, count);
}
compressor.end();
// Get the compressed data
return bos.toByteArray();
}
There is an interesting comment in that code about how the compressed
data won't necessarily be smaller, so maybe you have entered the
compression twilight zone.
HTH
-Grant
On Apr 2, 2008, at 12:51 AM, Sebastin wrote:
Hi All,
is there any possibility to create compression store for the
following types of string in lucene index store?
String str = "II0264.D05|00022745|ABCDE|03/01/2008 00:23:12|00035|
9840836588| 129382152520| 04F4243B600408|04F4243B600408|
|11919898456123|354943011025810L| "CPTBS2I"| "ABCD3E"|11|
1234510003243219I|"
I try to store these fields as Field.Store.COMPRESSION but it
exceeds the
original size of the file?
--
View this message in context:
http://www.nabble.com/Lucene-Compression-tp16442112p16442112.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
--------------------------
Grant Ingersoll
http://www.lucenebootcamp.com
Next Training: April 7, 2008 at ApacheCon Europe in Amsterdam
Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]