[ https://issues.apache.org/jira/browse/LUCENE-4509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13610625#comment-13610625 ]
Commit Tag Bot commented on LUCENE-4509: ---------------------------------------- [branch_4x commit] Adrien Grand http://svn.apache.org/viewvc?view=revision&revision=1403032 LUCENE-4509: improve test coverage of CompressingStoredFieldsFormat (merged from r1403027). > Make CompressingStoredFieldsFormat the new default StoredFieldsFormat impl > -------------------------------------------------------------------------- > > Key: LUCENE-4509 > URL: https://issues.apache.org/jira/browse/LUCENE-4509 > Project: Lucene - Core > Issue Type: Wish > Components: core/store > Reporter: Adrien Grand > Assignee: Adrien Grand > Priority: Minor > Fix For: 4.1 > > Attachments: LUCENE-4509.patch, LUCENE-4509.patch > > > What would you think of making CompressingStoredFieldsFormat the new default > StoredFieldsFormat? > Stored fields compression has many benefitsĀ : > - it makes the I/O cache work for us, > - file-based index replication/backup becomes cheaper. > Things to know: > - even with incompressible data, there is less than 0.5% overhead with LZ4, > - LZ4 compression requires ~ 16kB of memory and LZ4 HC compression requires > ~ 256kB, > - LZ4 uncompression has almost no memory overhead, > - on my low-end laptop, the LZ4 impl in Lucene uncompresses at ~ 300mB/s. > I think we could use the same default parameters as in CompressingCodec : > - LZ4 compression, > - in-memory stored fields index that is very memory-efficient (less than 12 > bytes per block of compressed docs) and uses binary search to locate > documents in the fields data file, > - 16 kB blocks (small enough so that there is no major slow down when the > whole index would fit into the I/O cache anyway, and large enough to provide > interesting compression ratiosĀ ; for example Robert got a 0.35 compression > ratio with the geonames.org database). > Any concerns? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org