markharwood opened a new pull request #1234: Add compression for Binary doc 
value fields
URL: https://github.com/apache/lucene-solr/pull/1234
 
 
   This PR stores groups of 32 doc values in LZ4 compressed blocks. 
   
   ### Write performance
   Test results for loading 680mb of log data (1.8m docs) are as follows:
   
   Branch | Load time (seconds, single thread) | Resulting index size (mb)
   ----|----|----
   master| 16| 680
   this PR| 11| 78
   
   ### Read performance
   Time taken to read 5,000 random doc IDs from above indices
   
   Branch | Read time (milliseconds, single thread) 
   ----|----
   master| 284
   this PR| 63
   
   On this particular test, the read + write speeds and resulting index size 
were all improved over the current master implementation. Obviously performance 
will vary with other tests with the main factors for changes being:
   * size of fields, 
   * compress-ability of field contents,
   * read access patterns (hitting same compressed blocks vs different ones)
   * variation from doc-to-doc in field value sizes,
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to