[ https://issues.apache.org/jira/browse/LUCENE-5688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14014922#comment-14014922 ]
Shai Erera commented on LUCENE-5688: ------------------------------------ Ahh, I see now that you only wrote a DVFormat, not a Codec. In that case I agree, apps should plug it in per-field and that it doesn't need to wrap another format. Can you perhaps make the Consumer/Producer package-private? I think only the Format needs to be public? About Binary field, indeed it doesn't write the data if a BytesRef is missing, but it does write all the meta information, e.g. the missing bitset, the addresses (in case the BytesRef aren't of equal length). So I think sparseness should be really sparse. But I'm fine if you leave that out for now - we first need to make sure the numeric field performs and that there are any gains (even if only during indexing). > NumericDocValues fields with sparse data can be compressed better > ------------------------------------------------------------------ > > Key: LUCENE-5688 > URL: https://issues.apache.org/jira/browse/LUCENE-5688 > Project: Lucene - Core > Issue Type: Improvement > Reporter: Varun Thacker > Priority: Minor > Attachments: LUCENE-5688.patch, LUCENE-5688.patch > > > I ran into this problem where I had a dynamic field in Solr and indexed data > into lots of fields. For each field only a few documents had actual values > and the remaining documents the default value ( 0 ) got indexed. Now when I > merge segments, the index size jumps up. > For example I have 10 segments - Each with 1 DV field. When I merge segments > into 1 that segment will contain all 10 DV fields with lots if 0s. > This was the motivation behind trying to come up with a compression for a use > case like this. -- This message was sent by Atlassian JIRA (v6.2#6252) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org