The really major win would be if we handle integer (especially boolean)
matrices specially.  Attacking the 4 byte cost of the index in a sparse
vector, but attacking the 8 byte value would be even better.  For sparse
boolean matrices, the value can go away entirely.

All of these efforts will have the effect of making any downstream
compression less valuable resulting in much less impressive gains.  The
exception is delta encoding of indexes which will probably make the
downstream compressor more effective.

On Sun, May 2, 2010 at 12:45 PM, Drew Farris <drew.far...@gmail.com> wrote:

> Do anyone have any idea whether greater gains to be found by finely tuning
> the base encoding vs. relying on some form of SequenceFile block
> compression? (or do both approaches compliment each other nicely?)
>

Reply via email to