Well I did a purer, local test and results are more reasonable.
Writing 10000 random-access sparse vectors, 1000 entries, each a
random number to 100000, takes 5.4s before versus 4.7s with changes.
That must be I/O savings since it takes a little more CPU -- and
that's savings writing to an SSD. Imagine the savings over a network.

Size goes down from 120MB to 108MB, which is in line with expectations.

I saw and fixed one bone-headed error in my patch which didn't use
variable-length coding for random-access sparse vectors. I think that
explains the puzzle.

So... I guess I'd like to commit. Anyone want to check my work?

On Tue, May 11, 2010 at 4:21 PM, Sean Owen <[email protected]> wrote:
> I added tests to check it outputs the expected number of bytes. I checked
> that performance is fine. That checks out.
>
> So maybe it was a bad or misleading test. I haven't constructed a new one
> yet, should be easy though.
>
> On May 11, 2010 4:17 PM, "Robin Anil" <[email protected]> wrote:
>
> Sean, Did you get to explore the issue you found with Varint, theoretically
> it should bring better savings thatn VInt and VLong right?
>
> Robin
>

Reply via email to