Well I did a purer, local test and results are more reasonable. Writing 10000 random-access sparse vectors, 1000 entries, each a random number to 100000, takes 5.4s before versus 4.7s with changes. That must be I/O savings since it takes a little more CPU -- and that's savings writing to an SSD. Imagine the savings over a network.
Size goes down from 120MB to 108MB, which is in line with expectations. I saw and fixed one bone-headed error in my patch which didn't use variable-length coding for random-access sparse vectors. I think that explains the puzzle. So... I guess I'd like to commit. Anyone want to check my work? On Tue, May 11, 2010 at 4:21 PM, Sean Owen <[email protected]> wrote: > I added tests to check it outputs the expected number of bytes. I checked > that performance is fine. That checks out. > > So maybe it was a bad or misleading test. I haven't constructed a new one > yet, should be easy though. > > On May 11, 2010 4:17 PM, "Robin Anil" <[email protected]> wrote: > > Sean, Did you get to explore the issue you found with Varint, theoretically > it should bring better savings thatn VInt and VLong right? > > Robin >
