On 2012-02-15, at 7:32 AM, Stack wrote: > On Tue, Feb 14, 2012 at 8:14 AM, Stack <st...@duboce.net> wrote: >>> 2) With that same randomWrite command line above, I would expect a >>> resulting table with 10 * (1024 * 1024) rows (so 10485700 = roughly 10M >>> rows). Instead what I'm seeing is that the randomWrite job reports writing >>> that many rows (exactly) but running rowcounter against the table reveals >>> only 6549899 rows. A second attempt to build the table produces slightly >>> different results (e.g. 6627689). I see a similar discrepancy when using >>> 50 instead of 10 clients (~35% smaller than expected). Key collision could >>> explain it, but it seems pretty unlikely (given I only need e.g. 10M keys >>> from a potential 2B). >>> >> > > I just tried it here and got similar result. I wonder if its the > randomWrite? What if you do sequentialWrite, do you get our 10M?
Thanks for checking into this stack - when using sequentialWrite I get the expected 10485700 rows. I'll hack around a bit on the PE to count the number of collisions, and try to think of a reasonable solution. Thanks, Oliver