Re: strange PerformanceEvaluation behaviour

2012-02-16 Thread Oliver Meyn (GBIF)
On 2012-02-15, at 5:39 PM, Stack wrote: On Wed, Feb 15, 2012 at 1:53 AM, Oliver Meyn (GBIF) om...@gbif.org wrote: So hacking around reveals that key collision is indeed the problem. I thought the modulo part of the getRandomRow method was suspect but while removing it improved the

Re: strange PerformanceEvaluation behaviour

2012-02-15 Thread Oliver Meyn (GBIF)
On 2012-02-15, at 7:32 AM, Stack wrote: On Tue, Feb 14, 2012 at 8:14 AM, Stack st...@duboce.net wrote: 2) With that same randomWrite command line above, I would expect a resulting table with 10 * (1024 * 1024) rows (so 10485700 = roughly 10M rows). Instead what I'm seeing is that the

Re: strange PerformanceEvaluation behaviour

2012-02-15 Thread Oliver Meyn (GBIF)
On 2012-02-15, at 9:09 AM, Oliver Meyn (GBIF) wrote: On 2012-02-15, at 7:32 AM, Stack wrote: On Tue, Feb 14, 2012 at 8:14 AM, Stack st...@duboce.net wrote: 2) With that same randomWrite command line above, I would expect a resulting table with 10 * (1024 * 1024) rows (so 10485700 = roughly

Re: strange PerformanceEvaluation behaviour

2012-02-15 Thread yuzhihong
Oliver: Thanks for digging. Please file Jira's for these issues. On Feb 15, 2012, at 1:53 AM, Oliver Meyn (GBIF) om...@gbif.org wrote: On 2012-02-15, at 9:09 AM, Oliver Meyn (GBIF) wrote: On 2012-02-15, at 7:32 AM, Stack wrote: On Tue, Feb 14, 2012 at 8:14 AM, Stack st...@duboce.net

Re: strange PerformanceEvaluation behaviour

2012-02-15 Thread Oliver Meyn (GBIF)
Okie: 10x # of mappers: https://issues.apache.org/jira/browse/HBASE-5401 wrong row count: https://issues.apache.org/jira/browse/HBASE-5402 Oliver On 2012-02-15, at 11:50 AM, yuzhih...@gmail.com wrote: Oliver: Thanks for digging. Please file Jira's for these issues. On Feb 15,

Re: strange PerformanceEvaluation behaviour

2012-02-15 Thread Stack
On Wed, Feb 15, 2012 at 1:53 AM, Oliver Meyn (GBIF) om...@gbif.org wrote: So hacking around reveals that key collision is indeed the problem.  I thought the modulo part of the getRandomRow method was suspect but while removing it improved the behaviour (I got ~8M rows instead of ~6.6M) it

strange PerformanceEvaluation behaviour

2012-02-14 Thread Oliver Meyn (GBIF)
Hi all, I've been trying to run a battery of tests to really understand our cluster's performance, and I'm employing PerformanceEvaluation to do that (picking up where Tim Robertson left off, elsewhere on the list). I'm seeing two strange things that I hope someone can help with: 1) With a

Re: strange PerformanceEvaluation behaviour

2012-02-14 Thread Stack
On Tue, Feb 14, 2012 at 7:56 AM, Oliver Meyn (GBIF) om...@gbif.org wrote: 1) With a command line like 'hbase org.apache.hadoop.hbase.PerformanceEvaluation randomWrite 10' I see 100 mappers spawned, rather than the expected 10.  I expect 10 because that's what the usage text implies, and

Re: strange PerformanceEvaluation behaviour

2012-02-14 Thread Stack
On Tue, Feb 14, 2012 at 8:14 AM, Stack st...@duboce.net wrote: 2) With that same randomWrite command line above, I would expect a resulting table with 10 * (1024 * 1024) rows (so 10485700 = roughly 10M rows).   Instead what I'm seeing is that the randomWrite job reports writing that many