Re: anyway to turn off per-region metrics?

2013-07-30 Thread Oliver Meyn (GBIF)
ve all of the per table metrics. > > For 0.95 and above this will be controlled by filters in the metrics > properties file. > > On Mon, Jul 29, 2013 at 4:06 AM, Oliver Meyn (GBIF) wrote: >> Hi All, >> >> My ganglia server is being overwhelmed and I need to cut d

anyway to turn off per-region metrics?

2013-07-29 Thread Oliver Meyn (GBIF)
Hi All, My ganglia server is being overwhelmed and I need to cut down on metrics. Is there a way to turn off the hbase.RegionServerDynamicStatistics.* metrics while keeping the hbase.regionserver.* metrics? I'm using cdh4.3, so hbase 0.94.6.1. Thanks, Oliver -- Oliver Meyn Software Developer G

Re: Scan performance on compressed column families

2012-11-09 Thread Oliver Meyn (GBIF)
Hi David, I wrote that blog post and I know that Lars George has much more experience than me with tuning HBase, especially in different environments, so weight our opinions accordingly. As he says, it will "usually" help, and the unusual cases of lower spec'd hardware (that I did those tests

Re: resource usage of ResultScanner's Iterator

2012-11-02 Thread Oliver Meyn (GBIF)
On 2012-10-26, at 9:59 PM, Stack wrote: > On Thu, Oct 25, 2012 at 1:24 AM, Oliver Meyn (GBIF) wrote: >> Hi all, >> >> I'm on cdh3u3 (hbase 0.90.4) and I need to provide a bunch of row keys based >> on a column value (e.g. give me all keys where colum

resource usage of ResultScanner's Iterator

2012-10-25 Thread Oliver Meyn (GBIF)
Hi all, I'm on cdh3u3 (hbase 0.90.4) and I need to provide a bunch of row keys based on a column value (e.g. give me all keys where column "dataset" = 1234). That's straightforward using a scan and filter. The trick is that I want to return an Iterator over my key type (Integer) rather than e

Optimizing writes/compactions/storefiles

2012-07-11 Thread Oliver Meyn (GBIF)
Hi all, We just spent some time figuring out how to get writes to work properly in our cluster on cdh3, and I wrote it up in a blog post. Might be of interest to some of you: http://gbif.blogspot.dk/2012/07/optimizing-writes-in-hbase.html Cheers, Oliver -- Oliver Meyn Software Developer Globa

Re: Pre-split table using shell

2012-06-12 Thread Oliver Meyn (GBIF)
Hi Simon, I might be wrong but I'm pretty sure the splits file you specify is assumed to be full of strings. So even though they look like bytes they're being interpreted as the string value (like '\x00') instead of the actual byte \x00. The only way I could get the byte representation of int

Re: PerformanceEvaluation results

2012-06-08 Thread Oliver Meyn (GBIF)
20, 2012 9:17 AM >> Subject: Re: PerformanceEvaluation results >> >> On Tue, Mar 20, 2012 at 8:53 AM, Oliver Meyn (GBIF) >> wrote: >>> Apologies for responding to myself, but after some more testing I've >>> concluded that we had a minor network

Re: HBase Performance Improvements?

2012-05-10 Thread Oliver Meyn (GBIF)
< >> mailinglist...@gmail.com> wrote: >> >>> Hey Oliver, >>> >>> Thanks a "billion" for the response -:) I will take any code you can >>> provide even if it's a hack! I will even send you an Amazon gift card - >>>

Re: HBase Performance Improvements?

2012-05-09 Thread Oliver Meyn (GBIF)
Heya Something, I had a similar task recently and by far the best way to go about this is with bulk loading after pre-splitting your target table. As you know ImportTsv doesn't understand Avro files so I hacked together my own ImportAvro class to create the Hfiles that I eventually moved into

Re: Doumentation broken

2012-04-13 Thread Oliver Meyn (GBIF)
Looks like /book got moved under another /book, so something is definitely wrong. You can try an unstyled version at: http://hbase.apache.org/book/book/book.html Cheers, Oliver On 2012-04-13, at 9:59 AM, Nitin Pawar wrote: > Hello, > > Is there any maintenance going on with hbase.apache.org?

Re: PerformanceEvaluation results

2012-03-21 Thread Oliver Meyn (GBIF)
By all means please link - I would be very happy if we could amortize the amount of time I spent digging (and learning various low-level hardware things) across other people's problems :) Oliver On 2012-03-20, at 5:57 PM, Stack wrote: > On Tue, Mar 20, 2012 at 9:55 AM, lars hofhansl wrote: >>

Re: PerformanceEvaluation results

2012-03-20 Thread Oliver Meyn (GBIF)
mance-evaluation-continued.html Cheers, Oliver On 2012-02-28, at 5:10 PM, Oliver Meyn (GBIF) wrote: > Hi all, > > I've spent the last couple of weeks working with PerformanceEvaluation, > trying to understand scan performance in our little cluster. I've written a >

ethernet channel bonding experiences

2012-03-19 Thread Oliver Meyn (GBIF)
Hi all, I've been experimenting with PerformanceEvaluation in the last weeks and on a whim thought I'd give channel bonding a try to see if it was networking bandwidth that was acting as the bottleneck. It would seem that it's not quite as trivial as it sounds, so I'm looking for other people'

PerformanceEvaluation results

2012-02-28 Thread Oliver Meyn (GBIF)
Hi all, I've spent the last couple of weeks working with PerformanceEvaluation, trying to understand scan performance in our little cluster. I've written a blog post with the results and would really welcome any input you may have. http://gbif.blogspot.com/2012/02/performance-evaluation-of-hba

Re: strange PerformanceEvaluation behaviour

2012-02-16 Thread Oliver Meyn (GBIF)
On 2012-02-15, at 5:39 PM, Stack wrote: > On Wed, Feb 15, 2012 at 1:53 AM, Oliver Meyn (GBIF) wrote: >> So hacking around reveals that key collision is indeed the problem. I >> thought the modulo part of the getRandomRow method was suspect but while >> removing it impr

Re: strange PerformanceEvaluation behaviour

2012-02-15 Thread Oliver Meyn (GBIF)
> > > > On Feb 15, 2012, at 1:53 AM, "Oliver Meyn (GBIF)" wrote: > >> On 2012-02-15, at 9:09 AM, Oliver Meyn (GBIF) wrote: >> >>> On 2012-02-15, at 7:32 AM, Stack wrote: >>> >>>> On Tue, Feb 14, 2012 at 8:14 AM, Sta

Re: strange PerformanceEvaluation behaviour

2012-02-15 Thread Oliver Meyn (GBIF)
On 2012-02-15, at 9:09 AM, Oliver Meyn (GBIF) wrote: > On 2012-02-15, at 7:32 AM, Stack wrote: > >> On Tue, Feb 14, 2012 at 8:14 AM, Stack wrote: >>>> 2) With that same randomWrite command line above, I would expect a >>>> resulting table with 10 * (1024 * 1

Re: strange PerformanceEvaluation behaviour

2012-02-15 Thread Oliver Meyn (GBIF)
On 2012-02-15, at 7:32 AM, Stack wrote: > On Tue, Feb 14, 2012 at 8:14 AM, Stack wrote: >>> 2) With that same randomWrite command line above, I would expect a >>> resulting table with 10 * (1024 * 1024) rows (so 10485700 = roughly 10M >>> rows). Instead what I'm seeing is that the randomWrite

strange PerformanceEvaluation behaviour

2012-02-14 Thread Oliver Meyn (GBIF)
Hi all, I've been trying to run a battery of tests to really understand our cluster's performance, and I'm employing PerformanceEvaluation to do that (picking up where Tim Robertson left off, elsewhere on the list). I'm seeing two strange things that I hope someone can help with: 1) With a co

Re: snappy error during completebulkload

2012-01-10 Thread Oliver Meyn (GBIF)
Todd Lipcon wrote: > On Mon, Jan 9, 2012 at 2:42 AM, Oliver Meyn (GBIF) wrote: >> It seems really weird that compression (native compression even moreso) >> should be required by a command that is in theory moving files from one >> place on a remote filesystem to another

snappy error during completebulkload

2012-01-09 Thread Oliver Meyn (GBIF)
Hi all, I'm trying to do bulk loading into a table with snappy compression enabled and I'm getting an exception complaining about missing native snappy library, namely: 12/01/09 11:16:53 WARN snappy.LoadSnappy: Snappy native library not loaded Exception in thread "main" java.io.IOException: jav