HBase technical support service providers?

2015-03-24 Thread Devaraja Swami
Hi all,

A company I work with is interested in talking to outfits that can provide
technical support (troubleshooting, tuning, etc) for their HBase cluster,
on either an on-demand basis, or some some kind of a monthly/yearly
contract [they are flexible on the nature of the contract].

If you work for an outfit like that, or know of any such outfit, would
appreciate hearing about it.

Cheers,
devarajaswami


performance problems during bulk load because of triggered compaction?

2015-03-24 Thread Serega Sheypak
Hi, I have lowcost hardware, 2 HDD, 10 nodes with HBase 0.98 CDH 5.2.1
i have several apps that read/write to HBase using Java api.
Sometimes I see that response time raises from normal 30-40 ms to 1000-2000
ms or even more.
There are no running MapReduce at that time. But there is a bulk load each
hour.
I see that response degradation and bulk load process happen sometimes.

Table size is 17GB on hdfs and has 84 regions. Most of regions are
150-200MB size.
it has single column family:
{NAME => 'd', DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'ROWCOL',
REPLICATION_SCOPE => '0', COMPRESSION => 'SNAPPY', VERSIONS => '1', TTL =>
'691200 SECONDS (8 DAYS)', MIN_VERSIONS => '0', KEEP_DELETED_CELLS =>
'false', BLOCKSIZE => '65536', IN_MEMORY => 'true', BLOCKCACHE => 'true'}

When bulkload happens, it just updates existing cell value, it brings 0.01%
of new rows.
I keep serialized objects in d:q, where d is column family and q is column
qualifier

How can I get the root cause of performance degradation and minimize it?


Re: introducing nodes w/ more storage

2015-03-24 Thread Nick Dimiduk
Good discussion folks. I've opened HBASE-13323 for this effort to be
pursued.

https://issues.apache.org/jira/browse/HBASE-13323

On Mon, Mar 23, 2015 at 7:50 AM, Michael Segel 
wrote:

> @lars,
>
> How does the HDFS load balancer impact the load balancing of HBase?
>
> Of course there are two loads… one is the number of regions managed by a
> region server that’s HBase’s load, right?
> And then there’s the data distribution of HBase files that is really
> managed by HDFS load balancer, right?
>
> OP’s question is having a heterogenous cluster where he would like to see
> a more even distribution of data/free space based on the capacity of the
> newer machines in the cluster.
>
> This is a storage question, not a memory/cpu core question.
>
> Or am I missing something?
>
>
> -Mike
>
> > On Mar 22, 2015, at 10:56 PM, lars hofhansl  wrote:
> >
> > Seems that it should not be too hard to add that to the stochastic load
> balancer.
> > We could add a spaceCost or something.
> >
> >
> >
> > - Original Message -
> > From: Jean-Marc Spaggiari 
> > To: user 
> > Cc: Development 
> > Sent: Thursday, March 19, 2015 12:55 PM
> > Subject: Re: introducing nodes w/ more storage
> >
> > You can extend the default balancer and assign the regions based on
> > that.But at the end, the replicated blocks might still go all over the
> > cluster and your "small" nodes are going to be full and will not be able
> to
> > get anymore writes even for the regions they are supposed to get.
> >
> > I'm not sure there is a good solution for what you are looking for :(
> >
> > I build my own balancer but because of differences in the CPUs, not
> because
> > of differences of the storage space...
> >
> >
> > 2015-03-19 15:50 GMT-04:00 Nick Dimiduk :
> >
> >> Seems more fantasy than fact, I'm afraid. The default load balancer [0]
> >> takes store file size into account, but has no concept of capacity. It
> >> doesn't know that nodes in a heterogenous environment have different
> >> capacity.
> >>
> >> This would be a good feature to add though.
> >>
> >> [0]:
> >>
> >>
> https://github.com/apache/hbase/blob/branch-1.0/hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/StochasticLoadBalancer.java
> >>
> >> On Tue, Mar 17, 2015 at 7:26 AM, Ted Tuttle 
> wrote:
> >>
> >>> Hello-
> >>>
> >>> Sometime back I asked a question about introducing new nodes w/ more
> >>> storage that existing nodes.  I was told at the time that HBase will
> not
> >> be
> >>> able to utilize the additional storage; I assumed at the time that
> >> regions
> >>> are allocated to nodes in something like a round-robin fashion and the
> >> node
> >>> with the least storage sets the limit for how much each node can
> utilize.
> >>>
> >>> My question this time around has to do with nodes w/ unequal numbers of
> >>> volumes: Does HBase allocate regions based on nodes or volumes on the
> >>> nodes?  I am hoping I can add a node with 8 volumes totaling 8X TB and
> >> all
> >>> the volumes will be filled.  This even though legacy nodes have 5
> volumes
> >>> and total storage of 5X TB.
> >>>
> >>> Fact or fantasy?
> >>>
> >>> Thanks,
> >>> Ted
> >>>
> >>>
> >>
> >
>
> The opinions expressed here are mine, while they may reflect a cognitive
> thought, that is purely accidental.
> Use at your own risk.
> Michael Segel
> michael_segel (AT) hotmail.com
>
>
>
>
>
>