Hello all,
I want to implement difference operator by using MapReduce. I read two
tables by using MultiTableInputFormat. In the Map phase, I need to tag the
name of table into each row, but how can I get the table name?
One way I can think is to create HTable instance for each table in the
Hi,
I'm currently writing my thesis, in part it is about HBase. I was wondering
if there are some current numbers for large deployments, i.e Facebook or
Yahoo. I'm particularly interested in things like number of nodes, amount
of data managed and (if available) query throughput.
The most recent
Have you looked at http://www.meetup.com/hbaseusergroup/files/ ?
I think the following talks are relevant to your thesis:
HBase-at-twitter http://files.meetup.com/1350427/HBase-at-twitter.pdf
HBase Sizing Notes
http://files.meetup.com/1350427/HBase%20Sizing%20Notes.pdf
On Fri, Nov 21, 2014 at
This question has been asked a few times.
Take a look at Nick's comment in HBASE-4587
Cheers
On Fri, Nov 21, 2014 at 4:54 AM, yonghu yongyong...@gmail.com wrote:
Hello all,
I want to implement difference operator by using MapReduce. I read two
tables by using MultiTableInputFormat. In the
Hi,
thank you! The meetup link comes in handy. However this is not the answer
to the question I asked (or maybe I wasn't clear enough).
I am well aware of the sizing notes etc. However what I am looking for are
some hard numbers considering actual scale in the rela world. I can write a
lot about
Hi Julian,
I don't have an answer to your question, but I want to better understand your
question: You are looking for data on the largest HBase deployments in
practice, correct?
Regards,
Dave
-Original Message-
From: Julian Wissmann [mailto:julianwissm...@gmail.com]
Sent: Friday,
I think your best bet, to get the latest and accurate as possible data,
would be to directly contact the companies (through their Engineering
channels) which are known to host large clusters. Most of these companies
have public blogs and such so should not be hard to find an appropriate
contact.
Exactly!
Regards,
Julian
2014-11-21 17:16 GMT+01:00 Birdsall, Dave dave.birds...@hp.com:
Hi Julian,
I don't have an answer to your question, but I want to better understand
your question: You are looking for data on the largest HBase deployments in
practice, correct?
Regards,
Dave
Take a look at slide #4 in this talk:
http://www.slideshare.net/ddlatham/hbase-at-flurry
Cheers
On Fri, Nov 21, 2014 at 7:43 AM, Julian Wissmann julianwissm...@gmail.com
wrote:
Hi,
thank you! The meetup link comes in handy. However this is not the answer
to the question I asked (or maybe I
Hi,
I'm using the node.js HBase Thrift client. I can use getRows() to fetch
specific rows with all their columns or getRowsWithColumns() to specify the
columns or column families to return. But I can't figure out how to
specify columns starting with a given prefix, as it seems to be possible
Hello,
HBase 0.96
I notice couple of our tables are on only one of the region servers and
that one is doing magnitudes of requests per sec compared to others.
Will setting hbase.master.loadbalance.bytable to true help this situation?
Also, if that is the case, wondering why this is not set to
Please take a look at TableSkewCostFunction in StochasticLoadBalancer (the
default balancer):
private static final String TABLE_SKEW_COST_KEY =
hbase.master.balancer.stochastic.tableSkewCost;
private static final float DEFAULT_TABLE_SKEW_COST = 35;
You can increase the value
400mb blockcache? Ouch. What's your hbase-env.sh? Have you configured a
heap size? My guess is you're using the un configured default of 1G. Should
be at least 8G, and maybe more like 30G with this kind of host.
How many users are sharing it and with what kinds of tasks? If there's no
IO
Hi Ted,
You suggest this because StochasticLoadBalancer is the default in 0.96 ?
What about setting hbase.master.loadbalance.bytable to true ?
Thanks,
Arul
On Fri, Nov 21, 2014 at 1:47 PM, Ted Yu yuzhih...@gmail.com wrote:
Please take a look at TableSkewCostFunction in StochasticLoadBalancer
bq. StochasticLoadBalancer is the default in 0.96
True.
Adjusting hbase.master.loadbalance.bytable is not recommended in 0.96+
Cheers
On Fri, Nov 21, 2014 at 3:14 PM, Arul Ramachandran arkup...@gmail.com
wrote:
Hi Ted,
You suggest this because StochasticLoadBalancer is the default in 0.96
HI,
I have created a testcase that includes using the
HBaseTestingUtility.startMiniCluster and .shutdownMiniCluster.
The test passes, but the shutdownMiniCluster is not clean. The following
is the output. There are two questions about the error/warnings:
a) Is there a way to fix the Error
16 matches
Mail list logo