I logged into the master
1. In hbase-ec2-init-remote.sh, the block is evaluated (on master)
if [ "$IS_MASTER" = "true" ]; then
sed -i -e "s|\( *mcast_join *=.*\)|#\1|" \
-e "s|\( *bind *=.*\)|#\1|" \
-e "s|\( *mute *=.*\)| mute = yes|" \
-e "s|\( *location *=.*\)| l
Hello, Jean-Daniel,
Thank you for telling me good pointers. I was afraid I could get my question
to be noticed by FB people and obtain reliable answers, so I asked here. One
more reason is that I'm a fan of HBase (I'm sorry I thought this might not
be the best place to ask this question.)
An
> I'd have to link /lib/ etc to /mnt or something.
Yes.
Copy first. Then use bind mounts ('mount --bind ...') to overlay the additional
storage wherever you prefer in the filesystem hierarchy.
Then invoke yum.
I am sure there are other approaches but the above can be scripted easily
enough.
True, and thanks for putting the scripts together. But doesn't yum demand
free space on the / device?
There is plenty of space on /mnt but then do instruct yum to install
packages elsewhere. I'd have to link /lib/ etc to
/mnt or something.
Cheers
J
On Fri, Nov 19, 2010 at 2:27 PM, Andrew Purtell
The root device on instance-store type instances is small, but there are
several additional disk volumes supplied depending on the instance size, 2 420
GB volumes for m1.xlarge and 4 420 GB volumes for c1.xlarge.
Our ec2 scripts mount them as /mnt, /mnt2, /mnt3, etc. and configure the Hadoop
D
I Think i found the answer. It appears this AMI was bundled with 3GB but
there is no compelling reason to do so.
I can recreate an AMI from the c1.xlarge AMI bundling it with e.g. 5GB. That
should cover my needs.
I honestly don't mind 5 minute start up times - coffee and a cigarette - or
is there s
Hello,
Both packages have HBAse 0.89.20100726 installed, the former is c1.xlarge
and the latter is medium).
I'm trying to install some extra packages (see [1])
By the time I've come to install R, I'm almost out of space on the root
device. I would like to add some packages to the
task nodes (whic
This isn't the right forum for that kind of discussion.
I recommend going on Quora which already has a few good threads on the
subject, answered by FB folks, namely:
http://www.quora.com/Why-did-Facebook-pick-HBase-instead-of-Cassandra-for-the-new-messaging-platform
and
http://www.quora.com/How
Yes, messages is the right place. Saw this
Nov 19 12:25:26 ip-10-98-154-214 /usr/sbin/gmetad[1293]: Unable to
mkdir(/var/lib/ganglia/rrds/unspecified): No such file or directory
Cheers
J
On Fri, Nov 19, 2010 at 2:15 AM, Lars George wrote:
> Yeah, this will be superseded by WHIRR-25 over the n
Thanks.
On Fri, Nov 19, 2010 at 2:15 AM, Lars George wrote:
> Yeah, this will be superseded by WHIRR-25 over the next month or two.
> The "root" name was simply a choice, no reason not to change it. As
> for Ganglia, do you see the Ganglia daemon run on each node? If not,
> please have a look in
Yeah, turning of the WAL would have been my next suggestion. Apart
from that Ganglia is really easily set up - you might want to consider
getting used to it now :)
On Fri, Nov 19, 2010 at 4:29 PM, Henning Blohm wrote:
> Hi Lars,
>
> we do not have anything like ganglia up. Unfortunately.
>
> I
Hi Lars,
we do not have anything like ganglia up. Unfortunately.
I use regular puts with autoflush turned off, with a buffer of 4MB
(could be bigger right?). We write to WAL.
I flush every 1000 recs.
I will try again - maybe over the weekend - and see if I can find out
more.
Thanks,
Hi Henning,
And you what you have seen is often difficult to explain. What I
listed are the obvious contenders. But ideally you would do a post
mortem on the master and slave logs for Hadoop and HBase, since that
would give you a better insight of the events. For example, when did
the system start
Hi Lars,
thanks. Yes, this is just the first test setup. Eventually the data
load will be significantly higher.
At the moment (looking at the master after the run) the number of
regions is well-distributed (684,685,685 regions). The overall
HDFS use is ~700G. (replication factor is 3 btw).
I
Hi Henning,
Could you look at the Master UI while doing the import? The issue with
a cold bulk import is that you are hitting one region server
initially, and while it is filling up its in-memory structures all is
nice and dandy. Then ou start to tax the server as it has to flush
data out and it b
Hi, Lars
It's very nice of you to show the helpful blog to me. Now I see :-)
--
Pan W
We have a Hadoop 0.20.2 + Hbase 0.20.6 setup with three data nodes
(12GB, 1.5TB each) and one master node (24GB, 1.5TB). We store a
relatively simple
table in HBase (1 column familiy, 5 columns, rowkey about 100chars).
In order to better understand the load behavior, I wanted to put 5*10^8
rows
Hello, (especially Mr. Jonathan Gray, Facebook folks),
I'm sorry for mentioning particular people in a public ML.
I saw the following note from Facebook that says Facebook chose HBase, not
Cassandra, as the storage for the next messaging infrastructure.
http://www.facebook.com/notes/facebook-
Yeah, this will be superseded by WHIRR-25 over the next month or two.
The "root" name was simply a choice, no reason not to change it. As
for Ganglia, do you see the Ganglia daemon run on each node? If not,
please have a look into the logs on the servers, the user scripts
usually log their process
Have a read here:
http://outerthought.org/blog/417-ot.html
Especially: "One interesting option that is missing is the ability to
retrieve the latest version less than or equal to a given timestamp,
thus giving the 'latest' state of the record at a certain point in
time. Update: this is (obviously
20 matches
Mail list logo