i have been playing with high availability using journalnodes and 2 masters
both running namenode and hbase master.
when i kill the namenode and hbase-master processes on the active master,
the failover is perfect. hbase never stops and a running map-reduce jobs
keeps going. this is impressive!
h
Hi there,
I had a similar issue with hadoop-1.2.0 JobTracker keep crashing until I
set HADOOP_HEAPSIZE="2048" I did not have this kind of issue with previous
versions. But you can try this if you have memory and see. In my case the
issue was gone after I set as above.
Thanks
Reyane OUKPEDJO
Have a look at our vagrant hadoop cluster, that does just that (using
ubuntu though):
https://github.com/Cascading/vagrant-cascading-hadoop-cluster
-- André
On Sat, Oct 12, 2013 at 12:33 AM, Raj Hadoop wrote:
> All,
>
> I have a CentOS VM image and want to replicate it four times on my Mac
> co
Raj & Gary,
For setting up multiple VMs on a local computer from a VM image, I
highly recommend Vagrant (http://www.vagrantup.com/).
It lets you easily create and start up multiple VMs with unique IP
addresses and host names from a single image, save/revert to named
snapshots, etc.
Ambari Quick S
Hi Raj
I want to do the same. Can we collaborate
Thanks
Gary
7327636549
On Oct 11, 2013 6:34 PM, "Raj Hadoop" wrote:
> All,
>
> I have a CentOS VM image and want to replicate it four times on my Mac
> computer. How
> can I set it up so that I can have 4 individual machines that can be used
> as
All,
I have a CentOS VM image and want to replicate it four times on my Mac
computer. How
can I set it up so that I can have 4 individual machines that can be used as
nodes
in my Hadoop cluster.
Please advise.
Thanks,
Raj
Hi,
I am trying to separate my output from reducer to different folders..
My dirver has the following code:
FileOutputFormat.setOutputPath(job, new Path(output));
//MultipleOutputs.addNamedOutput(job, namedOutput,
outputFormatClass, keyClass, valueClass)
//MultipleOutputs
never mind.. found a bug :D
On Fri, Oct 11, 2013 at 12:54 PM, jamal sasha wrote:
> Hi..
>
> In my mapper function..
> Can i have multiple context.write()...
>
> So...
>
> public void map(LongWritable key, Text value, Context context) throws
> IOException, InterruptedException ,NullPointerExcep
Hi..
In my mapper function..
Can i have multiple context.write()...
So...
public void map(LongWritable key, Text value, Context context) throws
IOException, InterruptedException ,NullPointerException{
.. //processing/...
context.write(k1,v1);
context.write(k2,v2);
}
I thought we could do th
issue with /etc/hosts files. thx for letting me explore on my own.
understood lot of internals.
On Fri, Oct 11, 2013 at 3:28 AM, Srinivas Chamarthi <
srinivas.chamar...@gmail.com> wrote:
> from the stack trace, I believe, it is trying to start/connect the
> ApplicationMaster and fails to connect
Hi,
I'm running a 14 nodes of Hadoop cluster with datanodes,tasktrackers
running in all nodes.
*Apache Hadoop :* 1.2.1
It shows the heap size currently as follows:
*Cluster Summary (Heap Size is 5.7/8.89 GB)*
*
*
In the above summary what is the *8.89* GB defines? Is the *8.89* defines
maximum
Just a clarification: Cloudera Manager is now free for any number of nodes.
Ref:
http://www.cloudera.com/content/cloudera/en/products/cloudera-manager.html
-Sandy
On Fri, Oct 11, 2013 at 7:05 AM, DSuiter RDX wrote:
> Sagar,
>
> It sounds like you want a management console. We are using Clouder
Hi,
I'm running a 14 nodes Hadoop cluster with tasktrackers running in all
nodes.
Have set the jobtracker default memory size in hadoop-env.sh
*HADOOP_HEAPSIZE="1024"*
*
*
Have set the mapred.child.java.opts value in mapred-site.xml as,
mapred.child.java.opts
-Xmx2048m
--
Regards,
Viswa.J
Sagar,
It sounds like you want a management console. We are using Cloudera
Manager, but for 200 nodes you would need to license it, it is only free up
to 50 nodes.
The FOSS version of this is Ambari, iirc.
http://incubator.apache.org/ambari/
Flume will provide a Hadoop-integrated pipeline for in
Hi,
http://flume.apache.org
- Alex
On Oct 11, 2013, at 7:36 AM, Sagar Mehta wrote:
> Hi Guys,
>
> We have fairly decent sized Hadoop cluster of about 200 nodes and was
> wondering what is the state of art if I want to aggregate and visualize
> Hadoop ecosystem logs, particularly
> Tasktrack
I've used Splunk in the past for log aggregation. It's commercial/proprietary,
but I think there's a free version.
http://www.splunk.com/
From: Raymond Tay [mailto:raymondtay1...@gmail.com]
Sent: Friday, October 11, 2013 1:39 AM
To: user@hadoop.apache.org
Subject: Re: State of Art in Hadoop Log
It looks like you are correct, and I did not have the right solution, I
apologize. I'm not sure if the other nodes need to be involved either. Now
I'm hoping someone with deeper knowledge will step in, because I'm curious
also! Some of the most knowledgeable people on here are on US Pacific Time,
s
this line:
2013-10-11 10:24:53,033 ERROR org.apache.hadoop.security.UserGroupInformation:
PriviledgedActionException as:mapred (auth:SIMPLE)
cause:java.io.IOException: java.lang.NullPointerException
is imho indicating that i am using the user "mapred" for executing (fyi:
submitting the job from t
The user running the job (might not be your username depending on your
setup) does not appear to have executable permissions on the jobtracker
cluster topology python script - I'm basing this on the lines:
2013-10-11 10:24:53,035 WARN org.apache.hadoop.net.ScriptBasedMapping:
Exception running
/ru
Hey everyone, I've got supplied with a decent ten node CDH 4.4 cluster,
only 7 days old, and someone tried some HBase stuff on it. Now I wanted to
try some MR Stuff on it, but starting a Job is already not possible (even
the wordcount example). The error log of the jobtracker produces a log 700k
li
Hi guys,
I am working on doing mount a hdfs to a remote host (say hdfs in
hostA and I need to mount it to local path in hostB)?
I noticed hdfs-nfs-proxy(https://github.com/cloudera/hdfs-nfs-proxy)
could make that happen.
but I got some doubts
1, when I mount remote hdfs to mo
So, perhaps this has been thought of, but perhaps not.
It is my understanding that grep is usually sorting things one line at a
time. As I am currently experimenting with Avro, I am finding that the
local grep function does not handle it well at all, because it is one long
line essentially, so wor
from the stack trace, I believe, it is trying to start/connect the
ApplicationMaster and fails to connect to it. I am not sure if this is
related to ec2 loopback adapter.
On Fri, Oct 11, 2013 at 12:22 AM, Srinivas Chamarthi <
srinivas.chamar...@gmail.com> wrote:
> I have a 2 node cluster (HDP1,
I have a 2 node cluster (HDP1, HDP2) as mentioned below.
HDP 1
1.name node ,
2.data node,
3. node manager
4. resource manager
HDP 2
1. node manager
2. data node
when I submit the map reduce job on HDP1 , the job runs on node HDP2 which
is fine.
But the job fails and in the userlogs/syslogs of
24 matches
Mail list logo