Hello all,
I am looking forward to build a 5 node hadoop cluster with the following
configurations per machine. --
1. Intel Xeon E5-2609 (2.40GHz/4-core)
2. 32 GB RAM (8GB 1Rx4 PC3)
3. 5 x 900GB 6G SAS 10K hard disk ( total 4.5 TB storage/machine)
4. Ethernet 1GbE connection
I would like the
Hi,
Can someone please explain default implementation of grouping comparator i.e.
if I do not specify a custom grouping comparator then which comparator is
called to decide the grouping for reducer.
I searched a lot on web but could not find a satisfactory explanation for its
default
Hello. I'm facing a issue when trying to configure my SecondaryNameNode on a
different machine than my NameNode. When both are on the same machine
everything works fine but after moving the secondary to a new machine I get:
2012-05-28 09:57:36,832 ERROR
Can someone guide me on how plug leakage of excess water flow from Pureit on
complete consumption of Chlorine
-Original Message-
From: Sheeba George [mailto:sheeba.geo...@gmail.com]
Sent: 04 June 2012 10:59
To: common-user@hadoop.apache.org
Subject: Re: datanode security (v 1.0.3)
Hi
I am not sure what could be the exact issue but when configuring secondary
NN to NN, you need to tell your SNN where the actual NN resides.
Try adding - dfs.http.address on your secondary namenode machine having
value as NN:port on hdfs-site.xml
Port should be on which your NN url is opening -
I configured dfs.http.address on SNN's hdfs-site.xml but still gets:
/
STARTUP_MSG: Starting SecondaryNameNode
STARTUP_MSG: host = hadoop01/192.168.0.11
STARTUP_MSG: args = [-checkpoint, force]
STARTUP_MSG: version = 1.0.3
Try giving value to dfs.secondary.http.address in hdfs-site.xml on your SNN.
In your logs, its starting SNN webserver at 0.0.0.0:50090. Its better if we
provide which IP it should start at.
Also I am assuming you are not having any firewalls enable between these 2
machines right ?
Regards,
Right. No firewalls. This is my 'toy' environment running as virtual machines
on my desktop computer. I'm playing with this here because have the same
problem on my real cluster. Will try to explicitly configure starting IP for
this SNN.
-Original Message-
From: praveenesh kumar
Also can you share your /etc/hosts file of both the VMs
Regards,
Praveenesh
On Mon, Jun 4, 2012 at 5:35 PM, ramon@accenture.com wrote:
Right. No firewalls. This is my 'toy' environment running as virtual
machines on my desktop computer. I'm playing with this here because have
the same
/etc/hosts
127.0.0.1 localhost.localdomain localhost
::1 localhost6.localdomain6 localhost6
192.168.0.10 hadoop00
192.168.0.11 hadoop01
192.168.0.12 hadoop02
-Original Message-
From: praveenesh kumar [mailto:praveen...@gmail.com]
Sent: lunes, 04 de junio de 2012
if you tell us the purpose of this cluster, then it would be helpful to
tell exactly how good it is
On Mon, Jun 4, 2012 at 3:57 PM, praveenesh kumar praveen...@gmail.comwrote:
Hello all,
I am looking forward to build a 5 node hadoop cluster with the following
configurations per machine. --
I would say not to use 127.0.0.1 in distributed mode. Comment out the first
2 lines of your /etc/hosts.
Rather have your /etc/hosts file like this -
Suppose you are on hadoop00 -- there /etc/hosts would look like
192.168.0.10 hadoop00 localhost
192.168.0.11 hadoop01
192.168.0.12 hadoop02
On
Now I see SNN machine name on the logs. Still refuses to connect to NN but now
got a diferent message:
PriviledgedActionException as:hadoop cause:java.io.FileNotFoundException:
http://hadoop00:50030/getimage?getimage=1
May be something is missing on my NN configuration?
12/06/04 14:13:08 INFO
On a very high level... we would be utilizing cluster not only for hadoop
but for other I/O bound or in-memory operations.
That is the reason we are going for SAS hard disks. And we also need to
perform lots of computational tasks for which we have RAM kept to 32 GB,
which can be increased. So on
Its trying to connect to your NN on port 50030.. I think it should be
50070. In your hdfs-site.xml -- for dfs.http.address -- I am assuming you
have given hadoop01:50070, right ?
Regards,
Praveenesh
On Mon, Jun 4, 2012 at 5:50 PM, ramon@accenture.com wrote:
Now I see SNN machine name on
You can control your map outputs based on any condition you want. I have
done that - it worked for me.
It could be your code problem that its not working for you.
Can you please share your map code or cross-check whether your conditions
are correct ?
Regards,
Praveenesh
On Mon, Jun 4, 2012 at
If you don't specify grouping comparator for your Job, it uses the Output Key
Comparator class for grouping.
This comparator should be provided if the equivalence rules for keys sorting
the intermediates are different from those for grouping keys.
Thanks
Devaraj
Hi Murat,
As Praveenesh explained, you can control the map outputs as you want.
map() function will be called for each input i.e map() function invokes
multiple times with different inputs in the same mapper. You can check by
having the logs in the map function what is happening in it.
Hi,
Thanks for your answer. After I've read your emails, I decided to clear
completely my mapper method to see If I can disable the output of the
mapper class at all, but it seems it did not work
So, here is my mapper method:
@Override
public void map(ByteBuffer key, SortedMapByteBuffer,
Thank you. That did the trick.
-Original Message-
From: Sheeba George [mailto:sheeba.geo...@gmail.com]
Sent: Monday, June 04, 2012 1:29 AM
To: common-user@hadoop.apache.org
Subject: Re: datanode security (v 1.0.3)
Hi Tony ,
Please take a look at
Right. Silly mistake Now using 50070 and IT WORKS!!!
Thx a lot Praveenesh. I will replicate this solution to my real cluster.
-Original Message-
From: praveenesh kumar [mailto:praveen...@gmail.com]
Sent: lunes, 04 de junio de 2012 14:25
To: common-user@hadoop.apache.org
Subject: Re:
did you configure dfs.namenode.secondary.http -address in
hdfs-site.xml.
On Mon, Jun 4, 2012 at 7:53 PM, ramon@accenture.com wrote:
Right. Silly mistake Now using 50070 and IT WORKS!!!
Thx a lot Praveenesh. I will replicate this solution to my real cluster.
-Original
I am happy to announce that I was able to get the license on the Yahoo! Hadoop
tutorial updated from Creative Commons Attribution 3.0 Unported License to
Apache 2.0. I have filed HADOOP-8477
https://issues.apache.org/jira/browse/HADOOP-8477 to pull the tutorial into
the Hadoop project, and to
Ok,
For the ones that faces the problem, here is how I solved the problem:
First of all, there was a task created for that on hadoop:
https://issues.apache.org/jira/browse/HADOOP-4927
and
http://hadoop.apache.org/mapreduce/docs/current/mapred_tutorial.html#Lazy+Output+Creation
explains how to
I am trying to put a 16gb file on to hdfs but I was given all of these messages
and I don't know why this is happening. Can someone please shed some light on
this scenario.
Thanks in advance
hduser@master:~ hadoop fs -put ~/tests/wiki16gb.txt /user/hduser/wiki/16gb.txt
12/06/04 10:52:05 WARN
Hello Bobby,
Great news !!
Thanks for your efforts in handling those legal issues. I will assign
myself few JIRA's.
To start off we can take reference for dividing the documentation into same
modules as original Yahoo Tutorials and adding relevant features which have
been incorporated into new
My local environment: single ubuntu 11.10 desktop version, oracle jdk
7.0_04, MIT kerberos 5, apache hadoop-1.0.2.
I am able to get kerberos working, here is my key:
Hi Sean,
It seems your HDFS has not properly started. Go through your HDFS webconsole
top verify if NN and all DN are up. You can access that on http://your name
node ip:50070
Also ensure yourself that your NN has left Safe Mode before start moving data
to HDFS.
-Original
I found these two threads from mailing list:
http://mail-archives.apache.org/mod_mbox/hadoop-common-user/201202.mbox/browser
http://mail-archives.apache.org/mod_mbox/hadoop-common-user/201108.mbox/browser
At least they were able to get name node up. Can someone please pointing
out why I am
Sorry, the links should be:
http://mail-archives.apache.org/mod_mbox/hadoop-common-user/201202.mbox/%3CCAMAD20=oKVRy_pDX6FWm=xvpz1pal0qcfqagssaxq8xugp7...@mail.gmail.com%3E
http://lucene.472066.n3.nabble.com/Starting-datanode-in-secure-mode-td3297090.html
-Hailun Yan
-- Forwarded
I have a machine that is part of the cluster but I'd like to dedicate it
to being the web server and run the db but still have access to starting
jobs and getting data out of hdfs. In other words I'd like to have the
cores, memory, and disk only minimally affected by running jobs on the
Hi Pat,
Sounds like you would just turn off the datanode and the tasktracker.
Your config will still point to the Namenode and JT, so you can still
launch jobs and read/write from HDFS.
You'll probably want to replicate the data off first of course.
Thanks,
Tom
On Mon, Jun 4, 2012 at 2:06 PM,
Hi Tom,
Sounds like the trick. This node is a slave so it's datanode and
tasktracker are started from the master.
- how do I start the cluster without starting the datanode and the
tasktracker on the mini-node slave? Remove it from slaves?
- what do I minimally need to start on the
Hi Pat,
Sounds like the trick. This node is a slave so it's datanode and tasktracker
are started from the master.
- how do I start the cluster without starting the datanode and the
tasktracker on the mini-node slave? Remove it from slaves?
There's no main cluster software, just don't start
if you are doing computations using hadoop on a miniscale yes this hardware
is good enough.
Normally hadoop clusters are pre-occupied with the heavy loads so they are
not shared for multiple usage unless your utilization of hadoop is on lower
side and then you want to reuse the hardware.
On
Check your Datanode logs.. or do hadoop fsck / or hadoop dfsadmin
-report to get more details about your HDFS.
Seems like DN is down.
Regards,
Praveenesh
On Tue, Jun 5, 2012 at 12:13 AM, ramon@accenture.com wrote:
Hi Sean,
It seems your HDFS has not properly started. Go through your
The output files should 0 kb size if you use FileOutputFormat/TextOutputFormat.
I think your output format writer is writing some meta data in those files. Can
you check what is the data present in those files.
Can you tell me which output format are you using?
Thanks
Devaraj
37 matches
Mail list logo