Bugs while installing apache hadoop 2.4.0

2014-07-03 Thread Ritesh Kumar Singh
When I try to start dfs using start-dfs.sh I get this error message: 14/07/03 11:03:21 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Starting namenodes on [OpenJDK 64-Bit Server VM warning: You have loaded library

Re: Bugs while installing apache hadoop 2.4.0

2014-07-03 Thread Akira AJISAKA
It looks like the native library is not compatible with your environment. You should delete '/usr/local/hadoop/lib/native' directory or compile source code to get your own native library. Thanks, Akira (2014/07/03 15:05), Ritesh Kumar Singh wrote: When I try to start dfs using start-dfs.sh I

How to recover reducer task data on a different data node?

2014-07-03 Thread James Teng
First i would like to declare that although i am not new to hadoop, but not expert on it as well.i would like to consult one issue on mapreduce framework. below is the description of the scenarios. When one reduce task is failed on one datanode, then the job tracker will try to schedule another

Re: Bugs while installing apache hadoop 2.4.0

2014-07-03 Thread Ritesh Kumar Singh
@Akira : if i delete my native library, how exactly do i generate my own copy of it? @Chris : This is the content of my /etc/hosts file : 127.0.0.1 localhost 127.0.1.1 hduser # The following lines are desirable for IPv6 capable hosts ::1 localhost ip6-localhost ip6-loopback fe00::0

Re: Bugs while installing apache hadoop 2.4.0

2014-07-03 Thread Akira AJISAKA
You can download the source code and generate your own native library by $ mvn package -Pdist,native -Dtar -DskipTests You should see the library in 'hadoop-dist/target/hadoop-2.4.0/lib/native' Thanks, Akira (2014/07/03 15:32), Ritesh Kumar Singh wrote: @Akira : if i delete my native

Re: What is the correct way to get a string back from a mapper or reducer

2014-07-03 Thread Bertrand Dechoux
The stackoverflow question doesn't add any useful information. Like I said you can emit the string inside a record. Or if you really want to handle lots of complexity, write it yourself within a file or a datastore from the reducer. But you will then have to consider performance issues and be

Re: How to recover reducer task data on a different data node?

2014-07-03 Thread Stanley Shi
It will start from scratch to copy all map outputs from all mapper nodes; Regards, *Stanley Shi,* On Thu, Jul 3, 2014 at 2:28 PM, James Teng tenglinx...@outlook.com wrote: First i would like to declare that although i am not new to hadoop, but not expert on it as well. i would like to

RE: How to recover reducer task data on a different data node?

2014-07-03 Thread James Teng
Hi, thanks for your quick reply.could you pls explain bit more in details? like how to get the info which map nodes have to transfer data to this new reducer node. and how to communicate with them to transfer the data here.or via what kind of way to copy data. James. Date: Thu, 3 Jul 2014

Re: How to make hdfs data rack aware

2014-07-03 Thread Adam Kawa
You can run $ sudo -u hdfs hdfs dfsadmin -report | grep Hostname -A 1 2014-07-02 7:33 GMT+02:00 hadoop hive hadooph...@gmail.com: Try running fsck, it will also validate the block placement as well as replication. On Jun 27, 2014 6:49 AM, Kilaru, Sambaiah sambaiah_kil...@intuit.com wrote:

Re: How to recover reducer task data on a different data node?

2014-07-03 Thread Shahab Yunus
Adding to what Jungi Jeong said, if you can get your hands on the book* Hadoop: The Definitive Guide *by Tom White, then that would help as well as it is explains this in significant detail. Regards, Shahab On Thu, Jul 3, 2014 at 6:29 AM, Jungi Jeong jgje...@calab.kaist.ac.kr wrote: As far as

Multi-Cluster Setup

2014-07-03 Thread fab wol
hey everyone, MapR is offering the possibility to acces from one cluster (e.g. a compute only cluster without much storage capabilities) another cluster's HDFS/MapRFS (see http://doc.mapr.com/display/MapR/mapr-clusters.conf). In times of Hadoop-as-a-Service this becomes very interesting. Is this

Re: Multi-Cluster Setup

2014-07-03 Thread Nitin Pawar
Nothing is stopping you to implement cluster the way you want. You can have storage only nodes for your HDFS and do not run tasktrackers on them. Start bunch of machines with High RAM and high CPUs but no storage. Only thing to worry then would be network bandwidth to carry data from hdfs to

Need to evaluate the price of a Hadoop cluster

2014-07-03 Thread YIMEN YIMGA Gael
Hello Dear all, I would like to evaluate the price of a Hadoop cluster using the below characteristics for my Namenode and for my Datanode. My cluster should have one Namenode and three datanode. Could someone help me with the price of commodity hardware with these characteristics, please ?

Re: Need to evaluate the price of a Hadoop cluster

2014-07-03 Thread Cristobal Giadach
Are you using Hadoop 2.x? What about your secondary namenode? El jul 3, 2014 11:19 AM, YIMEN YIMGA Gael gael.yimen-yi...@sgcib.com escribió: Hello Dear all, I would like to evaluate the price of a Hadoop cluster using the below characteristics for my Namenode and for my Datanode. My

Re: Multi-Cluster Setup

2014-07-03 Thread fab wol
Hey Nitin, I'm not talking about concept-wise. I'm takling about how to actually do it technically and how to set it up. Imagine this: I have two clusters, both running fine and they are both (setup-wise) the same, besides that one has way more tasktrackers/Nodemanagers than the other one. Now I

RE: Need to evaluate the price of a Hadoop cluster

2014-07-03 Thread YIMEN YIMGA Gael
Hi, Actually, i didn’t plan for a secondary namenode. But if I should do so, I will consider for secondary namenode the same characteristic as the primary namenode. No, I’m not using Hadoop 2.x. I’m using Hadoop 1.2.1 (Distribution with JobTracker not resource manager) Regards From:

Re: Multi-Cluster Setup

2014-07-03 Thread Rahul Chaudhari
Fabian, I see this as the classic case of federation of hadoop clusters. The MR or job can refer to the specific hdfs://file location as input but at the same time run on another cluster. You can refer to following link for further details on federation.

yarn REST api (controller for v1 not found)

2014-07-03 Thread Alex Nastetsky
Hi, Using HDP 2.0.6, Yarn 2.1 I am trying to access the REST api per the documentation here: http://hadoop.apache.org/docs/r2.3.0/hadoop-yarn/hadoop-yarn-site/MapredAppMasterRest.html When I try to access this http://rm host:8088/ws/v1/cluster/app/application_1401899005478_2241 I get

Re: yarn REST api (controller for v1 not found)

2014-07-03 Thread Alex Nastetsky
This is actually the link I was following: http://hadoop.apache.org/docs/r2.3.0/hadoop-yarn/hadoop-yarn-site/WebServicesIntro.html On Thu, Jul 3, 2014 at 4:12 PM, Alex Nastetsky anastet...@spryinc.com wrote: Hi, Using HDP 2.0.6, Yarn 2.1 I am trying to access the REST api per the

Re: YARN creates only 1 container

2014-07-03 Thread hari
Just an update on this: turns out that in 2.2.0 version, the container count is implemented by using the memory resource, even though there are configs for vcores. Things might have changed since then. Thanks for the suggestions. yarn.xml was a typo, should have been yarn-site.xml. About the

RE: How to recover reducer task data on a different data node?

2014-07-03 Thread James Teng
ok, got it. thanks shahab jingi for your helpful reply. :) Date: Thu, 3 Jul 2014 07:40:20 -0400 Subject: Re: How to recover reducer task data on a different data node? From: shahab.yu...@gmail.com To: user@hadoop.apache.org Adding to what Jungi Jeong said, if you can get your hands on the book

Significance of PID files

2014-07-03 Thread Vijaya Narayana Reddy Bhoomi Reddy
Hi, Can anyone please explain the significance of the pid files in Hadoop i.e. purpose and usage etc? Thanks Regards Vijay

cluster migration from hadoop 1.2.1 to 2.4.0

2014-07-03 Thread oc tsdb
Hi, We have our hadoop cluster running with hadoop 1.2.1 and hbase 0.94.14. Now we are planing to move it to hadoop 2.4.0,hbase 0.98.2. if we migrate to the above newer versions, will there be any data loss? Do we need to take any cluster data backup before migrating to newer version of hadoop

Working of combiner in hadoop

2014-07-03 Thread Chhaya Vishwakarma
Hi, If have two map tasks running on one node , i have written combiner class also. Will combiner be called once for each map task or just once for both the map tasks Can i write a logic inside map which will work as combiner ? if yes will there be any side effect? Regards, Chhaya Vishwakarma

Re: Significance of PID files

2014-07-03 Thread Vijaya Narayana Reddy Bhoomi Reddy
Vikas, Its main use is to keep one process at a time...like one one datanode at a any host - Can you please elaborate in a more detail? What is meant by one process at a time? At what level does a pid file come into picture i.e. is it at a deamon level, job level, task level etc? Thanks Vijay