How do I recover the namenode?
cluster is.. 2 namenodes ( ha cluster ), 3 jounalnodes, n datanodes I regularly back up the metadata(fsimage) file. ( http://[namenode address]:50070/imagetransfer?getimage=1amp;txid=latest ) How do I recover the namenode by using the metadata(fsimage)?
Re: heartbeat timeout doesn't work
The timeout value is set by the following formula: heartbeatExpireInterval = 2 * (heartbeatRecheckInterval) + 10 * 1000 * (heartbeatIntervalSeconds); Note that heartbeatRecheckInterval is set by dfs.namenode.heartbeat.recheck-interval property (5*60*1000 [msec] by default), and heartbeatIntervalSeconds is set by dfs.heartbeat.interval property (3 by default). In this way, the timeout value is default to 10 minutes and 30 seconds. Thanks, Akira (2014/07/05 1:06), MrAsanjar . wrote: In my namenode's hdfs-site.xml file, I set the following values to reduce datanodes heartbeat timeout value to less then a minute. dfs.heartbeat.interval = 3 dfs.namenode.stale.datanode.interval = 15 However it is still takes 10 to 15 minutes before it timesout.. What am I doing wrong here?
Re: define node
A server with more than one hard drive is one node only. Sam On 7/7/14, 9:50 AM, Adaryl Bob Wakefield, MBA adaryl.wakefi...@hotmail.com wrote: If you have a server with more than one hard drive is that one node or n nodes where n = the number of hard drives? B.
Managed File Transfer
Hi, We used a commercial FT and scheduler tool in clustered mode. This was a traditional active-active cluster that supported multiple protocols like FTPS etc. Now I am interested in evaluating a Distributed way of crawling FTP sites and downloading files using Hadoop. I thought since we have to process thousands of files Hadoop jobs can do it. Are Hadoop jobs used for this type of file transfers ? Moreover there is a requirement for a scheduler also. What is the recommendation of the forum ? Thanks, Mohan
Re: How do I recover the namenode?
please follow along the steps • Shutdown all Hadoop daemons on all servers in the cluster. • Copy NameNode metadata onto the secondary NameNode and copy the entire directory tree to the secondary NameNode. • Modify the core-site.xml file, making the secondary NameNode server the new NameNode server. • Replicate that file to all servers in the cluster. • Start the NameNode daemon on the secondary NameNode server. • Restart the secondary NameNode daemon on the new NameNode server. • Start the DataNode daemons on all DataNodes. • Start the JobTracker daemon on the JobTracker node. • Start the TaskTracker daemons on all the TaskTracker nodes (DataNodes) • From any node in the cluster, use the “hadoop dfs –ls” command to verify that the data file created by TeraGen exists. • Check the total amount of HDFS storage used. • Run “hadoop fsck /” to compare to results recorded before the NameNode was halted. Raj K Singh http://in.linkedin.com/in/rajkrrsingh http://www.rajkrrsingh.blogspot.com Mobile Tel: +91 (0)9899821370 On Mon, Jul 7, 2014 at 1:30 PM, cho ju il tjst...@kgrid.co.kr wrote: cluster is.. 2 namenodes ( ha cluster ), 3 jounalnodes, n datanodes I regularly back up the metadata(fsimage) file. ( http://[namenode address]:50070/imagetransfer?getimage=1txid=latest ) How do I recover the namenode by using the metadata(fsimage)?
Re: Significance of PID files
When a daemon process is started, the process ID of the process is captured in a pid file. It is used for following purposes: - During a daemon startup, the existence of pid file is used to determine that the process is already running. - When a daemon is stooped, hadoop scripts sends kill TERM signal to the process ID captured in pid file for graceful shutdown. After a timeout, if the process still exists, kill -9 is sent for forced shutdown. For more details, see the relevant code in http://svn.apache.org/repos/asf/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/hadoop-daemon.sh On Fri, Jul 4, 2014 at 10:00 AM, Vijaya Narayana Reddy Bhoomi Reddy vijay.bhoomire...@gmail.com wrote: Hi, Can anyone please explain the significance of the pid files in Hadoop i.e. purpose and usage etc? Thanks Regards Vijay -- http://hortonworks.com/download/ -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
Re: Huge text file for Hadoop Mapreduce
http://www.cs.cmu.edu/~./enron/ Not sure the uncompressed size but pretty sure it’s over a Gig. B. From: navaz Sent: Monday, July 07, 2014 6:22 PM To: user@hadoop.apache.org Subject: Huge text file for Hadoop Mapreduce Hi I am running basic word count Mapreduce code. I have download a file Gettysburg.txt which is of 1486bytes. I have 3 datanodes and replication factor is set to 3. The data is copied into all 3 datanodes but there is only one map task is running . All other nodes are ideal. I think this is because I have only one block of data and single task is running. I would like to download a bigger file say 1GB and want to test the network shuffling performance. Could you please suggest me where can I download the huge text file. ? Thanks Regards Abdul Navaz
Re: How do I recover the namenode?
Thank you for answer. However, my Hadoop version is 2.4.1. Cluster does not have secondary namenode . How do I recover the namenode( hadoop version 2.4.1 ) by using the metadata(fsimage) ? -Original Message- From: Raj K Singhlt;rajkrrsi...@gmail.comgt; To: lt;user@hadoop.apache.orggt;; cho ju illt;tjst...@kgrid.co.krgt;; Cc: Sent: 2014-07-08 (화) 02:05:15 Subject: Re: How do I recover the namenode? please follow along the steps • Shutdown all Hadoop daemons on all servers in the cluster. • Copy NameNode metadata onto the secondary NameNode and copy the entire directory tree to the secondary NameNode. • Modify the core-site.xml file, making the secondary NameNode server the new NameNode server. • Replicate that file to all servers in the cluster. • Start the NameNode daemon on the secondary NameNode server. • Restart the secondary NameNode daemon on the new NameNode server. • Start the DataNode daemons on all DataNodes. • Start the JobTracker daemon on the JobTracker node. • Start the TaskTracker daemons on all the TaskTracker nodes (DataNodes) • From any node in the cluster, use the “hadoop dfs –ls” command to verify that the data file created by TeraGen exists. • Check the total amount of HDFS storage used. • Run “hadoop fsck /” to compare to results recorded before the NameNode was halted. Raj K Singhhttp://in.linkedin.com/in/rajkrrsingh http://www.rajkrrsingh.blogspot.com Mobile Tel: +91 (0)9899821370 On Mon, Jul 7, 2014 at 1:30 PM, cho ju il lt;tjst...@kgrid.co.krgt; wrote: cluster is.. 2 namenodes ( ha cluster ), 3 jounalnodes, n datanodes I regularly back up the metadata(fsimage) file. ( http://[namenode address]:50070/imagetransfer?getimage=1amp;txid=latest ) How do I recover the namenode by using the metadata(fsimage)?
Copy hdfs block from one data node to another
Hi All, How can copy a certain hdfs block (given the file name, start and end bytes) from one node to another node ? Thanks Yehia
Re: Copy hdfs block from one data node to another
Can you outline why one would want to do that? The blocks are disposable so it is strange to manipulate them directly. On Jul 7, 2014 8:16 PM, Yehia Elshater y.z.elsha...@gmail.com wrote: Hi All, How can copy a certain hdfs block (given the file name, start and end bytes) from one node to another node ? Thanks Yehia
can i monitor all hadoop component from one box?
hi,maillist: i want to check all hadoop cluster component process is alive or die ,i do not know if it can do like check zookeeper node from one machine?thanks
Re: can i monitor all hadoop component from one box?
look at nagios or ganglia for monitoring. On Tue, Jul 8, 2014 at 8:16 AM, ch huang justlo...@gmail.com wrote: hi,maillist: i want to check all hadoop cluster component process is alive or die ,i do not know if it can do like check zookeeper node from one machine?thanks -- Nitin Pawar
Re: Huge text file for Hadoop Mapreduce
Configuration conf = getConf(); conf.setLong(mapreduce.input.fileinputformat.split.maxsize,1000); // u can set this to some small value (in bytes) to ensure your file will split to multiple mappers , provided the format is not un-splitable format like .snappy. On Tue, Jul 8, 2014 at 7:32 AM, Adaryl Bob Wakefield, MBA adaryl.wakefi...@hotmail.com wrote: http://www.cs.cmu.edu/~./enron/ Not sure the uncompressed size but pretty sure it’s over a Gig. B. *From:* navaz navaz@gmail.com *Sent:* Monday, July 07, 2014 6:22 PM *To:* user@hadoop.apache.org *Subject:* Huge text file for Hadoop Mapreduce Hi I am running basic word count Mapreduce code. I have download a file Gettysburg.txt which is of 1486bytes. I have 3 datanodes and replication factor is set to 3. The data is copied into all 3 datanodes but there is only one map task is running . All other nodes are ideal. I think this is because I have only one block of data and single task is running. I would like to download a bigger file say 1GB and want to test the network shuffling performance. Could you please suggest me where can I download the huge text file. ? Thanks Regards Abdul Navaz