Re: hadoop cluster with non-uniform disk spec

2015-02-11 Thread Manoj Venkatesh
I had a similar question recently. Please check out balancer http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HdfsUserGuide.html#Balancer this will balance the data across the nodes. - Manoj From: Chen Song chen.song...@gmail.commailto:chen.song...@gmail.com Reply-To:

hadoop cluster with non-uniform disk spec

2015-02-11 Thread Chen Song
We have a hadoop cluster consisting of 500 nodes. But the nodes are not uniform in term of disk spaces. Half of the racks are newer with 11 volumes of 1.1T on each node, while the other half have 5 volume of 900GB on each node. dfs.datanode.fsdataset.volume.choosing.policy is set to

Re: missing data blocks after active name node crashes

2015-02-11 Thread Chen Song
Thanks guys. On Wed, Feb 11, 2015 at 8:03 AM, dlmarion dlmar...@comcast.net wrote: https://issues.apache.org/jira/browse/HDFS-7097 Original message From: Chen Song chen.song...@gmail.com Date:02/11/2015 7:48 AM (GMT-05:00) To: user@hadoop.apache.org Cc: Subject: Re:

Re: hadoop cluster with non-uniform disk spec

2015-02-11 Thread Chen Song
Hey Ravi Here are my settings: dfs.datanode.available-space-volume-choosing-policy.balanced-space-threshold = 21474836480 (20G) dfs.datanode.available-space-volume-choosing-policy.balanced-space-preference-fraction = 0.85f Chen On Wed, Feb 11, 2015 at 4:36 PM, Ravi Prakash ravi...@ymail.com

Re: Building for Windows

2015-02-11 Thread Alexander Pivovarov
try mvn package -Pdist -Dtar -DskipTests On Wed, Feb 11, 2015 at 2:02 PM, Lucio Crusca lu...@sulweb.org wrote: Hello everybody, I'm absolutely new to hadoop and a customer asked me to build version 2.6 for Windows Server 2012 R2. I'm myself a java programmer, among other things, but I've

Building for Windows

2015-02-11 Thread Lucio Crusca
Hello everybody, I'm absolutely new to hadoop and a customer asked me to build version 2.6 for Windows Server 2012 R2. I'm myself a java programmer, among other things, but I've never used hadoop before. I've downloaded and installed JDK7, Maven, Cygwin (for sh, mv, gzip, ...) and other toys

Re: Building for Windows

2015-02-11 Thread Alexander Pivovarov
in addition to skipTests you want to add native-win profile mvn clean package -Pdist,native-win -DskipTests -Dtar this command must be run from a Windows SDK command prompt (not cygwin) as documented in BUILDING.txt. A successful build generates a binary hadoop .tar.gz package in

Re: hadoop cluster with non-uniform disk spec

2015-02-11 Thread Ravi Prakash
Hi Chen! Are you running the balancer? What are you setting dfs.datanode.available-space-volume-choosing-policy.balanced-space-threshold dfs.datanode.available-space-volume-choosing-policy.balanced-space-preference-fractionto? On Wednesday, February 11, 2015 7:44 AM, Chen Song

Re: Building for Windows

2015-02-11 Thread Lucio Crusca
In data mercoledì 11 febbraio 2015 15:17:23, Alexander Pivovarov ha scritto: in addition to skipTests you want to add native-win profile mvn clean package -Pdist,native-win -DskipTests -Dtar Ok thanks but... what's the point of having tests in place if you have to skip them in order to

Re: Building for Windows

2015-02-11 Thread Alexander Pivovarov
There are about 3000 tests It should be particular box configuration to run all tests successfully you should have lots of memory It takes min 1 hour to run all tests Look at hadoop pre-commit builds on jenkins https://builds.apache.org/job/PreCommit-HADOOP-Build/ On Wed, Feb 11, 2015 at 3:55

Re: Building for Windows

2015-02-11 Thread Lucio Crusca
Though I've gone go one step further, the build now stops with the following error: [ERROR] Failed to execute goal org.codehaus.mojo:exec-maven-plugin:1.2:exec (compile-ms-winutils) on project hadoop-common: Command execution failed. Cannot run program msbuild... I'm running the build from

RE: Time out after 600 for YARN mapreduce application

2015-02-11 Thread Alexandru Pacurar
Hello, Regarding the AttemptID:attempt_1423062241884_9970_m_09_0 Timed out after 600 secs error, I managed to get en extended status for it. The other message that I get is java.lang.Exception: Container is not yet running. Current state is LOCALIZING. So the container spends 10 minutes in

Re: FileSystem Vs ZKStateStore for RM recovery

2015-02-11 Thread Suma Shivaprasad
We have set yarn.resourcemanager.max-completed-applications=1. I assume this is the no of entries kept in RMStateStore since I see these in logs 2015-02-11 00:00:00,579 INFO org.apache.hadoop.yarn.server.resourcemanager.RMAppManager: Max number of completed apps kept in state store met:

Re: FileSystem Vs ZKStateStore for RM recovery

2015-02-11 Thread Karthik Kambatla
We recommend ZK-store, particularly if you plan to deploy multiple ResourceManagers with failover. ZK-store ensures a single RM has write access and thus is better protected against split-brain cases where both RMs think they are active. On Tue, Feb 10, 2015 at 9:59 PM, Suma Shivaprasad

Time out after 600 for YARN mapreduce application

2015-02-11 Thread Alexandru Pacurar
Hello, I keep encountering an error when running nutch on hadoop YARN: AttemptID:attempt_1423062241884_9970_m_09_0 Timed out after 600 secs Some info on my setup. I'm running a 64 nodes cluster with hadoop 2.4.1. Each node has 4 cores, 1 disk and 24Gb of RAM, and the

RE: Time out after 600 for YARN mapreduce application

2015-02-11 Thread Rohith Sharma K S
Looking into attemptID, this is mapper task getting timed out in MapReduce job. The configuration that can be used to increase the value is 'mapreduce.task.timeout'. The task timed out is because if there is no heartbeat from MapperTask(YarnChild) to MRAppMaster for 10 mins. Does MR job is

Re: FileSystem Vs ZKStateStore for RM recovery

2015-02-11 Thread Suma Shivaprasad
Can ZKStateStore scale for large clusters. Any idea on the number of concurrent jobs that can be supported on top of these ? Thanks Suma On Wed, Feb 11, 2015 at 1:45 PM, Karthik Kambatla ka...@cloudera.com wrote: We recommend ZK-store, particularly if you plan to deploy multiple

RE: Time out after 600 for YARN mapreduce application

2015-02-11 Thread Alexandru Pacurar
Thank you for the quick reply. I will modify the value to check if this is the threshold I'm hitting, but I was thinking of decreasing it because my jobs take to long If they get this time out. I would rather fail fast, than keep the cluster busy with jobs stuck in timeouts. Ideally I would

Re: missing data blocks after active name node crashes

2015-02-11 Thread Chen Song
Thanks David. Do you have the relative Jira ticket number handy? Chen On Tue, Feb 10, 2015 at 5:54 PM, david marion dlmar...@hotmail.com wrote: I believe therr was an issue fixed in 2.5 or 2.6 where the standby NN would not process block reports from the DNs when it was dealing with the

Re: missing data blocks after active name node crashes

2015-02-11 Thread dlmarion
https://issues.apache.org/jira/browse/HDFS-7097 div Original message /divdivFrom: Chen Song chen.song...@gmail.com /divdivDate:02/11/2015 7:48 AM (GMT-05:00) /divdivTo: user@hadoop.apache.org /divdivCc: /divdivSubject: Re: missing data blocks after active name node

RE: Building for Windows

2015-02-11 Thread Naveen Kumar Pokala
Hi lucio, Following steps helped me in building Hadoop on windows 7. Hadoop-3.0.0 build instructions for Windows 8.1 -- I compiled Hadoop-3.0.0 from Git Trunk. I have not used cygwin. I have Windows 8.1 64bit Pre-requisites: --- 1.

RE: Building for Windows

2015-02-11 Thread Kiran Kumar.M.R
Hi Naveen, Did you see the compilation steps from comments in blog http://zutai.blogspot.com/2014/06/build-install-and-run-hadoop-24-240-on.html?showComment=1422091525887#c2264594416650430988 I had posted it. Good to know it helped you. Regards, Kiran

Re:Re: Re: Stopping ntpd signals SIGTERM, then causes namenode exit

2015-02-11 Thread David chen
Hi Or Sher, Thanks for sharing your experiences, it indeed helped me.

Re: hadoop cluster with non-uniform disk spec

2015-02-11 Thread daemeon reiydelle
What have you set dfs.datanode.fsdataset.volume.choosing.policy to (assuming you are on a current version of Hadoop)? Is the policy set to org.apache.hadoop.hdfs.server.datanode.fsdataset.AvailableSpaceVolumeChoosingPolicy? *...* *“Life should not be a journey to the grave with the

RE: Error with winutils.sln

2015-02-11 Thread Kiran Kumar.M.R
I did following changes to make compilation successful 1. hadoop-hdfs-project\hadoop-hdfs\src\main\native\libhdfs\os\windows\thread.c Line 31: Add WINAPI to declaration static DWORD WINAPI runThread(LPVOID toRun) { 2.