Re: Synchronization among Mappers in map-reduce task

2014-08-12 Thread Wangda Tan
Hi Saurabh, It's an interesting topic, So , here is the question , is it possible to make sure that when one of the mapper tasks is writing to a file , other should wait until the first one is finished. ? I read that all the mappers task don't interact with each other A simple way to do this is

Re: 100% CPU consumption by Resource Manager process

2014-08-12 Thread Wangda Tan
Hi Krishna, To get more understanding about the problem, could you please share following information: 1) Number of nodes and running app in the cluster 2) What's the version of your Hadoop? 3) Have you set yarn.scheduler.capacity.schedule-asynchronously.enable=true? 4) What's the

Re: Negative value given by getVirtualCores() or getAvailableResources()

2014-08-12 Thread Wangda Tan
By default, vcore = 1 for each resource request. If you don't like this behavior, you can set yarn.scheduler.minimum-allocation-vcores=0 Hope this helps, Wangda Tan On Thu, Aug 7, 2014 at 7:13 PM, Krishna Kishore Bonagiri write2kish...@gmail.com wrote: Hi, I am calling

How to use docker in Hadoop, with patch of YARN-1964?

2014-08-12 Thread sam liu
Hi Experts, I am very interesting that Hadoop could work with Docker and doing some trial on patch of YARN-1964. I applied patch yarn-1964-branch-2.2.0-docker.patch of jira YARN-1964 on branch 2.2 and am going to install a Hadoop cluster using the new generated tarball including the patch.

org.apache.hadoop.security.AccessControlException: Permission denied: user=yarn, access=EXECUTE

2014-08-12 Thread Ana Gillan
Hi, I ran a job in Hive and it got to this stage: Stage-1 map = 100%, reduce = 29%, seemed to start cleaning up the containers and stuff successfully, and then I got this series of errors: 2014-08-12 03:58:55,718 ERROR org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException

Why 2 different approach for deleting localized resources and aggregated logs?

2014-08-12 Thread Rohith Sharma K S
Hi I see two different approach for deleting localized resources and aggregated logs. 1. Localized resources are deleted based on the size of localizer cache, per local directory. 2. Aggregated logs are deleted based on the time(if enabled). Is there any specific thoughts

Pseudo -distributed mode

2014-08-12 Thread sindhu hosamane
Can Setting up 2 datanodes on same machine be considered as pseudo-distributed mode hadoop ? Thanks, Sindhu

Re: Pseudo -distributed mode

2014-08-12 Thread Sergey Murylev
Yes :) Pseudo-distributed mode is such configuration when we have some Hadoop environment on single computer. On 12/08/14 18:25, sindhu hosamane wrote: Can Setting up 2 datanodes on same machine be considered as pseudo-distributed mode hadoop ? Thanks, Sindhu signature.asc

Re: Pseudo -distributed mode

2014-08-12 Thread sindhu hosamane
I have read By default, Hadoop is configured to run in a non-distributed mode, as a single Java process . But if my hadoop is pseudo distributed mode , why does it still runs as a single Java process and utilizes only 1 cpu core even if there are many more ? On Tue, Aug 12, 2014 at 4:32 PM,

Re: ulimit for Hive

2014-08-12 Thread Zhijie Shen
+ Hive user mailing list It should be a better place for your questions. On Mon, Aug 11, 2014 at 3:17 PM, Ana Gillan ana.gil...@gmail.com wrote: Hi, I’ve been reading a lot of posts about needing to set a high ulimit for file descriptors in Hadoop and I think it’s probably the cause of a

Making All datanode down

2014-08-12 Thread Satyam Singh
Hi Users, In my cluster setup i am doing one test case of making only all datanodes down and keep namenode running. In this case my application gets error with remoteException: could only be replicated to 0 nodes instead of minReplication (=1). There are 0 datanode(s) running and no

Re: ulimit for Hive

2014-08-12 Thread Chris MacKenzie
Hi Zhijie, ulimit is common between hard and soft ulimit The hard limit can only be set by a sys admin. It can be used for a fork bomb dos attack. The sys admin hard ulimit can be set per user i.e hadoop_user A user can add a line to their .profile file setting a soft -ulimit up to the hard

hadoop/yarn and task parallelization on non-hdfs filesystems

2014-08-12 Thread Calvin
Hi all, I've instantiated a Hadoop 2.4.1 cluster and I've found that running MapReduce applications will parallelize differently depending on what kind of filesystem the input data is on. Using HDFS, a MapReduce job will spawn enough containers to maximize use of all available memory. For

MR AppMaster unable to load native libs

2014-08-12 Thread Subroto Sanyal
Hi, I am running a single node hadoop cluster 2.4.1. When I submit a MR job it logs a warning: 2014-08-12 21:38:22,173 WARN [main] org.apache.hadoop.util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable The problem doesn’t

Started learning Hadoop. Which distribution is best for native install in pseudo distributed mode?

2014-08-12 Thread mani kandan
Which distribution are you people using? Cloudera vs Hortonworks vs Biginsights?

I was wondering what could make these 2 variables different: HADOOP_CONF_DIR vs YARN_CONF_DIR

2014-08-12 Thread REYANE OUKPEDJO
Can someone explain what makes the above variable different  ? Most of the time they are set pointing to the same directory. Thanks  Reyane OUKPEDJO

Re: Started learning Hadoop. Which distribution is best for native install in pseudo distributed mode?

2014-08-12 Thread Kai Voigt
3. seems a biased and incomplete statement. Cloudera’s distribution CDH is fully open source. The proprietary „stuff you refer to is most likely Cloudera Manager, an additional tool to make deployment, configuration and monitoring easy. Nobody is required to use it to run a Hadoop cluster.

Re: Started learning Hadoop. Which distribution is best for native install in pseudo distributed mode?

2014-08-12 Thread Adaryl Bob Wakefield, MBA
You fell into my trap sir. I was hoping someone would clear that up. :) Adaryl Bob Wakefield, MBA Principal Mass Street Analytics 913.938.6685 www.linkedin.com/in/bobwakefieldmba Twitter: @BobLovesData From: Kai Voigt Sent: Tuesday, August 12, 2014 4:10 PM To: user@hadoop.apache.org Subject:

Re: Started learning Hadoop. Which distribution is best for native install in pseudo distributed mode?

2014-08-12 Thread Aaron Eng
On that note, 2 is also misleading/incomplete. You might want to explain which specific features you are referencing so the original poster can figure out if those features are relevant. The inverse of 2 is also true, things like consistent snapshots and full random read/write over NFS are in

Re: Started learning Hadoop. Which distribution is best for native install in pseudo distributed mode?

2014-08-12 Thread Jay Vyas
also, consider apache bigtop. That is the apache upstream Hadoop initiative, and it comes with smoke tests+ Puppet recipes for setting up your own Hadoop distro from scratch. IMHO ... If learning or building your own tooling around Hadoop , bigtop is ideal. If interested in purchasing support

Re: Started learning Hadoop. Which distribution is best for native install in pseudo distributed mode?

2014-08-12 Thread Adaryl Bob Wakefield, MBA
Is this up to date? http://www.mapr.com/products/product-overview/overview Adaryl Bob Wakefield, MBA Principal Mass Street Analytics 913.938.6685 www.linkedin.com/in/bobwakefieldmba Twitter: @BobLovesData From: Aaron Eng Sent: Tuesday, August 12, 2014 4:31 PM To: user@hadoop.apache.org

Re: Hadoop 2.4.1 Verifying Automatic Failover Failed: ResourceManager

2014-08-12 Thread Xuan Gong
Hey, Arthur: Could you show me the error message for rm2. please ? Thanks Xuan Gong On Mon, Aug 11, 2014 at 10:17 PM, arthur.hk.c...@gmail.com arthur.hk.c...@gmail.com wrote: Hi, Thank y very much! At the moment if I run ./sbin/start-yarn.sh in rm1, the standby STANDBY

Hadoop 2.4 failed to launch job on aws s3n

2014-08-12 Thread Yue Cheng
Hi, I deployed Hadoop 2.4 on AWS EC2 using S3 native file system as a replacement of HDFS. I tried several example apps, all gave me the following stack tracing msgs (an older thread on Jul 24 hang there w/o being resolved... So I attach the DEBUG info here...): hadoop jar

fair scheduler not working as intended

2014-08-12 Thread Henry Hung
Hi Everyone, I'm using Hadoop-2.2.0 with fair scheduler in my YARN cluster, but something is wrong with the fair scheduler. Here is my fair-scheduler.xml looks like: allocations queue name=longrun maxResources15360 mb, 5 vcores/maxResources weight0.5/weight minMaps2/minMaps

Re: Synchronization among Mappers in map-reduce task Please advise

2014-08-12 Thread saurabh jain
Hi Wangda , I am not sure making overwrite=false , will solve the problem. As per java doc by making overwrite=false , it will throw an exception if the file already exists. So, for all the remaining mappers it will throw an exception. Also I am very new to ZK and have very basic knowledge of it

Re: Making All datanode down

2014-08-12 Thread Gordon Wang
Did you try to close the file and reopen it for writing after datanodes restart ? I think if you close the file and reopen it. The exception might disappear. On Wed, Aug 13, 2014 at 2:21 AM, Satyam Singh satyam.si...@ericsson.com wrote: Hi Users, In my cluster setup i am doing one test

Re: Synchronization among Mappers in map-reduce task Please advise

2014-08-12 Thread Wangda Tan
Hi Saurabh, am not sure making overwrite=false , will solve the problem. As per java doc by making overwrite=false , it will throw an exception if the file already exists. So, for all the remaining mappers it will throw an exception. You can catch the exception and wait. Can you please refer

Re: fair scheduler not working as intended

2014-08-12 Thread Yehia Elshater
Hi Henry, Are there any applications (on different queues rather than longrun queue) are running in the same time ? I think FairScheduler is going to assign more resources to your longrun as long as there no other applications are running in the other queues. Thanks Yehia On 12 August 2014

Re: MR AppMaster unable to load native libs

2014-08-12 Thread Susheel Kumar Gadalay
This message I have also got when running in 2.4.1 I have found the native libraries in $HADOOP_HOME/lib/native are 32 bit not 64 bit. Recompile once again and build 64 bit shared objects, but it is a lengthy exercise. On 8/13/14, Subroto Sanyal ssan...@datameer.com wrote: Hi, I am running a