Re: Data Locality Importance

2014-03-22 Thread Vinod Kumar Vavilapalli
Like you said, it depends both on the kind of network you have and the type of your workload. Given your point about S3, I'd guess your input files/blocks are not large enough that moving code to data trumps moving data itself to the code. When that balance tilts a lot, especially when moving

Re: are the job and task tracker monitor webpages gone now in hadoop v2.3.0

2014-03-06 Thread Vinod Kumar Vavilapalli
Yes. JobTracker and TaskTracker are gone from all the 2.x release lines. MapReduce is an application on top of YARN. That is per job - launches, starts and finishes after it is done with its work. Once it is done, you can go look at it in the MapReduce specific JobHistoryServer. +Vinod On

Re: 33%, 66%, and 100% *reducer* optimization

2012-10-15 Thread Vinod Kumar Vavilapalli
Reduce has three phases - shuffle, sort and reduce. So, 33% would imply the shuffle phase end, and 66% would refer to the end of sort phase. Thanks, +Vinod On Oct 15, 2012, at 2:32 PM, Jay Vyas wrote: Hi guys ! We all know that there are major milestones in reducers (33%, 66%) In

Re: DN cannot talk to NN using Kerberos on secured hdfs

2012-09-12 Thread Vinod Kumar Vavilapalli
in caps. If that is the case, you should try tweaking your host-names to all lower-case. Thanks, +Vinod Kumar Vavilapalli Hortonworks Inc. http://hortonworks.com/ On Sep 12, 2012, at 9:47 AM, Shumin Wu wrote: Hi, I am setting up a secured hdfs using Kerberos. I got NN, 2NN working just fine

Re: Error starting MRAppMaster

2012-06-21 Thread Vinod Kumar Vavilapalli
This has got nothing to do with the scheduler. I believe this has got to do with some compilation issue. How did you build hadoop? Also, I found that the repo at github (which is a mirror of git repo at apache) doesn't always pick all the commits immediately. You are better off checking out

Re: Error: Too Many Fetch Failures

2012-06-19 Thread Vinod Kumar Vavilapalli
Replies/more questions inline. I'm using Hadoop 0.23 on 50 machines, each connected with gigabit ethernet and each having solely a single hard disk. I am getting the following error repeatably for the TeraSort benchmark. TeraGen runs without error, but TeraSort runs predictably until

Re: Issue with loading the Snappy Codec

2012-04-14 Thread Vinod Kumar Vavilapalli
Hadoop has integrated snappy via installed native libraries instead of snappy-java.jar (ref https://issues.apache.org/jira/browse/HADOOP-7206) - You need to have the snappy system libraries (snappy and snappy-devel) installed before you compile hadoop. (RPMs are available on the web,

Re: Issues during setting up hadoop security cluster

2012-01-20 Thread Vinod Kumar Vavilapalli
through jsvc, I don't know if the java setting does not work after executed through jsvc. But anyway, it still complain for the AES 256 is not supported. Any ideas? Thanks Emma -Original Message- From: Vinod Kumar Vavilapalli [mailto:vino...@hortonworks.com] Sent: 2012年1月20日 13

Re: Issues during setting up hadoop security cluster

2012-01-19 Thread Vinod Kumar Vavilapalli
Hi, Just today evening, I happened to run into someone who had the same issue. After some debugging, I cornered that to the hostnames having upper-case characters. Somehow, when DataNode or NodeManager try to get a service ticket for their corresponding services (NameNode and ResourceManager

Re: Yarn Container Limit

2012-01-10 Thread Vinod Kumar Vavilapalli
You can use yarn.nodemanager.resource.memory-mb to set the limit on each NodeManager. You should have a good look at http://hadoop.apache.org/common/docs/r0.23.0/hadoop-yarn/hadoop-yarn-site/ClusterSetup.html . It has enough information to get you a good distance. HTH. +Vinod On Tue, Jan 10,

Re: Container launch from appmaster

2012-01-10 Thread Vinod Kumar Vavilapalli
Yes, you can. http://hadoop.apache.org/common/docs/r0.23.0/hadoop-yarn/hadoop-yarn-site/WritingYarnApplications.html#Writing_an_ApplicationMaster should give you a very good idea and example code about this. But, the requirements are not hard-fixed. If the scheduler cannot find free resources on

Re: pointing mapred.local.dir to a ramdisk

2011-10-03 Thread Vinod Kumar Vavilapalli
Must be related to some kind of permissions problems. It will help if you can paste the corresponding source code for FileUtil.copy(). Hard to track it with different versions, so. Thanks, +Vinod On Mon, Oct 3, 2011 at 9:28 PM, Raj V rajv...@yahoo.com wrote: Eric Yes. The owner is hdfs and

Re: Java programmatic authentication of Hadoop Kerberos

2011-09-22 Thread Vinod Kumar Vavilapalli
You may be missing the kerberos principal for the namenode in your configuration used to connect to NameNode. Check your configuration for dfs.namenode.kerberos.principal and set it to the same value as on NN. HTH +Vinod On Thu, Sep 22, 2011 at 4:06 AM, Sivva svijaysand...@gmail.com wrote: Hi