Re: What is the class that launches the reducers?

2016-08-26 Thread Hitesh Shah
Have you considered trying to use Tez with a 3-vertex DAG instead of trying to change the MR framework? i.e. A->B, A->C, B->C where A is the original map, C is the reducer and B being the verification stage I assume and C is configured to not start doing any work until B’s verification

Re: how to use Yarn API to find task/attempt status

2016-03-10 Thread Hitesh Shah
You would use YARN apis as mentioned my David. Look for “PendingMB” from “RM:8088/jmx” to see allocated/reserved/pending stats on a per queue basis. There is probably a WS that exposes similar data. At the app level, something like

Re: Can't run hadoop examples with YARN Single node cluster

2016-03-07 Thread Hitesh Shah
+common-user On Mar 7, 2016, at 3:42 PM, Hitesh Shah <hit...@apache.org> wrote: > > On Mar 7, 2016, at 1:50 PM, José Luis Larroque <larroques...@gmail.com> wrote: > >> Hi again guys, i could, finally, find what the issue was!!! >> >> &

Re: Yarn app: Cannot run "java -jar" container

2016-01-22 Thread Hitesh Shah
Ideally, the “yarn logs -application” command should give you the logs for the container in question and the stdout/stderr there usually gives you a good indication on what is going wrong. Second more complex option: - Set yarn.nodemanager.delete.debug-delay-sec to say 1200 or a large

Re: Should AMRMClientAsync#CallbackHandler add method onAMCommand ?

2015-08-13 Thread Hitesh Shah
Please look at CallbackHandler::onShutdownRequest() thanks — Hitesh On Aug 13, 2015, at 6:55 AM, Jeff Zhang zjf...@gmail.com wrote: I see that AllocateResponse has AMCommand which may request AM to resync or shutdown, but I don't see AMRMClientAsync#CallbackHandler has any method to

Re: node location of mapreduce task from CLI

2015-07-13 Thread Hitesh Shah
Maybe try the web services for the MR AM: https://hadoop.apache.org/docs/r2.7.0/hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapredAppMasterRest.html ? — Hitesh On Jul 13, 2015, at 3:17 PM, Tomas Delvechio tomasdelvechi...@gmail.com wrote: Hi for all, I'm trying to get the

Re: timelineclient in 2.4.0.2.1.11.0-891

2015-07-10 Thread Hitesh Shah
The error seems to clearly indicate that you are submitting to an invalid queue: java.io.IOException: Failed to run job : Application application_1435937105729_0100 submitted by user to unknown queue: default” You may want to address the queue name issue first before looking into

Re: run yarn container as specific user

2014-12-11 Thread Hitesh Shah
Is you app code running within the container also being run within a UGI.doAs() ? You can use the following in your code to create a UGI for the “actual” user and run all the logic within that: code actualUserUGI = UserGroupInformation.createRemoteUser(System

Re: building Apache Hadoop (main, Tez)

2014-11-22 Thread Hitesh Shah
Hi Alexey, Would you mind sharing details on the issues that you are facing? For both hadoop and tez, refer to the respective BUILDING.txt as it contains some basic information on required tools to build the project ( maven, protoc, etc ). For hadoop, you should just need to run “mvn

Re: in MR on YARN, can I change my applicationMaster code?

2014-11-07 Thread Hitesh Shah
Have you considered https://issues.apache.org/jira/browse/MAPREDUCE-4421 ? — Hitesh On Nov 6, 2014, at 4:09 PM, Yang tedd...@gmail.com wrote: we are hit with this bug https://issues.apache.org/jira/browse/YARN-2175 I could either change the NodeManager, or ApplicationMaster, but NM

Re: Hadoop 2.0 job simulation using MiniCluster

2014-10-21 Thread Hitesh Shah
Maybe check TestMRJobs.java ( hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/v2/TestMRJobs.java ) ? — Hitesh On Oct 21, 2014, at 10:01 AM, Yehia Elshater y.z.elsha...@gmail.com wrote: Hi All, I am wondering

Re: ResourceManager shutting down

2014-03-14 Thread Hitesh Shah
Hi John Would you mind filing a jira with more details. The RM going down just because a host was not resolvable or DNS timed out is something that should be addressed. thanks -- Hitesh On Mar 13, 2014, at 2:29 PM, John Lilley wrote: Never mind… we figured out its DNS entry was going

Re: ResourceManager shutting down

2014-03-13 Thread Hitesh Shah
Hi John Would you mind filing a jira with more details. The RM going down just because a host was not resolvable or DNS timed out is something that should be addressed. thanks -- Hitesh On Mar 13, 2014, at 2:29 PM, John Lilley wrote: Never mind… we figured out its DNS entry was going

Re: Passing data from Client to AM

2014-01-30 Thread Hitesh Shah
Adding values to a Configuration object does not really work unless you serialize the config into a file and send it over to the AM and containers as a local resource. The application code would then need to load in this file using Configuration::addResource(). MapReduce does this by taking in

Re: How to make AM terminate if client crashes?

2014-01-15 Thread Hitesh Shah
You would probably need to bake this into your own application. By default, a client never should need to keep an open active connection with the RM. It could keep an active connection with the AM ( application-specific code required ) but it would then also have to handle failover to a

Re: Unmanaged AMs

2013-11-23 Thread Hitesh Shah
Hello Kishore, An unmanaged AM has no relation to the language being used. An unmanaged AM is an AM that is launched outside of the YARN cluster i.e. manually launched elsewhere and not by the RM ( using the application submission context provided by a client). It was built to be a dev-tool

Re: HDP 2.0 Install fails on repo unavailability

2013-10-24 Thread Hitesh Shah
BCC'ing user@hadoop. This is a question for the ambari mailing list. -- Hitesh On Oct 24, 2013, at 3:36 PM, Jain, Prem wrote: Folks, Trying to install the newly release Hadoop 2.0 using Ambari. I am able to install Ambari, but when I try to install Hadoop 2.0 on rest of the cluster,

Re: simple word count program remains un assigned...

2013-10-19 Thread Hitesh Shah
Hello Gunjan, This mailing list is for Apache Hadoop related questions. Please post questions for other distributions to the appropriate vendor's mailing list. thanks -- Hitesh On Oct 19, 2013, at 11:27 AM, gunjan mishra wrote: Hi I am trying to run a simple word count program , like this ,

Re: Conflicting dependency versions

2013-10-10 Thread Hitesh Shah
Hi Albert, If you are using distributed cache to push the newer version of the guava jars, you can try setting mapreduce.job.user.classpath.first to true. If not, you can try overriding the value of mapreduce.application.classpath to ensure that the dir where the newer guava jars are present

Re: Hadoop Yarn

2013-08-29 Thread Hitesh Shah
Hi Rajesh, Have you looked at re-using the profiling options to inject the jvm options to a defined range of tasks? http://hadoop.apache.org/docs/stable/mapred_tutorial.html#Profiling -- Hitesh On Aug 29, 2013, at 3:51 PM, Rajesh Jain wrote: Hi Vinod These are jvm parameters to inject

Re: setLocalResources() on ContainerLaunchContext

2013-08-06 Thread Hitesh Shah
Hi Krishna, YARN downloads a specified local resource on the container's node from the url specified. In all situtations, the remote url needs to be a fully qualified path. To verify that the file at the remote url is still valid, YARN expects you to provide the length and last modified

Re: setLocalResources() on ContainerLaunchContext

2013-08-06 Thread Hitesh Shah
here w.r.t. handling HDFS paths. On Wed, Aug 7, 2013 at 12:35 AM, Hitesh Shah hit...@apache.org wrote: Hi Krishna, YARN downloads a specified local resource on the container's node from the url specified. In all situtations, the remote url needs to be a fully qualified path. To verify

Re: yarn Failed to bind to: 0.0.0.0/0.0.0.0:8080

2013-07-10 Thread Hitesh Shah
You are probably hitting a clash with the shuffle port. Take a look at https://issues.apache.org/jira/browse/MAPREDUCE-5036 -- Hitesh On Jul 10, 2013, at 8:19 PM, Harsh J wrote: Please see yarn-default.xml for the list of options you can tweak:

Re: Compile Just a Subproject

2013-06-21 Thread Hitesh Shah
Hello Curtis Try the following: hadoop jar ./target/distributed-shell.jar.rebuilt-one org.apache.hadoop.yarn.applications.distributedshell.Client -jar ... If you are running hadoop without the jar command, it will find the first instance of Client.class in its classpath which I am guessing

Re: jobtracker webservice in 1.2.0

2013-06-20 Thread Hitesh Shah
The webservices were introduced only in the 2.x branch. I don't believe the feature has been ported back to the 1.x line. If it helps, for the 1.x, line, you can try appending ?format=json to the urls used in the jobtracker UI to get a dump of the data in json format. thanks -- Hitesh On Jun

Re: YARN Container's App ID

2013-06-11 Thread Hitesh Shah
Hello Brian, org.apache.hadoop.yarn.api.ApplicationConstants.Environment should have a list of all the information set in the environment. One of these is the container ID. ApplicationAttemptID can be obtained from a container ID object which in turn can be used to get the App Id. -- Hitesh

Re: YARN Container's App ID

2013-06-11 Thread Hitesh Shah
, 2013, at 12:14 PM, Brian C. Huffman wrote: Hitesh, Is this only in trunk? I'm currently running 2.0.3-alpha and I don't see it there. I also don't see it in the latest 2.0.5. Thanks, Brian On 06/11/2013 02:54 PM, Hitesh Shah wrote: Hello Brian

Re: Apache Flume Properties File

2013-05-24 Thread Hitesh Shah
Hello Raj BCC-ing user@hadoop and user@hive Could you please not cross-post questions to multiple mailing lists? For questions on hadoop, go to user@hadoop. For questions on hive, please send them to the hive mailing list and not the user@hadoop mailing list. Likewise for flume. thanks --

Re: getAllocatedContainers() is not returning when ran against 2.0.3-alpha

2013-04-03 Thread Hitesh Shah
If I understand your question, you are expecting all the containers to be allocated in one go? Or are you seeing your application hang because it asked for 10 containers but it only received a total of 9 even after repeated calls to the RM? There is no guarantee that you will be allocated

Re: The most newbie question ever

2013-03-21 Thread Hitesh Shah
Also, BUILDING.txt can be found at the top level directory of the checked out code. -- Hitesh On Mar 21, 2013, at 5:39 PM, Hitesh Shah wrote: Assuming you have checked out the hadoop source code into /home/keithomas/hadoop-common/ , you need to run the maven command in that directory

Re: YARN Features

2013-03-12 Thread Hitesh Shah
Answers regarding DistributedShell. https://issues.apache.org/jira/secure/attachment/12486023/MapReduce_NextGen_Architecture.pdf has some details on YARN's architecture. -- Hitesh On Mar 12, 2013, at 7:31 AM, Ioan Zeng wrote: Another point I would like to evaluate is the Distributed Shell

Re: YARN Features

2013-03-12 Thread Hitesh Shah
. ( http://riccomini.name/posts/hadoop/2012-10-12-hortonworks-yarn-meetup/ might be of help ) Thanks, Ioan On Tue, Mar 12, 2013 at 8:47 PM, Hitesh Shah hit...@hortonworks.com wrote: Answers regarding DistributedShell. https://issues.apache.org/jira/secure/attachment/12486023

Re: Installing Hadoop on RHEL 6.2

2013-02-13 Thread Hitesh Shah
You could try using Ambari. http://incubator.apache.org/ambari/ http://incubator.apache.org/ambari/1.2.0/installing-hadoop-using-ambari/content/index.html -- Hitesh On Feb 13, 2013, at 11:00 AM, Shah, Rahul1 wrote: Hi, Can someone help me with installation of Hadoop on cluster with RHEL

Re: “hadoop namenode -format” formats wrong directory

2013-02-06 Thread Hitesh Shah
Try running the command using hadoop --config /etc/hadoop/conf to make sure it is looking at the right conf dir. It would help to understand how you installed hadoop - local build/rpm, etc .. to figure out which config dir is being looked at by default. -- Hitesh On Feb 6, 2013, at 7:25 AM,

Re: hortonworks install fail

2013-01-10 Thread Hitesh Shah
Hi ambari-user@ is probably the better list for this. It seems like your puppet command is timing out. Could you reply back with the contents of the /var/log/puppet_apply.log from the node in question? Also, it might be worth waiting a few days for the next release of ambari which should

Re: Hive error when loading csv data.

2012-06-26 Thread Hitesh Shah
Michael's suggestion was to change your data to: c|zxy|xyz d|abc,def|abcd and then use | as the delimiter. -- Hitesh On Jun 26, 2012, at 2:30 PM, Sandeep Reddy P wrote: Thanks for the reply. I didnt get that Michael. My f2 should be abc,def On Tue, Jun 26, 2012 at 4:00 PM, Michael Segel

Re: Running Distributed shell in hadoop0.23

2011-12-15 Thread Hitesh Shah
The shell script is invoked within the context of a container launched by the NodeManager. If you are creating a directory using a relative path, it will be created relative of the container's working directory and cleaned up when the container completes. If you really want to see some

Re: Running Distributed shell in hadoop0.23

2011-12-14 Thread Hitesh Shah
Assuming you have a non-secure cluster setup ( the code does not handle security properly yet ), the following command would run the ls command on 5 allocated containers. $HADOOP_COMMON_HOME/bin/hadoop jar path to hadoop-yarn-applications-distributedshell-0.24.0-SNAPSHOT.jar

Re: Running Distributed shell in hadoop0.23

2011-12-14 Thread Hitesh Shah
On Thu, Dec 15, 2011 at 12:09 AM, Hitesh Shah hit...@hortonworks.com wrote: Assuming you have a non-secure cluster setup ( the code does not handle security properly yet ), the following command would run the ls command on 5 allocated containers. $HADOOP_COMMON_HOME/bin/hadoop jar path