Have you considered trying to use Tez with a 3-vertex DAG instead of trying to
change the MR framework? i.e. A->B, A->C, B->C where A is the original map, C
is the reducer and B being the verification stage I assume and C is configured
to not start doing any work until B’s verification
You would use YARN apis as mentioned my David. Look for “PendingMB” from
“RM:8088/jmx” to see allocated/reserved/pending stats on a per queue basis.
There is probably a WS that exposes similar data. At the app level, something
like
+common-user
On Mar 7, 2016, at 3:42 PM, Hitesh Shah <hit...@apache.org> wrote:
>
> On Mar 7, 2016, at 1:50 PM, José Luis Larroque <larroques...@gmail.com> wrote:
>
>> Hi again guys, i could, finally, find what the issue was!!!
>>
>>
&
Ideally, the “yarn logs -application” command should give you the logs for the
container in question and the stdout/stderr there usually gives you a good
indication on what is going wrong.
Second more complex option:
- Set yarn.nodemanager.delete.debug-delay-sec to say 1200 or a large
Please look at CallbackHandler::onShutdownRequest()
thanks
— Hitesh
On Aug 13, 2015, at 6:55 AM, Jeff Zhang zjf...@gmail.com wrote:
I see that AllocateResponse has AMCommand which may request AM to resync or
shutdown, but I don't see AMRMClientAsync#CallbackHandler has any method to
Maybe try the web services for the MR AM:
https://hadoop.apache.org/docs/r2.7.0/hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapredAppMasterRest.html
?
— Hitesh
On Jul 13, 2015, at 3:17 PM, Tomas Delvechio tomasdelvechi...@gmail.com wrote:
Hi for all,
I'm trying to get the
The error seems to clearly indicate that you are submitting to an invalid queue:
java.io.IOException: Failed to run job : Application
application_1435937105729_0100 submitted by user to unknown queue:
default”
You may want to address the queue name issue first before looking into
Is you app code running within the container also being run within a UGI.doAs()
?
You can use the following in your code to create a UGI for the “actual” user
and run all the logic within that:
code
actualUserUGI = UserGroupInformation.createRemoteUser(System
Hi Alexey,
Would you mind sharing details on the issues that you are facing?
For both hadoop and tez, refer to the respective BUILDING.txt as it contains
some basic information on required tools to build the project ( maven, protoc,
etc ).
For hadoop, you should just need to run “mvn
Have you considered https://issues.apache.org/jira/browse/MAPREDUCE-4421 ?
— Hitesh
On Nov 6, 2014, at 4:09 PM, Yang tedd...@gmail.com wrote:
we are hit with this bug https://issues.apache.org/jira/browse/YARN-2175
I could either change the NodeManager, or ApplicationMaster, but NM
Maybe check TestMRJobs.java (
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/v2/TestMRJobs.java
) ?
— Hitesh
On Oct 21, 2014, at 10:01 AM, Yehia Elshater y.z.elsha...@gmail.com wrote:
Hi All,
I am wondering
Hi John
Would you mind filing a jira with more details. The RM going down just because
a host was not resolvable or DNS timed out is something that should be
addressed.
thanks
-- Hitesh
On Mar 13, 2014, at 2:29 PM, John Lilley wrote:
Never mind… we figured out its DNS entry was going
Hi John
Would you mind filing a jira with more details. The RM going down just because
a host was not resolvable or DNS timed out is something that should be
addressed.
thanks
-- Hitesh
On Mar 13, 2014, at 2:29 PM, John Lilley wrote:
Never mind… we figured out its DNS entry was going
Adding values to a Configuration object does not really work unless you
serialize the config into a file and send it over to the AM and containers as a
local resource. The application code would then need to load in this file using
Configuration::addResource(). MapReduce does this by taking in
You would probably need to bake this into your own application. By default, a
client never should need to keep an open active connection with the RM. It
could keep an active connection with the AM ( application-specific code
required ) but it would then also have to handle failover to a
Hello Kishore,
An unmanaged AM has no relation to the language being used. An unmanaged AM is
an AM that is launched outside of the YARN cluster i.e. manually launched
elsewhere and not by the RM ( using the application submission context provided
by a client). It was built to be a dev-tool
BCC'ing user@hadoop.
This is a question for the ambari mailing list.
-- Hitesh
On Oct 24, 2013, at 3:36 PM, Jain, Prem wrote:
Folks,
Trying to install the newly release Hadoop 2.0 using Ambari. I am able to
install Ambari, but when I try to install Hadoop 2.0 on rest of the cluster,
Hello Gunjan,
This mailing list is for Apache Hadoop related questions. Please post questions
for other distributions to the appropriate vendor's mailing list.
thanks
-- Hitesh
On Oct 19, 2013, at 11:27 AM, gunjan mishra wrote:
Hi I am trying to run a simple word count program , like this ,
Hi Albert,
If you are using distributed cache to push the newer version of the guava jars,
you can try setting mapreduce.job.user.classpath.first to true. If not, you
can try overriding the value of mapreduce.application.classpath to ensure that
the dir where the newer guava jars are present
Hi Rajesh,
Have you looked at re-using the profiling options to inject the jvm options to
a defined range of tasks?
http://hadoop.apache.org/docs/stable/mapred_tutorial.html#Profiling
-- Hitesh
On Aug 29, 2013, at 3:51 PM, Rajesh Jain wrote:
Hi Vinod
These are jvm parameters to inject
Hi Krishna,
YARN downloads a specified local resource on the container's node from the url
specified. In all situtations, the remote url needs to be a fully qualified
path. To verify that the file at the remote url is still valid, YARN expects
you to provide the length and last modified
here w.r.t. handling HDFS
paths.
On Wed, Aug 7, 2013 at 12:35 AM, Hitesh Shah hit...@apache.org wrote:
Hi Krishna,
YARN downloads a specified local resource on the container's node from the
url specified. In all situtations, the remote url needs to be a fully
qualified path. To verify
You are probably hitting a clash with the shuffle port. Take a look at
https://issues.apache.org/jira/browse/MAPREDUCE-5036
-- Hitesh
On Jul 10, 2013, at 8:19 PM, Harsh J wrote:
Please see yarn-default.xml for the list of options you can tweak:
Hello Curtis
Try the following:
hadoop jar ./target/distributed-shell.jar.rebuilt-one
org.apache.hadoop.yarn.applications.distributedshell.Client -jar ...
If you are running hadoop without the jar command, it will find the first
instance of Client.class in its classpath which I am guessing
The webservices were introduced only in the 2.x branch. I don't believe the
feature has been ported back to the 1.x line. If it helps, for the 1.x, line,
you can try appending ?format=json to the urls used in the jobtracker UI to
get a dump of the data in json format.
thanks
-- Hitesh
On Jun
Hello Brian,
org.apache.hadoop.yarn.api.ApplicationConstants.Environment should have a list
of all the information set in the environment.
One of these is the container ID. ApplicationAttemptID can be obtained from a
container ID object which in turn can be used to get the App Id.
-- Hitesh
, 2013, at 12:14 PM, Brian C. Huffman wrote:
Hitesh,
Is this only in trunk? I'm currently running 2.0.3-alpha and I don't see it
there. I also don't see it in the latest 2.0.5.
Thanks,
Brian
On 06/11/2013 02:54 PM, Hitesh Shah wrote:
Hello Brian
Hello Raj
BCC-ing user@hadoop and user@hive
Could you please not cross-post questions to multiple mailing lists?
For questions on hadoop, go to user@hadoop. For questions on hive, please send
them to the hive mailing list and not the user@hadoop mailing list. Likewise
for flume.
thanks
--
If I understand your question, you are expecting all the containers to be
allocated in one go? Or are you seeing your application hang because it asked
for 10 containers but it only received a total of 9 even after repeated calls
to the RM?
There is no guarantee that you will be allocated
Also, BUILDING.txt can be found at the top level directory of the checked out
code.
-- Hitesh
On Mar 21, 2013, at 5:39 PM, Hitesh Shah wrote:
Assuming you have checked out the hadoop source code into
/home/keithomas/hadoop-common/ , you need to run the maven command in that
directory
Answers regarding DistributedShell.
https://issues.apache.org/jira/secure/attachment/12486023/MapReduce_NextGen_Architecture.pdf
has some details on YARN's architecture.
-- Hitesh
On Mar 12, 2013, at 7:31 AM, Ioan Zeng wrote:
Another point I would like to evaluate is the Distributed Shell
.
( http://riccomini.name/posts/hadoop/2012-10-12-hortonworks-yarn-meetup/ might
be of help )
Thanks,
Ioan
On Tue, Mar 12, 2013 at 8:47 PM, Hitesh Shah hit...@hortonworks.com wrote:
Answers regarding DistributedShell.
https://issues.apache.org/jira/secure/attachment/12486023
You could try using Ambari.
http://incubator.apache.org/ambari/
http://incubator.apache.org/ambari/1.2.0/installing-hadoop-using-ambari/content/index.html
-- Hitesh
On Feb 13, 2013, at 11:00 AM, Shah, Rahul1 wrote:
Hi,
Can someone help me with installation of Hadoop on cluster with RHEL
Try running the command using hadoop --config /etc/hadoop/conf to make sure
it is looking at the right conf dir.
It would help to understand how you installed hadoop - local build/rpm, etc ..
to figure out which config dir is being looked at by default.
-- Hitesh
On Feb 6, 2013, at 7:25 AM,
Hi
ambari-user@ is probably the better list for this.
It seems like your puppet command is timing out. Could you reply back with the
contents of the /var/log/puppet_apply.log from the node in question?
Also, it might be worth waiting a few days for the next release of ambari which
should
Michael's suggestion was to change your data to:
c|zxy|xyz
d|abc,def|abcd
and then use | as the delimiter.
-- Hitesh
On Jun 26, 2012, at 2:30 PM, Sandeep Reddy P wrote:
Thanks for the reply.
I didnt get that Michael. My f2 should be abc,def
On Tue, Jun 26, 2012 at 4:00 PM, Michael Segel
The shell script is invoked within the context of a container launched by the
NodeManager. If you are creating a directory using a relative path, it will be
created relative of the container's working directory and cleaned up when the
container completes.
If you really want to see some
Assuming you have a non-secure cluster setup ( the code does not handle
security properly yet ), the following command would run the ls command on 5
allocated containers.
$HADOOP_COMMON_HOME/bin/hadoop jar path to
hadoop-yarn-applications-distributedshell-0.24.0-SNAPSHOT.jar
On Thu, Dec 15, 2011 at 12:09 AM, Hitesh Shah hit...@hortonworks.com wrote:
Assuming you have a non-secure cluster setup ( the code does not handle
security properly yet ), the following command would run the ls command on 5
allocated containers.
$HADOOP_COMMON_HOME/bin/hadoop jar path
39 matches
Mail list logo