Re: how to use Yarn API to find task/attempt status

2016-03-10 Thread David Morel

On 10 Mar 2016, at 18:21, Frank Luo wrote:


Thanks David/Jeff.

To avoid further confusions, let me make sure I am clear on what I am 
trying to do: I would like to know how many hours in a day my cluster 
is running at its full capacity, and when that happens, how long is my 
waiting queue. I founded similar information on Ambari as below, but 
I’d like to dive deeper, hence asking.


From what I see, container per job information, especially pending 
containers, is only available from an application’s trackingUrl, but 
that just applies to M/R jobs. I am not able to get the same 
information from a Tez applications’ trackingUrl (Tez’s url 
doesn’t do anything for hdp2.2).  So how does Ambari find the 
information out?


Using the REST API you'd query the resource manager's "apps" method, 
then the appmasters through the RM proxy with the "jobs" method 
(sequentially, using the app ids found at step 1 in turn). Works for MR, 
there used to be an issue with spark jobs, haven't looked at that. This 
is only for running jobs; you'd probably want to query the history 
server too which may return more complete info with less indirection. 
Also, have a look at the "scheduler" method on the RM, which you may 
find useful.


The docs are here:
https://hadoop.apache.org/docs/r2.7.1/hadoop-yarn/hadoop-yarn-site/ResourceManagerRest.html
https://hadoop.apache.org/docs/r2.7.1/hadoop-yarn/hadoop-yarn-site/NodeManagerRest.html

For MR stuff:
https://hadoop.apache.org/docs/r2.7.1/hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapredAppMasterRest.html
https://hadoop.apache.org/docs/r2.7.1/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/HistoryServerRest.html

But most useful is probably the timeline server, which I didn't have a 
chance to use and possibly provides what you need:

https://hadoop.apache.org/docs/r2.7.1/hadoop-yarn/hadoop-yarn-site/TimelineServer.html#Timeline_Server_REST_API_v1

All this from memory since I haven't touched a cluster lately, and 
hoping it's not completely missing the point ;-)


David




[cid:image001.png@01D17ABE.06C936C0]

From: David Morel [mailto:dmo...@amakuru.net]
Sent: Thursday, March 10, 2016 1:03 AM
To: Jeff Zhang
Cc: user@hadoop.apache.org; Frank Luo
Subject: Re: how to use Yarn API to find task/attempt status


The REST API should help. A working implementation (in perl, not java, 
sorry) is visible here : http://search.cpan.org/dist/Net-Hadoop-YARN/

Read the comments, they matter :-)
Le 10 mars 2016 7:28 AM, "Jeff Zhang" 
<zjf...@gmail.com<mailto:zjf...@gmail.com>> a écrit :

If it is for M/R, then maybe this is what you want
https://hadoop.apache.org/docs/r2.6.0/api/org/apache/hadoop/mapreduce/JobStatus.html



On Thu, Mar 10, 2016 at 1:58 PM, Frank Luo 
<j...@merkleinc.com<mailto:j...@merkleinc.com>> wrote:
Let’s say there are 10 standard M/R jobs running. How to find how 
many tasks are done/running/pending?


From: Jeff Zhang [mailto:zjf...@gmail.com<mailto:zjf...@gmail.com>]
Sent: Wednesday, March 09, 2016 9:33 PM
To: Frank Luo
Cc: user@hadoop.apache.org<mailto:user@hadoop.apache.org>
Subject: Re: how to use Yarn API to find task/attempt status

I don't think it is related with yarn. Yarn don't know about task/task 
attempt, it only knows containers. So it should be your application to 
provide such function.


On Thu, Mar 10, 2016 at 11:29 AM, Frank Luo 
<j...@merkleinc.com<mailto:j...@merkleinc.com>> wrote:

Anyone had a similar issue and knows the answer?

From: Frank Luo
Sent: Wednesday, March 09, 2016 1:59 PM
To: 'user@hadoop.apache.org<mailto:user@hadoop.apache.org>'
Subject: how to use Yarn API to find task/attempt status

I have a need to programmatically find out how many tasks are pending 
in Yarn. Is there a way to do it through a Java API?


I looked at YarnClient, but not able to find what I need.

Thx in advance.

Frank Luo

This email and any attachments transmitted with it are intended for 
use by the intended recipient(s) only. If you have received this email 
in error, please notify the sender immediately and then delete it. If 
you are not the intended recipient, you must not keep, use, disclose, 
copy or distribute this email without the author’s prior permission. 
We take precautions to minimize the risk of transmitting software 
viruses, but we advise you to perform your own virus checks on any 
attachment to this message. We cannot accept liability for any loss or 
damage caused by software viruses. The information contained in this 
communication may be confidential and may be subject to the 
attorney-client privilege.




--
Best Regards

Jeff Zhang

This email and any attachments transmitted with it are intended for 
use by the intended recipient(s) only. If you have received this email 
in error, please notify the sender immediately and then delete it. If 
you are not the intended recipient, you must not keep, use, disclose, 
copy or

Re: how to use Yarn API to find task/attempt status

2016-03-10 Thread Hitesh Shah
You would use YARN apis as mentioned my David. Look for “PendingMB” from 
“RM:8088/jmx” to see allocated/reserved/pending stats on a per queue basis. 
There is probably a WS that exposes similar data. At the app level, something 
like "http://RM:8088/ws/v1/cluster/apps/application_1457573549805_0001” will 
give you details (only for a running app ) on allocated MB, running containers 
and pending resourceRequests, clusterUsagePercentage, etc. 

thanks
— Hitesh


On Mar 10, 2016, at 9:21 AM, Frank Luo <j...@merkleinc.com> wrote:

> Thanks David/Jeff.
>  
> To avoid further confusions, let me make sure I am clear on what I am trying 
> to do: I would like to know how many hours in a day my cluster is running at 
> its full capacity, and when that happens, how long is my waiting queue. I 
> founded similar information on Ambari as below, but I’d like to dive deeper, 
> hence asking.
>  
> From what I see, container per job information, especially pending 
> containers, is only available from an application’s trackingUrl, but that 
> just applies to M/R jobs. I am not able to get the same information from a 
> Tez applications’ trackingUrl (Tez’s url doesn’t do anything for hdp2.2).  So 
> how does Ambari find the information out?
>  
>  
> 
>  
> From: David Morel [mailto:dmo...@amakuru.net] 
> Sent: Thursday, March 10, 2016 1:03 AM
> To: Jeff Zhang
> Cc: user@hadoop.apache.org; Frank Luo
> Subject: Re: how to use Yarn API to find task/attempt status
>  
> The REST API should help. A working implementation (in perl, not java, sorry) 
> is visible here : http://search.cpan.org/dist/Net-Hadoop-YARN/
> Read the comments, they matter :-)
> 
> Le 10 mars 2016 7:28 AM, "Jeff Zhang" <zjf...@gmail.com> a écrit :
> If it is for M/R, then maybe this is what you want 
> https://hadoop.apache.org/docs/r2.6.0/api/org/apache/hadoop/mapreduce/JobStatus.html
>  
>  
>  
> On Thu, Mar 10, 2016 at 1:58 PM, Frank Luo <j...@merkleinc.com> wrote:
> Let’s say there are 10 standard M/R jobs running. How to find how many tasks 
> are done/running/pending?
>  
> From: Jeff Zhang [mailto:zjf...@gmail.com] 
> Sent: Wednesday, March 09, 2016 9:33 PM
> To: Frank Luo
> Cc: user@hadoop.apache.org
> Subject: Re: how to use Yarn API to find task/attempt status
>  
> I don't think it is related with yarn. Yarn don't know about task/task 
> attempt, it only knows containers. So it should be your application to 
> provide such function. 
>  
> On Thu, Mar 10, 2016 at 11:29 AM, Frank Luo <j...@merkleinc.com> wrote:
> Anyone had a similar issue and knows the answer?
>  
> From: Frank Luo 
> Sent: Wednesday, March 09, 2016 1:59 PM
> To: 'user@hadoop.apache.org'
> Subject: how to use Yarn API to find task/attempt status
>  
> I have a need to programmatically find out how many tasks are pending in 
> Yarn. Is there a way to do it through a Java API?
>  
> I looked at YarnClient, but not able to find what I need.
>  
> Thx in advance.
>  
> Frank Luo
> This email and any attachments transmitted with it are intended for use by 
> the intended recipient(s) only. If you have received this email in error, 
> please notify the sender immediately and then delete it. If you are not the 
> intended recipient, you must not keep, use, disclose, copy or distribute this 
> email without the author’s prior permission. We take precautions to minimize 
> the risk of transmitting software viruses, but we advise you to perform your 
> own virus checks on any attachment to this message. We cannot accept 
> liability for any loss or damage caused by software viruses. The information 
> contained in this communication may be confidential and may be subject to the 
> attorney-client privilege.
> 
> 
> 
>  
> --
> Best Regards
> 
> Jeff Zhang
> This email and any attachments transmitted with it are intended for use by 
> the intended recipient(s) only. If you have received this email in error, 
> please notify the sender immediately and then delete it. If you are not the 
> intended recipient, you must not keep, use, disclose, copy or distribute this 
> email without the author’s prior permission. We take precautions to minimize 
> the risk of transmitting software viruses, but we advise you to perform your 
> own virus checks on any attachment to this message. We cannot accept 
> liability for any loss or damage caused by software viruses. The information 
> contained in this communication may be confidential and may be subject to the 
> attorney-client privilege.
> 
> 
> 
>  
> --
> Best Regards
> 
> Jeff Zhang
> This email and any attachments transmitted with it are intended for use by 
> the intended recipient(s) only. If 

Re: how to use Yarn API to find task/attempt status

2016-03-09 Thread Jeff Zhang
If it is for M/R, then maybe this is what you want
https://hadoop.apache.org/docs/r2.6.0/api/org/apache/hadoop/mapreduce/JobStatus.html



On Thu, Mar 10, 2016 at 1:58 PM, Frank Luo <j...@merkleinc.com> wrote:

> Let’s say there are 10 standard M/R jobs running. How to find how many
> tasks are done/running/pending?
>
>
>
> *From:* Jeff Zhang [mailto:zjf...@gmail.com]
> *Sent:* Wednesday, March 09, 2016 9:33 PM
> *To:* Frank Luo
> *Cc:* user@hadoop.apache.org
> *Subject:* Re: how to use Yarn API to find task/attempt status
>
>
>
> I don't think it is related with yarn. Yarn don't know about task/task
> attempt, it only knows containers. So it should be your application to
> provide such function.
>
>
>
> On Thu, Mar 10, 2016 at 11:29 AM, Frank Luo <j...@merkleinc.com> wrote:
>
> Anyone had a similar issue and knows the answer?
>
>
>
> *From:* Frank Luo
> *Sent:* Wednesday, March 09, 2016 1:59 PM
> *To:* 'user@hadoop.apache.org'
> *Subject:* how to use Yarn API to find task/attempt status
>
>
>
> I have a need to programmatically find out how many tasks are pending in
> Yarn. Is there a way to do it through a Java API?
>
>
>
> I looked at YarnClient, but not able to find what I need.
>
>
>
> Thx in advance.
>
>
>
> Frank Luo
>
> This email and any attachments transmitted with it are intended for use by
> the intended recipient(s) only. If you have received this email in error,
> please notify the sender immediately and then delete it. If you are not the
> intended recipient, you must not keep, use, disclose, copy or distribute
> this email without the author’s prior permission. We take precautions to
> minimize the risk of transmitting software viruses, but we advise you to
> perform your own virus checks on any attachment to this message. We cannot
> accept liability for any loss or damage caused by software viruses. The
> information contained in this communication may be confidential and may be
> subject to the attorney-client privilege.
>
>
>
>
>
> --
>
> Best Regards
>
> Jeff Zhang
>
> This email and any attachments transmitted with it are intended for use by
> the intended recipient(s) only. If you have received this email in error,
> please notify the sender immediately and then delete it. If you are not the
> intended recipient, you must not keep, use, disclose, copy or distribute
> this email without the author’s prior permission. We take precautions to
> minimize the risk of transmitting software viruses, but we advise you to
> perform your own virus checks on any attachment to this message. We cannot
> accept liability for any loss or damage caused by software viruses. The
> information contained in this communication may be confidential and may be
> subject to the attorney-client privilege.
>



-- 
Best Regards

Jeff Zhang


Re: how to use Yarn API to find task/attempt status

2016-03-09 Thread Sultan Alamro

You still can see the tasks status through the web interfaces.

Look at the end of this page
https://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-common/ClusterSetup.html

> On Mar 10, 2016, at 12:58 AM, Frank Luo <j...@merkleinc.com> wrote:
> 
> Let’s say there are 10 standard M/R jobs running. How to find how many tasks 
> are done/running/pending?
>  
> From: Jeff Zhang [mailto:zjf...@gmail.com] 
> Sent: Wednesday, March 09, 2016 9:33 PM
> To: Frank Luo
> Cc: user@hadoop.apache.org
> Subject: Re: how to use Yarn API to find task/attempt status
>  
> I don't think it is related with yarn. Yarn don't know about task/task 
> attempt, it only knows containers. So it should be your application to 
> provide such function. 
>  
> On Thu, Mar 10, 2016 at 11:29 AM, Frank Luo <j...@merkleinc.com> wrote:
> Anyone had a similar issue and knows the answer?
>  
> From: Frank Luo 
> Sent: Wednesday, March 09, 2016 1:59 PM
> To: 'user@hadoop.apache.org'
> Subject: how to use Yarn API to find task/attempt status
>  
> I have a need to programmatically find out how many tasks are pending in 
> Yarn. Is there a way to do it through a Java API?
>  
> I looked at YarnClient, but not able to find what I need.
>  
> Thx in advance.
>  
> Frank Luo
> This email and any attachments transmitted with it are intended for use by 
> the intended recipient(s) only. If you have received this email in error, 
> please notify the sender immediately and then delete it. If you are not the 
> intended recipient, you must not keep, use, disclose, copy or distribute this 
> email without the author’s prior permission. We take precautions to minimize 
> the risk of transmitting software viruses, but we advise you to perform your 
> own virus checks on any attachment to this message. We cannot accept 
> liability for any loss or damage caused by software viruses. The information 
> contained in this communication may be confidential and may be subject to the 
> attorney-client privilege.
> 
> 
> 
>  
> --
> Best Regards
> 
> Jeff Zhang
> This email and any attachments transmitted with it are intended for use by 
> the intended recipient(s) only. If you have received this email in error, 
> please notify the sender immediately and then delete it. If you are not the 
> intended recipient, you must not keep, use, disclose, copy or distribute this 
> email without the author’s prior permission. We take precautions to minimize 
> the risk of transmitting software viruses, but we advise you to perform your 
> own virus checks on any attachment to this message. We cannot accept 
> liability for any loss or damage caused by software viruses. The information 
> contained in this communication may be confidential and may be subject to the 
> attorney-client privilege.


RE: how to use Yarn API to find task/attempt status

2016-03-09 Thread Frank Luo
Let’s say there are 10 standard M/R jobs running. How to find how many tasks 
are done/running/pending?

From: Jeff Zhang [mailto:zjf...@gmail.com]
Sent: Wednesday, March 09, 2016 9:33 PM
To: Frank Luo
Cc: user@hadoop.apache.org
Subject: Re: how to use Yarn API to find task/attempt status

I don't think it is related with yarn. Yarn don't know about task/task attempt, 
it only knows containers. So it should be your application to provide such 
function.

On Thu, Mar 10, 2016 at 11:29 AM, Frank Luo 
<j...@merkleinc.com<mailto:j...@merkleinc.com>> wrote:
Anyone had a similar issue and knows the answer?

From: Frank Luo
Sent: Wednesday, March 09, 2016 1:59 PM
To: 'user@hadoop.apache.org<mailto:user@hadoop.apache.org>'
Subject: how to use Yarn API to find task/attempt status

I have a need to programmatically find out how many tasks are pending in Yarn. 
Is there a way to do it through a Java API?

I looked at YarnClient, but not able to find what I need.

Thx in advance.

Frank Luo

This email and any attachments transmitted with it are intended for use by the 
intended recipient(s) only. If you have received this email in error, please 
notify the sender immediately and then delete it. If you are not the intended 
recipient, you must not keep, use, disclose, copy or distribute this email 
without the author’s prior permission. We take precautions to minimize the risk 
of transmitting software viruses, but we advise you to perform your own virus 
checks on any attachment to this message. We cannot accept liability for any 
loss or damage caused by software viruses. The information contained in this 
communication may be confidential and may be subject to the attorney-client 
privilege.



--
Best Regards

Jeff Zhang

This email and any attachments transmitted with it are intended for use by the 
intended recipient(s) only. If you have received this email in error, please 
notify the sender immediately and then delete it. If you are not the intended 
recipient, you must not keep, use, disclose, copy or distribute this email 
without the author’s prior permission. We take precautions to minimize the risk 
of transmitting software viruses, but we advise you to perform your own virus 
checks on any attachment to this message. We cannot accept liability for any 
loss or damage caused by software viruses. The information contained in this 
communication may be confidential and may be subject to the attorney-client 
privilege.