Re: how to use Yarn API to find task/attempt status
On 10 Mar 2016, at 18:21, Frank Luo wrote: Thanks David/Jeff. To avoid further confusions, let me make sure I am clear on what I am trying to do: I would like to know how many hours in a day my cluster is running at its full capacity, and when that happens, how long is my waiting queue. I founded similar information on Ambari as below, but I’d like to dive deeper, hence asking. From what I see, container per job information, especially pending containers, is only available from an application’s trackingUrl, but that just applies to M/R jobs. I am not able to get the same information from a Tez applications’ trackingUrl (Tez’s url doesn’t do anything for hdp2.2). So how does Ambari find the information out? Using the REST API you'd query the resource manager's "apps" method, then the appmasters through the RM proxy with the "jobs" method (sequentially, using the app ids found at step 1 in turn). Works for MR, there used to be an issue with spark jobs, haven't looked at that. This is only for running jobs; you'd probably want to query the history server too which may return more complete info with less indirection. Also, have a look at the "scheduler" method on the RM, which you may find useful. The docs are here: https://hadoop.apache.org/docs/r2.7.1/hadoop-yarn/hadoop-yarn-site/ResourceManagerRest.html https://hadoop.apache.org/docs/r2.7.1/hadoop-yarn/hadoop-yarn-site/NodeManagerRest.html For MR stuff: https://hadoop.apache.org/docs/r2.7.1/hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapredAppMasterRest.html https://hadoop.apache.org/docs/r2.7.1/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/HistoryServerRest.html But most useful is probably the timeline server, which I didn't have a chance to use and possibly provides what you need: https://hadoop.apache.org/docs/r2.7.1/hadoop-yarn/hadoop-yarn-site/TimelineServer.html#Timeline_Server_REST_API_v1 All this from memory since I haven't touched a cluster lately, and hoping it's not completely missing the point ;-) David [cid:image001.png@01D17ABE.06C936C0] From: David Morel [mailto:dmo...@amakuru.net] Sent: Thursday, March 10, 2016 1:03 AM To: Jeff Zhang Cc: user@hadoop.apache.org; Frank Luo Subject: Re: how to use Yarn API to find task/attempt status The REST API should help. A working implementation (in perl, not java, sorry) is visible here : http://search.cpan.org/dist/Net-Hadoop-YARN/ Read the comments, they matter :-) Le 10 mars 2016 7:28 AM, "Jeff Zhang" mailto:zjf...@gmail.com>> a écrit : If it is for M/R, then maybe this is what you want https://hadoop.apache.org/docs/r2.6.0/api/org/apache/hadoop/mapreduce/JobStatus.html On Thu, Mar 10, 2016 at 1:58 PM, Frank Luo mailto:j...@merkleinc.com>> wrote: Let’s say there are 10 standard M/R jobs running. How to find how many tasks are done/running/pending? From: Jeff Zhang [mailto:zjf...@gmail.com<mailto:zjf...@gmail.com>] Sent: Wednesday, March 09, 2016 9:33 PM To: Frank Luo Cc: user@hadoop.apache.org<mailto:user@hadoop.apache.org> Subject: Re: how to use Yarn API to find task/attempt status I don't think it is related with yarn. Yarn don't know about task/task attempt, it only knows containers. So it should be your application to provide such function. On Thu, Mar 10, 2016 at 11:29 AM, Frank Luo mailto:j...@merkleinc.com>> wrote: Anyone had a similar issue and knows the answer? From: Frank Luo Sent: Wednesday, March 09, 2016 1:59 PM To: 'user@hadoop.apache.org<mailto:user@hadoop.apache.org>' Subject: how to use Yarn API to find task/attempt status I have a need to programmatically find out how many tasks are pending in Yarn. Is there a way to do it through a Java API? I looked at YarnClient, but not able to find what I need. Thx in advance. Frank Luo This email and any attachments transmitted with it are intended for use by the intended recipient(s) only. If you have received this email in error, please notify the sender immediately and then delete it. If you are not the intended recipient, you must not keep, use, disclose, copy or distribute this email without the author’s prior permission. We take precautions to minimize the risk of transmitting software viruses, but we advise you to perform your own virus checks on any attachment to this message. We cannot accept liability for any loss or damage caused by software viruses. The information contained in this communication may be confidential and may be subject to the attorney-client privilege. -- Best Regards Jeff Zhang This email and any attachments transmitted with it are intended for use by the intended recipient(s) only. If you have received this email in error, please notify the sender immediately and then delete it. If you are not the intended recipient, you must not keep, use, disclose, copy or distribute this email witho
Re: how to use Yarn API to find task/attempt status
You would use YARN apis as mentioned my David. Look for “PendingMB” from “RM:8088/jmx” to see allocated/reserved/pending stats on a per queue basis. There is probably a WS that exposes similar data. At the app level, something like "http://RM:8088/ws/v1/cluster/apps/application_1457573549805_0001” will give you details (only for a running app ) on allocated MB, running containers and pending resourceRequests, clusterUsagePercentage, etc. thanks — Hitesh On Mar 10, 2016, at 9:21 AM, Frank Luo wrote: > Thanks David/Jeff. > > To avoid further confusions, let me make sure I am clear on what I am trying > to do: I would like to know how many hours in a day my cluster is running at > its full capacity, and when that happens, how long is my waiting queue. I > founded similar information on Ambari as below, but I’d like to dive deeper, > hence asking. > > From what I see, container per job information, especially pending > containers, is only available from an application’s trackingUrl, but that > just applies to M/R jobs. I am not able to get the same information from a > Tez applications’ trackingUrl (Tez’s url doesn’t do anything for hdp2.2). So > how does Ambari find the information out? > > > > > From: David Morel [mailto:dmo...@amakuru.net] > Sent: Thursday, March 10, 2016 1:03 AM > To: Jeff Zhang > Cc: user@hadoop.apache.org; Frank Luo > Subject: Re: how to use Yarn API to find task/attempt status > > The REST API should help. A working implementation (in perl, not java, sorry) > is visible here : http://search.cpan.org/dist/Net-Hadoop-YARN/ > Read the comments, they matter :-) > > Le 10 mars 2016 7:28 AM, "Jeff Zhang" a écrit : > If it is for M/R, then maybe this is what you want > https://hadoop.apache.org/docs/r2.6.0/api/org/apache/hadoop/mapreduce/JobStatus.html > > > > On Thu, Mar 10, 2016 at 1:58 PM, Frank Luo wrote: > Let’s say there are 10 standard M/R jobs running. How to find how many tasks > are done/running/pending? > > From: Jeff Zhang [mailto:zjf...@gmail.com] > Sent: Wednesday, March 09, 2016 9:33 PM > To: Frank Luo > Cc: user@hadoop.apache.org > Subject: Re: how to use Yarn API to find task/attempt status > > I don't think it is related with yarn. Yarn don't know about task/task > attempt, it only knows containers. So it should be your application to > provide such function. > > On Thu, Mar 10, 2016 at 11:29 AM, Frank Luo wrote: > Anyone had a similar issue and knows the answer? > > From: Frank Luo > Sent: Wednesday, March 09, 2016 1:59 PM > To: 'user@hadoop.apache.org' > Subject: how to use Yarn API to find task/attempt status > > I have a need to programmatically find out how many tasks are pending in > Yarn. Is there a way to do it through a Java API? > > I looked at YarnClient, but not able to find what I need. > > Thx in advance. > > Frank Luo > This email and any attachments transmitted with it are intended for use by > the intended recipient(s) only. If you have received this email in error, > please notify the sender immediately and then delete it. If you are not the > intended recipient, you must not keep, use, disclose, copy or distribute this > email without the author’s prior permission. We take precautions to minimize > the risk of transmitting software viruses, but we advise you to perform your > own virus checks on any attachment to this message. We cannot accept > liability for any loss or damage caused by software viruses. The information > contained in this communication may be confidential and may be subject to the > attorney-client privilege. > > > > > -- > Best Regards > > Jeff Zhang > This email and any attachments transmitted with it are intended for use by > the intended recipient(s) only. If you have received this email in error, > please notify the sender immediately and then delete it. If you are not the > intended recipient, you must not keep, use, disclose, copy or distribute this > email without the author’s prior permission. We take precautions to minimize > the risk of transmitting software viruses, but we advise you to perform your > own virus checks on any attachment to this message. We cannot accept > liability for any loss or damage caused by software viruses. The information > contained in this communication may be confidential and may be subject to the > attorney-client privilege. > > > > > -- > Best Regards > > Jeff Zhang > This email and any attachments transmitted with it are intended for use by > the intended recipient(s) only. If you have received this email in error, > please notify the sender immediately and
RE: how to use Yarn API to find task/attempt status
Thanks David/Jeff. To avoid further confusions, let me make sure I am clear on what I am trying to do: I would like to know how many hours in a day my cluster is running at its full capacity, and when that happens, how long is my waiting queue. I founded similar information on Ambari as below, but I’d like to dive deeper, hence asking. From what I see, container per job information, especially pending containers, is only available from an application’s trackingUrl, but that just applies to M/R jobs. I am not able to get the same information from a Tez applications’ trackingUrl (Tez’s url doesn’t do anything for hdp2.2). So how does Ambari find the information out? [cid:image001.png@01D17ABE.06C936C0] From: David Morel [mailto:dmo...@amakuru.net] Sent: Thursday, March 10, 2016 1:03 AM To: Jeff Zhang Cc: user@hadoop.apache.org; Frank Luo Subject: Re: how to use Yarn API to find task/attempt status The REST API should help. A working implementation (in perl, not java, sorry) is visible here : http://search.cpan.org/dist/Net-Hadoop-YARN/ Read the comments, they matter :-) Le 10 mars 2016 7:28 AM, "Jeff Zhang" mailto:zjf...@gmail.com>> a écrit : If it is for M/R, then maybe this is what you want https://hadoop.apache.org/docs/r2.6.0/api/org/apache/hadoop/mapreduce/JobStatus.html On Thu, Mar 10, 2016 at 1:58 PM, Frank Luo mailto:j...@merkleinc.com>> wrote: Let’s say there are 10 standard M/R jobs running. How to find how many tasks are done/running/pending? From: Jeff Zhang [mailto:zjf...@gmail.com<mailto:zjf...@gmail.com>] Sent: Wednesday, March 09, 2016 9:33 PM To: Frank Luo Cc: user@hadoop.apache.org<mailto:user@hadoop.apache.org> Subject: Re: how to use Yarn API to find task/attempt status I don't think it is related with yarn. Yarn don't know about task/task attempt, it only knows containers. So it should be your application to provide such function. On Thu, Mar 10, 2016 at 11:29 AM, Frank Luo mailto:j...@merkleinc.com>> wrote: Anyone had a similar issue and knows the answer? From: Frank Luo Sent: Wednesday, March 09, 2016 1:59 PM To: 'user@hadoop.apache.org<mailto:user@hadoop.apache.org>' Subject: how to use Yarn API to find task/attempt status I have a need to programmatically find out how many tasks are pending in Yarn. Is there a way to do it through a Java API? I looked at YarnClient, but not able to find what I need. Thx in advance. Frank Luo This email and any attachments transmitted with it are intended for use by the intended recipient(s) only. If you have received this email in error, please notify the sender immediately and then delete it. If you are not the intended recipient, you must not keep, use, disclose, copy or distribute this email without the author’s prior permission. We take precautions to minimize the risk of transmitting software viruses, but we advise you to perform your own virus checks on any attachment to this message. We cannot accept liability for any loss or damage caused by software viruses. The information contained in this communication may be confidential and may be subject to the attorney-client privilege. -- Best Regards Jeff Zhang This email and any attachments transmitted with it are intended for use by the intended recipient(s) only. If you have received this email in error, please notify the sender immediately and then delete it. If you are not the intended recipient, you must not keep, use, disclose, copy or distribute this email without the author’s prior permission. We take precautions to minimize the risk of transmitting software viruses, but we advise you to perform your own virus checks on any attachment to this message. We cannot accept liability for any loss or damage caused by software viruses. The information contained in this communication may be confidential and may be subject to the attorney-client privilege. -- Best Regards Jeff Zhang This email and any attachments transmitted with it are intended for use by the intended recipient(s) only. If you have received this email in error, please notify the sender immediately and then delete it. If you are not the intended recipient, you must not keep, use, disclose, copy or distribute this email without the author’s prior permission. We take precautions to minimize the risk of transmitting software viruses, but we advise you to perform your own virus checks on any attachment to this message. We cannot accept liability for any loss or damage caused by software viruses. The information contained in this communication may be confidential and may be subject to the attorney-client privilege.
Re: how to use Yarn API to find task/attempt status
If it is for M/R, then maybe this is what you want https://hadoop.apache.org/docs/r2.6.0/api/org/apache/hadoop/mapreduce/JobStatus.html On Thu, Mar 10, 2016 at 1:58 PM, Frank Luo wrote: > Let’s say there are 10 standard M/R jobs running. How to find how many > tasks are done/running/pending? > > > > *From:* Jeff Zhang [mailto:zjf...@gmail.com] > *Sent:* Wednesday, March 09, 2016 9:33 PM > *To:* Frank Luo > *Cc:* user@hadoop.apache.org > *Subject:* Re: how to use Yarn API to find task/attempt status > > > > I don't think it is related with yarn. Yarn don't know about task/task > attempt, it only knows containers. So it should be your application to > provide such function. > > > > On Thu, Mar 10, 2016 at 11:29 AM, Frank Luo wrote: > > Anyone had a similar issue and knows the answer? > > > > *From:* Frank Luo > *Sent:* Wednesday, March 09, 2016 1:59 PM > *To:* 'user@hadoop.apache.org' > *Subject:* how to use Yarn API to find task/attempt status > > > > I have a need to programmatically find out how many tasks are pending in > Yarn. Is there a way to do it through a Java API? > > > > I looked at YarnClient, but not able to find what I need. > > > > Thx in advance. > > > > Frank Luo > > This email and any attachments transmitted with it are intended for use by > the intended recipient(s) only. If you have received this email in error, > please notify the sender immediately and then delete it. If you are not the > intended recipient, you must not keep, use, disclose, copy or distribute > this email without the author’s prior permission. We take precautions to > minimize the risk of transmitting software viruses, but we advise you to > perform your own virus checks on any attachment to this message. We cannot > accept liability for any loss or damage caused by software viruses. The > information contained in this communication may be confidential and may be > subject to the attorney-client privilege. > > > > > > -- > > Best Regards > > Jeff Zhang > > This email and any attachments transmitted with it are intended for use by > the intended recipient(s) only. If you have received this email in error, > please notify the sender immediately and then delete it. If you are not the > intended recipient, you must not keep, use, disclose, copy or distribute > this email without the author’s prior permission. We take precautions to > minimize the risk of transmitting software viruses, but we advise you to > perform your own virus checks on any attachment to this message. We cannot > accept liability for any loss or damage caused by software viruses. The > information contained in this communication may be confidential and may be > subject to the attorney-client privilege. > -- Best Regards Jeff Zhang
Re: how to use Yarn API to find task/attempt status
You still can see the tasks status through the web interfaces. Look at the end of this page https://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-common/ClusterSetup.html > On Mar 10, 2016, at 12:58 AM, Frank Luo wrote: > > Let’s say there are 10 standard M/R jobs running. How to find how many tasks > are done/running/pending? > > From: Jeff Zhang [mailto:zjf...@gmail.com] > Sent: Wednesday, March 09, 2016 9:33 PM > To: Frank Luo > Cc: user@hadoop.apache.org > Subject: Re: how to use Yarn API to find task/attempt status > > I don't think it is related with yarn. Yarn don't know about task/task > attempt, it only knows containers. So it should be your application to > provide such function. > > On Thu, Mar 10, 2016 at 11:29 AM, Frank Luo wrote: > Anyone had a similar issue and knows the answer? > > From: Frank Luo > Sent: Wednesday, March 09, 2016 1:59 PM > To: 'user@hadoop.apache.org' > Subject: how to use Yarn API to find task/attempt status > > I have a need to programmatically find out how many tasks are pending in > Yarn. Is there a way to do it through a Java API? > > I looked at YarnClient, but not able to find what I need. > > Thx in advance. > > Frank Luo > This email and any attachments transmitted with it are intended for use by > the intended recipient(s) only. If you have received this email in error, > please notify the sender immediately and then delete it. If you are not the > intended recipient, you must not keep, use, disclose, copy or distribute this > email without the author’s prior permission. We take precautions to minimize > the risk of transmitting software viruses, but we advise you to perform your > own virus checks on any attachment to this message. We cannot accept > liability for any loss or damage caused by software viruses. The information > contained in this communication may be confidential and may be subject to the > attorney-client privilege. > > > > > -- > Best Regards > > Jeff Zhang > This email and any attachments transmitted with it are intended for use by > the intended recipient(s) only. If you have received this email in error, > please notify the sender immediately and then delete it. If you are not the > intended recipient, you must not keep, use, disclose, copy or distribute this > email without the author’s prior permission. We take precautions to minimize > the risk of transmitting software viruses, but we advise you to perform your > own virus checks on any attachment to this message. We cannot accept > liability for any loss or damage caused by software viruses. The information > contained in this communication may be confidential and may be subject to the > attorney-client privilege.
RE: how to use Yarn API to find task/attempt status
Let’s say there are 10 standard M/R jobs running. How to find how many tasks are done/running/pending? From: Jeff Zhang [mailto:zjf...@gmail.com] Sent: Wednesday, March 09, 2016 9:33 PM To: Frank Luo Cc: user@hadoop.apache.org Subject: Re: how to use Yarn API to find task/attempt status I don't think it is related with yarn. Yarn don't know about task/task attempt, it only knows containers. So it should be your application to provide such function. On Thu, Mar 10, 2016 at 11:29 AM, Frank Luo mailto:j...@merkleinc.com>> wrote: Anyone had a similar issue and knows the answer? From: Frank Luo Sent: Wednesday, March 09, 2016 1:59 PM To: 'user@hadoop.apache.org<mailto:user@hadoop.apache.org>' Subject: how to use Yarn API to find task/attempt status I have a need to programmatically find out how many tasks are pending in Yarn. Is there a way to do it through a Java API? I looked at YarnClient, but not able to find what I need. Thx in advance. Frank Luo This email and any attachments transmitted with it are intended for use by the intended recipient(s) only. If you have received this email in error, please notify the sender immediately and then delete it. If you are not the intended recipient, you must not keep, use, disclose, copy or distribute this email without the author’s prior permission. We take precautions to minimize the risk of transmitting software viruses, but we advise you to perform your own virus checks on any attachment to this message. We cannot accept liability for any loss or damage caused by software viruses. The information contained in this communication may be confidential and may be subject to the attorney-client privilege. -- Best Regards Jeff Zhang This email and any attachments transmitted with it are intended for use by the intended recipient(s) only. If you have received this email in error, please notify the sender immediately and then delete it. If you are not the intended recipient, you must not keep, use, disclose, copy or distribute this email without the author’s prior permission. We take precautions to minimize the risk of transmitting software viruses, but we advise you to perform your own virus checks on any attachment to this message. We cannot accept liability for any loss or damage caused by software viruses. The information contained in this communication may be confidential and may be subject to the attorney-client privilege.
Re: how to use Yarn API to find task/attempt status
I don't think it is related with yarn. Yarn don't know about task/task attempt, it only knows containers. So it should be your application to provide such function. On Thu, Mar 10, 2016 at 11:29 AM, Frank Luo wrote: > Anyone had a similar issue and knows the answer? > > > > *From:* Frank Luo > *Sent:* Wednesday, March 09, 2016 1:59 PM > *To:* 'user@hadoop.apache.org' > *Subject:* how to use Yarn API to find task/attempt status > > > > I have a need to programmatically find out how many tasks are pending in > Yarn. Is there a way to do it through a Java API? > > > > I looked at YarnClient, but not able to find what I need. > > > > Thx in advance. > > > > Frank Luo > > This email and any attachments transmitted with it are intended for use by > the intended recipient(s) only. If you have received this email in error, > please notify the sender immediately and then delete it. If you are not the > intended recipient, you must not keep, use, disclose, copy or distribute > this email without the author’s prior permission. We take precautions to > minimize the risk of transmitting software viruses, but we advise you to > perform your own virus checks on any attachment to this message. We cannot > accept liability for any loss or damage caused by software viruses. The > information contained in this communication may be confidential and may be > subject to the attorney-client privilege. > -- Best Regards Jeff Zhang
RE: how to use Yarn API to find task/attempt status
Anyone had a similar issue and knows the answer? From: Frank Luo Sent: Wednesday, March 09, 2016 1:59 PM To: 'user@hadoop.apache.org' Subject: how to use Yarn API to find task/attempt status I have a need to programmatically find out how many tasks are pending in Yarn. Is there a way to do it through a Java API? I looked at YarnClient, but not able to find what I need. Thx in advance. Frank Luo This email and any attachments transmitted with it are intended for use by the intended recipient(s) only. If you have received this email in error, please notify the sender immediately and then delete it. If you are not the intended recipient, you must not keep, use, disclose, copy or distribute this email without the author’s prior permission. We take precautions to minimize the risk of transmitting software viruses, but we advise you to perform your own virus checks on any attachment to this message. We cannot accept liability for any loss or damage caused by software viruses. The information contained in this communication may be confidential and may be subject to the attorney-client privilege.
how to use Yarn API to find task/attempt status
I have a need to programmatically find out how many tasks are pending in Yarn. Is there a way to do it through a Java API? I looked at YarnClient, but not able to find what I need. Thx in advance. Frank Luo This email and any attachments transmitted with it are intended for use by the intended recipient(s) only. If you have received this email in error, please notify the sender immediately and then delete it. If you are not the intended recipient, you must not keep, use, disclose, copy or distribute this email without the author’s prior permission. We take precautions to minimize the risk of transmitting software viruses, but we advise you to perform your own virus checks on any attachment to this message. We cannot accept liability for any loss or damage caused by software viruses. The information contained in this communication may be confidential and may be subject to the attorney-client privilege.