Thanks Arun!  It is sometimes hard for me to figure out what is built into YARN 
vs what is done by convention.
John


From: Arun C Murthy [mailto:a...@hortonworks.com]
Sent: Monday, September 23, 2013 5:41 PM
To: user@hadoop.apache.org
Subject: Re: Task status query

Yep, typically, the AM should pass it's host:port to the task as part of either 
the cmd-line for the task or in it's env. That is what is done by MR AM.

hth,
Arun

On Sep 21, 2013, at 6:52 AM, John Lilley 
<john.lil...@redpoint.net<mailto:john.lil...@redpoint.net>> wrote:


Thanks Harsh!  The data-transport format is pretty easy, but how is the RPC 
typically set up?  Does the AM open a listen port to accept the RPC from the 
tasks, and then pass the port/URI to the tasks when they are spawned as 
command-line or environment?
john

-----Original Message-----
From: Harsh J [mailto:ha...@cloudera.com<http://cloudera.com>]
Sent: Friday, September 20, 2013 7:47 AM
To: <user@hadoop.apache.org<mailto:user@hadoop.apache.org>>
Subject: Re: Task status query

Right now its MR specific (TaskUmbilicalProtocol) - YARN doesn't have any 
reusable items here yet, but there are easy to use RPC libs such as Avro and 
Thrift out there that make it easy to do such things once you define what you 
want in a schema/spec form.

On Fri, Sep 20, 2013 at 5:32 PM, John Lilley 
<john.lil...@redpoint.net<mailto:john.lil...@redpoint.net>> wrote:

Thanks Harsh.  Is this protocol something that is available to all AMs/tasks?  
Or is it up to each AM/task pair to develop their own protocol?
john

-----Original Message-----
From: Harsh J [mailto:ha...@cloudera.com<http://cloudera.com>]
Sent: Thursday, September 19, 2013 9:20 PM
To: <user@hadoop.apache.org<mailto:user@hadoop.apache.org>>
Subject: Re: Task status query

Hi John,

YARN tasks can be more than simple executables. In case of MR, for example, 
tasks talk to the AM and report their individual progress and counters back to 
it, via a specific protocol (over the network), giving the AM more data to 
compute an near-accurate global progress.

On Fri, Sep 20, 2013 at 12:18 AM, John Lilley 
<john.lil...@redpoint.net<mailto:john.lil...@redpoint.net>> wrote:

How does a YARN application master typically query ongoing status
(like percentage completion) of its tasks?

I would like to be able to ultimately relay information to the user like:

100 tasks are scheduled

10 tasks are complete

4 tasks are running and they are (4%, 10%, 50%, 70%) complete

But, given that YARN tasks are simply executables, how can the AM
even get at this information?  Can the AM get access to stdout/stderr?

Thanks

John




--
Harsh J



--
Harsh J

--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/


CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader of 
this message is not the intended recipient, you are hereby notified that any 
printing, copying, dissemination, distribution, disclosure or forwarding of 
this communication is strictly prohibited. If you have received this 
communication in error, please contact the sender immediately and delete it 
from your system. Thank You.

Reply via email to