[jira] [Commented] (YARN-4602) Simple and Scalable Message Service for YARN application

2016-08-03 Thread Arun Suresh (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15405770#comment-15405770
 ] 

Arun Suresh commented on YARN-4602:
---

[~djp], wondering if you've taken a look at apache REEF. It uses an event 
driven framework called https://reef.apache.org/wake.html that also supports 
messaging. It uses Netty under the hood.

> Simple and Scalable Message Service for YARN application
> 
>
> Key: YARN-4602
> URL: https://issues.apache.org/jira/browse/YARN-4602
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: applications, resourcemanager
>Reporter: Junping Du
>Assignee: Junping Du
>
> We are proposing to support MR AM restart with work preserving in 
> MAPREDUCE-6608 (https://issues.apache.org/jira/browse/MAPREDUCE-6608) that 
> when AM get failed for some reason, the inflight tasks will keep 
> running/pending until new AM attempt comes back to continue. One of 
> prerequisite is tasks should know where the new AM attempt get launched so 
> TaskUmbilicalProtocol can get retry between clients and new server.
> There could be the same requirement for other applications running on YARN 
> too. Some application decide to handle message delivery itself, e.g. Long 
> running services can leverage Slider agent to notify messages back and forth. 
> However, vanilla applications on YARN is hard to achieve this because Hadoop 
> RPC mechanism essentially is a single way of communication. Although two 
> directions mechanism like heartbeats (between NM-RM or AM-RM) can get built 
> on top of it, it make less sense to build the same mechanism between AM and 
> its application containers - or it need to handle massive of client 
> connections in AM which could be the new bottleneck for scalability and very 
> complicated in state maintaining. Instead, we need a new message mechanism 
> that is simple and scalable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4602) Simple and Scalable Message Service for YARN application

2016-03-04 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15179660#comment-15179660
 ] 

Junping Du commented on YARN-4602:
--

Some requirements could be: 
1. Message sharing for containers within yarn application, especially AM can 
notify some simple messages to each container.
2. Simple message format - string with fixed size.
3. Scale up to 10 - 100 thousands of message receivers.
4. Latency insensitive, message could be received between 10 sec - 1 mins
5. Message won’t get lost (can be recovered) during NM, RM, AM restart.
6. Message limited to application internal only, not include messages between 
applications.
7. Only one message sender (single topic) for the first step

> Simple and Scalable Message Service for YARN application
> 
>
> Key: YARN-4602
> URL: https://issues.apache.org/jira/browse/YARN-4602
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: applications, resourcemanager
>Reporter: Junping Du
>Assignee: Junping Du
>
> We are proposing to support MR AM restart with work preserving in 
> MAPREDUCE-6608 (https://issues.apache.org/jira/browse/MAPREDUCE-6608) that 
> when AM get failed for some reason, the inflight tasks will keep 
> running/pending until new AM attempt comes back to continue. One of 
> prerequisite is tasks should know where the new AM attempt get launched so 
> TaskUmbilicalProtocol can get retry between clients and new server.
> There could be the same requirement for other applications running on YARN 
> too. Some application decide to handle message delivery itself, e.g. Long 
> running services can leverage Slider agent to notify messages back and forth. 
> However, vanilla applications on YARN is hard to achieve this because Hadoop 
> RPC mechanism essentially is a single way of communication. Although two 
> directions mechanism like heartbeats (between NM-RM or AM-RM) can get built 
> on top of it, it make less sense to build the same mechanism between AM and 
> its application containers - or it need to handle massive of client 
> connections in AM which could be the new bottleneck for scalability and very 
> complicated in state maintaining. Instead, we need a new message mechanism 
> that is simple and scalable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)