[
https://issues.apache.org/jira/browse/APEXCORE-602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15882623#comment-15882623
]
ASF GitHub Bot commented on APEXCORE-602:
-----------------------------------------
GitHub user DT-Priyanka opened a pull request:
https://github.com/apache/apex-core/pull/479
[Review only] APEXCORE-602: group events by cause
This feature is to group events which are raised due to common cause e.g.
if a operator fails it causes all downstream operators to redeploy and this
action raises bunch of events. The events should have common groupId for
reference.
The code changes following following path for code changes:
**When Operator throws an exception,**
1. StreamingContainer containing operator generates a event groupId and
raises OperatorErrorEvent with generated groupId
2. Then StreamingContainer sends groupId to StrAM in heartbeat
3. StaAM saves this groupId for future use.
4. When StreamingAppMasterService detects that a container is killed with
non-zero exit code it schedules redeployment for all downstream operators.
5. StrAM, when scheduling redeployment for downstream operators maps
groupId to all scheduled operators.
6. StrAM then sends undeploy signals to operators along with groupId in
heartbeat response. StrAM also raises OperatorStop event and refers to same
groupId.
7. StreamingContainer remember groupId and sends it back in heartbeat when
it starts operator again.
8. StrAM then uses this groupId to raise OperatorStart event.
StrAM also tracks container stop and start to raise ContainerStop and
ContainerStart events.
**When StrAM kills a container**
1. StreamingAppMasterService detects that a container is killed and removes
container agent. As well as creates RedeploymentInformation with groupId
2. follows steps 4-8 from above flow
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/DT-Priyanka/incubator-apex-core
APEXCORE-602-events-grouping
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/apex-core/pull/479.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #479
----
commit ae8a6ab3ad3107627eb6bea886b4e167fa669d3b
Author: priya <[email protected]>
Date: 2017-02-23T11:55:56Z
APEXCORE-602: group events by cause
----
> Provide a "group-id" in the event object so that events are grouped together
> by a "root cause".
> -----------------------------------------------------------------------------------------------
>
> Key: APEXCORE-602
> URL: https://issues.apache.org/jira/browse/APEXCORE-602
> Project: Apache Apex Core
> Issue Type: Improvement
> Reporter: Sanjay M Pujare
> Assignee: Priyanka Gugale
>
> Provide a "group-id" in the event object so that events are grouped together
> by a "root cause". An example is a bunch of container restarts are related to
> a single failure in the application but the current sequence of Stram events
> doesn't make it obvious. The consumer of events is able to better
> read/analyze the events because of the group-id and focus on the root-cause.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)