[
https://issues.apache.org/jira/browse/EAGLE-794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15683060#comment-15683060
]
Garrett Li edited comment on EAGLE-794 at 11/21/16 11:14 AM:
-------------------------------------------------------------
We cannot simply use AlertConstants.FIELD_0 field grouping to partition tuples,
it is because we may have policy1 which has publisher1 and publisher2. These 2
publishers may have different group by fields, it is hard to use the current
hierarchy to handle this case.
So we need to use the same strategy as router bolt to do partition stream.
was (Author: garrettlish):
We cannot simply use AlertConstants.FIELD_0 field grouping to partition tuples,
it is because we may have policy1 which has publisher1 and publisher2. If we
send 2 events with default group by fields which may cause duplicated tuples.
To avoid this kinds of the duplicated tuples, we need to use the same strategy
as router bolt to do partition stream.
> Enable publish bolt parallelism
> -------------------------------
>
> Key: EAGLE-794
> URL: https://issues.apache.org/jira/browse/EAGLE-794
> Project: Eagle
> Issue Type: Improvement
> Affects Versions: v0.5.0
> Reporter: Garrett Li
> Assignee: Garrett Li
> Fix For: v0.5.0
>
>
> Currently the publish is using shuffle grouping, we cannot enable parallelism
> for publish since we may have local cache which is unavailable across the
> storm cluster.
> We are going to use the same strategy as router bolts to alert bolts, which
> is using field grouping (define empty list of field) and dispatch tuple
> according to group by fields hashing.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)