[
https://issues.apache.org/jira/browse/STORM-1167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14989407#comment-14989407
]
ASF GitHub Bot commented on STORM-1167:
---------------------------------------
GitHub user arunmahadevan opened a pull request:
https://github.com/apache/storm/pull/855
STORM-1167: Add windowing support for storm core
Currently, topologies that needs windowing support requires writing custom
logic inside bolts making it tedious to handle the windowing and acking logic
with custom logic.
The PR proposes to add framework level support to core storm bolts to
process tuples in a time or a count based window. Sliding and tumbling windows
based on tuple count or time duration are supported.
A new bolt interface is added for bolts that needs windowing support.
```
public interface IWindowedBolt extends IComponent {
void prepare(Map stormConf, TopologyContext context, OutputCollector
collector);
/**
* Process tuples falling within the window and optionally emit
* new tuples based on the tuples in the input window.
*/
void execute(TupleWindow inputWindow);
void cleanup();
}
```
`TupleWindow` gives access to the current tuples in the window, the tuples
that expired and the new tuples that are added since last window was computed
which will be useful for efficient windowing computations.
The `TopologyBuilder` wraps the IWindowedBolt implementation in an internal
bolt `WindowedBoltExecutor` (similar to `BasicBoltExecutor`), which receives
the tuples and hands it off to IWindowedBolt when the window conditions are
met. The tuples are automatically acked when they fall out of the window and
the tuples in a window are automatically anchored with the tuple emitted from
an IWindowedBolt to provide atleast once guarentee. An example topology
`SlidingWindowTopology` shows how to use the apis to compute a sliding window
sum.
The tuples are tracked with the system timestamp when they are received by
the bolt. Later we can add support for tuple based timestamps (e.g the ts when
the tuple was generated). There is also scope for minimizing duplicates by
tracking the evaluated tuples (based on some id) in a persistent store.
Similiar windowing constructs needs to be added to storm trident apis as
well. That can be addressed once the basic support to store core is added and
the common logic can be reused.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/arunmahadevan/storm windowing
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/storm/pull/855.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #855
----
commit 77429d07d7c080962b8c414cf02ee93839b25ca2
Author: Arun Mahadevan <[email protected]>
Date: 2015-11-04T11:29:19Z
STORM-1167: Add windowing support for storm core
1. Added new interface IWindowedBolt and wrapper classes for bolts that
need windowing support
2. New constants for specifying the window length and sliding interval
3. WindowManager and related classes that handles the windowing logic
----
> Add sliding & tumbling window support for core storm
> ----------------------------------------------------
>
> Key: STORM-1167
> URL: https://issues.apache.org/jira/browse/STORM-1167
> Project: Apache Storm
> Issue Type: Improvement
> Reporter: Arun Mahadevan
> Assignee: Arun Mahadevan
>
> Currently, topologies that needs windowing support requires writing custom
> logic inside bolts making it tedious to handle the windowing and acking logic
> with custom logic.
> We can add framework level support to core storm bolts to process tuples in a
> time or a count based window. Sliding and tumbling windows can be supported.
> Later this can be extended to trident apis as well.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)