Hello Kevin,

I'm not very familiar with the stream API, but I think you can achieve what
you want by mapping over your elements to turn the
strings into one-item lists, so that you get a key-value that is (K:
String, V: (List[String], Int))  and then apply the window reduce function,
which produces a data stream out of
a windowed stream, you combine your lists there and sum the value. Again,
it's not a great way to use reduce, since you are growing the list with
each reduction.

Regards,
Theodore

On Thu, Aug 4, 2016 at 1:36 AM, Kevin Jacobs <kevin.jac...@cern.ch> wrote:

> Hi,
>
> I have the following use case:
>
>     1. Group by a specific field.
>
>     2. Get a list of all messages belonging to the group.
>
>     3. Count the number of records in the group.
>
> With the use of DataSets, it is fairly easy to do this (see
> http://stackoverflow.com/questions/38745446/apache-flink-
> sum-and-keep-grouped/38747685#38747685):
>
> |fromElements(("a-b", "data1", 1), ("a-c", "data2", 1), ("a-b", "data3",
> 1)). groupBy(0). reduceGroup { (it: Iterator[(String, String, Int)], out:
> Collector[(String, List[String], Int)]) => { val group = it.toList if
> (group.length > 0) out.collect((group(0)._1, group.map(_._2),
> group.map(_._3).sum)) } |
>
> So, now I am moving to DataStreams (since the input is really a
> DataStream). From my perspective, a Window should provide the same
> functionality as a DataSet. This would easify the process a lot:
>
>     1. Window the elements.
>
>     2. Apply the same operations as before.
>
> Is there a way in Flink to do so? Otherwise, I would like to think of a
> solution to this problem.
>
> Regards,
> Kevin
>

Reply via email to