Re: What is output from DataSet.print()?

2016-08-03 Thread Stephan Ewen
Hi!

The print() output is usually partitioned in the same way as the previous
operation.
Because your previous operation is the keyBy/window operator, it should be
partitioned following the key selected by the key selector.

The Reduce() function gets only called if a window has at least two
elements. If the window has only one element, that single element is the
result of the window and gets printed.

Greetings,
Stephan


On Wed, Aug 3, 2016 at 2:30 AM, Jon Yeargers 
wrote:

> Topology snip:
>
> datastream = 
> some_stream.keyBy(keySelector).timeWindow(Time.seconds(60)).reduce(new 
> some_KeyReduce());
>
>
> If I have a KeySelector that's pretty 'loose' (IE lots of matches) the
> 'some_KeyReduce' function gets hit frequently and some set of values is
> printed out via 'datastream.print()'.
>
> If I have a more stringent KeySelector the 'keyReduce' function never gets
> called but the 'datastream.print()' function still outputs numerous values.
>
> So how are the KeySelector and the output of the datastream.print()
> related? Or are they?
>
>


What is output from DataSet.print()?

2016-08-02 Thread Jon Yeargers
Topology snip:

datastream = 
some_stream.keyBy(keySelector).timeWindow(Time.seconds(60)).reduce(new
some_KeyReduce());


If I have a KeySelector that's pretty 'loose' (IE lots of matches) the
'some_KeyReduce' function gets hit frequently and some set of values is
printed out via 'datastream.print()'.

If I have a more stringent KeySelector the 'keyReduce' function never gets
called but the 'datastream.print()' function still outputs numerous values.

So how are the KeySelector and the output of the datastream.print()
related? Or are they?