Re: Strange behavior on filter, group and reduce DataSets

Fabian Hueske Mon, 19 Mar 2018 03:38:21 -0700

Hi,

Union is actually a very simple operator (not even an operator in Flink
terms). It just merges to inputs. There is no additional logic involved.
Therefore, it should also not emit records before either of both
ReduceFunctions sorted its data.
Once the data has been sorted for the ReduceFunction, the data is reduced
and emitted in a pipelined fashion, i.e., once the first record is reduced,
it is forwarded into the MapFunction (passing the unioned inputs).
So it is not unexpected that Map starts processing before the
ReduceFunction terminated.


Did you enable object reuse [1]?
If yes, try to disable it. If you want to reuse objects, you have to be
careful in how you implement your functions.
If no, can you share the plan (ExecutionEnvironment.getExecutionPlan())
that was generated for the program?

Thanks,
Fabian

[1] https://ci.apache.org/projects/flink/flink-docs-
release-1.3/dev/batch/index.html#operating-on-data-objects-in-functions



2018-03-19 9:51 GMT+01:00 Flavio Pompermaier <pomperma...@okkam.it>:

> Any help on this? This thing is very strange..the "manual" union of the
> output of the 2 datasets is different than the flink-union of them..
> Could it be a problem of the flink optimizer?
>
> Best,
> Flavio
>
> On Fri, Mar 16, 2018 at 4:01 PM, simone <simone.povosca...@gmail.com>
> wrote:
>
>> Sorry, I translated the code into pseudocode too fast. That is indeed an
>> equals.
>>
>> On 16/03/2018 15:58, Kien Truong wrote:
>>
>> Hi,
>>
>> Just a guest, but string compare in Java should be using equals method,
>> not == operator.
>>
>> Regards,
>>
>> Kien
>>
>>
>> On 3/16/2018 9:47 PM, simone wrote:
>>
>> *subject.getField("field1") == "";*
>>
>>
>>
>
>

Re: Strange behavior on filter, group and reduce DataSets

Reply via email to