Hi Ebru,

the count() operator is a very simple utility functions that calls execute() internally. If you want to have a more complex pipeline you can take a look at how our WordCount [0] example works. The general concept is to emit a 1 for every record and sum the ones in parallel. If you need an overall count, you need to set the parallelism of the last operator to 1 (operator(xxx).setParallelism(1)), but this means that your job is not executed in parallel anymore.

It might also make sense to take a look at Flink's Table & SQL API [1] which makes such operations easier.

Hope that helps.

Regards,
Timo



[0] https://github.com/apache/flink/blob/master/flink-examples/flink-examples-batch/src/main/java/org/apache/flink/examples/java/wordcount/WordCount.java [1] https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/table/index.html


Am 11/29/17 um 3:26 PM schrieb ebru:
Hi all,

We are trying to use more than one count operator for dataset, but it executes 
first count and skips other operations. Also we call env.execute().
How can we solve this problem?

-Ebru


Reply via email to