Hi Ebru,
the count() operator is a very simple utility functions that calls
execute() internally. If you want to have a more complex pipeline you
can take a look at how our WordCount [0] example works. The general
concept is to emit a 1 for every record and sum the ones in parallel. If
you need an overall count, you need to set the parallelism of the last
operator to 1 (operator(xxx).setParallelism(1)), but this means that
your job is not executed in parallel anymore.
It might also make sense to take a look at Flink's Table & SQL API [1]
which makes such operations easier.
Hope that helps.
Regards,
Timo
[0]
https://github.com/apache/flink/blob/master/flink-examples/flink-examples-batch/src/main/java/org/apache/flink/examples/java/wordcount/WordCount.java
[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/table/index.html
Am 11/29/17 um 3:26 PM schrieb ebru:
Hi all,
We are trying to use more than one count operator for dataset, but it executes
first count and skips other operations. Also we call env.execute().
How can we solve this problem?
-Ebru