Hi Chang,
The partitioning steps, like keyBy() are not operators. In general you can
let Flink's fluent-style API tell you the answer. If you can call .uid()
in the API and it compiles then the thing just before that is an operator ;)
-Jamie
On Wed, Nov 21, 2018 at 5:59 AM Chang Liu wrote:
> Dear All,
>
> As stated here (
> https://ci.apache.org/projects/flink/flink-docs-stable/ops/state/savepoints.html),
> it is highly recommended to assign IDs to Operators, especially for the
> stateful ones.
>
> My question is: what is the gradually of a so-called Operator.
>
> To be more specific, in the following example, we have the Operators like,
> addSource and map. I am wondering is shuffle and print also a some kind of
> Operator?
>
> DataStream stream = env.
> // Stateful source (e.g. Kafka) with ID
> .addSource(new StatefulSource())
> .uid("source-id") // ID for the source operator
> .shuffle()
> // Stateful mapper with ID
> .map(new StatefulMapper())
> .uid("mapper-id") // ID for the mapper
> // Stateless printing sink
> .print(); // Auto-generated ID
>
>
> Or, in the following example, how many Operator we have (that we can
> assign IDs to)? 3? KeyBy, window and aggregate?
>
>
> input
> .keyBy()
> .window()
> .aggregate(new AverageAggregate)
>
>
> Then, how many Operators (and which are they) do we have in the following
> example?
>
> stream.join(otherStream)
> .where()
> .equalTo()
> .window()
> .apply()
>
>
> Many Thanks.
>
> Best regards/祝好,
>
> Chang Liu 刘畅
>
>
>