Re: Assign IDs to Operators

2018-11-21 Thread Jamie Grier
Hi Chang,

The partitioning steps, like keyBy() are not operators.  In general you can
let Flink's fluent-style API tell you the answer.  If you can call .uid()
in the API and it compiles then the thing just before that is an operator ;)

-Jamie


On Wed, Nov 21, 2018 at 5:59 AM Chang Liu  wrote:

> Dear All,
>
> As stated here (
> https://ci.apache.org/projects/flink/flink-docs-stable/ops/state/savepoints.html),
> it is highly recommended to assign IDs to Operators, especially for the
> stateful ones.
>
> My question is: what is the gradually of a so-called Operator.
>
> To be more specific, in the following example, we have the Operators like,
> addSource and map. I am wondering is shuffle and print also a some kind of
> Operator?
>
> DataStream stream = env.
>   // Stateful source (e.g. Kafka) with ID
>   .addSource(new StatefulSource())
>   .uid("source-id") // ID for the source operator
>   .shuffle()
>   // Stateful mapper with ID
>   .map(new StatefulMapper())
>   .uid("mapper-id") // ID for the mapper
>   // Stateless printing sink
>   .print(); // Auto-generated ID
>
>
> Or, in the following example, how many Operator we have (that we can
> assign IDs to)? 3? KeyBy, window and aggregate?
>
>
> input
> .keyBy()
> .window()
> .aggregate(new AverageAggregate)
>
>
> Then, how many Operators (and which are they) do we have in the following
> example?
>
> stream.join(otherStream)
> .where()
> .equalTo()
> .window()
> .apply()
>
>
> Many Thanks.
>
> Best regards/祝好,
>
> Chang Liu 刘畅
>
>
>


Assign IDs to Operators

2018-11-21 Thread Chang Liu
Dear All,

As stated here 
(https://ci.apache.org/projects/flink/flink-docs-stable/ops/state/savepoints.html
 
<https://ci.apache.org/projects/flink/flink-docs-stable/ops/state/savepoints.html>),
 it is highly recommended to assign IDs to Operators, especially for the 
stateful ones.

My question is: what is the gradually of a so-called Operator.

To be more specific, in the following example, we have the Operators like, 
addSource and map. I am wondering is shuffle and print also a some kind of 
Operator?

DataStream stream = env.
  // Stateful source (e.g. Kafka) with ID
  .addSource(new StatefulSource())
  .uid("source-id") // ID for the source operator
  .shuffle()
  // Stateful mapper with ID
  .map(new StatefulMapper())
  .uid("mapper-id") // ID for the mapper
  // Stateless printing sink
  .print(); // Auto-generated ID

Or, in the following example, how many Operator we have (that we can assign IDs 
to)? 3? KeyBy, window and aggregate?


input
.keyBy()
.window()
.aggregate(new AverageAggregate)

Then, how many Operators (and which are they) do we have in the following 
example?

stream.join(otherStream)
.where()
.equalTo()
.window()
.apply()

Many Thanks.

Best regards/祝好,

Chang Liu 刘畅