Re: Some thoughts about the lower-level Flink APIs

Gyula Fóra Sun, 14 Aug 2016 22:46:39 -0700

Hi Jamie,

I agree that it is often much easier to work on the lower level APIs if you
know what you are doing.


I think it would be nice to have very clean abstractions on that level so
we could teach this to the users first but currently I thinm its not easy
enough to be good starting point.

The user needs to understand a lot about the system if the dont want to
hurt other parts of the pipeline. For insance working with the
streamrecords, propagating watermarks, working with state internals

This all might be overwhelming at the first glance. But maybe we can slim
some abstractions down to the point where this becomes kind of the
extension of the RichFunctions.

Cheers,
Gyula

On Sat, Aug 13, 2016, 17:48 Jamie Grier <[email protected]> wrote:

> Hey all,
>
> I've noticed a few times now when trying to help users implement particular
> things in the Flink API that it can be complicated to map what they know
> they are trying to do onto higher-level Flink concepts such as windowing or
> Connect/CoFlatMap/ValueState, etc.
>
> At some point it just becomes easier to think about writing a Flink
> operator yourself that is integrated into the pipeline with a transform()
> call.
>
> It can just be easier to think at a more basic level.  For example I can
> write an operator that can consume one or two input streams (should
> probably be N), update state which is managed for me fault tolerantly, and
> output elements or setup timers/triggers that give me callbacks from which
> I can also update state or emit elements.
>
> When you think at this level you realize you can program just about
> anything you want.  You can create whatever fault-tolerant data structures
> you want, and easily execute robust stateful computation over data streams
> at scale.  This is the real technology and power of Flink IMO.
>
> Also, at this level I don't have to think about the complexities of
> windowing semantics, learn as much API, etc.  I can easily have some inputs
> that are broadcast, others that are keyed, manage my own state in whatever
> data structure makes sense, etc.  If I know exactly what I actually want to
> do I can just do it with the full power of my chosen language, data
> structures, etc.  I'm not "restricted" to trying to map everything onto
> higher-level Flink constructs which is sometimes actually more complicated.
>
> Programming at this level is actually fairly easy to do but people seem a
> bit afraid of this level of the API.  They think of it as low-level or
> custom hacking..
>
> Anyway, I guess my thought is this..  Should we explain Flink to people at
> this level *first*?  Show that you have nearly unlimited power and
> flexibility to build what you want *and only then* from there explain the
> higher level APIs they can use *if* those match their use cases well.
>
> Would this better demonstrate to people the power of Flink and maybe
> *liberate* them a bit from feeling they have to map their problem onto a
> more complex set of higher level primitives?  I see people trying to
> shoe-horn what they are really trying to do, which is simple to explain in
> english, onto windows, triggers, CoFlatMaps, etc, and this get's
> complicated sometimes.  It's like an impedance mismatch.  You could just
> solve the problem very easily programmed in straight Java/Scala.
>
> Anyway, it's very easy to drop down a level in the API and program whatever
> you want but users don't seem to *perceive* it that way.
>
> Just some thoughts...  Any feedback?  Have any of you had similar
> experiences when working with newer Flink users or as a newer Flink user
> yourself?  Can/should we do anything to make the *lower* level API more
> accessible/visible to users?
>
> -Jamie
>

Re: Some thoughts about the lower-level Flink APIs

Reply via email to