I think my favorite line is "Haven’t had this much fun since I started
Kettle" :-)

I was browsing https://github.com/mattcasters/kettle-beam/ to see if I
could comment on how to take advantage of bundling (where to move expensive
logic to @FinishBundle - you will notice that Beam's IO connectors use this)

I noticed that the core DoFns are defined in org.kettle.beam.core which is
in the separate https://github.com/mattcasters/kettle-beam-core/
repository. JFYI for the sake of users / code lurkers.

The only place that looked like it did the sort of work where bundling
would matter is the StepTransform. There's already a separate @FinishBundle
there - does more of the logic need to be moved there?

Kenn

On Tue, Feb 12, 2019 at 8:01 AM Maximilian Michels <m...@apache.org> wrote:

> Yes, you can use Flink's local execution mode, which is the default if
> you don't provide any settings. A cluster should not be necessary to
> complete the integration. Ideally, it should work out of the box :)
>
> > However, I'm first trying to solve the complicated issue of grouping
> records together in Beam in a safe way so that they can batched up
>
> I'm not sure what your use case is but Beam does batching by default.
> The batches are called bundles. The Flink Runner supports setting the
> bundle size.
>
> Cheers,
> Max
>
> On 12.02.19 12:20, Matt Casters wrote:
> > Yes, Flink is obviously the next target.  I'm not expecting too many
> > issues there beyond getting a cluster set up to test on.  I read you can
> > run the Flink Runner locally so that will help a lot in testing.
> >
> > However, I'm first trying to solve the complicated issue of grouping
> > records together in Beam in a safe way so that they can batched up.
> > Batching up is really important for fast loading into a lot of output
> > targets.  I'll probably use some group by behind the scenes or something
> > like that, need to think about that.
> > Having the ability to re-use the existing Kettle steps without having to
> > write new code is really key.
> >
> > Once that is done (in a few weeks) I'll give Flink a shot.
> >
> > Cheers,
> >
> > Matt
> >
> > Op di 12 feb. 2019 om 12:02 schreef Maximilian Michels <m...@apache.org
> > <mailto:m...@apache.org>>:
> >
> >     @Dan: Thanks for sharing the presentation. Kettle is a great way to
> >     make
> >     Beam more accessible.
> >
> >     @Matt: Thanks for the plug. It's good to hear you enjoyed it. I think
> >     the link to your slides got messed up: http://beam.kettle.be
> >
> >     Are you planning to add execution via the Flink Runner to Kettle?
> >     Saw in
> >     the presentation that you already support Direct, Spark, and
> Dataflow.
> >
> >     On 11.02.19 20:50, Matt Casters wrote:
> >      > By the way, Maximilian, I linked and plugged your wonderful FOSDEM
> >      > presentation in my slides http://beam kettle.be
> >     <http://kettle.be> <http://kettle.be> slide
> >      > 19. If you mind, let me know and I'll get it out of the slides.
> >     In any
> >      > case, great content worth promoting I thought.
> >      >
> >      > Op wo 6 feb. 2019 18:03 schreef Maximilian Michels
> >     <m...@apache.org <mailto:m...@apache.org>
> >      > <mailto:m...@apache.org <mailto:m...@apache.org>>:
> >      >
> >      >     Hi Dan,
> >      >
> >      >     Thanks for the info. Would be great to share a video of the
> >      >     presentation.
> >      >
> >      >     Cheers,
> >      >     Max
> >      >
> >      >     On 30.01.19 10:00, Dan wrote:
> >      >      > Hi, in just over a week you're all welcome to come and see
> >     the very
> >      >      > first public reveal of Kettle running on beam! (Including
> >     spark,
> >      >      > dataflow and flink support)
> >      >      >
> >      >      >
> >     https://www.meetup.com/Pentaho-London-User-Group/events/256773962/
> >      >      >
> >      >      > So this ingenious integration combines the power of visual
> >      >     development,
> >      >      > with the platform agnostic benefits of beam - impressive
> >     stuff. No
> >      >      > vendor lock-in here!
> >      >      >
> >      >      >
> >      >      > See you there!
> >      >      > Dan
> >      >
> >
>

Reply via email to