Amir - @Setup is a regular Java annotation <http://docs.oracle.com/javase/1.5.0/docs/guide/language/annotations.html>; class names in Java (including names of annotation classes), like all other names, are case-sensitive.
On Fri, Nov 18, 2016 at 12:54 PM amir bahmanyari <[email protected]> wrote: > Thanks Alexey.I just fired up the whole thing. With @Setup. BTW, does it > matter if lowercase @setup or @Setup?I hope not. :-))Will update you when > its done and share my observations.Cheers+have a great weekend.Amir- > > From: Alexey Demin <[email protected]> > To: [email protected]; amir bahmanyari <[email protected]> > Sent: Friday, November 18, 2016 12:38 PM > Subject: Re: Flink runner. Wrapper for DoFn > > In my case it's: > 1) i don't rebuild index by filters every time, only one time on start > processing > 2) connection for remote db does not open hundreds times in second > > as result all pipeline work more stable and faster > > 2016-11-19 0:06 GMT+04:00 amir bahmanyari <[email protected]>: > > > Hi Alexey,What improvements do you expect by replacing @StartBundle > > with @Setup?I am going to give it a try & see what diff it > > makes.Interesting & thanks for bringing it up... > > Cheers > > > > From: Demin Alexey <[email protected]> > > To: [email protected] > > Sent: Friday, November 18, 2016 11:12 AM > > Subject: Re: Flink runner. Wrapper for DoFn > > > > Oh, this is my mistake > > > > Yes correct way its use @Setup. > > > > Thank you Eugene. > > > > > > 2016-11-18 22:54 GMT+04:00 Eugene Kirpichov <[email protected] > > > > : > > > > > Hi Alexey, > > > > > > In general, things like establishing connections and initializing > caches > > > are better done in @Setup and @TearDown methods, rather than > @StartBundle > > > and @FinishBundle, because DoFn's can be reused between bundles and > this > > > way you get more benefit from reuse. > > > > > > Bundles can be pretty small, especially in streaming pipelines. That > > said, > > > they normally shouldn't be 1-element-small. Hopefully someone working > on > > > the Flink runner can comment. > > > > > > On Fri, Nov 18, 2016 at 10:47 AM amir bahmanyari > > > <[email protected]> wrote: > > > > > > > Hmmm...Thanks...This could very well be my bottleneck since I see > tons > > of > > > > threads get on WAIT state after sometime& stay like that relatively > > > > forever.I have a 100 G worth of elements to process...........Is > there > > a > > > > way to bypass this "startBundle" & get a fairly optimized > > > > behavior?Anyone? Thanks+regardsAmir- > > > > > > > > From: Demin Alexey <[email protected]> > > > > To: [email protected]; amir bahmanyari < > > [email protected] > > > > > > > > Sent: Friday, November 18, 2016 10:40 AM > > > > Subject: Re: Flink runner. Wrapper for DoFn > > > > > > > > Very simple example: > > > > > > > > My DoFn on startBundle load filters from remote db and build > optimized > > > > index, on processElement apply filters on every element for decision > > > about > > > > push element to next operation or drop his. > > > > > > > > In current implementation it's like matching regexp on string, you > > have 2 > > > > way > > > > 1) compile regexp every time for every element > > > > 2) compile regexp one time and apply on all element > > > > > > > > now flink work by 1 way and this way not optimal > > > > > > > > > > > > 2016-11-18 22:26 GMT+04:00 amir bahmanyari > <[email protected] > > > >: > > > > > > > > > Hi Alexey," startBundle can be expensive"...Could you elaborate on > > > > > "expensive" as per each element pls? > > > > > Thanks > > > > > > > > > > From: Demin Alexey <[email protected]> > > > > > To: [email protected] > > > > > Sent: Friday, November 18, 2016 7:40 AM > > > > > Subject: Flink runner. Wrapper for DoFn > > > > > > > > > > Hi > > > > > > > > > > In flink runner we have this code: > > > > > > > > > > https://github.com/apache/incubator-beam/blob/master/ > > > > > runners/flink/runner/src/main/java/org/apache/beam/runners/ > > > > > flink/translation/wrappers/streaming/DoFnOperator.java#L262 > > > > > > > > > > but in mostly cases method startBundle can be expensive for making > > for > > > > > every element (for example connection for db/build cache/ etc) > > > > > > > > > > Why so important invoke startBundle/finishBundle on every > > > > > incoming streamRecord ? > > > > > > > > > > Thanks > > > > > Alexey Diomin > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
