If you're talking about reactive programming, at a certain level beam is already reactive. Are you referring to a specific way of writing the code?
On Fri, Mar 9, 2018 at 1:59 PM Reuven Lax <re...@google.com> wrote: > What do you mean by reactive? > > On Fri, Mar 9, 2018, 6:58 PM Romain Manni-Bucau <rmannibu...@gmail.com> > wrote: > >> @Kenn: why not preferring to make beam reactive? Would alow to scale way >> more without having to hardly synchronize multithreading. Elegant and >> efficient :). Beam 3? >> >> >> Le 9 mars 2018 22:49, "Kenneth Knowles" <k...@google.com> a écrit : >> >>> I will start with the "exciting futuristic" answer, which is that we >>> envision the new DoFn to be able to provide an automatic ExecutorService >>> parameters that you can use as you wish. >>> >>> new DoFn<>() { >>> @ProcessElement >>> public void process(ProcessContext ctx, ExecutorService >>> executorService) { >>> ... launch some futures, put them in instance vars ... >>> } >>> >>> @FinishBundle >>> public void finish(...) { >>> ... block on futures, output results if appropriate ... >>> } >>> } >>> >>> This way, the Java SDK harness can use its overarching knowledge of what >>> is going on in a computation to, for example, share a thread pool between >>> different bits. This was one reason to delete IntraBundleParallelization - >>> it didn't allow the runner and user code to properly manage how many things >>> were going on concurrently. And mostly the runner should own parallelizing >>> to max out cores and what user code needs is asynchrony hooks that can >>> interact with that. However, this feature is not thoroughly considered. TBD >>> how much the harness itself manages blocking on outstanding requests versus >>> it being your responsibility in FinishBundle, etc. >>> >>> I haven't explored rolling your own here, if you are willing to do the >>> knob tuning to get the threading acceptable for your particular use case. >>> Perhaps someone else can weigh in. >>> >>> Kenn >>> >>> On Fri, Mar 9, 2018 at 1:38 PM Josh Ferge <josh.fe...@bounceexchange.com> >>> wrote: >>> >>>> Hello all: >>>> >>>> Our team has a pipeline that make external network calls. These >>>> pipelines are currently super slow, and the hypothesis is that they are >>>> slow because we are not threading for our network calls. The github issue >>>> below provides some discussion around this: >>>> >>>> https://github.com/apache/beam/pull/957 >>>> >>>> In beam 1.0, there was IntraBundleParallelization, which helped with >>>> this. However, this was removed because it didn't comply with a few BEAM >>>> paradigms. >>>> >>>> Questions going forward: >>>> >>>> What is advised for jobs that make blocking network calls? It seems >>>> bundling the elements into groups of size X prior to passing to the DoFn, >>>> and managing the threading within the function might work. thoughts? >>>> Are these types of jobs even suitable for beam? >>>> Are there any plans to develop features that help with this? >>>> >>>> Thanks >>>> >>>