If you're talking about reactive programming, at a certain level beam is
already reactive. Are you referring to a specific way of writing the code?


On Fri, Mar 9, 2018 at 1:59 PM Reuven Lax <re...@google.com> wrote:

> What do you mean by reactive?
>
> On Fri, Mar 9, 2018, 6:58 PM Romain Manni-Bucau <rmannibu...@gmail.com>
> wrote:
>
>> @Kenn: why not preferring to make beam reactive? Would alow to scale way
>> more without having to hardly synchronize multithreading. Elegant and
>> efficient :). Beam 3?
>>
>>
>> Le 9 mars 2018 22:49, "Kenneth Knowles" <k...@google.com> a écrit :
>>
>>> I will start with the "exciting futuristic" answer, which is that we
>>> envision the new DoFn to be able to provide an automatic ExecutorService
>>> parameters that you can use as you wish.
>>>
>>>     new DoFn<>() {
>>>       @ProcessElement
>>>       public void process(ProcessContext ctx, ExecutorService
>>> executorService) {
>>>           ... launch some futures, put them in instance vars ...
>>>       }
>>>
>>>       @FinishBundle
>>>       public void finish(...) {
>>>          ... block on futures, output results if appropriate ...
>>>       }
>>>     }
>>>
>>> This way, the Java SDK harness can use its overarching knowledge of what
>>> is going on in a computation to, for example, share a thread pool between
>>> different bits. This was one reason to delete IntraBundleParallelization -
>>> it didn't allow the runner and user code to properly manage how many things
>>> were going on concurrently. And mostly the runner should own parallelizing
>>> to max out cores and what user code needs is asynchrony hooks that can
>>> interact with that. However, this feature is not thoroughly considered. TBD
>>> how much the harness itself manages blocking on outstanding requests versus
>>> it being your responsibility in FinishBundle, etc.
>>>
>>> I haven't explored rolling your own here, if you are willing to do the
>>> knob tuning to get the threading acceptable for your particular use case.
>>> Perhaps someone else can weigh in.
>>>
>>> Kenn
>>>
>>> On Fri, Mar 9, 2018 at 1:38 PM Josh Ferge <josh.fe...@bounceexchange.com>
>>> wrote:
>>>
>>>> Hello all:
>>>>
>>>> Our team has a pipeline that make external network calls. These
>>>> pipelines are currently super slow, and the hypothesis is that they are
>>>> slow because we are not threading for our network calls. The github issue
>>>> below provides some discussion around this:
>>>>
>>>> https://github.com/apache/beam/pull/957
>>>>
>>>> In beam 1.0, there was IntraBundleParallelization, which helped with
>>>> this. However, this was removed because it didn't comply with a few BEAM
>>>> paradigms.
>>>>
>>>> Questions going forward:
>>>>
>>>> What is advised for jobs that make blocking network calls? It seems
>>>> bundling the elements into groups of size X prior to passing to the DoFn,
>>>> and managing the threading within the function might work. thoughts?
>>>> Are these types of jobs even suitable for beam?
>>>> Are there any plans to develop features that help with this?
>>>>
>>>> Thanks
>>>>
>>>

Reply via email to