Hi, guys,

As more users try out Beam running on the SamzaRunner, we got a lot of asks
for an asynchronous processing API. There are a few reasons for these asks:

   - The users here are experienced in asynchronous programming. With async
   frameworks such as Netty and ParSeq and libs like async jersey client, they
   are able to make remote calls efficiently and the libraries help manage the
   execution threads underneath. Async remote calls are very common in most of
   our streaming applications today.
   - Many jobs are running on a multi-tenancy cluster. Async processing
   helps for less resource usage and fast computation (less context switch).

I asked about the async support in a previous email thread. The following
API was mentioned in the reply:

  new DoFn<InputT, OutputT>() {
    @ProcessElement
    public void process(@Element CompletionStage<InputT> element, ...) {
      element.thenApply(...)
    }
  }

We are wondering whether there are any discussions on this API and related
docs. It is awesome that you guys already considered having DoFn to process
asynchronously. Out of curiosity, this API seems to create a
CompletionState out of the input element (probably using framework's
executor) and then allow user to chain on it. To us, it seems more
convenient if the DoFn output a CompletionStage<OutputT> or pass in a
CompletionStage<OutputT> to invoke upon completion.

We would like to discuss further on the async API and hopefully we will
have a great support in Beam. Really appreciate the feedback!

Thanks,
Xinyu

Reply via email to