wanna bring up this thread as we're looking for similar feature in SQL.
--Please point me if something is there, I don't find any JIRA task.

Now the streaming+batch/batch+batch join is implemented with sideInput.
It's not a one-fit-all rule as Jingsong mentioned, the batch data may be
too large, and it would be changed periodically. A userland PTransform
sounds a more straight-forward option, as it doesn't require support in
runner level.

Mingmin

On Mon, Jul 17, 2017 at 8:56 PM, JingsongLee <lzljs3620...@aliyun.com>
wrote:

> Sorry for so long to reply.
> Hi, Aljoscha, I think Async I/O operator and Batch the same, and Async is
> a better interface. All IO-related operations may be more appropriate
>  for asynchronous use. Just like you said, the beginning
> is like no any special support by the Runners.
> I really like Luke's idea, let the user see a SeekableRea
> d + Sideinput interface, and in the runner layer will
> optimize it to the direct access to external
> store. This requires a suitable SeekableRead interface and more efficient
> compiler optimization.
> Kenn's idea is exciting. If we can have an interface similar
>  to FileSystem (Maybe like SeekableRead), abstract and unify a interface
> for multiple of KV stores, we can let users to see only the concept
> of Beam rather than the specific KVStore.
> Best, Jingsong Lee
> ------------------------------------------------------------------From:Kenneth
> Knowles <k...@google.com.INVALID>Time:2017 Jul 7 (Fri) 11:43To:dev <
> dev@beam.apache.org>Cc:JingsongLee <lzljs3620...@aliyun.com>Subject:Re:
> [PROPOSAL] External Join with KV Stores
> In the streams/tables way of talking, side inputs are tables. External KV
> stores are basically also [globally windowed] tables. Both
> are time-varying.
>
> I think it makes perfect sense to access an external KV store in userland
> directly rather than listen to its changelog and reproduce the same table
> as a multimap side input. I'm sure many users are already doing this. I'm
> sure users will always do this. Providing a common interface (simpler than
> Filesystem) and helpful transform(s) in an extension module seems nice.
> Does it require any support in the core SDK?
>
> If I understand, Luke & Robert, you favor adding metadata to Read/SDF so
> that a user _does_ write it as a changelog listener that is observed as a
> multimap side input, and each runner optimizes it if they can to just
> directly access the KV store? A runner is free to use any kind of storage
> they like to materialize a side input anyhow, so this is surely possible,
> but it is a "sufficiently smart compiler" issue. As for semantics, I'm not
> worried about availability - it is globally windowed and always available.
> But I think this requires retractions to be correctly equivalent to direct
> access.
>
> I think we can have a userland PTransform in much less time than a model
> concept, so I favor it.
>
> Kenn
>
>


-- 
----
Mingmin

Reply via email to