On Thu, Nov 19, 2020 at 10:38 AM Boyuan Zhang <[email protected]> wrote:

> Hi Vincent,
>
> Thanks for your contribution! I'm happy to work with you on this when you
> contribute the code into Beam.
>

Should I write up a JIRA to start?  I have access, I've already been in the
process of contributing some big changes to the CassandraIO connector.


>
> Another thing is that it would be preferable to use Splittable DoFn
> <https://beam.apache.org/documentation/programming-guide/#splittable-dofns> 
> instead
> of using UnboundedSource to write a new IO.
>

I would prefer to use the UnboundedSource connector, I've already written
most of it, but also, I see some challenges using Splittable DoFn for Redis
streams.

Unlike Kafka and Kinesis, Redis Streams offsets are *not* simply
monotonically increasing counters, so there is not a way  to just claim a
chunk of work and know that the chunk has any actual data in it.

Since UnboundedSource is not yet deprecated, could I contribute that after
finishing up some test aspects, and then perhaps we can implement a
Splittable DoFn version?



>
> On Thu, Nov 19, 2020 at 10:18 AM Vincent Marquez <
> [email protected]> wrote:
>
>> Currently, Redis offers a streaming queue functionality similar to
>> Kafka/Kinesis/Pubsub, but Beam does not currently have a connector for it.
>>
>> I've written an UnboundedSource connector that makes use of Redis Streams
>> as a POC and it seems to work well.
>>
>> If someone is willing to work with me, I could write up a JIRA and/or
>> open up a WIP pull request if there is interest in getting this as an
>> official connector.  I would mostly need guidance on naming/testing aspects.
>>
>> https://redis.io/topics/streams-intro
>>
>> *~Vincent*
>>
>
~Vincent

Reply via email to