On Thu, Nov 19, 2020 at 10:38 AM Boyuan Zhang <[email protected]> wrote:
> Hi Vincent, > > Thanks for your contribution! I'm happy to work with you on this when you > contribute the code into Beam. > Should I write up a JIRA to start? I have access, I've already been in the process of contributing some big changes to the CassandraIO connector. > > Another thing is that it would be preferable to use Splittable DoFn > <https://beam.apache.org/documentation/programming-guide/#splittable-dofns> > instead > of using UnboundedSource to write a new IO. > I would prefer to use the UnboundedSource connector, I've already written most of it, but also, I see some challenges using Splittable DoFn for Redis streams. Unlike Kafka and Kinesis, Redis Streams offsets are *not* simply monotonically increasing counters, so there is not a way to just claim a chunk of work and know that the chunk has any actual data in it. Since UnboundedSource is not yet deprecated, could I contribute that after finishing up some test aspects, and then perhaps we can implement a Splittable DoFn version? > > On Thu, Nov 19, 2020 at 10:18 AM Vincent Marquez < > [email protected]> wrote: > >> Currently, Redis offers a streaming queue functionality similar to >> Kafka/Kinesis/Pubsub, but Beam does not currently have a connector for it. >> >> I've written an UnboundedSource connector that makes use of Redis Streams >> as a POC and it seems to work well. >> >> If someone is willing to work with me, I could write up a JIRA and/or >> open up a WIP pull request if there is interest in getting this as an >> official connector. I would mostly need guidance on naming/testing aspects. >> >> https://redis.io/topics/streams-intro >> >> *~Vincent* >> > ~Vincent
