I have certainly had use cases in the past where I've made use of the Kafka
consumer library's ability to consume from a set of topics based on a
regular expression.

Specifically, I worked with a microservices architecture where each service
had a Postgres DB with a logical decoding client for change data capture to
Kafka. There were separate Kafka topics populated per DB. We then had a
service that consumed from all those topics via regex for sinking the data
to S3. It was particularly nice that the regex-based topic subscription was
able to automatically pick up new topics matching the regex without
restarting the application.

I do find it limiting that PubsubIO needs to know about topics explicitly
on startup.

On Tue, Dec 8, 2020 at 8:23 PM Vincent Marquez <vincent.marq...@gmail.com>
wrote:

> KafkaIO has a readAll method that returns a
> PTransform<PCollection<KafkaSourceDescription>, PCollection<V>> is that
> what you mean? Then it could read in a 'dynamic' number of topics generated
> from somewhere else.  Is that what you mean?
>
> *~Vincent*
>
>
> On Tue, Dec 8, 2020 at 5:15 PM Daniel Collins <dpcoll...@google.com>
> wrote:
>
>> /s/Combine/Flatten
>>
>> On Tue, Dec 8, 2020 at 8:06 PM Daniel Collins <dpcoll...@google.com>
>> wrote:
>>
>>> Hi all,
>>>
>>> I'm trying to figure out if there's any possible use for reading from a
>>> dynamic set of Pub/Sub [Lite] subscriptions in a beam pipeline, although
>>> the same logic would apply to kafka topics. Does anyone know of a use case
>>> where you'd want to apply the same set of processing logic to all messages
>>> on a set of topics, but, you wouldn't know that set of topics when the
>>> pipeline is started? (otherwise you could just use Combine).
>>>
>>> -Dan
>>>
>>

Reply via email to