Hi!

Thanks for the quick reply!

I actually need a shared subscription, so I can have multiple instances of
consumer consuming the same topic.

I think I didn't explain my issue well, I'll try to explain it again, the
flow is like this:

   1. Producer - publish events from anywhere in the system (consumer can
   publish events, producers can publish directly to pulsar or pulsar-proxy)
   to a topic/topics (this is the question).
   2. Service (multiple consumers that scale-out/scale-down) - created a
   shared subscription that needs to listen to multiple tenants of events (the
   list of tenants can change dynamically) OR to all events.

Now, I am not sure how to implement the event routing and I don't want to
have traffic waste, let me elaborate on that.
Given that all producers publish all events at 30mb/s I don't want a
service that listens to two tenants (let's say 10% of traffic) will consume
30mb/s and filter on the client-side.

Looks like my solution will come to a function that will do a routing, so
the implementation will be something like this:

   1. Producer - publish all events to a topic named "events"
   2. Pulsar function - will process all those events and will route to
   them to service topics
   3. Service - will create shared subscription to its topic

Producer -> topic "events" -> Pulsar functions routes to "service-a" events
-> Service A will listen to "service-a" topic.

Is that something that makes sense?

If so, about a function runtimes - "thread" - is running inside the pulsar
broker OR it runs inside in another process dedicated for functions
(different pod in k8s deployment)




On Mon, 9 Mar 2020 at 9:02 Sijie Guo <[email protected]> wrote:

> Thank you, Yosi! The mailing list is a great place to ask questions since
> the emails are indexed and searchable.
>
> If most of the time, a consumer only listens to a "tenant" topic, you can
> use a master topic and a key_shared subscription to distribute your list of
> tenants. So each of the consumers of the master topic will be receiving a
> subset of the tenants. Then it can listen to those "tenant" topics to
> subscribe. So you don't need to all consumers to subscribe to all topics.
>
> Other comments inline.
>
>
> On Sun, Mar 8, 2020 at 5:02 AM Yosi Attias <[email protected]> wrote:
>
>> Hi!
>>
>> *I posted this to google groups and then the message somehow disappeared,
>> I will send it again here. Sorry for the duplication.*
>>
>> I am checking out pulsar for using it as our events bus, and it's awesome!
>>
>> Our services (written in nodejs) requirements that they need to listen to
>> multiple tenants (or all tenants - we have 10k tenants, and it's growing)
>> and the list of tenants can change dynamically at runtime (changes are not
>> that frequent, we can have 200/300 changes max at a day).
>> Pulsar sounds like an excellent fit for this because I can create topic
>> per tenant, like "tenant:XX:events" (XX = tenant id) and use shared
>> subscription for consumer groups.
>>
>> As I said, the list of tenants needed to be subscribed all consumers in a
>> group gets a message (it's broadcasted via Redis pub/sub).
>>
>> I am not sure what is the best solution to implement this, I see I have
>> two options:
>>
>>    - Client-side: consumer receives a tenant he needs listening to, and
>>    he adds the topic to the shard subscription - sounds a like a right
>>    solution, but:
>>
>>
>>    - Since all consumers will add the same topic at the same time - is
>>       there any issues with this? Or I need to make sure it happens once, so 
>> only
>>       one consumer mutates the shared subscription?
>>
>> It sounds like you need to use an exclusive subscription for this case.
>
>
>>
>>    - There are consumers (small fraction, but important ones) that needs
>>       to listen to all events - this makes the subscription consume all 
>> topics -
>>       is it makes sense in terms of performance? Attaching subscription to 
>> 10k+
>>       topics?
>>
>>
> It is okay to subscribe to a 10k+ topic. However, you need to pay
> attention to allocating memory for your client.
>
> But I would recommend thinking of architecting your service in a different
> way to avoid this if possible.
>
>
>>
>>    - Functions: I thought about creating a function that will have a
>>    list of application subscriptions (not pulsar subscription) and will 
>> listen
>>    to the main topic called "events" (or to all tenant topics? not sure how 
>> to
>>    implement this with function) and will route the events based on
>>    subscriptions to service topic. For example, service named "users" will
>>    have "users-service" topic and the function will route all events to
>>    "users-service" topic. This sounds like a good solution as well, but:
>>       - I am not sure where functions are running, if they are running
>>       as a separate container we will have massive traffic waste - I see 
>> there is
>>       threaded option to run the function - is the function runs inside 
>> pulsar?
>>       So I don't have traffic waste?
>>
>>
> Function have different runtimes - thread, process, and Kubernetes. It is
> pretty flexible.
>
> > So I don't have traffic waste?
>
> I am not sure what does "traffic waste" means. If you are referring
> messages that will be read and write multiple times, that's true. If your
> "service" topics (like users-service) will be used by different
> subscriptions, I would recommend going with function approaches.
>
>
>
>
>>
>>    - Is this overkill for functions?
>>       - Storing of application subscriptions - I can save them inside
>>       our database, and I see I can store them inside pulsar state tables - 
>> what
>>       is most preferred here?
>>       - Once I want to listen to more topic - Should I notify the
>>       function somehow to reload the list of subscriptions (since I will 
>> cache
>>       it) OR I need to implement some refresh timer?
>>
>>
>> Hopefully, this makes sense! If you have any questions and want me to
>> elaborate, please let me know!
>>
>> If you want me to ask in other places (like Slack) or somewhere else, let
>> me know and I will ask their instead.
>>
>

Reply via email to