Re: Kafka Spout rate

Jakes John Wed, 02 Sep 2015 08:23:01 -0700

So, you are asking me to hard code to a value of say 1000. Still, it is a
fixed value right? how does it automatically adjust to database flush
rate?. If backend systems get slow, how does the topology automatically
adjust and throttle the rate?. I once saw that actual error was in database
writes but storm topology stalled because of OOM exception. I feel that
this should be a common problem for all.


Also, how can I view the number of pending spout messages at any given time?

Thanks,
Johnu

On Wed, Sep 2, 2015 at 12:05 AM, Ziemer, Tom <[email protected]>
wrote:

> Hi,
>
>
>
> have a look at Config.TOPOLOGY_MAX_SPOUT_PENDING this should take care of
> the OOM, if set to a prudent value since it determines “The maximum number
> of tuples that can be pending on a spout task at any given time”. (
> https://nathanmarz.github.io/storm/doc/backtype/storm/Config.html)
>
>
>
> Regards,
>
> Tom
>
>
>
> *From:* Jakes John [mailto:[email protected]]
> *Sent:* Mittwoch, 2. September 2015 07:57
> *To:* [email protected]
> *Subject:* Kafka Spout rate
>
>
>
> Hey,
>
>        I have a 5 node storm cluster. I have deployed a storm topology
> with Kafka Spout that reads from Kafka cluster and a bolt that writes to
> database.  When I tested java Kafka consumer independently, I got
> throughput around 1M messages per second.  Also, when I tested my database
> independently, i got throughput maximum around 100k messages per second.
> Since, my database is very slow at consuming messages, I need to reduce the
> intake of messages by Kafka Spout. Adding more parallelism  to DB bolt
> doesn't help as I have reached the maximum throughput of database.
> Periodically I am seeing "Out of memory exception" in Kafka Spout and
> processing stops.
>
> 1. How can I reduce the rate of Kafka Spout intake of messages? . I assume
> the reason for OOM exceptions is  that Kafka Spout is fast to read more
> messages from Kafka but, DB bolt is not able to flush the messages to
> database at the same rate.   Is that the reason? I tried playing around by
> reducing fetchsize but it didn't help.
>
> 2. Suppose, if my DB bolt is somehow able to flush entire messages  to the
> database at the same rate as Kafka spout, but if database gets slow in the
> future, will the message intake rate get reduced dynamically to ensure that
> OOM exception doesn't happen? How can i pro actively take measures?
>
> 3. What is the best way to tune my system parameters? Also, how do I test
> performance(throughput) of my storm topology?
>
>
> I would like to see how the current storm community deals with my problem
>
> Thanks for your time
>

Re: Kafka Spout rate

Reply via email to