So, you are asking me to hard code to a value of say 1000. Still, it is a fixed value right? how does it automatically adjust to database flush rate?. If backend systems get slow, how does the topology automatically adjust and throttle the rate?. I once saw that actual error was in database writes but storm topology stalled because of OOM exception. I feel that this should be a common problem for all.
Also, how can I view the number of pending spout messages at any given time? Thanks, Johnu On Wed, Sep 2, 2015 at 12:05 AM, Ziemer, Tom <[email protected]> wrote: > Hi, > > > > have a look at Config.TOPOLOGY_MAX_SPOUT_PENDING this should take care of > the OOM, if set to a prudent value since it determines “The maximum number > of tuples that can be pending on a spout task at any given time”. ( > https://nathanmarz.github.io/storm/doc/backtype/storm/Config.html) > > > > Regards, > > Tom > > > > *From:* Jakes John [mailto:[email protected]] > *Sent:* Mittwoch, 2. September 2015 07:57 > *To:* [email protected] > *Subject:* Kafka Spout rate > > > > Hey, > > I have a 5 node storm cluster. I have deployed a storm topology > with Kafka Spout that reads from Kafka cluster and a bolt that writes to > database. When I tested java Kafka consumer independently, I got > throughput around 1M messages per second. Also, when I tested my database > independently, i got throughput maximum around 100k messages per second. > Since, my database is very slow at consuming messages, I need to reduce the > intake of messages by Kafka Spout. Adding more parallelism to DB bolt > doesn't help as I have reached the maximum throughput of database. > Periodically I am seeing "Out of memory exception" in Kafka Spout and > processing stops. > > 1. How can I reduce the rate of Kafka Spout intake of messages? . I assume > the reason for OOM exceptions is that Kafka Spout is fast to read more > messages from Kafka but, DB bolt is not able to flush the messages to > database at the same rate. Is that the reason? I tried playing around by > reducing fetchsize but it didn't help. > > 2. Suppose, if my DB bolt is somehow able to flush entire messages to the > database at the same rate as Kafka spout, but if database gets slow in the > future, will the message intake rate get reduced dynamically to ensure that > OOM exception doesn't happen? How can i pro actively take measures? > > 3. What is the best way to tune my system parameters? Also, how do I test > performance(throughput) of my storm topology? > > > I would like to see how the current storm community deals with my problem > > Thanks for your time >
