Hi, We are continuing testing profiler. Thank you for your help, but we have more questions.
After your recomendations we increased memory for profiler topology and set "topology.max.spout.pending" to null as default and set "profiler.workers" to 1 as default too. But with such configuration we had very slow event processing pace and we tried to set "topology.max.spout.pending" to 20000 and also set "profiler.hbase.batch" to 1000. So processing pace was good and kafka lag for profiler consumer began to decrease. But in traffic peak and we had this error again: "org.apache.kafka.clients.consumer.CommitFailedException: Commit cannot be completed since the group has already rebalanced and assigned the partitions to another member. This means that the time between subsequent calls to poll() was longer than the configured max.poll.interval.ms, which typically implies that the poll loop is spending too much time message processing. You can address this either by increasing the session timeout or by reducing the maximum size of batches returned in poll() with max.poll.records. at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.sendOffsetCommitRequest(ConsumerCoordinator.java:798) at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.commitOffsetsSync(ConsumerCoordinator.java:681) at org.apache.kafka.clients.consumer.KafkaConsumer.commitSync(KafkaConsumer.java:1416) at org.apache.kafka.clients.consumer.KafkaConsumer.commitSync(KafkaConsumer.java:1377) at org.apache.storm.kafka.spout.KafkaSpout.commitOffsetsForAckedTuples(KafkaSpout.java:534) at org.apache.storm.kafka.spout.KafkaSpout.nextTuple(KafkaSpout.java:293) at org.apache.storm.daemon.executor$fn__10149$fn__10164$fn__10197.invoke(executor.clj:660) at org.apache.storm.util$async_loop$fn__1221.invoke(util.clj:484) at clojure.lang.AFn.run(AFn.java:22) at java.lang.Thread.run(Thread.java:745)" And after this error (or maybe this error it isn't the reason of that) we didn't see any event processing, and the kafka lag for profiler began to increase. As we undesrstood worker was restart after the error. So why did processing stopped? Thanks.
