I have another very disturbing observation.

The errors go away if I start 2 kafka-producer-perf-test.sh with the same
configs on different hosts.
If I cancel 1 kafka-producer-perf-test.sh then after some time the below
errors start reappearing.

org.apache.kafka.common.errors.TimeoutException: The request timed out.
org.apache.kafka.common.errors.NetworkException: The server disconnected
before a response was received.
org.apache.kafka.common.errors.TimeoutException: Expiring 148 record(s) for
benchmark-6-3r-2isr-none-0: 182806 ms has passed since last append


On Wed, May 30, 2018 at 1:19 AM Localhost shell <
universal.localh...@gmail.com> wrote:

> Hello All,
> I am trying to perform a benchmark test in our kafka env. I have played
> with few configurations such as request.timeout.ms and max.block.ms and
> throughout but not able to avoid the error:
> org.apache.kafka.common.errors.TimeoutException: The request timed out.
> org.apache.kafka.common.errors.NetworkException: The server disconnected
> before a response was received.
> org.apache.kafka.common.errors.TimeoutException: Expiring 148 record(s)
> for benchmark-6-3r-2isr-none-0: 182806 ms has passed since last append
> Produce Perf Test command:
> nohup sh ~/kafka/kafka_2.11-1.0.0/bin/kafka-producer-perf-test.sh --topic
> benchmark-6p-3r-2isr-none --num-records 10000000 --record-size 100
> --throughput -1 --print-metrics --producer-props acks=all
> bootstrap.servers=node1:9092,node2:9092,node3:9092 request.timeout.ms=180000
> max.block.ms=180000 buffer.memory=100000000 >
> ~/kafka/load_test/results/6p-3r-10M-100B-t-1-ackall-rto3m-block2m-bm100m-2
> 2>&1
> Cluster: 3 nodes, topic: 6 partitions, RF=3 and minISR=2
> I am monitoring the kafka metrics using a tsdb and grafana. I know that
> disk IO perf is bad [disk await(1.5 secs), IO queue size and disk
> utilization metrics are high(60-75%)] but I don't see any issue in kafka
> logs that can relate slow disk io to the above perf errors.
> I have even run the test with throughput=1000(all above params same) but
> still get timeout exceptions.
> Need suggestions to understand the issue and fix the above errors?
> --Unilocal


Reply via email to