[ https://issues.apache.org/jira/browse/KAFKA-8103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sean Humbarger updated KAFKA-8103: ---------------------------------- Environment: OS Amazon Linux Kernel 4.14.97-74.72.amzn1.x86_64 #1 SMP Tue Feb 5 20:59:30 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux Java openjdk version "1.8.0_191" OpenJDK Runtime Environment (build 1.8.0_191-b12) OpenJDK 64-Bit Server VM (build 25.191-b12, mixed mode) AWS Instance Type c5.4xlarge was: OS {code} Amazon Linux {code} Kernel {code} 4.14.97-74.72.amzn1.x86_64 #1 SMP Tue Feb 5 20:59:30 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux {code} Java {code} openjdk version "1.8.0_191" OpenJDK Runtime Environment (build 1.8.0_191-b12) OpenJDK 64-Bit Server VM (build 25.191-b12, mixed mode) {code} AWS Instance Type {code} c5.4xlarge {code} > Kafka SIGSEGV on kafka-network-thread > ------------------------------------- > > Key: KAFKA-8103 > URL: https://issues.apache.org/jira/browse/KAFKA-8103 > Project: Kafka > Issue Type: Bug > Affects Versions: 1.1.1 > Environment: OS > Amazon Linux > Kernel > 4.14.97-74.72.amzn1.x86_64 #1 SMP Tue Feb 5 20:59:30 UTC 2019 x86_64 x86_64 > x86_64 GNU/Linux > Java > openjdk version "1.8.0_191" > OpenJDK Runtime Environment (build 1.8.0_191-b12) > OpenJDK 64-Bit Server VM (build 25.191-b12, mixed mode) > AWS Instance Type > c5.4xlarge > Reporter: Sean Humbarger > Priority: Major > Attachments: hs_err_pid4345.log > > > We have a 4 node cluster (6 topics, 6 consumer groups) that is processing > 65,000 messages per second and are seeing SIGSEGV crashes at least once a day > (see attachment). Each broker has six disks attached to it to support the > kafka logs. When the crash occurs, we simply restart kafka and everything > seems fine. We don't see any out of the ordinary in /var/log/messages or > dmesg when the crashes occur. Thus far, we are unable to predict during the > day when the crash will occur or which node it will occur on. > > The problematic frame is as follows: > {code} > # Problematic frame: > # J 8628 C2 > org.apache.kafka.common.metrics.stats.Max.update(Lorg/apache/kafka/common/metrics/stats/SampledStat$Sample;Lorg/apache/kafka/common/metrics/MetricConfig;DJ)V > (13 bytes) @ 0x00007ff779f9fca0 [0x00007ff779f9fc80+0x20] > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)