[jira] [Commented] (KAFKA-8103) Kafka SIGSEGV on kafka-network-thread

2019-03-28 Thread Sean Humbarger (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-8103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16803933#comment-16803933
 ] 

Sean Humbarger commented on KAFKA-8103:
---

We are still seeing random JVM crashes.  We've switched over from OpenJDK to 
Oracle 1.8.202 and see the same thing:

 

{code}

#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x7f6c9cd85100, pid=4550, tid=0x7f6a64792700
#
# JRE version: Java(TM) SE Runtime Environment (8.0_202-b08) (build 
1.8.0_202-b08)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (25.202-b08 mixed mode linux-amd64 
compressed oops)
# Problematic frame:
# V  [libjvm.so+0x2c7100]  Handle::Handle(Thread*, oopDesc*)+0x0
#
# Failed to write core dump. Core dumps have been disabled. To enable core 
dumping, try "ulimit -c unlimited" before starting Java again
#
# If you would like to submit a bug report, please visit:
#   http://bugreport.java.com/bugreport/crash.jsp
#
---  T H R E A D  ---
Current thread (0x7f6c99279000):  JavaThread "kafka-request-handler-3" 
daemon [_thread_in_vm, id=4984, stack(0x7f6a64692000,0x7f6a64793000)]
siginfo: si_signo: 11 (SIGSEGV), si_code: 2 (SEGV_ACCERR), si_addr: 
0x7f6c9cd85100
Registers:
RAX=0x7f6c9da7b2ce, RBX=0x7f6c99279000, RCX=0x0005502d3250, 
RDX=0x0005502d3250
RSP=0x7f6a647913d8, RBP=0x7f6a64791410, RSI=0x7f6c99279000, 
RDI=0x7f6a647913e8
R8 =0xaa05a64a, R9 =0x000550a3d9b8, R10=0x7f6c9d488af0, 
R11=0x0002
R12=0x7f6a64791470, R13=0x0005502d3250, R14=0x7f6c9da85f8c, 
R15=0x7f6c99279000
RIP=0x7f6c9cd85100, EFLAGS=0x00010246, CSGSFS=0x002b0033, 
ERR=0x0015
  TRAPNO=0x000e
Top of Stack: (sp=0x7f6a647913d8)
0x7f6a647913d8:   7f6c9d488b38 00700070
0x7f6a647913e8:   7f6c8a4e702c 7f67d003adaa
0x7f6a647913f8:    f2e95d57
0x7f6a64791408:   a9e2f767 a9e05757
0x7f6a64791418:   7f6c88e78c88 a9e05757
0x7f6a64791428:   7f6c8a0d149c 0005aa05a64a
0x7f6a64791438:   0005502d3250 0007974aeab8
0x7f6a64791448:   00054f02bab8 00054f17bb60
0x7f6a64791458:   7f6c9dbbacdd 0007974adfe0
0x7f6a64791468:   7f6c9d3cc53f 0003
0x7f6a64791478:   0d70aa32 00054f037260
0x7f6a64791488:   7f6c8ab3cc88 aa147bbef4590578
0x7f6a64791498:   0007a2c82bc0 0007974ae068
0x7f6a647914a8:   0007974adcb0 0007974adfe0
0x7f6a647914b8:   000550a3ddf0 0007974adcf8
0x7f6a647914c8:   0007974adc48 0007a2c83420
0x7f6a647914d8:   7f6cf2e95b69 a9e269d8
0x7f6a647914e8:   7f6c8a2c0934 0007974adba0
0x7f6a647914f8:   7f6c8a000138 7f6a64791550
0x7f6a64791508:   7ffed59c2c60 7f6a64791580
0x7f6a64791518:   0002f4590578 00079d149f38
0x7f6a64791528:   000550a3ddf0 7f6a64791570
0x7f6a64791538:    f4590684
0x7f6a64791548:   00079d1b3490 7f6a64791590
0x7f6a64791558:   7f6c9dbbacdd 0007f3a366fd
0x7f6a64791568:   7f6c9dbbacdd 0007974adb60
0x7f6a64791578:   7f6c9d3cc53f 00011172
0x7f6a64791588:   0d7091ab f4590567
0x7f6a64791598:   7f6c8a887a24 0007a2c82bc0
0x7f6a647915a8:   0007a2c832d0 aa147bbe15bd
0x7f6a647915b8:   f2e95b56974adb48 0005f2e95b67
0x7f6a647915c8:   0007974adb38 0007974adab0
Instructions: (pc=0x7f6c9cd85100)
0x7f6c9cd850e0:   e8 0b 4c 77 00 48 83 c4 30 5b 41 5c 5d c3 66 90
0x7f6c9cd850f0:   f3 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00
0x7f6c9cd85100:   48 85 d2 74 63 55 48 89 e5 41 55 41 54 53 49 89
0x7f6c9cd85110:   fc 48 89 d3 48 83 ec 08 4c 8b ae 38 01 00 00 49
Register to memory mapping:
RAX=0x7f6c9da7b2ce:  in 
/usr/local/java/jdk1.8.0_202/jre/lib/amd64/server/libjvm.so at 
0x7f6c9cabe000
RBX=0x7f6c99279000 is a thread
RCX=0x0005502d3250 is an oop
java.lang.Object
 - klass: 'java/lang/Object'
RDX=0x0005502d3250 is an oop
java.lang.Object
 - klass: 'java/lang/Object'
RSP=0x7f6a647913d8 is pointing into the stack for thread: 0x7f6c99279000
RBP=0x7f6a64791410 is pointing into the stack for thread: 0x7f6c99279000
RSI=0x7f6c99279000 is a thread
RDI=0x7f6a647913e8 is pointing into the stack for thread: 0x7f6c99279000
R8 =0xaa05a64a is an unknown value
R9 =0x000550a3d9b8 is an oop
org.apache.kafka.common.utils.KafkaThread
 - klass: 'org/apache/kafka/common/utils/KafkaThread'
R10=0x7f6c9d488af0:  in 
/usr/local/java/jdk1.8.0_202/jre/lib/amd64/server/libjvm.so at 
0x7f6c9cabe000
R11=0x0002 is an unknown value
R12=0x7f6a64791470 is pointing into the s

[jira] [Updated] (KAFKA-8103) Kafka SIGSEGV on kafka-network-thread

2019-03-13 Thread Sean Humbarger (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Humbarger updated KAFKA-8103:
--
Description: 
We have a 4 node cluster (6 topics, 6 consumer groups) that is processing 
65,000 messages per second and are seeing SIGSEGV crashes at least once a day 
(see attachment).  Each broker has six disks attached to it to support the 
kafka logs.  When the crash occurs, we simply restart kafka and everything 
seems fine.  We don't see anything out of the ordinary in /var/log/messages or 
dmesg when the crashes occur.  Thus far, we are unable to predict during the 
day when the crash will occur or which node it will occur on. 

 

The problematic frame is as follows:
{code:java}
# Problematic frame:
# J 8628 C2 
org.apache.kafka.common.metrics.stats.Max.update(Lorg/apache/kafka/common/metrics/stats/SampledStat$Sample;Lorg/apache/kafka/common/metrics/MetricConfig;DJ)V
 (13 bytes) @ 0x7ff779f9fca0 [0x7ff779f9fc80+0x20]
{code}

  was:
We have a 4 node cluster (6 topics, 6 consumer groups) that is processing 
65,000 messages per second and are seeing SIGSEGV crashes at least once a day 
(see attachment).  Each broker has six disks attached to it to support the 
kafka logs.  When the crash occurs, we simply restart kafka and everything 
seems fine.  We don't see any out of the ordinary in /var/log/messages or dmesg 
when the crashes occur.  Thus far, we are unable to predict during the day when 
the crash will occur or which node it will occur on. 

 

The problematic frame is as follows:
{code}

# Problematic frame:
# J 8628 C2 
org.apache.kafka.common.metrics.stats.Max.update(Lorg/apache/kafka/common/metrics/stats/SampledStat$Sample;Lorg/apache/kafka/common/metrics/MetricConfig;DJ)V
 (13 bytes) @ 0x7ff779f9fca0 [0x7ff779f9fc80+0x20]
{code}


> Kafka SIGSEGV on kafka-network-thread
> -
>
> Key: KAFKA-8103
> URL: https://issues.apache.org/jira/browse/KAFKA-8103
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 1.1.1
> Environment: OS 
> Amazon Linux
> Kernel 
> 4.14.97-74.72.amzn1.x86_64 #1 SMP Tue Feb 5 20:59:30 UTC 2019 x86_64 x86_64 
> x86_64 GNU/Linux
> Java
> openjdk version "1.8.0_191"
> OpenJDK Runtime Environment (build 1.8.0_191-b12)
> OpenJDK 64-Bit Server VM (build 25.191-b12, mixed mode)
> AWS Instance Type
> c5.4xlarge
>Reporter: Sean Humbarger
>Priority: Major
> Attachments: hs_err_pid4345.log
>
>
> We have a 4 node cluster (6 topics, 6 consumer groups) that is processing 
> 65,000 messages per second and are seeing SIGSEGV crashes at least once a day 
> (see attachment).  Each broker has six disks attached to it to support the 
> kafka logs.  When the crash occurs, we simply restart kafka and everything 
> seems fine.  We don't see anything out of the ordinary in /var/log/messages 
> or dmesg when the crashes occur.  Thus far, we are unable to predict during 
> the day when the crash will occur or which node it will occur on. 
>  
> The problematic frame is as follows:
> {code:java}
> # Problematic frame:
> # J 8628 C2 
> org.apache.kafka.common.metrics.stats.Max.update(Lorg/apache/kafka/common/metrics/stats/SampledStat$Sample;Lorg/apache/kafka/common/metrics/MetricConfig;DJ)V
>  (13 bytes) @ 0x7ff779f9fca0 [0x7ff779f9fc80+0x20]
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-8103) Kafka SIGSEGV on kafka-network-thread

2019-03-13 Thread Sean Humbarger (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Humbarger updated KAFKA-8103:
--
Environment: 
OS 
Amazon Linux

Kernel 
4.14.97-74.72.amzn1.x86_64 #1 SMP Tue Feb 5 20:59:30 UTC 2019 x86_64 x86_64 
x86_64 GNU/Linux

Java
openjdk version "1.8.0_191"
OpenJDK Runtime Environment (build 1.8.0_191-b12)
OpenJDK 64-Bit Server VM (build 25.191-b12, mixed mode)

AWS Instance Type
c5.4xlarge

  was:
OS 
{code}
Amazon Linux
{code}

Kernel 
{code}
4.14.97-74.72.amzn1.x86_64 #1 SMP Tue Feb 5 20:59:30 UTC 2019 x86_64 x86_64 
x86_64 GNU/Linux
{code}

Java
{code}
openjdk version "1.8.0_191"
OpenJDK Runtime Environment (build 1.8.0_191-b12)
OpenJDK 64-Bit Server VM (build 25.191-b12, mixed mode)
{code}

AWS Instance Type
{code}
c5.4xlarge
{code}


> Kafka SIGSEGV on kafka-network-thread
> -
>
> Key: KAFKA-8103
> URL: https://issues.apache.org/jira/browse/KAFKA-8103
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 1.1.1
> Environment: OS 
> Amazon Linux
> Kernel 
> 4.14.97-74.72.amzn1.x86_64 #1 SMP Tue Feb 5 20:59:30 UTC 2019 x86_64 x86_64 
> x86_64 GNU/Linux
> Java
> openjdk version "1.8.0_191"
> OpenJDK Runtime Environment (build 1.8.0_191-b12)
> OpenJDK 64-Bit Server VM (build 25.191-b12, mixed mode)
> AWS Instance Type
> c5.4xlarge
>Reporter: Sean Humbarger
>Priority: Major
> Attachments: hs_err_pid4345.log
>
>
> We have a 4 node cluster (6 topics, 6 consumer groups) that is processing 
> 65,000 messages per second and are seeing SIGSEGV crashes at least once a day 
> (see attachment).  Each broker has six disks attached to it to support the 
> kafka logs.  When the crash occurs, we simply restart kafka and everything 
> seems fine.  We don't see any out of the ordinary in /var/log/messages or 
> dmesg when the crashes occur.  Thus far, we are unable to predict during the 
> day when the crash will occur or which node it will occur on. 
>  
> The problematic frame is as follows:
> {code}
> # Problematic frame:
> # J 8628 C2 
> org.apache.kafka.common.metrics.stats.Max.update(Lorg/apache/kafka/common/metrics/stats/SampledStat$Sample;Lorg/apache/kafka/common/metrics/MetricConfig;DJ)V
>  (13 bytes) @ 0x7ff779f9fca0 [0x7ff779f9fc80+0x20]
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-8103) Kafka SIGSEGV on kafka-network-thread

2019-03-13 Thread Sean Humbarger (JIRA)
Sean Humbarger created KAFKA-8103:
-

 Summary: Kafka SIGSEGV on kafka-network-thread
 Key: KAFKA-8103
 URL: https://issues.apache.org/jira/browse/KAFKA-8103
 Project: Kafka
  Issue Type: Bug
Affects Versions: 1.1.1
 Environment: OS 
{code}
Amazon Linux
{code}

Kernel 
{code}
4.14.97-74.72.amzn1.x86_64 #1 SMP Tue Feb 5 20:59:30 UTC 2019 x86_64 x86_64 
x86_64 GNU/Linux
{code}

Java
{code}
openjdk version "1.8.0_191"
OpenJDK Runtime Environment (build 1.8.0_191-b12)
OpenJDK 64-Bit Server VM (build 25.191-b12, mixed mode)
{code}

AWS Instance Type
{code}
c5.4xlarge
{code}
Reporter: Sean Humbarger
 Attachments: hs_err_pid4345.log

We have a 4 node cluster (6 topics, 6 consumer groups) that is processing 
65,000 messages per second and are seeing SIGSEGV crashes at least once a day 
(see attachment).  Each broker has six disks attached to it to support the 
kafka logs.  When the crash occurs, we simply restart kafka and everything 
seems fine.  We don't see any out of the ordinary in /var/log/messages or dmesg 
when the crashes occur.  Thus far, we are unable to predict during the day when 
the crash will occur or which node it will occur on. 

 

The problematic frame is as follows:
{code}

# Problematic frame:
# J 8628 C2 
org.apache.kafka.common.metrics.stats.Max.update(Lorg/apache/kafka/common/metrics/stats/SampledStat$Sample;Lorg/apache/kafka/common/metrics/MetricConfig;DJ)V
 (13 bytes) @ 0x7ff779f9fca0 [0x7ff779f9fc80+0x20]
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)