[ 
https://issues.apache.org/jira/browse/KAFKA-8103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16803933#comment-16803933
 ] 

Sean Humbarger commented on KAFKA-8103:
---------------------------------------

We are still seeing random JVM crashes.  We've switched over from OpenJDK to 
Oracle 1.8.202 and see the same thing:

 

{code}

#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x00007f6c9cd85100, pid=4550, tid=0x00007f6a64792700
#
# JRE version: Java(TM) SE Runtime Environment (8.0_202-b08) (build 
1.8.0_202-b08)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (25.202-b08 mixed mode linux-amd64 
compressed oops)
# Problematic frame:
# V  [libjvm.so+0x2c7100]  Handle::Handle(Thread*, oopDesc*)+0x0
#
# Failed to write core dump. Core dumps have been disabled. To enable core 
dumping, try "ulimit -c unlimited" before starting Java again
#
# If you would like to submit a bug report, please visit:
#   http://bugreport.java.com/bugreport/crash.jsp
#
---------------  T H R E A D  ---------------
Current thread (0x00007f6c99279000):  JavaThread "kafka-request-handler-3" 
daemon [_thread_in_vm, id=4984, stack(0x00007f6a64692000,0x00007f6a64793000)]
siginfo: si_signo: 11 (SIGSEGV), si_code: 2 (SEGV_ACCERR), si_addr: 
0x00007f6c9cd85100
Registers:
RAX=0x00007f6c9da7b2ce, RBX=0x00007f6c99279000, RCX=0x00000005502d3250, 
RDX=0x00000005502d3250
RSP=0x00007f6a647913d8, RBP=0x00007f6a64791410, RSI=0x00007f6c99279000, 
RDI=0x00007f6a647913e8
R8 =0x00000000aa05a64a, R9 =0x0000000550a3d9b8, R10=0x00007f6c9d488af0, 
R11=0x0000000000000002
R12=0x00007f6a64791470, R13=0x00000005502d3250, R14=0x00007f6c9da85f8c, 
R15=0x00007f6c99279000
RIP=0x00007f6c9cd85100, EFLAGS=0x0000000000010246, CSGSFS=0x002b000000000033, 
ERR=0x0000000000000015
  TRAPNO=0x000000000000000e
Top of Stack: (sp=0x00007f6a647913d8)
0x00007f6a647913d8:   00007f6c9d488b38 0000007000000070
0x00007f6a647913e8:   00007f6c8a4e702c 00007f67d003adaa
0x00007f6a647913f8:   0000000000000000 00000000f2e95d57
0x00007f6a64791408:   00000000a9e2f767 00000000a9e05757
0x00007f6a64791418:   00007f6c88e78c88 00000000a9e05757
0x00007f6a64791428:   00007f6c8a0d149c 00000005aa05a64a
0x00007f6a64791438:   00000005502d3250 00000007974aeab8
0x00007f6a64791448:   000000054f02bab8 000000054f17bb60
0x00007f6a64791458:   00007f6c9dbbacdd 00000007974adfe0
0x00007f6a64791468:   00007f6c9d3cc53f 0000000000000003
0x00007f6a64791478:   000000000d70aa32 000000054f037260
0x00007f6a64791488:   00007f6c8ab3cc88 aa147bbef4590578
0x00007f6a64791498:   00000007a2c82bc0 00000007974ae068
0x00007f6a647914a8:   00000007974adcb0 00000007974adfe0
0x00007f6a647914b8:   0000000550a3ddf0 00000007974adcf8
0x00007f6a647914c8:   00000007974adc48 00000007a2c83420
0x00007f6a647914d8:   00007f6cf2e95b69 00000000a9e269d8
0x00007f6a647914e8:   00007f6c8a2c0934 00000007974adba0
0x00007f6a647914f8:   00007f6c8a000138 00007f6a64791550
0x00007f6a64791508:   00007ffed59c2c60 00007f6a64791580
0x00007f6a64791518:   00000002f4590578 000000079d149f38
0x00007f6a64791528:   0000000550a3ddf0 00007f6a64791570
0x00007f6a64791538:   0000000000000000 00000000f4590684
0x00007f6a64791548:   000000079d1b3490 00007f6a64791590
0x00007f6a64791558:   00007f6c9dbbacdd 00000007f3a366fd
0x00007f6a64791568:   00007f6c9dbbacdd 00000007974adb60
0x00007f6a64791578:   00007f6c9d3cc53f 0000000000011172
0x00007f6a64791588:   000000000d7091ab 00000000f4590567
0x00007f6a64791598:   00007f6c8a887a24 00000007a2c82bc0
0x00007f6a647915a8:   00000007a2c832d0 aa147bbe000015bd
0x00007f6a647915b8:   f2e95b56974adb48 00000005f2e95b67
0x00007f6a647915c8:   00000007974adb38 00000007974adab0
Instructions: (pc=0x00007f6c9cd85100)
0x00007f6c9cd850e0:   e8 0b 4c 77 00 48 83 c4 30 5b 41 5c 5d c3 66 90
0x00007f6c9cd850f0:   f3 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00
0x00007f6c9cd85100:   48 85 d2 74 63 55 48 89 e5 41 55 41 54 53 49 89
0x00007f6c9cd85110:   fc 48 89 d3 48 83 ec 08 4c 8b ae 38 01 00 00 49
Register to memory mapping:
RAX=0x00007f6c9da7b2ce: <offset 0xfbd2ce> in 
/usr/local/java/jdk1.8.0_202/jre/lib/amd64/server/libjvm.so at 
0x00007f6c9cabe000
RBX=0x00007f6c99279000 is a thread
RCX=0x00000005502d3250 is an oop
java.lang.Object
 - klass: 'java/lang/Object'
RDX=0x00000005502d3250 is an oop
java.lang.Object
 - klass: 'java/lang/Object'
RSP=0x00007f6a647913d8 is pointing into the stack for thread: 0x00007f6c99279000
RBP=0x00007f6a64791410 is pointing into the stack for thread: 0x00007f6c99279000
RSI=0x00007f6c99279000 is a thread
RDI=0x00007f6a647913e8 is pointing into the stack for thread: 0x00007f6c99279000
R8 =0x00000000aa05a64a is an unknown value
R9 =0x0000000550a3d9b8 is an oop
org.apache.kafka.common.utils.KafkaThread
 - klass: 'org/apache/kafka/common/utils/KafkaThread'
R10=0x00007f6c9d488af0: <offset 0x9caaf0> in 
/usr/local/java/jdk1.8.0_202/jre/lib/amd64/server/libjvm.so at 
0x00007f6c9cabe000
R11=0x0000000000000002 is an unknown value
R12=0x00007f6a64791470 is pointing into the stack for thread: 0x00007f6c99279000
R13=0x00000005502d3250 is an oop
java.lang.Object
 - klass: 'java/lang/Object'
R14=0x00007f6c9da85f8c: <offset 0xfc7f8c> in 
/usr/local/java/jdk1.8.0_202/jre/lib/amd64/server/libjvm.so at 
0x00007f6c9cabe000

{code}

> Kafka SIGSEGV on kafka-network-thread
> -------------------------------------
>
>                 Key: KAFKA-8103
>                 URL: https://issues.apache.org/jira/browse/KAFKA-8103
>             Project: Kafka
>          Issue Type: Bug
>    Affects Versions: 1.1.1
>         Environment: OS 
> Amazon Linux
> Kernel 
> 4.14.97-74.72.amzn1.x86_64 #1 SMP Tue Feb 5 20:59:30 UTC 2019 x86_64 x86_64 
> x86_64 GNU/Linux
> Java
> openjdk version "1.8.0_191"
> OpenJDK Runtime Environment (build 1.8.0_191-b12)
> OpenJDK 64-Bit Server VM (build 25.191-b12, mixed mode)
> AWS Instance Type
> c5.4xlarge
>            Reporter: Sean Humbarger
>            Priority: Major
>         Attachments: hs_err_pid4345.log
>
>
> We have a 4 node cluster (6 topics, 6 consumer groups) that is processing 
> 65,000 messages per second and are seeing SIGSEGV crashes at least once a day 
> (see attachment).  Each broker has six disks attached to it to support the 
> kafka logs.  When the crash occurs, we simply restart kafka and everything 
> seems fine.  We don't see anything out of the ordinary in /var/log/messages 
> or dmesg when the crashes occur.  Thus far, we are unable to predict during 
> the day when the crash will occur or which node it will occur on. 
>  
> The problematic frame is as follows:
> {code:java}
> # Problematic frame:
> # J 8628 C2 
> org.apache.kafka.common.metrics.stats.Max.update(Lorg/apache/kafka/common/metrics/stats/SampledStat$Sample;Lorg/apache/kafka/common/metrics/MetricConfig;DJ)V
>  (13 bytes) @ 0x00007ff779f9fca0 [0x00007ff779f9fc80+0x20]
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to