[jira] [Updated] (KAFKA-7304) memory leakage in org.apache.kafka.common.network.Selector
[ https://issues.apache.org/jira/browse/KAFKA-7304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ismael Juma updated KAFKA-7304: --- Fix Version/s: (was: 2.0.2) (was: 2.2.0) (was: 1.1.2) > memory leakage in org.apache.kafka.common.network.Selector > -- > > Key: KAFKA-7304 > URL: https://issues.apache.org/jira/browse/KAFKA-7304 > Project: Kafka > Issue Type: Bug > Components: core >Affects Versions: 1.1.0, 1.1.1 >Reporter: Yu Yang >Priority: Major > Attachments: 7304.v4.txt, 7304.v7.txt, Screen Shot 2018-08-16 at > 11.04.16 PM.png, Screen Shot 2018-08-16 at 11.06.38 PM.png, Screen Shot > 2018-08-16 at 12.41.26 PM.png, Screen Shot 2018-08-16 at 4.26.19 PM.png, > Screen Shot 2018-08-17 at 1.03.35 AM.png, Screen Shot 2018-08-17 at 1.04.32 > AM.png, Screen Shot 2018-08-17 at 1.05.30 AM.png, Screen Shot 2018-08-28 at > 11.09.45 AM.png, Screen Shot 2018-08-29 at 10.49.03 AM.png, Screen Shot > 2018-08-29 at 10.50.47 AM.png, Screen Shot 2018-09-29 at 10.38.12 PM.png, > Screen Shot 2018-09-29 at 10.38.38 PM.png, Screen Shot 2018-09-29 at 8.34.50 > PM.png > > > We are testing secured writing to kafka through ssl. Testing at small scale, > ssl writing to kafka was fine. However, when we enabled ssl writing at a > larger scale (>40k clients write concurrently), the kafka brokers soon hit > OutOfMemory issue with 4G memory setting. We have tried with increasing the > heap size to 10Gb, but encountered the same issue. > We took a few heap dumps , and found that most of the heap memory is > referenced through org.apache.kafka.common.network.Selector objects. There > are two Channel maps field in Selector. It seems that somehow the objects is > not deleted from the map in a timely manner. > One observation is that the memory leak seems relate to kafka partition > leader changes. If there is broker restart etc. in the cluster that caused > partition leadership change, the brokers may hit the OOM issue faster. > {code} > private final Map channels; > private final Map closingChannels; > {code} > Please see the attached images and the following link for sample gc > analysis. > http://gceasy.io/my-gc-report.jsp?p=c2hhcmVkLzIwMTgvMDgvMTcvLS1nYy5sb2cuMC5jdXJyZW50Lmd6LS0yLTM5LTM0 > the command line for running kafka: > {code} > java -Xms10g -Xmx10g -XX:NewSize=512m -XX:MaxNewSize=512m > -Xbootclasspath/p:/usr/local/libs/bcp -XX:MetaspaceSize=128m -XX:+UseG1GC > -XX:MaxGCPauseMillis=25 -XX:InitiatingHeapOccupancyPercent=35 > -XX:G1HeapRegionSize=16M -XX:MinMetaspaceFreeRatio=25 > -XX:MaxMetaspaceFreeRatio=75 -XX:+PrintGCDetails -XX:+PrintGCDateStamps > -XX:+PrintTenuringDistribution -Xloggc:/var/log/kafka/gc.log > -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=40 -XX:GCLogFileSize=50M > -Djava.awt.headless=true > -Dlog4j.configuration=file:/etc/kafka/log4j.properties > -Dcom.sun.management.jmxremote > -Dcom.sun.management.jmxremote.authenticate=false > -Dcom.sun.management.jmxremote.ssl=false > -Dcom.sun.management.jmxremote.port= > -Dcom.sun.management.jmxremote.rmi.port= -cp /usr/local/libs/* > kafka.Kafka /etc/kafka/server.properties > {code} > We use java 1.8.0_102, and has applied a TLS patch on reducing > X509Factory.certCache map size from 750 to 20. > {code} > java -version > java version "1.8.0_102" > Java(TM) SE Runtime Environment (build 1.8.0_102-b14) > Java HotSpot(TM) 64-Bit Server VM (build 25.102-b14, mixed mode) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (KAFKA-7304) memory leakage in org.apache.kafka.common.network.Selector
[ https://issues.apache.org/jira/browse/KAFKA-7304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yu Yang updated KAFKA-7304: --- Priority: Major (was: Critical) > memory leakage in org.apache.kafka.common.network.Selector > -- > > Key: KAFKA-7304 > URL: https://issues.apache.org/jira/browse/KAFKA-7304 > Project: Kafka > Issue Type: Bug > Components: core >Affects Versions: 1.1.0, 1.1.1 >Reporter: Yu Yang >Priority: Major > Fix For: 1.1.2, 2.2.0, 2.0.2 > > Attachments: 7304.v4.txt, 7304.v7.txt, Screen Shot 2018-08-16 at > 11.04.16 PM.png, Screen Shot 2018-08-16 at 11.06.38 PM.png, Screen Shot > 2018-08-16 at 12.41.26 PM.png, Screen Shot 2018-08-16 at 4.26.19 PM.png, > Screen Shot 2018-08-17 at 1.03.35 AM.png, Screen Shot 2018-08-17 at 1.04.32 > AM.png, Screen Shot 2018-08-17 at 1.05.30 AM.png, Screen Shot 2018-08-28 at > 11.09.45 AM.png, Screen Shot 2018-08-29 at 10.49.03 AM.png, Screen Shot > 2018-08-29 at 10.50.47 AM.png, Screen Shot 2018-09-29 at 10.38.12 PM.png, > Screen Shot 2018-09-29 at 10.38.38 PM.png, Screen Shot 2018-09-29 at 8.34.50 > PM.png > > > We are testing secured writing to kafka through ssl. Testing at small scale, > ssl writing to kafka was fine. However, when we enabled ssl writing at a > larger scale (>40k clients write concurrently), the kafka brokers soon hit > OutOfMemory issue with 4G memory setting. We have tried with increasing the > heap size to 10Gb, but encountered the same issue. > We took a few heap dumps , and found that most of the heap memory is > referenced through org.apache.kafka.common.network.Selector objects. There > are two Channel maps field in Selector. It seems that somehow the objects is > not deleted from the map in a timely manner. > One observation is that the memory leak seems relate to kafka partition > leader changes. If there is broker restart etc. in the cluster that caused > partition leadership change, the brokers may hit the OOM issue faster. > {code} > private final Map channels; > private final Map closingChannels; > {code} > Please see the attached images and the following link for sample gc > analysis. > http://gceasy.io/my-gc-report.jsp?p=c2hhcmVkLzIwMTgvMDgvMTcvLS1nYy5sb2cuMC5jdXJyZW50Lmd6LS0yLTM5LTM0 > the command line for running kafka: > {code} > java -Xms10g -Xmx10g -XX:NewSize=512m -XX:MaxNewSize=512m > -Xbootclasspath/p:/usr/local/libs/bcp -XX:MetaspaceSize=128m -XX:+UseG1GC > -XX:MaxGCPauseMillis=25 -XX:InitiatingHeapOccupancyPercent=35 > -XX:G1HeapRegionSize=16M -XX:MinMetaspaceFreeRatio=25 > -XX:MaxMetaspaceFreeRatio=75 -XX:+PrintGCDetails -XX:+PrintGCDateStamps > -XX:+PrintTenuringDistribution -Xloggc:/var/log/kafka/gc.log > -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=40 -XX:GCLogFileSize=50M > -Djava.awt.headless=true > -Dlog4j.configuration=file:/etc/kafka/log4j.properties > -Dcom.sun.management.jmxremote > -Dcom.sun.management.jmxremote.authenticate=false > -Dcom.sun.management.jmxremote.ssl=false > -Dcom.sun.management.jmxremote.port= > -Dcom.sun.management.jmxremote.rmi.port= -cp /usr/local/libs/* > kafka.Kafka /etc/kafka/server.properties > {code} > We use java 1.8.0_102, and has applied a TLS patch on reducing > X509Factory.certCache map size from 750 to 20. > {code} > java -version > java version "1.8.0_102" > Java(TM) SE Runtime Environment (build 1.8.0_102-b14) > Java HotSpot(TM) 64-Bit Server VM (build 25.102-b14, mixed mode) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (KAFKA-7304) memory leakage in org.apache.kafka.common.network.Selector
[ https://issues.apache.org/jira/browse/KAFKA-7304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin P. McCabe updated KAFKA-7304: --- Fix Version/s: (was: 2.1.1) > memory leakage in org.apache.kafka.common.network.Selector > -- > > Key: KAFKA-7304 > URL: https://issues.apache.org/jira/browse/KAFKA-7304 > Project: Kafka > Issue Type: Bug > Components: core >Affects Versions: 1.1.0, 1.1.1 >Reporter: Yu Yang >Priority: Critical > Fix For: 1.1.2, 2.2.0, 2.0.2 > > Attachments: 7304.v4.txt, 7304.v7.txt, Screen Shot 2018-08-16 at > 11.04.16 PM.png, Screen Shot 2018-08-16 at 11.06.38 PM.png, Screen Shot > 2018-08-16 at 12.41.26 PM.png, Screen Shot 2018-08-16 at 4.26.19 PM.png, > Screen Shot 2018-08-17 at 1.03.35 AM.png, Screen Shot 2018-08-17 at 1.04.32 > AM.png, Screen Shot 2018-08-17 at 1.05.30 AM.png, Screen Shot 2018-08-28 at > 11.09.45 AM.png, Screen Shot 2018-08-29 at 10.49.03 AM.png, Screen Shot > 2018-08-29 at 10.50.47 AM.png, Screen Shot 2018-09-29 at 10.38.12 PM.png, > Screen Shot 2018-09-29 at 10.38.38 PM.png, Screen Shot 2018-09-29 at 8.34.50 > PM.png > > > We are testing secured writing to kafka through ssl. Testing at small scale, > ssl writing to kafka was fine. However, when we enabled ssl writing at a > larger scale (>40k clients write concurrently), the kafka brokers soon hit > OutOfMemory issue with 4G memory setting. We have tried with increasing the > heap size to 10Gb, but encountered the same issue. > We took a few heap dumps , and found that most of the heap memory is > referenced through org.apache.kafka.common.network.Selector objects. There > are two Channel maps field in Selector. It seems that somehow the objects is > not deleted from the map in a timely manner. > One observation is that the memory leak seems relate to kafka partition > leader changes. If there is broker restart etc. in the cluster that caused > partition leadership change, the brokers may hit the OOM issue faster. > {code} > private final Map channels; > private final Map closingChannels; > {code} > Please see the attached images and the following link for sample gc > analysis. > http://gceasy.io/my-gc-report.jsp?p=c2hhcmVkLzIwMTgvMDgvMTcvLS1nYy5sb2cuMC5jdXJyZW50Lmd6LS0yLTM5LTM0 > the command line for running kafka: > {code} > java -Xms10g -Xmx10g -XX:NewSize=512m -XX:MaxNewSize=512m > -Xbootclasspath/p:/usr/local/libs/bcp -XX:MetaspaceSize=128m -XX:+UseG1GC > -XX:MaxGCPauseMillis=25 -XX:InitiatingHeapOccupancyPercent=35 > -XX:G1HeapRegionSize=16M -XX:MinMetaspaceFreeRatio=25 > -XX:MaxMetaspaceFreeRatio=75 -XX:+PrintGCDetails -XX:+PrintGCDateStamps > -XX:+PrintTenuringDistribution -Xloggc:/var/log/kafka/gc.log > -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=40 -XX:GCLogFileSize=50M > -Djava.awt.headless=true > -Dlog4j.configuration=file:/etc/kafka/log4j.properties > -Dcom.sun.management.jmxremote > -Dcom.sun.management.jmxremote.authenticate=false > -Dcom.sun.management.jmxremote.ssl=false > -Dcom.sun.management.jmxremote.port= > -Dcom.sun.management.jmxremote.rmi.port= -cp /usr/local/libs/* > kafka.Kafka /etc/kafka/server.properties > {code} > We use java 1.8.0_102, and has applied a TLS patch on reducing > X509Factory.certCache map size from 750 to 20. > {code} > java -version > java version "1.8.0_102" > Java(TM) SE Runtime Environment (build 1.8.0_102-b14) > Java HotSpot(TM) 64-Bit Server VM (build 25.102-b14, mixed mode) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (KAFKA-7304) memory leakage in org.apache.kafka.common.network.Selector
[ https://issues.apache.org/jira/browse/KAFKA-7304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ismael Juma updated KAFKA-7304: --- Fix Version/s: (was: 2.0.1) 2.0.2 > memory leakage in org.apache.kafka.common.network.Selector > -- > > Key: KAFKA-7304 > URL: https://issues.apache.org/jira/browse/KAFKA-7304 > Project: Kafka > Issue Type: Bug > Components: core >Affects Versions: 1.1.0, 1.1.1 >Reporter: Yu Yang >Priority: Critical > Fix For: 1.1.2, 2.2.0, 2.1.1, 2.0.2 > > Attachments: 7304.v4.txt, 7304.v7.txt, Screen Shot 2018-08-16 at > 11.04.16 PM.png, Screen Shot 2018-08-16 at 11.06.38 PM.png, Screen Shot > 2018-08-16 at 12.41.26 PM.png, Screen Shot 2018-08-16 at 4.26.19 PM.png, > Screen Shot 2018-08-17 at 1.03.35 AM.png, Screen Shot 2018-08-17 at 1.04.32 > AM.png, Screen Shot 2018-08-17 at 1.05.30 AM.png, Screen Shot 2018-08-28 at > 11.09.45 AM.png, Screen Shot 2018-08-29 at 10.49.03 AM.png, Screen Shot > 2018-08-29 at 10.50.47 AM.png, Screen Shot 2018-09-29 at 10.38.12 PM.png, > Screen Shot 2018-09-29 at 10.38.38 PM.png, Screen Shot 2018-09-29 at 8.34.50 > PM.png > > > We are testing secured writing to kafka through ssl. Testing at small scale, > ssl writing to kafka was fine. However, when we enabled ssl writing at a > larger scale (>40k clients write concurrently), the kafka brokers soon hit > OutOfMemory issue with 4G memory setting. We have tried with increasing the > heap size to 10Gb, but encountered the same issue. > We took a few heap dumps , and found that most of the heap memory is > referenced through org.apache.kafka.common.network.Selector objects. There > are two Channel maps field in Selector. It seems that somehow the objects is > not deleted from the map in a timely manner. > One observation is that the memory leak seems relate to kafka partition > leader changes. If there is broker restart etc. in the cluster that caused > partition leadership change, the brokers may hit the OOM issue faster. > {code} > private final Map channels; > private final Map closingChannels; > {code} > Please see the attached images and the following link for sample gc > analysis. > http://gceasy.io/my-gc-report.jsp?p=c2hhcmVkLzIwMTgvMDgvMTcvLS1nYy5sb2cuMC5jdXJyZW50Lmd6LS0yLTM5LTM0 > the command line for running kafka: > {code} > java -Xms10g -Xmx10g -XX:NewSize=512m -XX:MaxNewSize=512m > -Xbootclasspath/p:/usr/local/libs/bcp -XX:MetaspaceSize=128m -XX:+UseG1GC > -XX:MaxGCPauseMillis=25 -XX:InitiatingHeapOccupancyPercent=35 > -XX:G1HeapRegionSize=16M -XX:MinMetaspaceFreeRatio=25 > -XX:MaxMetaspaceFreeRatio=75 -XX:+PrintGCDetails -XX:+PrintGCDateStamps > -XX:+PrintTenuringDistribution -Xloggc:/var/log/kafka/gc.log > -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=40 -XX:GCLogFileSize=50M > -Djava.awt.headless=true > -Dlog4j.configuration=file:/etc/kafka/log4j.properties > -Dcom.sun.management.jmxremote > -Dcom.sun.management.jmxremote.authenticate=false > -Dcom.sun.management.jmxremote.ssl=false > -Dcom.sun.management.jmxremote.port= > -Dcom.sun.management.jmxremote.rmi.port= -cp /usr/local/libs/* > kafka.Kafka /etc/kafka/server.properties > {code} > We use java 1.8.0_102, and has applied a TLS patch on reducing > X509Factory.certCache map size from 750 to 20. > {code} > java -version > java version "1.8.0_102" > Java(TM) SE Runtime Environment (build 1.8.0_102-b14) > Java HotSpot(TM) 64-Bit Server VM (build 25.102-b14, mixed mode) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (KAFKA-7304) memory leakage in org.apache.kafka.common.network.Selector
[ https://issues.apache.org/jira/browse/KAFKA-7304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dong Lin updated KAFKA-7304: Fix Version/s: 2.1.1 > memory leakage in org.apache.kafka.common.network.Selector > -- > > Key: KAFKA-7304 > URL: https://issues.apache.org/jira/browse/KAFKA-7304 > Project: Kafka > Issue Type: Bug > Components: core >Affects Versions: 1.1.0, 1.1.1 >Reporter: Yu Yang >Priority: Critical > Fix For: 1.1.2, 2.0.1, 2.2.0, 2.1.1 > > Attachments: 7304.v4.txt, 7304.v7.txt, Screen Shot 2018-08-16 at > 11.04.16 PM.png, Screen Shot 2018-08-16 at 11.06.38 PM.png, Screen Shot > 2018-08-16 at 12.41.26 PM.png, Screen Shot 2018-08-16 at 4.26.19 PM.png, > Screen Shot 2018-08-17 at 1.03.35 AM.png, Screen Shot 2018-08-17 at 1.04.32 > AM.png, Screen Shot 2018-08-17 at 1.05.30 AM.png, Screen Shot 2018-08-28 at > 11.09.45 AM.png, Screen Shot 2018-08-29 at 10.49.03 AM.png, Screen Shot > 2018-08-29 at 10.50.47 AM.png, Screen Shot 2018-09-29 at 10.38.12 PM.png, > Screen Shot 2018-09-29 at 10.38.38 PM.png, Screen Shot 2018-09-29 at 8.34.50 > PM.png > > > We are testing secured writing to kafka through ssl. Testing at small scale, > ssl writing to kafka was fine. However, when we enabled ssl writing at a > larger scale (>40k clients write concurrently), the kafka brokers soon hit > OutOfMemory issue with 4G memory setting. We have tried with increasing the > heap size to 10Gb, but encountered the same issue. > We took a few heap dumps , and found that most of the heap memory is > referenced through org.apache.kafka.common.network.Selector objects. There > are two Channel maps field in Selector. It seems that somehow the objects is > not deleted from the map in a timely manner. > One observation is that the memory leak seems relate to kafka partition > leader changes. If there is broker restart etc. in the cluster that caused > partition leadership change, the brokers may hit the OOM issue faster. > {code} > private final Map channels; > private final Map closingChannels; > {code} > Please see the attached images and the following link for sample gc > analysis. > http://gceasy.io/my-gc-report.jsp?p=c2hhcmVkLzIwMTgvMDgvMTcvLS1nYy5sb2cuMC5jdXJyZW50Lmd6LS0yLTM5LTM0 > the command line for running kafka: > {code} > java -Xms10g -Xmx10g -XX:NewSize=512m -XX:MaxNewSize=512m > -Xbootclasspath/p:/usr/local/libs/bcp -XX:MetaspaceSize=128m -XX:+UseG1GC > -XX:MaxGCPauseMillis=25 -XX:InitiatingHeapOccupancyPercent=35 > -XX:G1HeapRegionSize=16M -XX:MinMetaspaceFreeRatio=25 > -XX:MaxMetaspaceFreeRatio=75 -XX:+PrintGCDetails -XX:+PrintGCDateStamps > -XX:+PrintTenuringDistribution -Xloggc:/var/log/kafka/gc.log > -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=40 -XX:GCLogFileSize=50M > -Djava.awt.headless=true > -Dlog4j.configuration=file:/etc/kafka/log4j.properties > -Dcom.sun.management.jmxremote > -Dcom.sun.management.jmxremote.authenticate=false > -Dcom.sun.management.jmxremote.ssl=false > -Dcom.sun.management.jmxremote.port= > -Dcom.sun.management.jmxremote.rmi.port= -cp /usr/local/libs/* > kafka.Kafka /etc/kafka/server.properties > {code} > We use java 1.8.0_102, and has applied a TLS patch on reducing > X509Factory.certCache map size from 750 to 20. > {code} > java -version > java version "1.8.0_102" > Java(TM) SE Runtime Environment (build 1.8.0_102-b14) > Java HotSpot(TM) 64-Bit Server VM (build 25.102-b14, mixed mode) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (KAFKA-7304) memory leakage in org.apache.kafka.common.network.Selector
[ https://issues.apache.org/jira/browse/KAFKA-7304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dong Lin updated KAFKA-7304: Fix Version/s: (was: 2.1.0) 2.2.0 > memory leakage in org.apache.kafka.common.network.Selector > -- > > Key: KAFKA-7304 > URL: https://issues.apache.org/jira/browse/KAFKA-7304 > Project: Kafka > Issue Type: Bug > Components: core >Affects Versions: 1.1.0, 1.1.1 >Reporter: Yu Yang >Priority: Critical > Fix For: 1.1.2, 2.0.1, 2.2.0 > > Attachments: 7304.v4.txt, 7304.v7.txt, Screen Shot 2018-08-16 at > 11.04.16 PM.png, Screen Shot 2018-08-16 at 11.06.38 PM.png, Screen Shot > 2018-08-16 at 12.41.26 PM.png, Screen Shot 2018-08-16 at 4.26.19 PM.png, > Screen Shot 2018-08-17 at 1.03.35 AM.png, Screen Shot 2018-08-17 at 1.04.32 > AM.png, Screen Shot 2018-08-17 at 1.05.30 AM.png, Screen Shot 2018-08-28 at > 11.09.45 AM.png, Screen Shot 2018-08-29 at 10.49.03 AM.png, Screen Shot > 2018-08-29 at 10.50.47 AM.png, Screen Shot 2018-09-29 at 10.38.12 PM.png, > Screen Shot 2018-09-29 at 10.38.38 PM.png, Screen Shot 2018-09-29 at 8.34.50 > PM.png > > > We are testing secured writing to kafka through ssl. Testing at small scale, > ssl writing to kafka was fine. However, when we enabled ssl writing at a > larger scale (>40k clients write concurrently), the kafka brokers soon hit > OutOfMemory issue with 4G memory setting. We have tried with increasing the > heap size to 10Gb, but encountered the same issue. > We took a few heap dumps , and found that most of the heap memory is > referenced through org.apache.kafka.common.network.Selector objects. There > are two Channel maps field in Selector. It seems that somehow the objects is > not deleted from the map in a timely manner. > One observation is that the memory leak seems relate to kafka partition > leader changes. If there is broker restart etc. in the cluster that caused > partition leadership change, the brokers may hit the OOM issue faster. > {code} > private final Map channels; > private final Map closingChannels; > {code} > Please see the attached images and the following link for sample gc > analysis. > http://gceasy.io/my-gc-report.jsp?p=c2hhcmVkLzIwMTgvMDgvMTcvLS1nYy5sb2cuMC5jdXJyZW50Lmd6LS0yLTM5LTM0 > the command line for running kafka: > {code} > java -Xms10g -Xmx10g -XX:NewSize=512m -XX:MaxNewSize=512m > -Xbootclasspath/p:/usr/local/libs/bcp -XX:MetaspaceSize=128m -XX:+UseG1GC > -XX:MaxGCPauseMillis=25 -XX:InitiatingHeapOccupancyPercent=35 > -XX:G1HeapRegionSize=16M -XX:MinMetaspaceFreeRatio=25 > -XX:MaxMetaspaceFreeRatio=75 -XX:+PrintGCDetails -XX:+PrintGCDateStamps > -XX:+PrintTenuringDistribution -Xloggc:/var/log/kafka/gc.log > -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=40 -XX:GCLogFileSize=50M > -Djava.awt.headless=true > -Dlog4j.configuration=file:/etc/kafka/log4j.properties > -Dcom.sun.management.jmxremote > -Dcom.sun.management.jmxremote.authenticate=false > -Dcom.sun.management.jmxremote.ssl=false > -Dcom.sun.management.jmxremote.port= > -Dcom.sun.management.jmxremote.rmi.port= -cp /usr/local/libs/* > kafka.Kafka /etc/kafka/server.properties > {code} > We use java 1.8.0_102, and has applied a TLS patch on reducing > X509Factory.certCache map size from 750 to 20. > {code} > java -version > java version "1.8.0_102" > Java(TM) SE Runtime Environment (build 1.8.0_102-b14) > Java HotSpot(TM) 64-Bit Server VM (build 25.102-b14, mixed mode) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (KAFKA-7304) memory leakage in org.apache.kafka.common.network.Selector
[ https://issues.apache.org/jira/browse/KAFKA-7304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yu Yang updated KAFKA-7304: --- Attachment: Screen Shot 2018-09-29 at 10.38.38 PM.png > memory leakage in org.apache.kafka.common.network.Selector > -- > > Key: KAFKA-7304 > URL: https://issues.apache.org/jira/browse/KAFKA-7304 > Project: Kafka > Issue Type: Bug > Components: core >Affects Versions: 1.1.0, 1.1.1 >Reporter: Yu Yang >Priority: Critical > Fix For: 1.1.2, 2.0.1, 2.1.0 > > Attachments: 7304.v4.txt, 7304.v7.txt, Screen Shot 2018-08-16 at > 11.04.16 PM.png, Screen Shot 2018-08-16 at 11.06.38 PM.png, Screen Shot > 2018-08-16 at 12.41.26 PM.png, Screen Shot 2018-08-16 at 4.26.19 PM.png, > Screen Shot 2018-08-17 at 1.03.35 AM.png, Screen Shot 2018-08-17 at 1.04.32 > AM.png, Screen Shot 2018-08-17 at 1.05.30 AM.png, Screen Shot 2018-08-28 at > 11.09.45 AM.png, Screen Shot 2018-08-29 at 10.49.03 AM.png, Screen Shot > 2018-08-29 at 10.50.47 AM.png, Screen Shot 2018-09-29 at 10.38.12 PM.png, > Screen Shot 2018-09-29 at 10.38.38 PM.png, Screen Shot 2018-09-29 at 8.34.50 > PM.png > > > We are testing secured writing to kafka through ssl. Testing at small scale, > ssl writing to kafka was fine. However, when we enabled ssl writing at a > larger scale (>40k clients write concurrently), the kafka brokers soon hit > OutOfMemory issue with 4G memory setting. We have tried with increasing the > heap size to 10Gb, but encountered the same issue. > We took a few heap dumps , and found that most of the heap memory is > referenced through org.apache.kafka.common.network.Selector objects. There > are two Channel maps field in Selector. It seems that somehow the objects is > not deleted from the map in a timely manner. > One observation is that the memory leak seems relate to kafka partition > leader changes. If there is broker restart etc. in the cluster that caused > partition leadership change, the brokers may hit the OOM issue faster. > {code} > private final Map channels; > private final Map closingChannels; > {code} > Please see the attached images and the following link for sample gc > analysis. > http://gceasy.io/my-gc-report.jsp?p=c2hhcmVkLzIwMTgvMDgvMTcvLS1nYy5sb2cuMC5jdXJyZW50Lmd6LS0yLTM5LTM0 > the command line for running kafka: > {code} > java -Xms10g -Xmx10g -XX:NewSize=512m -XX:MaxNewSize=512m > -Xbootclasspath/p:/usr/local/libs/bcp -XX:MetaspaceSize=128m -XX:+UseG1GC > -XX:MaxGCPauseMillis=25 -XX:InitiatingHeapOccupancyPercent=35 > -XX:G1HeapRegionSize=16M -XX:MinMetaspaceFreeRatio=25 > -XX:MaxMetaspaceFreeRatio=75 -XX:+PrintGCDetails -XX:+PrintGCDateStamps > -XX:+PrintTenuringDistribution -Xloggc:/var/log/kafka/gc.log > -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=40 -XX:GCLogFileSize=50M > -Djava.awt.headless=true > -Dlog4j.configuration=file:/etc/kafka/log4j.properties > -Dcom.sun.management.jmxremote > -Dcom.sun.management.jmxremote.authenticate=false > -Dcom.sun.management.jmxremote.ssl=false > -Dcom.sun.management.jmxremote.port= > -Dcom.sun.management.jmxremote.rmi.port= -cp /usr/local/libs/* > kafka.Kafka /etc/kafka/server.properties > {code} > We use java 1.8.0_102, and has applied a TLS patch on reducing > X509Factory.certCache map size from 750 to 20. > {code} > java -version > java version "1.8.0_102" > Java(TM) SE Runtime Environment (build 1.8.0_102-b14) > Java HotSpot(TM) 64-Bit Server VM (build 25.102-b14, mixed mode) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (KAFKA-7304) memory leakage in org.apache.kafka.common.network.Selector
[ https://issues.apache.org/jira/browse/KAFKA-7304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yu Yang updated KAFKA-7304: --- Attachment: Screen Shot 2018-09-29 at 10.38.12 PM.png > memory leakage in org.apache.kafka.common.network.Selector > -- > > Key: KAFKA-7304 > URL: https://issues.apache.org/jira/browse/KAFKA-7304 > Project: Kafka > Issue Type: Bug > Components: core >Affects Versions: 1.1.0, 1.1.1 >Reporter: Yu Yang >Priority: Critical > Fix For: 1.1.2, 2.0.1, 2.1.0 > > Attachments: 7304.v4.txt, 7304.v7.txt, Screen Shot 2018-08-16 at > 11.04.16 PM.png, Screen Shot 2018-08-16 at 11.06.38 PM.png, Screen Shot > 2018-08-16 at 12.41.26 PM.png, Screen Shot 2018-08-16 at 4.26.19 PM.png, > Screen Shot 2018-08-17 at 1.03.35 AM.png, Screen Shot 2018-08-17 at 1.04.32 > AM.png, Screen Shot 2018-08-17 at 1.05.30 AM.png, Screen Shot 2018-08-28 at > 11.09.45 AM.png, Screen Shot 2018-08-29 at 10.49.03 AM.png, Screen Shot > 2018-08-29 at 10.50.47 AM.png, Screen Shot 2018-09-29 at 10.38.12 PM.png, > Screen Shot 2018-09-29 at 8.34.50 PM.png > > > We are testing secured writing to kafka through ssl. Testing at small scale, > ssl writing to kafka was fine. However, when we enabled ssl writing at a > larger scale (>40k clients write concurrently), the kafka brokers soon hit > OutOfMemory issue with 4G memory setting. We have tried with increasing the > heap size to 10Gb, but encountered the same issue. > We took a few heap dumps , and found that most of the heap memory is > referenced through org.apache.kafka.common.network.Selector objects. There > are two Channel maps field in Selector. It seems that somehow the objects is > not deleted from the map in a timely manner. > One observation is that the memory leak seems relate to kafka partition > leader changes. If there is broker restart etc. in the cluster that caused > partition leadership change, the brokers may hit the OOM issue faster. > {code} > private final Map channels; > private final Map closingChannels; > {code} > Please see the attached images and the following link for sample gc > analysis. > http://gceasy.io/my-gc-report.jsp?p=c2hhcmVkLzIwMTgvMDgvMTcvLS1nYy5sb2cuMC5jdXJyZW50Lmd6LS0yLTM5LTM0 > the command line for running kafka: > {code} > java -Xms10g -Xmx10g -XX:NewSize=512m -XX:MaxNewSize=512m > -Xbootclasspath/p:/usr/local/libs/bcp -XX:MetaspaceSize=128m -XX:+UseG1GC > -XX:MaxGCPauseMillis=25 -XX:InitiatingHeapOccupancyPercent=35 > -XX:G1HeapRegionSize=16M -XX:MinMetaspaceFreeRatio=25 > -XX:MaxMetaspaceFreeRatio=75 -XX:+PrintGCDetails -XX:+PrintGCDateStamps > -XX:+PrintTenuringDistribution -Xloggc:/var/log/kafka/gc.log > -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=40 -XX:GCLogFileSize=50M > -Djava.awt.headless=true > -Dlog4j.configuration=file:/etc/kafka/log4j.properties > -Dcom.sun.management.jmxremote > -Dcom.sun.management.jmxremote.authenticate=false > -Dcom.sun.management.jmxremote.ssl=false > -Dcom.sun.management.jmxremote.port= > -Dcom.sun.management.jmxremote.rmi.port= -cp /usr/local/libs/* > kafka.Kafka /etc/kafka/server.properties > {code} > We use java 1.8.0_102, and has applied a TLS patch on reducing > X509Factory.certCache map size from 750 to 20. > {code} > java -version > java version "1.8.0_102" > Java(TM) SE Runtime Environment (build 1.8.0_102-b14) > Java HotSpot(TM) 64-Bit Server VM (build 25.102-b14, mixed mode) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (KAFKA-7304) memory leakage in org.apache.kafka.common.network.Selector
[ https://issues.apache.org/jira/browse/KAFKA-7304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yu Yang updated KAFKA-7304: --- Attachment: Screen Shot 2018-09-29 at 8.34.50 PM.png > memory leakage in org.apache.kafka.common.network.Selector > -- > > Key: KAFKA-7304 > URL: https://issues.apache.org/jira/browse/KAFKA-7304 > Project: Kafka > Issue Type: Bug > Components: core >Affects Versions: 1.1.0, 1.1.1 >Reporter: Yu Yang >Priority: Critical > Fix For: 1.1.2, 2.0.1, 2.1.0 > > Attachments: 7304.v4.txt, 7304.v7.txt, Screen Shot 2018-08-16 at > 11.04.16 PM.png, Screen Shot 2018-08-16 at 11.06.38 PM.png, Screen Shot > 2018-08-16 at 12.41.26 PM.png, Screen Shot 2018-08-16 at 4.26.19 PM.png, > Screen Shot 2018-08-17 at 1.03.35 AM.png, Screen Shot 2018-08-17 at 1.04.32 > AM.png, Screen Shot 2018-08-17 at 1.05.30 AM.png, Screen Shot 2018-08-28 at > 11.09.45 AM.png, Screen Shot 2018-08-29 at 10.49.03 AM.png, Screen Shot > 2018-08-29 at 10.50.47 AM.png, Screen Shot 2018-09-29 at 8.34.50 PM.png > > > We are testing secured writing to kafka through ssl. Testing at small scale, > ssl writing to kafka was fine. However, when we enabled ssl writing at a > larger scale (>40k clients write concurrently), the kafka brokers soon hit > OutOfMemory issue with 4G memory setting. We have tried with increasing the > heap size to 10Gb, but encountered the same issue. > We took a few heap dumps , and found that most of the heap memory is > referenced through org.apache.kafka.common.network.Selector objects. There > are two Channel maps field in Selector. It seems that somehow the objects is > not deleted from the map in a timely manner. > One observation is that the memory leak seems relate to kafka partition > leader changes. If there is broker restart etc. in the cluster that caused > partition leadership change, the brokers may hit the OOM issue faster. > {code} > private final Map channels; > private final Map closingChannels; > {code} > Please see the attached images and the following link for sample gc > analysis. > http://gceasy.io/my-gc-report.jsp?p=c2hhcmVkLzIwMTgvMDgvMTcvLS1nYy5sb2cuMC5jdXJyZW50Lmd6LS0yLTM5LTM0 > the command line for running kafka: > {code} > java -Xms10g -Xmx10g -XX:NewSize=512m -XX:MaxNewSize=512m > -Xbootclasspath/p:/usr/local/libs/bcp -XX:MetaspaceSize=128m -XX:+UseG1GC > -XX:MaxGCPauseMillis=25 -XX:InitiatingHeapOccupancyPercent=35 > -XX:G1HeapRegionSize=16M -XX:MinMetaspaceFreeRatio=25 > -XX:MaxMetaspaceFreeRatio=75 -XX:+PrintGCDetails -XX:+PrintGCDateStamps > -XX:+PrintTenuringDistribution -Xloggc:/var/log/kafka/gc.log > -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=40 -XX:GCLogFileSize=50M > -Djava.awt.headless=true > -Dlog4j.configuration=file:/etc/kafka/log4j.properties > -Dcom.sun.management.jmxremote > -Dcom.sun.management.jmxremote.authenticate=false > -Dcom.sun.management.jmxremote.ssl=false > -Dcom.sun.management.jmxremote.port= > -Dcom.sun.management.jmxremote.rmi.port= -cp /usr/local/libs/* > kafka.Kafka /etc/kafka/server.properties > {code} > We use java 1.8.0_102, and has applied a TLS patch on reducing > X509Factory.certCache map size from 750 to 20. > {code} > java -version > java version "1.8.0_102" > Java(TM) SE Runtime Environment (build 1.8.0_102-b14) > Java HotSpot(TM) 64-Bit Server VM (build 25.102-b14, mixed mode) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (KAFKA-7304) memory leakage in org.apache.kafka.common.network.Selector
[ https://issues.apache.org/jira/browse/KAFKA-7304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yu Yang updated KAFKA-7304: --- Attachment: Screen Shot 2018-08-29 at 10.49.03 AM.png > memory leakage in org.apache.kafka.common.network.Selector > -- > > Key: KAFKA-7304 > URL: https://issues.apache.org/jira/browse/KAFKA-7304 > Project: Kafka > Issue Type: Bug > Components: core >Affects Versions: 1.1.0, 1.1.1 >Reporter: Yu Yang >Priority: Critical > Fix For: 1.1.2, 2.0.1, 2.1.0 > > Attachments: 7304.v4.txt, 7304.v7.txt, Screen Shot 2018-08-16 at > 11.04.16 PM.png, Screen Shot 2018-08-16 at 11.06.38 PM.png, Screen Shot > 2018-08-16 at 12.41.26 PM.png, Screen Shot 2018-08-16 at 4.26.19 PM.png, > Screen Shot 2018-08-17 at 1.03.35 AM.png, Screen Shot 2018-08-17 at 1.04.32 > AM.png, Screen Shot 2018-08-17 at 1.05.30 AM.png, Screen Shot 2018-08-28 at > 11.09.45 AM.png, Screen Shot 2018-08-29 at 10.49.03 AM.png, Screen Shot > 2018-08-29 at 10.50.47 AM.png > > > We are testing secured writing to kafka through ssl. Testing at small scale, > ssl writing to kafka was fine. However, when we enabled ssl writing at a > larger scale (>40k clients write concurrently), the kafka brokers soon hit > OutOfMemory issue with 4G memory setting. We have tried with increasing the > heap size to 10Gb, but encountered the same issue. > We took a few heap dumps , and found that most of the heap memory is > referenced through org.apache.kafka.common.network.Selector objects. There > are two Channel maps field in Selector. It seems that somehow the objects is > not deleted from the map in a timely manner. > One observation is that the memory leak seems relate to kafka partition > leader changes. If there is broker restart etc. in the cluster that caused > partition leadership change, the brokers may hit the OOM issue faster. > {code} > private final Map channels; > private final Map closingChannels; > {code} > Please see the attached images and the following link for sample gc > analysis. > http://gceasy.io/my-gc-report.jsp?p=c2hhcmVkLzIwMTgvMDgvMTcvLS1nYy5sb2cuMC5jdXJyZW50Lmd6LS0yLTM5LTM0 > the command line for running kafka: > {code} > java -Xms10g -Xmx10g -XX:NewSize=512m -XX:MaxNewSize=512m > -Xbootclasspath/p:/usr/local/libs/bcp -XX:MetaspaceSize=128m -XX:+UseG1GC > -XX:MaxGCPauseMillis=25 -XX:InitiatingHeapOccupancyPercent=35 > -XX:G1HeapRegionSize=16M -XX:MinMetaspaceFreeRatio=25 > -XX:MaxMetaspaceFreeRatio=75 -XX:+PrintGCDetails -XX:+PrintGCDateStamps > -XX:+PrintTenuringDistribution -Xloggc:/var/log/kafka/gc.log > -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=40 -XX:GCLogFileSize=50M > -Djava.awt.headless=true > -Dlog4j.configuration=file:/etc/kafka/log4j.properties > -Dcom.sun.management.jmxremote > -Dcom.sun.management.jmxremote.authenticate=false > -Dcom.sun.management.jmxremote.ssl=false > -Dcom.sun.management.jmxremote.port= > -Dcom.sun.management.jmxremote.rmi.port= -cp /usr/local/libs/* > kafka.Kafka /etc/kafka/server.properties > {code} > We use java 1.8.0_102, and has applied a TLS patch on reducing > X509Factory.certCache map size from 750 to 20. > {code} > java -version > java version "1.8.0_102" > Java(TM) SE Runtime Environment (build 1.8.0_102-b14) > Java HotSpot(TM) 64-Bit Server VM (build 25.102-b14, mixed mode) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (KAFKA-7304) memory leakage in org.apache.kafka.common.network.Selector
[ https://issues.apache.org/jira/browse/KAFKA-7304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yu Yang updated KAFKA-7304: --- Attachment: Screen Shot 2018-08-29 at 10.50.47 AM.png > memory leakage in org.apache.kafka.common.network.Selector > -- > > Key: KAFKA-7304 > URL: https://issues.apache.org/jira/browse/KAFKA-7304 > Project: Kafka > Issue Type: Bug > Components: core >Affects Versions: 1.1.0, 1.1.1 >Reporter: Yu Yang >Priority: Critical > Fix For: 1.1.2, 2.0.1, 2.1.0 > > Attachments: 7304.v4.txt, 7304.v7.txt, Screen Shot 2018-08-16 at > 11.04.16 PM.png, Screen Shot 2018-08-16 at 11.06.38 PM.png, Screen Shot > 2018-08-16 at 12.41.26 PM.png, Screen Shot 2018-08-16 at 4.26.19 PM.png, > Screen Shot 2018-08-17 at 1.03.35 AM.png, Screen Shot 2018-08-17 at 1.04.32 > AM.png, Screen Shot 2018-08-17 at 1.05.30 AM.png, Screen Shot 2018-08-28 at > 11.09.45 AM.png, Screen Shot 2018-08-29 at 10.49.03 AM.png, Screen Shot > 2018-08-29 at 10.50.47 AM.png > > > We are testing secured writing to kafka through ssl. Testing at small scale, > ssl writing to kafka was fine. However, when we enabled ssl writing at a > larger scale (>40k clients write concurrently), the kafka brokers soon hit > OutOfMemory issue with 4G memory setting. We have tried with increasing the > heap size to 10Gb, but encountered the same issue. > We took a few heap dumps , and found that most of the heap memory is > referenced through org.apache.kafka.common.network.Selector objects. There > are two Channel maps field in Selector. It seems that somehow the objects is > not deleted from the map in a timely manner. > One observation is that the memory leak seems relate to kafka partition > leader changes. If there is broker restart etc. in the cluster that caused > partition leadership change, the brokers may hit the OOM issue faster. > {code} > private final Map channels; > private final Map closingChannels; > {code} > Please see the attached images and the following link for sample gc > analysis. > http://gceasy.io/my-gc-report.jsp?p=c2hhcmVkLzIwMTgvMDgvMTcvLS1nYy5sb2cuMC5jdXJyZW50Lmd6LS0yLTM5LTM0 > the command line for running kafka: > {code} > java -Xms10g -Xmx10g -XX:NewSize=512m -XX:MaxNewSize=512m > -Xbootclasspath/p:/usr/local/libs/bcp -XX:MetaspaceSize=128m -XX:+UseG1GC > -XX:MaxGCPauseMillis=25 -XX:InitiatingHeapOccupancyPercent=35 > -XX:G1HeapRegionSize=16M -XX:MinMetaspaceFreeRatio=25 > -XX:MaxMetaspaceFreeRatio=75 -XX:+PrintGCDetails -XX:+PrintGCDateStamps > -XX:+PrintTenuringDistribution -Xloggc:/var/log/kafka/gc.log > -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=40 -XX:GCLogFileSize=50M > -Djava.awt.headless=true > -Dlog4j.configuration=file:/etc/kafka/log4j.properties > -Dcom.sun.management.jmxremote > -Dcom.sun.management.jmxremote.authenticate=false > -Dcom.sun.management.jmxremote.ssl=false > -Dcom.sun.management.jmxremote.port= > -Dcom.sun.management.jmxremote.rmi.port= -cp /usr/local/libs/* > kafka.Kafka /etc/kafka/server.properties > {code} > We use java 1.8.0_102, and has applied a TLS patch on reducing > X509Factory.certCache map size from 750 to 20. > {code} > java -version > java version "1.8.0_102" > Java(TM) SE Runtime Environment (build 1.8.0_102-b14) > Java HotSpot(TM) 64-Bit Server VM (build 25.102-b14, mixed mode) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (KAFKA-7304) memory leakage in org.apache.kafka.common.network.Selector
[ https://issues.apache.org/jira/browse/KAFKA-7304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yu Yang updated KAFKA-7304: --- Attachment: Screen Shot 2018-08-28 at 11.09.45 AM.png > memory leakage in org.apache.kafka.common.network.Selector > -- > > Key: KAFKA-7304 > URL: https://issues.apache.org/jira/browse/KAFKA-7304 > Project: Kafka > Issue Type: Bug > Components: core >Affects Versions: 1.1.0, 1.1.1 >Reporter: Yu Yang >Priority: Critical > Fix For: 1.1.2, 2.0.1, 2.1.0 > > Attachments: 7304.v4.txt, 7304.v7.txt, Screen Shot 2018-08-16 at > 11.04.16 PM.png, Screen Shot 2018-08-16 at 11.06.38 PM.png, Screen Shot > 2018-08-16 at 12.41.26 PM.png, Screen Shot 2018-08-16 at 4.26.19 PM.png, > Screen Shot 2018-08-17 at 1.03.35 AM.png, Screen Shot 2018-08-17 at 1.04.32 > AM.png, Screen Shot 2018-08-17 at 1.05.30 AM.png, Screen Shot 2018-08-28 at > 11.09.45 AM.png > > > We are testing secured writing to kafka through ssl. Testing at small scale, > ssl writing to kafka was fine. However, when we enabled ssl writing at a > larger scale (>40k clients write concurrently), the kafka brokers soon hit > OutOfMemory issue with 4G memory setting. We have tried with increasing the > heap size to 10Gb, but encountered the same issue. > We took a few heap dumps , and found that most of the heap memory is > referenced through org.apache.kafka.common.network.Selector objects. There > are two Channel maps field in Selector. It seems that somehow the objects is > not deleted from the map in a timely manner. > One observation is that the memory leak seems relate to kafka partition > leader changes. If there is broker restart etc. in the cluster that caused > partition leadership change, the brokers may hit the OOM issue faster. > {code} > private final Map channels; > private final Map closingChannels; > {code} > Please see the attached images and the following link for sample gc > analysis. > http://gceasy.io/my-gc-report.jsp?p=c2hhcmVkLzIwMTgvMDgvMTcvLS1nYy5sb2cuMC5jdXJyZW50Lmd6LS0yLTM5LTM0 > the command line for running kafka: > {code} > java -Xms10g -Xmx10g -XX:NewSize=512m -XX:MaxNewSize=512m > -Xbootclasspath/p:/usr/local/libs/bcp -XX:MetaspaceSize=128m -XX:+UseG1GC > -XX:MaxGCPauseMillis=25 -XX:InitiatingHeapOccupancyPercent=35 > -XX:G1HeapRegionSize=16M -XX:MinMetaspaceFreeRatio=25 > -XX:MaxMetaspaceFreeRatio=75 -XX:+PrintGCDetails -XX:+PrintGCDateStamps > -XX:+PrintTenuringDistribution -Xloggc:/var/log/kafka/gc.log > -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=40 -XX:GCLogFileSize=50M > -Djava.awt.headless=true > -Dlog4j.configuration=file:/etc/kafka/log4j.properties > -Dcom.sun.management.jmxremote > -Dcom.sun.management.jmxremote.authenticate=false > -Dcom.sun.management.jmxremote.ssl=false > -Dcom.sun.management.jmxremote.port= > -Dcom.sun.management.jmxremote.rmi.port= -cp /usr/local/libs/* > kafka.Kafka /etc/kafka/server.properties > {code} > We use java 1.8.0_102, and has applied a TLS patch on reducing > X509Factory.certCache map size from 750 to 20. > {code} > java -version > java version "1.8.0_102" > Java(TM) SE Runtime Environment (build 1.8.0_102-b14) > Java HotSpot(TM) 64-Bit Server VM (build 25.102-b14, mixed mode) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (KAFKA-7304) memory leakage in org.apache.kafka.common.network.Selector
[ https://issues.apache.org/jira/browse/KAFKA-7304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated KAFKA-7304: -- Attachment: 7304.v7.txt > memory leakage in org.apache.kafka.common.network.Selector > -- > > Key: KAFKA-7304 > URL: https://issues.apache.org/jira/browse/KAFKA-7304 > Project: Kafka > Issue Type: Bug > Components: core >Affects Versions: 1.1.0, 1.1.1 >Reporter: Yu Yang >Priority: Critical > Fix For: 1.1.2, 2.0.1, 2.1.0 > > Attachments: 7304.v4.txt, 7304.v7.txt, Screen Shot 2018-08-16 at > 11.04.16 PM.png, Screen Shot 2018-08-16 at 11.06.38 PM.png, Screen Shot > 2018-08-16 at 12.41.26 PM.png, Screen Shot 2018-08-16 at 4.26.19 PM.png, > Screen Shot 2018-08-17 at 1.03.35 AM.png, Screen Shot 2018-08-17 at 1.04.32 > AM.png, Screen Shot 2018-08-17 at 1.05.30 AM.png > > > We are testing secured writing to kafka through ssl. Testing at small scale, > ssl writing to kafka was fine. However, when we enabled ssl writing at a > larger scale (>40k clients write concurrently), the kafka brokers soon hit > OutOfMemory issue with 4G memory setting. We have tried with increasing the > heap size to 10Gb, but encountered the same issue. > We took a few heap dumps , and found that most of the heap memory is > referenced through org.apache.kafka.common.network.Selector objects. There > are two Channel maps field in Selector. It seems that somehow the objects is > not deleted from the map in a timely manner. > One observation is that the memory leak seems relate to kafka partition > leader changes. If there is broker restart etc. in the cluster that caused > partition leadership change, the brokers may hit the OOM issue faster. > {code} > private final Map channels; > private final Map closingChannels; > {code} > Please see the attached images and the following link for sample gc > analysis. > http://gceasy.io/my-gc-report.jsp?p=c2hhcmVkLzIwMTgvMDgvMTcvLS1nYy5sb2cuMC5jdXJyZW50Lmd6LS0yLTM5LTM0 > the command line for running kafka: > {code} > java -Xms10g -Xmx10g -XX:NewSize=512m -XX:MaxNewSize=512m > -Xbootclasspath/p:/usr/local/libs/bcp -XX:MetaspaceSize=128m -XX:+UseG1GC > -XX:MaxGCPauseMillis=25 -XX:InitiatingHeapOccupancyPercent=35 > -XX:G1HeapRegionSize=16M -XX:MinMetaspaceFreeRatio=25 > -XX:MaxMetaspaceFreeRatio=75 -XX:+PrintGCDetails -XX:+PrintGCDateStamps > -XX:+PrintTenuringDistribution -Xloggc:/var/log/kafka/gc.log > -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=40 -XX:GCLogFileSize=50M > -Djava.awt.headless=true > -Dlog4j.configuration=file:/etc/kafka/log4j.properties > -Dcom.sun.management.jmxremote > -Dcom.sun.management.jmxremote.authenticate=false > -Dcom.sun.management.jmxremote.ssl=false > -Dcom.sun.management.jmxremote.port= > -Dcom.sun.management.jmxremote.rmi.port= -cp /usr/local/libs/* > kafka.Kafka /etc/kafka/server.properties > {code} > We use java 1.8.0_102, and has applied a TLS patch on reducing > X509Factory.certCache map size from 750 to 20. > {code} > java -version > java version "1.8.0_102" > Java(TM) SE Runtime Environment (build 1.8.0_102-b14) > Java HotSpot(TM) 64-Bit Server VM (build 25.102-b14, mixed mode) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (KAFKA-7304) memory leakage in org.apache.kafka.common.network.Selector
[ https://issues.apache.org/jira/browse/KAFKA-7304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yu Yang updated KAFKA-7304: --- Affects Version/s: 1.1.1 > memory leakage in org.apache.kafka.common.network.Selector > -- > > Key: KAFKA-7304 > URL: https://issues.apache.org/jira/browse/KAFKA-7304 > Project: Kafka > Issue Type: Bug > Components: core >Affects Versions: 1.1.0, 1.1.1 >Reporter: Yu Yang >Priority: Critical > Fix For: 1.1.2, 2.0.1, 2.1.0 > > Attachments: 7304.v4.txt, Screen Shot 2018-08-16 at 11.04.16 PM.png, > Screen Shot 2018-08-16 at 11.06.38 PM.png, Screen Shot 2018-08-16 at 12.41.26 > PM.png, Screen Shot 2018-08-16 at 4.26.19 PM.png, Screen Shot 2018-08-17 at > 1.03.35 AM.png, Screen Shot 2018-08-17 at 1.04.32 AM.png, Screen Shot > 2018-08-17 at 1.05.30 AM.png > > > We are testing secured writing to kafka through ssl. Testing at small scale, > ssl writing to kafka was fine. However, when we enabled ssl writing at a > larger scale (>40k clients write concurrently), the kafka brokers soon hit > OutOfMemory issue with 4G memory setting. We have tried with increasing the > heap size to 10Gb, but encountered the same issue. > We took a few heap dumps , and found that most of the heap memory is > referenced through org.apache.kafka.common.network.Selector objects. There > are two Channel maps field in Selector. It seems that somehow the objects is > not deleted from the map in a timely manner. > One observation is that the memory leak seems relate to kafka partition > leader changes. If there is broker restart etc. in the cluster that caused > partition leadership change, the brokers may hit the OOM issue faster. > {code} > private final Map channels; > private final Map closingChannels; > {code} > Please see the attached images and the following link for sample gc > analysis. > http://gceasy.io/my-gc-report.jsp?p=c2hhcmVkLzIwMTgvMDgvMTcvLS1nYy5sb2cuMC5jdXJyZW50Lmd6LS0yLTM5LTM0 > the command line for running kafka: > {code} > java -Xms10g -Xmx10g -XX:NewSize=512m -XX:MaxNewSize=512m > -Xbootclasspath/p:/usr/local/libs/bcp -XX:MetaspaceSize=128m -XX:+UseG1GC > -XX:MaxGCPauseMillis=25 -XX:InitiatingHeapOccupancyPercent=35 > -XX:G1HeapRegionSize=16M -XX:MinMetaspaceFreeRatio=25 > -XX:MaxMetaspaceFreeRatio=75 -XX:+PrintGCDetails -XX:+PrintGCDateStamps > -XX:+PrintTenuringDistribution -Xloggc:/var/log/kafka/gc.log > -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=40 -XX:GCLogFileSize=50M > -Djava.awt.headless=true > -Dlog4j.configuration=file:/etc/kafka/log4j.properties > -Dcom.sun.management.jmxremote > -Dcom.sun.management.jmxremote.authenticate=false > -Dcom.sun.management.jmxremote.ssl=false > -Dcom.sun.management.jmxremote.port= > -Dcom.sun.management.jmxremote.rmi.port= -cp /usr/local/libs/* > kafka.Kafka /etc/kafka/server.properties > {code} > We use java 1.8.0_102, and has applied a TLS patch on reducing > X509Factory.certCache map size from 750 to 20. > {code} > java -version > java version "1.8.0_102" > Java(TM) SE Runtime Environment (build 1.8.0_102-b14) > Java HotSpot(TM) 64-Bit Server VM (build 25.102-b14, mixed mode) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (KAFKA-7304) memory leakage in org.apache.kafka.common.network.Selector
[ https://issues.apache.org/jira/browse/KAFKA-7304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated KAFKA-7304: -- Attachment: 7304.v4.txt > memory leakage in org.apache.kafka.common.network.Selector > -- > > Key: KAFKA-7304 > URL: https://issues.apache.org/jira/browse/KAFKA-7304 > Project: Kafka > Issue Type: Bug > Components: core >Affects Versions: 1.1.0 >Reporter: Yu Yang >Priority: Critical > Fix For: 1.1.2, 2.0.1, 2.1.0 > > Attachments: 7304.v4.txt, Screen Shot 2018-08-16 at 11.04.16 PM.png, > Screen Shot 2018-08-16 at 11.06.38 PM.png, Screen Shot 2018-08-16 at 12.41.26 > PM.png, Screen Shot 2018-08-16 at 4.26.19 PM.png, Screen Shot 2018-08-17 at > 1.03.35 AM.png, Screen Shot 2018-08-17 at 1.04.32 AM.png, Screen Shot > 2018-08-17 at 1.05.30 AM.png > > > We are testing secured writing to kafka through ssl. Testing at small scale, > ssl writing to kafka was fine. However, when we enabled ssl writing at a > larger scale (>40k clients write concurrently), the kafka brokers soon hit > OutOfMemory issue with 4G memory setting. We have tried with increasing the > heap size to 10Gb, but encountered the same issue. > We took a few heap dumps , and found that most of the heap memory is > referenced through org.apache.kafka.common.network.Selector objects. There > are two Channel maps field in Selector. It seems that somehow the objects is > not deleted from the map in a timely manner. > One observation is that the memory leak seems relate to kafka partition > leader changes. If there is broker restart etc. in the cluster that caused > partition leadership change, the brokers may hit the OOM issue faster. > {code} > private final Map channels; > private final Map closingChannels; > {code} > Please see the attached images and the following link for sample gc > analysis. > http://gceasy.io/my-gc-report.jsp?p=c2hhcmVkLzIwMTgvMDgvMTcvLS1nYy5sb2cuMC5jdXJyZW50Lmd6LS0yLTM5LTM0 > the command line for running kafka: > {code} > java -Xms10g -Xmx10g -XX:NewSize=512m -XX:MaxNewSize=512m > -Xbootclasspath/p:/usr/local/libs/bcp -XX:MetaspaceSize=128m -XX:+UseG1GC > -XX:MaxGCPauseMillis=25 -XX:InitiatingHeapOccupancyPercent=35 > -XX:G1HeapRegionSize=16M -XX:MinMetaspaceFreeRatio=25 > -XX:MaxMetaspaceFreeRatio=75 -XX:+PrintGCDetails -XX:+PrintGCDateStamps > -XX:+PrintTenuringDistribution -Xloggc:/var/log/kafka/gc.log > -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=40 -XX:GCLogFileSize=50M > -Djava.awt.headless=true > -Dlog4j.configuration=file:/etc/kafka/log4j.properties > -Dcom.sun.management.jmxremote > -Dcom.sun.management.jmxremote.authenticate=false > -Dcom.sun.management.jmxremote.ssl=false > -Dcom.sun.management.jmxremote.port= > -Dcom.sun.management.jmxremote.rmi.port= -cp /usr/local/libs/* > kafka.Kafka /etc/kafka/server.properties > {code} > We use java 1.8.0_102, and has applied a TLS patch on reducing > X509Factory.certCache map size from 750 to 20. > {code} > java -version > java version "1.8.0_102" > Java(TM) SE Runtime Environment (build 1.8.0_102-b14) > Java HotSpot(TM) 64-Bit Server VM (build 25.102-b14, mixed mode) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (KAFKA-7304) memory leakage in org.apache.kafka.common.network.Selector
[ https://issues.apache.org/jira/browse/KAFKA-7304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ismael Juma updated KAFKA-7304: --- Priority: Critical (was: Major) > memory leakage in org.apache.kafka.common.network.Selector > -- > > Key: KAFKA-7304 > URL: https://issues.apache.org/jira/browse/KAFKA-7304 > Project: Kafka > Issue Type: Bug > Components: core >Affects Versions: 1.1.0 >Reporter: Yu Yang >Priority: Critical > Fix For: 1.1.2, 2.0.1, 2.1.0 > > Attachments: Screen Shot 2018-08-16 at 11.04.16 PM.png, Screen Shot > 2018-08-16 at 11.06.38 PM.png, Screen Shot 2018-08-16 at 12.41.26 PM.png, > Screen Shot 2018-08-16 at 4.26.19 PM.png, Screen Shot 2018-08-17 at 1.03.35 > AM.png, Screen Shot 2018-08-17 at 1.04.32 AM.png, Screen Shot 2018-08-17 at > 1.05.30 AM.png > > > We are testing secured writing to kafka through ssl. Testing at small scale, > ssl writing to kafka was fine. However, when we enabled ssl writing at a > larger scale (>40k clients write concurrently), the kafka brokers soon hit > OutOfMemory issue with 4G memory setting. We have tried with increasing the > heap size to 10Gb, but encountered the same issue. > We took a few heap dumps , and found that most of the heap memory is > referenced through org.apache.kafka.common.network.Selector objects. There > are two Channel maps field in Selector. It seems that somehow the objects is > not deleted from the map in a timely manner. > One observation is that the memory leak seems relate to kafka partition > leader changes. If there is broker restart etc. in the cluster that caused > partition leadership change, the brokers may hit the OOM issue faster. > {code} > private final Map channels; > private final Map closingChannels; > {code} > Please see the attached images and the following link for sample gc > analysis. > http://gceasy.io/my-gc-report.jsp?p=c2hhcmVkLzIwMTgvMDgvMTcvLS1nYy5sb2cuMC5jdXJyZW50Lmd6LS0yLTM5LTM0 > the command line for running kafka: > {code} > java -Xms10g -Xmx10g -XX:NewSize=512m -XX:MaxNewSize=512m > -Xbootclasspath/p:/usr/local/libs/bcp -XX:MetaspaceSize=128m -XX:+UseG1GC > -XX:MaxGCPauseMillis=25 -XX:InitiatingHeapOccupancyPercent=35 > -XX:G1HeapRegionSize=16M -XX:MinMetaspaceFreeRatio=25 > -XX:MaxMetaspaceFreeRatio=75 -XX:+PrintGCDetails -XX:+PrintGCDateStamps > -XX:+PrintTenuringDistribution -Xloggc:/var/log/kafka/gc.log > -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=40 -XX:GCLogFileSize=50M > -Djava.awt.headless=true > -Dlog4j.configuration=file:/etc/kafka/log4j.properties > -Dcom.sun.management.jmxremote > -Dcom.sun.management.jmxremote.authenticate=false > -Dcom.sun.management.jmxremote.ssl=false > -Dcom.sun.management.jmxremote.port= > -Dcom.sun.management.jmxremote.rmi.port= -cp /usr/local/libs/* > kafka.Kafka /etc/kafka/server.properties > {code} > We use java 1.8.0_102, and has applied a TLS patch on reducing > X509Factory.certCache map size from 750 to 20. > {code} > java -version > java version "1.8.0_102" > Java(TM) SE Runtime Environment (build 1.8.0_102-b14) > Java HotSpot(TM) 64-Bit Server VM (build 25.102-b14, mixed mode) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (KAFKA-7304) memory leakage in org.apache.kafka.common.network.Selector
[ https://issues.apache.org/jira/browse/KAFKA-7304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ismael Juma updated KAFKA-7304: --- Fix Version/s: 2.1.0 2.0.1 1.1.2 > memory leakage in org.apache.kafka.common.network.Selector > -- > > Key: KAFKA-7304 > URL: https://issues.apache.org/jira/browse/KAFKA-7304 > Project: Kafka > Issue Type: Bug > Components: core >Affects Versions: 1.1.0 >Reporter: Yu Yang >Priority: Critical > Fix For: 1.1.2, 2.0.1, 2.1.0 > > Attachments: Screen Shot 2018-08-16 at 11.04.16 PM.png, Screen Shot > 2018-08-16 at 11.06.38 PM.png, Screen Shot 2018-08-16 at 12.41.26 PM.png, > Screen Shot 2018-08-16 at 4.26.19 PM.png, Screen Shot 2018-08-17 at 1.03.35 > AM.png, Screen Shot 2018-08-17 at 1.04.32 AM.png, Screen Shot 2018-08-17 at > 1.05.30 AM.png > > > We are testing secured writing to kafka through ssl. Testing at small scale, > ssl writing to kafka was fine. However, when we enabled ssl writing at a > larger scale (>40k clients write concurrently), the kafka brokers soon hit > OutOfMemory issue with 4G memory setting. We have tried with increasing the > heap size to 10Gb, but encountered the same issue. > We took a few heap dumps , and found that most of the heap memory is > referenced through org.apache.kafka.common.network.Selector objects. There > are two Channel maps field in Selector. It seems that somehow the objects is > not deleted from the map in a timely manner. > One observation is that the memory leak seems relate to kafka partition > leader changes. If there is broker restart etc. in the cluster that caused > partition leadership change, the brokers may hit the OOM issue faster. > {code} > private final Map channels; > private final Map closingChannels; > {code} > Please see the attached images and the following link for sample gc > analysis. > http://gceasy.io/my-gc-report.jsp?p=c2hhcmVkLzIwMTgvMDgvMTcvLS1nYy5sb2cuMC5jdXJyZW50Lmd6LS0yLTM5LTM0 > the command line for running kafka: > {code} > java -Xms10g -Xmx10g -XX:NewSize=512m -XX:MaxNewSize=512m > -Xbootclasspath/p:/usr/local/libs/bcp -XX:MetaspaceSize=128m -XX:+UseG1GC > -XX:MaxGCPauseMillis=25 -XX:InitiatingHeapOccupancyPercent=35 > -XX:G1HeapRegionSize=16M -XX:MinMetaspaceFreeRatio=25 > -XX:MaxMetaspaceFreeRatio=75 -XX:+PrintGCDetails -XX:+PrintGCDateStamps > -XX:+PrintTenuringDistribution -Xloggc:/var/log/kafka/gc.log > -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=40 -XX:GCLogFileSize=50M > -Djava.awt.headless=true > -Dlog4j.configuration=file:/etc/kafka/log4j.properties > -Dcom.sun.management.jmxremote > -Dcom.sun.management.jmxremote.authenticate=false > -Dcom.sun.management.jmxremote.ssl=false > -Dcom.sun.management.jmxremote.port= > -Dcom.sun.management.jmxremote.rmi.port= -cp /usr/local/libs/* > kafka.Kafka /etc/kafka/server.properties > {code} > We use java 1.8.0_102, and has applied a TLS patch on reducing > X509Factory.certCache map size from 750 to 20. > {code} > java -version > java version "1.8.0_102" > Java(TM) SE Runtime Environment (build 1.8.0_102-b14) > Java HotSpot(TM) 64-Bit Server VM (build 25.102-b14, mixed mode) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (KAFKA-7304) memory leakage in org.apache.kafka.common.network.Selector
[ https://issues.apache.org/jira/browse/KAFKA-7304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yu Yang updated KAFKA-7304: --- Description: We are testing secured writing to kafka through ssl. Testing at small scale, ssl writing to kafka was fine. However, when we enabled ssl writing at a larger scale (>40k clients write concurrently), the kafka brokers soon hit OutOfMemory issue with 4G memory setting. We have tried with increasing the heap size to 10Gb, but encountered the same issue. We took a few heap dumps , and found that most of the heap memory is referenced through org.apache.kafka.common.network.Selector objects. There are two Channel maps field in Selector. It seems that somehow the objects is not deleted from the map in a timely manner. One observation is that the memory leak seems relate to kafka partition leader changes. If there is broker restart etc. in the cluster that caused partition leadership change, the brokers may hit the OOM issue faster. {code} private final Map channels; private final Map closingChannels; {code} Please see the attached images and the following link for sample gc analysis. http://gceasy.io/my-gc-report.jsp?p=c2hhcmVkLzIwMTgvMDgvMTcvLS1nYy5sb2cuMC5jdXJyZW50Lmd6LS0yLTM5LTM0 the command line for running kafka: {code} java -Xms10g -Xmx10g -XX:NewSize=512m -XX:MaxNewSize=512m -Xbootclasspath/p:/usr/local/libs/bcp -XX:MetaspaceSize=128m -XX:+UseG1GC -XX:MaxGCPauseMillis=25 -XX:InitiatingHeapOccupancyPercent=35 -XX:G1HeapRegionSize=16M -XX:MinMetaspaceFreeRatio=25 -XX:MaxMetaspaceFreeRatio=75 -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintTenuringDistribution -Xloggc:/var/log/kafka/gc.log -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=40 -XX:GCLogFileSize=50M -Djava.awt.headless=true -Dlog4j.configuration=file:/etc/kafka/log4j.properties -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.port= -Dcom.sun.management.jmxremote.rmi.port= -cp /usr/local/libs/* kafka.Kafka /etc/kafka/server.properties {code} We use java 1.8.0_102, and has applied a TLS patch on reducing X509Factory.certCache map size from 750 to 20. {code} java -version java version "1.8.0_102" Java(TM) SE Runtime Environment (build 1.8.0_102-b14) Java HotSpot(TM) 64-Bit Server VM (build 25.102-b14, mixed mode) {code} was: We are testing secured writing to kafka through ssl. Testing at small scale, ssl writing to kafka was fine. However, when we enabled ssl writing at a larger scale (>40k clients write concurrently), the kafka brokers soon hit OutOfMemory issue with 4G memory setting. We have tried with increasing the heap size to 10Gb, but encountered the same issue. We took a few heap dumps , and found that most of the heap memory is referenced through org.apache.kafka.common.network.Selector object. There are two Channel maps field in Selector. It seems that somehow the objects is not deleted from the map in a timely manner. One observation is that the memory leak seems relate to kafka partition leader changes. If there is broker restart etc. in the cluster that caused partition leadership change, the brokers may hit the OOM issue faster. {code} private final Map channels; private final Map closingChannels; {code} Please see the attached images and the following link for sample gc analysis. http://gceasy.io/my-gc-report.jsp?p=c2hhcmVkLzIwMTgvMDgvMTcvLS1nYy5sb2cuMC5jdXJyZW50Lmd6LS0yLTM5LTM0 the command line for running kafka: {code} java -Xms10g -Xmx10g -XX:NewSize=512m -XX:MaxNewSize=512m -Xbootclasspath/p:/usr/local/libs/bcp -XX:MetaspaceSize=128m -XX:+UseG1GC -XX:MaxGCPauseMillis=25 -XX:InitiatingHeapOccupancyPercent=35 -XX:G1HeapRegionSize=16M -XX:MinMetaspaceFreeRatio=25 -XX:MaxMetaspaceFreeRatio=75 -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintTenuringDistribution -Xloggc:/var/log/kafka/gc.log -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=40 -XX:GCLogFileSize=50M -Djava.awt.headless=true -Dlog4j.configuration=file:/etc/kafka/log4j.properties -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.port= -Dcom.sun.management.jmxremote.rmi.port= -cp /usr/local/libs/* kafka.Kafka /etc/kafka/server.properties {code} We use java 1.8.0_102, and has applied a TLS patch on reducing X509Factory.certCache map size from 750 to 20. {code} java -version java version "1.8.0_102" Java(TM) SE Runtime Environment (build 1.8.0_102-b14) Java HotSpot(TM) 64-Bit Server VM (build 25.102-b14, mixed mode) {code} > memory leakage in org.apache.kafka.common.network.Selector > -- > > Key: KAFKA-7304 > URL: https://issues.apache.org/jira/browse/KAFKA-7304
[jira] [Updated] (KAFKA-7304) memory leakage in org.apache.kafka.common.network.Selector
[ https://issues.apache.org/jira/browse/KAFKA-7304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yu Yang updated KAFKA-7304: --- Description: We are testing secured writing to kafka through ssl. Testing at small scale, ssl writing to kafka was fine. However, when we enabled ssl writing at a larger scale (>40k clients write concurrently), the kafka brokers soon hit OutOfMemory issue with 4G memory setting. We have tried with increasing the heap size to 10Gb, but encountered the same issue. We took a few heap dumps , and found that most of the heap memory is referenced through org.apache.kafka.common.network.Selector object. There are two Channel maps field in Selector. It seems that somehow the objects is not deleted from the map in a timely manner. One observation is that the memory leak seems relate to kafka partition leader changes. If there is broker restart etc. in the cluster that caused partition leadership change, the brokers may hit the OOM issue faster. {code} private final Map channels; private final Map closingChannels; {code} Please see the attached images and the following link for sample gc analysis. http://gceasy.io/my-gc-report.jsp?p=c2hhcmVkLzIwMTgvMDgvMTcvLS1nYy5sb2cuMC5jdXJyZW50Lmd6LS0yLTM5LTM0 the command line for running kafka: {code} java -Xms10g -Xmx10g -XX:NewSize=512m -XX:MaxNewSize=512m -Xbootclasspath/p:/usr/local/libs/bcp -XX:MetaspaceSize=128m -XX:+UseG1GC -XX:MaxGCPauseMillis=25 -XX:InitiatingHeapOccupancyPercent=35 -XX:G1HeapRegionSize=16M -XX:MinMetaspaceFreeRatio=25 -XX:MaxMetaspaceFreeRatio=75 -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintTenuringDistribution -Xloggc:/var/log/kafka/gc.log -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=40 -XX:GCLogFileSize=50M -Djava.awt.headless=true -Dlog4j.configuration=file:/etc/kafka/log4j.properties -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.port= -Dcom.sun.management.jmxremote.rmi.port= -cp /usr/local/libs/* kafka.Kafka /etc/kafka/server.properties {code} We use java 1.8.0_102, and has applied a TLS patch on reducing X509Factory.certCache map size from 750 to 20. {code} java -version java version "1.8.0_102" Java(TM) SE Runtime Environment (build 1.8.0_102-b14) Java HotSpot(TM) 64-Bit Server VM (build 25.102-b14, mixed mode) {code} was: We are testing secured writing to kafka through ssl. Testing at small scale, ssl writing to kafka was fine. However, when we enabled ssl writing at a larger scale (>40k clients writes concurrently), the kafka brokers soon hit OutOfMemory issue with 4G memory setting. We have tried with increasing the heap size to 10Gb, but encountered the same issue. We took a few heap dumps , and found that most of the heap memory is referenced through org.apache.kafka.common.network.Selector object. There are two Channel maps field in Selector. It seems that somehow the objects is not deleted from the map in a timely manner. One observation is that the memory leak seems relate to kafka partition leader changes. If there is broker restart etc. in the cluster that caused partition leadership change, the brokers may hit the OOM issue faster. {code} private final Map channels; private final Map closingChannels; {code} Please see the attached images and the following link for sample gc analysis. http://gceasy.io/my-gc-report.jsp?p=c2hhcmVkLzIwMTgvMDgvMTcvLS1nYy5sb2cuMC5jdXJyZW50Lmd6LS0yLTM5LTM0 the command line for running kafka: {code} java -Xms10g -Xmx10g -XX:NewSize=512m -XX:MaxNewSize=512m -Xbootclasspath/p:/usr/local/libs/bcp -XX:MetaspaceSize=128m -XX:+UseG1GC -XX:MaxGCPauseMillis=25 -XX:InitiatingHeapOccupancyPercent=35 -XX:G1HeapRegionSize=16M -XX:MinMetaspaceFreeRatio=25 -XX:MaxMetaspaceFreeRatio=75 -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintTenuringDistribution -Xloggc:/var/log/kafka/gc.log -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=40 -XX:GCLogFileSize=50M -Djava.awt.headless=true -Dlog4j.configuration=file:/etc/kafka/log4j.properties -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.port= -Dcom.sun.management.jmxremote.rmi.port= -cp /usr/local/libs/* kafka.Kafka /etc/kafka/server.properties {code} We use java 1.8.0_102, and has applied a TLS patch on reducing X509Factory.certCache map size from 750 to 20. {code} java -version java version "1.8.0_102" Java(TM) SE Runtime Environment (build 1.8.0_102-b14) Java HotSpot(TM) 64-Bit Server VM (build 25.102-b14, mixed mode) {code} > memory leakage in org.apache.kafka.common.network.Selector > -- > > Key: KAFKA-7304 > URL: https://issues.apache.org/jira/browse/KAFKA-7304
[jira] [Updated] (KAFKA-7304) memory leakage in org.apache.kafka.common.network.Selector
[ https://issues.apache.org/jira/browse/KAFKA-7304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yu Yang updated KAFKA-7304: --- Description: We are testing secured writing to kafka through ssl. Testing at small scale, ssl writing to kafka was fine. However, when we enabled ssl writing at a larger scale (>40k clients writes concurrently), the kafka brokers soon hit OutOfMemory issue with 4G memory setting. We have tried with increasing the heap size to 10Gb, but encountered the same issue. We took a few heap dumps , and found that most of the heap memory is referenced through org.apache.kafka.common.network.Selector object. There are two Channel maps field in Selector. It seems that somehow the objects is not deleted from the map in a timely manner. One observation is that the memory leak seems relate to kafka partition leader changes. If there is broker restart etc. in the cluster that caused partition leadership change, the brokers may hit the OOM issue faster. {code} private final Map channels; private final Map closingChannels; {code} Please see the attached images and the following link for sample gc analysis. http://gceasy.io/my-gc-report.jsp?p=c2hhcmVkLzIwMTgvMDgvMTcvLS1nYy5sb2cuMC5jdXJyZW50Lmd6LS0yLTM5LTM0 the command line for running kafka: {code} java -Xms10g -Xmx10g -XX:NewSize=512m -XX:MaxNewSize=512m -Xbootclasspath/p:/usr/local/libs/bcp -XX:MetaspaceSize=128m -XX:+UseG1GC -XX:MaxGCPauseMillis=25 -XX:InitiatingHeapOccupancyPercent=35 -XX:G1HeapRegionSize=16M -XX:MinMetaspaceFreeRatio=25 -XX:MaxMetaspaceFreeRatio=75 -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintTenuringDistribution -Xloggc:/var/log/kafka/gc.log -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=40 -XX:GCLogFileSize=50M -Djava.awt.headless=true -Dlog4j.configuration=file:/etc/kafka/log4j.properties -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.port= -Dcom.sun.management.jmxremote.rmi.port= -cp /usr/local/libs/* kafka.Kafka /etc/kafka/server.properties {code} We use java 1.8.0_102, and has applied a TLS patch on reducing X509Factory.certCache map size from 750 to 20. {code} java -version java version "1.8.0_102" Java(TM) SE Runtime Environment (build 1.8.0_102-b14) Java HotSpot(TM) 64-Bit Server VM (build 25.102-b14, mixed mode) {code} was: We are testing secured writing to kafka through ssl. Testing at small scale, ssl writing to kafka was fine. However, when we enabled ssl writing at a larger scale (>40k clients writes concurrently), the kafka brokers soon hit OutOfMemory issue with 4G memory setting. We have tried with increasing the heap size to 10Gb, but encountered the same issue. We took a few heap dump , and found that most of the heap memory is referenced through org.apache.kafka.common.network.Selector object. There are two Channel maps field in Selector. It seems that somehow the objects is not deleted from the map in a timely manner. One observation is that the memory leak seems relate to kafka partition leader changes. If there is broker restart etc. in the cluster that caused partition leadership change, the brokers may hit the OOM issue faster. {code} private final Map channels; private final Map closingChannels; {code} Please see the attached images and the following link for sample gc analysis. http://gceasy.io/my-gc-report.jsp?p=c2hhcmVkLzIwMTgvMDgvMTcvLS1nYy5sb2cuMC5jdXJyZW50Lmd6LS0yLTM5LTM0 the command line for running kafka: {code} java -Xms10g -Xmx10g -XX:NewSize=512m -XX:MaxNewSize=512m -Xbootclasspath/p:/usr/local/libs/bcp -XX:MetaspaceSize=128m -XX:+UseG1GC -XX:MaxGCPauseMillis=25 -XX:InitiatingHeapOccupancyPercent=35 -XX:G1HeapRegionSize=16M -XX:MinMetaspaceFreeRatio=25 -XX:MaxMetaspaceFreeRatio=75 -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintTenuringDistribution -Xloggc:/var/log/kafka/gc.log -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=40 -XX:GCLogFileSize=50M -Djava.awt.headless=true -Dlog4j.configuration=file:/etc/kafka/log4j.properties -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.port= -Dcom.sun.management.jmxremote.rmi.port= -cp /usr/local/libs/* kafka.Kafka /etc/kafka/server.properties {code} We use java 1.8.0_102, and has applied a TLS patch on reducing X509Factory.certCache map size from 750 to 20. {code} java -version java version "1.8.0_102" Java(TM) SE Runtime Environment (build 1.8.0_102-b14) Java HotSpot(TM) 64-Bit Server VM (build 25.102-b14, mixed mode) {code} > memory leakage in org.apache.kafka.common.network.Selector > -- > > Key: KAFKA-7304 > URL: https://issues.apache.org/jira/browse/KAFKA-7304
[jira] [Updated] (KAFKA-7304) memory leakage in org.apache.kafka.common.network.Selector
[ https://issues.apache.org/jira/browse/KAFKA-7304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yu Yang updated KAFKA-7304: --- Affects Version/s: (was: 1.1.1) > memory leakage in org.apache.kafka.common.network.Selector > -- > > Key: KAFKA-7304 > URL: https://issues.apache.org/jira/browse/KAFKA-7304 > Project: Kafka > Issue Type: Bug > Components: core >Affects Versions: 1.1.0 >Reporter: Yu Yang >Priority: Major > Attachments: Screen Shot 2018-08-16 at 11.04.16 PM.png, Screen Shot > 2018-08-16 at 11.06.38 PM.png, Screen Shot 2018-08-16 at 12.41.26 PM.png, > Screen Shot 2018-08-16 at 4.26.19 PM.png, Screen Shot 2018-08-17 at 1.03.35 > AM.png, Screen Shot 2018-08-17 at 1.04.32 AM.png, Screen Shot 2018-08-17 at > 1.05.30 AM.png > > > We are testing secured writing to kafka through ssl. Testing at small scale, > ssl writing to kafka was fine. However, when we enabled ssl writing at a > larger scale (>40k clients writes concurrently), the kafka brokers soon hit > OutOfMemory issue with 4G memory setting. We have tried with increasing the > heap size to 10Gb, but encountered the same issue. > We took a few heap dump , and found that most of the heap memory is > referenced through org.apache.kafka.common.network.Selector object. There > are two Channel maps field in Selector. It seems that somehow the objects is > not deleted from the map in a timely manner. > One observation is that the memory leak seems relate to kafka partition > leader changes. If there is broker restart etc. in the cluster that caused > partition leadership change, the brokers may hit the OOM issue faster. > {code} > private final Map channels; > private final Map closingChannels; > {code} > Please see the attached images and the following link for sample gc > analysis. > http://gceasy.io/my-gc-report.jsp?p=c2hhcmVkLzIwMTgvMDgvMTcvLS1nYy5sb2cuMC5jdXJyZW50Lmd6LS0yLTM5LTM0 > the command line for running kafka: > {code} > java -Xms10g -Xmx10g -XX:NewSize=512m -XX:MaxNewSize=512m > -Xbootclasspath/p:/usr/local/libs/bcp -XX:MetaspaceSize=128m -XX:+UseG1GC > -XX:MaxGCPauseMillis=25 -XX:InitiatingHeapOccupancyPercent=35 > -XX:G1HeapRegionSize=16M -XX:MinMetaspaceFreeRatio=25 > -XX:MaxMetaspaceFreeRatio=75 -XX:+PrintGCDetails -XX:+PrintGCDateStamps > -XX:+PrintTenuringDistribution -Xloggc:/var/log/kafka/gc.log > -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=40 -XX:GCLogFileSize=50M > -Djava.awt.headless=true > -Dlog4j.configuration=file:/etc/kafka/log4j.properties > -Dcom.sun.management.jmxremote > -Dcom.sun.management.jmxremote.authenticate=false > -Dcom.sun.management.jmxremote.ssl=false > -Dcom.sun.management.jmxremote.port= > -Dcom.sun.management.jmxremote.rmi.port= -cp /usr/local/libs/* > kafka.Kafka /etc/kafka/server.properties > {code} > We use java 1.8.0_102, and has applied a TLS patch on reducing > X509Factory.certCache map size from 750 to 20. > {code} > java -version > java version "1.8.0_102" > Java(TM) SE Runtime Environment (build 1.8.0_102-b14) > Java HotSpot(TM) 64-Bit Server VM (build 25.102-b14, mixed mode) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (KAFKA-7304) memory leakage in org.apache.kafka.common.network.Selector
[ https://issues.apache.org/jira/browse/KAFKA-7304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yu Yang updated KAFKA-7304: --- Description: We are testing secured writing to kafka through ssl. Testing at small scale, ssl writing to kafka was fine. However, when we enabled ssl writing at a larger scale (>40k clients writes concurrently), the kafka brokers soon hit OutOfMemory issue with 4G memory setting. We have tried with increasing the heap size to 10Gb, but encountered the same issue. We took a few heap dump , and found that most of the heap memory is referenced through org.apache.kafka.common.network.Selector object. There are two Channel maps field in Selector. It seems that somehow the objects is not deleted from the map in a timely manner. One observation is that the memory leak seems related to kafka partition leader changes. If there is broker restart etc. in the cluster that caused partition leadership change, the brokers may hit the OOM issue faster. {code} private final Map channels; private final Map closingChannels; {code} Please see the attached images and the following link for sample gc analysis. http://gceasy.io/my-gc-report.jsp?p=c2hhcmVkLzIwMTgvMDgvMTcvLS1nYy5sb2cuMC5jdXJyZW50Lmd6LS0yLTM5LTM0 the command line for running kafka: {code} java -Xms10g -Xmx10g -XX:NewSize=512m -XX:MaxNewSize=512m -Xbootclasspath/p:/usr/local/libs/bcp -XX:MetaspaceSize=128m -XX:+UseG1GC -XX:MaxGCPauseMillis=25 -XX:InitiatingHeapOccupancyPercent=35 -XX:G1HeapRegionSize=16M -XX:MinMetaspaceFreeRatio=25 -XX:MaxMetaspaceFreeRatio=75 -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintTenuringDistribution -Xloggc:/var/log/kafka/gc.log -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=40 -XX:GCLogFileSize=50M -Djava.awt.headless=true -Dlog4j.configuration=file:/etc/kafka/log4j.properties -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.port= -Dcom.sun.management.jmxremote.rmi.port= -cp /usr/local/libs/* kafka.Kafka /etc/kafka/server.properties {code} We use java 1.8.0_102, and has applied a TLS patch on reducing X509Factory.certCache map size from 750 to 20. {code} java -version java version "1.8.0_102" Java(TM) SE Runtime Environment (build 1.8.0_102-b14) Java HotSpot(TM) 64-Bit Server VM (build 25.102-b14, mixed mode) {code} was: We are testing secured writing to kafka through ssl. Testing at small scale, ssl writing to kafka was fine. However, when we enabled ssl writing at a larger scale (>40k clients writes concurrently), the kafka brokers soon hit OutOfMemory issue with 4G memory setting. We have tried with increasing the heap size to 10Gb, but encountered the same issue. We took a few heap dump , and found that most of the heap memory is referenced through org.apache.kafka.common.network.Selector object. There are two Channel maps field in Selector. It seems that somehow the objects is not deleted from the map in a timely manner. {code} private final Map channels; private final Map closingChannels; {code} Please see the attached images and the following link for sample gc analysis. http://gceasy.io/my-gc-report.jsp?p=c2hhcmVkLzIwMTgvMDgvMTcvLS1nYy5sb2cuMC5jdXJyZW50Lmd6LS0yLTM5LTM0 the command line for running kafka: {code} java -Xms10g -Xmx10g -XX:NewSize=512m -XX:MaxNewSize=512m -Xbootclasspath/p:/usr/local/libs/bcp -XX:MetaspaceSize=128m -XX:+UseG1GC -XX:MaxGCPauseMillis=25 -XX:InitiatingHeapOccupancyPercent=35 -XX:G1HeapRegionSize=16M -XX:MinMetaspaceFreeRatio=25 -XX:MaxMetaspaceFreeRatio=75 -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintTenuringDistribution -Xloggc:/var/log/kafka/gc.log -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=40 -XX:GCLogFileSize=50M -Djava.awt.headless=true -Dlog4j.configuration=file:/etc/kafka/log4j.properties -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.port= -Dcom.sun.management.jmxremote.rmi.port= -cp /usr/local/libs/* kafka.Kafka /etc/kafka/server.properties {code} We use java 1.8.0_102, and has applied a TLS patch on reducing X509Factory.certCache map size from 750 to 20. {code} java -version java version "1.8.0_102" Java(TM) SE Runtime Environment (build 1.8.0_102-b14) Java HotSpot(TM) 64-Bit Server VM (build 25.102-b14, mixed mode) {code} > memory leakage in org.apache.kafka.common.network.Selector > -- > > Key: KAFKA-7304 > URL: https://issues.apache.org/jira/browse/KAFKA-7304 > Project: Kafka > Issue Type: Bug > Components: core >Affects Versions: 1.1.0, 1.1.1 >Reporter: Yu Yang >Priority: Major > Attachments: Screen Shot 2018-0
[jira] [Updated] (KAFKA-7304) memory leakage in org.apache.kafka.common.network.Selector
[ https://issues.apache.org/jira/browse/KAFKA-7304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yu Yang updated KAFKA-7304: --- Description: We are testing secured writing to kafka through ssl. Testing at small scale, ssl writing to kafka was fine. However, when we enabled ssl writing at a larger scale (>40k clients writes concurrently), the kafka brokers soon hit OutOfMemory issue with 4G memory setting. We have tried with increasing the heap size to 10Gb, but encountered the same issue. We took a few heap dump , and found that most of the heap memory is referenced through org.apache.kafka.common.network.Selector object. There are two Channel maps field in Selector. It seems that somehow the objects is not deleted from the map in a timely manner. One observation is that the memory leak seems relate to kafka partition leader changes. If there is broker restart etc. in the cluster that caused partition leadership change, the brokers may hit the OOM issue faster. {code} private final Map channels; private final Map closingChannels; {code} Please see the attached images and the following link for sample gc analysis. http://gceasy.io/my-gc-report.jsp?p=c2hhcmVkLzIwMTgvMDgvMTcvLS1nYy5sb2cuMC5jdXJyZW50Lmd6LS0yLTM5LTM0 the command line for running kafka: {code} java -Xms10g -Xmx10g -XX:NewSize=512m -XX:MaxNewSize=512m -Xbootclasspath/p:/usr/local/libs/bcp -XX:MetaspaceSize=128m -XX:+UseG1GC -XX:MaxGCPauseMillis=25 -XX:InitiatingHeapOccupancyPercent=35 -XX:G1HeapRegionSize=16M -XX:MinMetaspaceFreeRatio=25 -XX:MaxMetaspaceFreeRatio=75 -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintTenuringDistribution -Xloggc:/var/log/kafka/gc.log -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=40 -XX:GCLogFileSize=50M -Djava.awt.headless=true -Dlog4j.configuration=file:/etc/kafka/log4j.properties -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.port= -Dcom.sun.management.jmxremote.rmi.port= -cp /usr/local/libs/* kafka.Kafka /etc/kafka/server.properties {code} We use java 1.8.0_102, and has applied a TLS patch on reducing X509Factory.certCache map size from 750 to 20. {code} java -version java version "1.8.0_102" Java(TM) SE Runtime Environment (build 1.8.0_102-b14) Java HotSpot(TM) 64-Bit Server VM (build 25.102-b14, mixed mode) {code} was: We are testing secured writing to kafka through ssl. Testing at small scale, ssl writing to kafka was fine. However, when we enabled ssl writing at a larger scale (>40k clients writes concurrently), the kafka brokers soon hit OutOfMemory issue with 4G memory setting. We have tried with increasing the heap size to 10Gb, but encountered the same issue. We took a few heap dump , and found that most of the heap memory is referenced through org.apache.kafka.common.network.Selector object. There are two Channel maps field in Selector. It seems that somehow the objects is not deleted from the map in a timely manner. One observation is that the memory leak seems related to kafka partition leader changes. If there is broker restart etc. in the cluster that caused partition leadership change, the brokers may hit the OOM issue faster. {code} private final Map channels; private final Map closingChannels; {code} Please see the attached images and the following link for sample gc analysis. http://gceasy.io/my-gc-report.jsp?p=c2hhcmVkLzIwMTgvMDgvMTcvLS1nYy5sb2cuMC5jdXJyZW50Lmd6LS0yLTM5LTM0 the command line for running kafka: {code} java -Xms10g -Xmx10g -XX:NewSize=512m -XX:MaxNewSize=512m -Xbootclasspath/p:/usr/local/libs/bcp -XX:MetaspaceSize=128m -XX:+UseG1GC -XX:MaxGCPauseMillis=25 -XX:InitiatingHeapOccupancyPercent=35 -XX:G1HeapRegionSize=16M -XX:MinMetaspaceFreeRatio=25 -XX:MaxMetaspaceFreeRatio=75 -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintTenuringDistribution -Xloggc:/var/log/kafka/gc.log -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=40 -XX:GCLogFileSize=50M -Djava.awt.headless=true -Dlog4j.configuration=file:/etc/kafka/log4j.properties -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.port= -Dcom.sun.management.jmxremote.rmi.port= -cp /usr/local/libs/* kafka.Kafka /etc/kafka/server.properties {code} We use java 1.8.0_102, and has applied a TLS patch on reducing X509Factory.certCache map size from 750 to 20. {code} java -version java version "1.8.0_102" Java(TM) SE Runtime Environment (build 1.8.0_102-b14) Java HotSpot(TM) 64-Bit Server VM (build 25.102-b14, mixed mode) {code} > memory leakage in org.apache.kafka.common.network.Selector > -- > > Key: KAFKA-7304 > URL: https://issues.apache.org/jira/browse/KAFKA-7304
[jira] [Updated] (KAFKA-7304) memory leakage in org.apache.kafka.common.network.Selector
[ https://issues.apache.org/jira/browse/KAFKA-7304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yu Yang updated KAFKA-7304: --- Attachment: Screen Shot 2018-08-17 at 1.05.30 AM.png > memory leakage in org.apache.kafka.common.network.Selector > -- > > Key: KAFKA-7304 > URL: https://issues.apache.org/jira/browse/KAFKA-7304 > Project: Kafka > Issue Type: Bug > Components: core >Affects Versions: 1.1.0, 1.1.1 >Reporter: Yu Yang >Priority: Major > Attachments: Screen Shot 2018-08-16 at 11.04.16 PM.png, Screen Shot > 2018-08-16 at 11.06.38 PM.png, Screen Shot 2018-08-16 at 12.41.26 PM.png, > Screen Shot 2018-08-16 at 4.26.19 PM.png, Screen Shot 2018-08-17 at 1.03.35 > AM.png, Screen Shot 2018-08-17 at 1.04.32 AM.png, Screen Shot 2018-08-17 at > 1.05.30 AM.png > > > We are testing secured writing to kafka through ssl. Testing at small scale, > ssl writing to kafka was fine. However, when we enabled ssl writing at a > larger scale (>40k clients writes concurrently), the kafka brokers soon hit > OutOfMemory issue with 4G memory setting. We have tried with increasing the > heap size to 10Gb, but encountered the same issue. > We took a few heap dump , and found that most of the heap memory is > referenced through org.apache.kafka.common.network.Selector object. There > are two Channel maps field in Selector. It seems that somehow the objects is > not deleted from the map in a timely manner. > {code} > private final Map channels; > private final Map closingChannels; > {code} > Please see the attached images and the following link for sample gc > analysis. > http://gceasy.io/my-gc-report.jsp?p=c2hhcmVkLzIwMTgvMDgvMTcvLS1nYy5sb2cuMC5jdXJyZW50Lmd6LS0yLTM5LTM0 > the command line for running kafka: > {code} > java -Xms10g -Xmx10g -XX:NewSize=512m -XX:MaxNewSize=512m > -Xbootclasspath/p:/usr/local/libs/bcp -XX:MetaspaceSize=128m -XX:+UseG1GC > -XX:MaxGCPauseMillis=25 -XX:InitiatingHeapOccupancyPercent=35 > -XX:G1HeapRegionSize=16M -XX:MinMetaspaceFreeRatio=25 > -XX:MaxMetaspaceFreeRatio=75 -XX:+PrintGCDetails -XX:+PrintGCDateStamps > -XX:+PrintTenuringDistribution -Xloggc:/var/log/kafka/gc.log > -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=40 -XX:GCLogFileSize=50M > -Djava.awt.headless=true > -Dlog4j.configuration=file:/etc/kafka/log4j.properties > -Dcom.sun.management.jmxremote > -Dcom.sun.management.jmxremote.authenticate=false > -Dcom.sun.management.jmxremote.ssl=false > -Dcom.sun.management.jmxremote.port= > -Dcom.sun.management.jmxremote.rmi.port= -cp /usr/local/libs/* > kafka.Kafka /etc/kafka/server.properties > {code} > We use java 1.8.0_102, and has applied a TLS patch on reducing > X509Factory.certCache map size from 750 to 20. > {code} > java -version > java version "1.8.0_102" > Java(TM) SE Runtime Environment (build 1.8.0_102-b14) > Java HotSpot(TM) 64-Bit Server VM (build 25.102-b14, mixed mode) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (KAFKA-7304) memory leakage in org.apache.kafka.common.network.Selector
[ https://issues.apache.org/jira/browse/KAFKA-7304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yu Yang updated KAFKA-7304: --- Attachment: Screen Shot 2018-08-17 at 1.04.32 AM.png > memory leakage in org.apache.kafka.common.network.Selector > -- > > Key: KAFKA-7304 > URL: https://issues.apache.org/jira/browse/KAFKA-7304 > Project: Kafka > Issue Type: Bug > Components: core >Affects Versions: 1.1.0, 1.1.1 >Reporter: Yu Yang >Priority: Major > Attachments: Screen Shot 2018-08-16 at 11.04.16 PM.png, Screen Shot > 2018-08-16 at 11.06.38 PM.png, Screen Shot 2018-08-16 at 12.41.26 PM.png, > Screen Shot 2018-08-16 at 4.26.19 PM.png, Screen Shot 2018-08-17 at 1.03.35 > AM.png, Screen Shot 2018-08-17 at 1.04.32 AM.png > > > We are testing secured writing to kafka through ssl. Testing at small scale, > ssl writing to kafka was fine. However, when we enabled ssl writing at a > larger scale (>40k clients writes concurrently), the kafka brokers soon hit > OutOfMemory issue with 4G memory setting. We have tried with increasing the > heap size to 10Gb, but encountered the same issue. > We took a few heap dump , and found that most of the heap memory is > referenced through org.apache.kafka.common.network.Selector object. There > are two Channel maps field in Selector. It seems that somehow the objects is > not deleted from the map in a timely manner. > {code} > private final Map channels; > private final Map closingChannels; > {code} > Please see the attached images and the following link for sample gc > analysis. > http://gceasy.io/my-gc-report.jsp?p=c2hhcmVkLzIwMTgvMDgvMTcvLS1nYy5sb2cuMC5jdXJyZW50Lmd6LS0yLTM5LTM0 > the command line for running kafka: > {code} > java -Xms10g -Xmx10g -XX:NewSize=512m -XX:MaxNewSize=512m > -Xbootclasspath/p:/usr/local/libs/bcp -XX:MetaspaceSize=128m -XX:+UseG1GC > -XX:MaxGCPauseMillis=25 -XX:InitiatingHeapOccupancyPercent=35 > -XX:G1HeapRegionSize=16M -XX:MinMetaspaceFreeRatio=25 > -XX:MaxMetaspaceFreeRatio=75 -XX:+PrintGCDetails -XX:+PrintGCDateStamps > -XX:+PrintTenuringDistribution -Xloggc:/var/log/kafka/gc.log > -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=40 -XX:GCLogFileSize=50M > -Djava.awt.headless=true > -Dlog4j.configuration=file:/etc/kafka/log4j.properties > -Dcom.sun.management.jmxremote > -Dcom.sun.management.jmxremote.authenticate=false > -Dcom.sun.management.jmxremote.ssl=false > -Dcom.sun.management.jmxremote.port= > -Dcom.sun.management.jmxremote.rmi.port= -cp /usr/local/libs/* > kafka.Kafka /etc/kafka/server.properties > {code} > We use java 1.8.0_102, and has applied a TLS patch on reducing > X509Factory.certCache map size from 750 to 20. > {code} > java -version > java version "1.8.0_102" > Java(TM) SE Runtime Environment (build 1.8.0_102-b14) > Java HotSpot(TM) 64-Bit Server VM (build 25.102-b14, mixed mode) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (KAFKA-7304) memory leakage in org.apache.kafka.common.network.Selector
[ https://issues.apache.org/jira/browse/KAFKA-7304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yu Yang updated KAFKA-7304: --- Attachment: Screen Shot 2018-08-17 at 1.03.35 AM.png > memory leakage in org.apache.kafka.common.network.Selector > -- > > Key: KAFKA-7304 > URL: https://issues.apache.org/jira/browse/KAFKA-7304 > Project: Kafka > Issue Type: Bug > Components: core >Affects Versions: 1.1.0, 1.1.1 >Reporter: Yu Yang >Priority: Major > Attachments: Screen Shot 2018-08-16 at 11.04.16 PM.png, Screen Shot > 2018-08-16 at 11.06.38 PM.png, Screen Shot 2018-08-16 at 12.41.26 PM.png, > Screen Shot 2018-08-16 at 4.26.19 PM.png, Screen Shot 2018-08-17 at 1.03.35 > AM.png > > > We are testing secured writing to kafka through ssl. Testing at small scale, > ssl writing to kafka was fine. However, when we enabled ssl writing at a > larger scale (>40k clients writes concurrently), the kafka brokers soon hit > OutOfMemory issue with 4G memory setting. We have tried with increasing the > heap size to 10Gb, but encountered the same issue. > We took a few heap dump , and found that most of the heap memory is > referenced through org.apache.kafka.common.network.Selector object. There > are two Channel maps field in Selector. It seems that somehow the objects is > not deleted from the map in a timely manner. > {code} > private final Map channels; > private final Map closingChannels; > {code} > Please see the attached images and the following link for sample gc > analysis. > http://gceasy.io/my-gc-report.jsp?p=c2hhcmVkLzIwMTgvMDgvMTcvLS1nYy5sb2cuMC5jdXJyZW50Lmd6LS0yLTM5LTM0 > the command line for running kafka: > {code} > java -Xms10g -Xmx10g -XX:NewSize=512m -XX:MaxNewSize=512m > -Xbootclasspath/p:/usr/local/libs/bcp -XX:MetaspaceSize=128m -XX:+UseG1GC > -XX:MaxGCPauseMillis=25 -XX:InitiatingHeapOccupancyPercent=35 > -XX:G1HeapRegionSize=16M -XX:MinMetaspaceFreeRatio=25 > -XX:MaxMetaspaceFreeRatio=75 -XX:+PrintGCDetails -XX:+PrintGCDateStamps > -XX:+PrintTenuringDistribution -Xloggc:/var/log/kafka/gc.log > -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=40 -XX:GCLogFileSize=50M > -Djava.awt.headless=true > -Dlog4j.configuration=file:/etc/kafka/log4j.properties > -Dcom.sun.management.jmxremote > -Dcom.sun.management.jmxremote.authenticate=false > -Dcom.sun.management.jmxremote.ssl=false > -Dcom.sun.management.jmxremote.port= > -Dcom.sun.management.jmxremote.rmi.port= -cp /usr/local/libs/* > kafka.Kafka /etc/kafka/server.properties > {code} > We use java 1.8.0_102, and has applied a TLS patch on reducing > X509Factory.certCache map size from 750 to 20. > {code} > java -version > java version "1.8.0_102" > Java(TM) SE Runtime Environment (build 1.8.0_102-b14) > Java HotSpot(TM) 64-Bit Server VM (build 25.102-b14, mixed mode) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (KAFKA-7304) memory leakage in org.apache.kafka.common.network.Selector
[ https://issues.apache.org/jira/browse/KAFKA-7304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yu Yang updated KAFKA-7304: --- Description: We are testing secured writing to kafka through ssl. Testing at small scale, ssl writing to kafka was fine. However, when we enabled ssl writing at a larger scale (>40k clients writes concurrently), the kafka brokers soon hit OutOfMemory issue with 4G memory setting. We have tried with increasing the heap size to 10Gb, but encountered the same issue. We took a few heap dump , and found that most of the heap memory is referenced through org.apache.kafka.common.network.Selector object. There are two Channel maps field in Selector. It seems that somehow the objects is not deleted from the map in a timely manner. {code} private final Map channels; private final Map closingChannels; {code} Please see the attached images and the following link for sample gc analysis. http://gceasy.io/my-gc-report.jsp?p=c2hhcmVkLzIwMTgvMDgvMTcvLS1nYy5sb2cuMC5jdXJyZW50Lmd6LS0yLTM5LTM0 the command line for running kafka: {code} java -Xms10g -Xmx10g -XX:NewSize=512m -XX:MaxNewSize=512m -Xbootclasspath/p:/usr/local/libs/bcp -XX:MetaspaceSize=128m -XX:+UseG1GC -XX:MaxGCPauseMillis=25 -XX:InitiatingHeapOccupancyPercent=35 -XX:G1HeapRegionSize=16M -XX:MinMetaspaceFreeRatio=25 -XX:MaxMetaspaceFreeRatio=75 -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintTenuringDistribution -Xloggc:/var/log/kafka/gc.log -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=40 -XX:GCLogFileSize=50M -Djava.awt.headless=true -Dlog4j.configuration=file:/etc/kafka/log4j.properties -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.port= -Dcom.sun.management.jmxremote.rmi.port= -cp /usr/local/libs/* kafka.Kafka /etc/kafka/server.properties {code} We use java 1.8.0_102, and has applied a TLS patch on reducing X509Factory.certCache map size from 750 to 20. {code} java -version java version "1.8.0_102" Java(TM) SE Runtime Environment (build 1.8.0_102-b14) Java HotSpot(TM) 64-Bit Server VM (build 25.102-b14, mixed mode) {code} was: We are testing secured writing to kafka through ssl. Testing at small scale, ssl writing to kafka was fine. However, when we enabled ssl writing at a larger scale (>40k clients writes concurrently), the kafka brokers soon hit OutOfMemory issue with 4G memory setting. We have tried with increasing the heap size to 10Gb, but encountered the same issue. We took a few heap dump , and found that most of the heap memory is referenced through org.apache.kafka.common.network.Selector object. There are two Channel maps field in Selector. It seems that somehow the objects is not deleted from the map in a timely manner. {code} private final Map channels; private final Map closingChannels; {code} Please see the attached images and the following link for sample gc analysis. http://gceasy.io/my-gc-report.jsp?p=c2hhcmVkLzIwMTgvMDgvMTcvLS1nYy5sb2cuMC5jdXJyZW50Lmd6LS0yLTM5LTM0 the command line for running kafka: {code} java -Xms10g -Xmx10g -XX:NewSize=512m -XX:MaxNewSize=512m -Xbootclasspath/p:/usr/local/libs/bcp -XX:MetaspaceSize=128m -XX:+UseG1GC -XX:MaxGCPauseMillis=25 -XX:InitiatingHeapOccupancyPercent=35 -XX:G1HeapRegionSize=16M -XX:MinMetaspaceFreeRatio=25 -XX:MaxMetaspaceFreeRatio=75 -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintTenuringDistribution -Xloggc:/var/log/kafka/gc.log -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=40 -XX:GCLogFileSize=50M -Djava.awt.headless=true -Dlog4j.configuration=file:/etc/kafka/log4j.properties -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.port= -Dcom.sun.management.jmxremote.rmi.port= -cp /usr/local/libs/* kafka.Kafka /etc/kafka/server.properties {code} > memory leakage in org.apache.kafka.common.network.Selector > -- > > Key: KAFKA-7304 > URL: https://issues.apache.org/jira/browse/KAFKA-7304 > Project: Kafka > Issue Type: Bug > Components: core >Affects Versions: 1.1.0, 1.1.1 >Reporter: Yu Yang >Priority: Major > Attachments: Screen Shot 2018-08-16 at 11.04.16 PM.png, Screen Shot > 2018-08-16 at 11.06.38 PM.png, Screen Shot 2018-08-16 at 12.41.26 PM.png, > Screen Shot 2018-08-16 at 4.26.19 PM.png > > > We are testing secured writing to kafka through ssl. Testing at small scale, > ssl writing to kafka was fine. However, when we enabled ssl writing at a > larger scale (>40k clients writes concurrently), the kafka brokers soon hit > OutOfMemory issue with 4G memory setting. We have tried with increasing the > heap size to 10Gb, but encounte
[jira] [Updated] (KAFKA-7304) memory leakage in org.apache.kafka.common.network.Selector
[ https://issues.apache.org/jira/browse/KAFKA-7304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yu Yang updated KAFKA-7304: --- Description: We are testing secured writing to kafka through ssl. Testing at small scale, ssl writing to kafka was fine. However, when we enabled ssl writing at a larger scale (>40k clients writes concurrently), the kafka brokers soon hit OutOfMemory issue with 4G memory setting. We have tried with increasing the heap size to 10Gb, but encountered the same issue. We took a few heap dump , and found that most of the heap memory is referenced through org.apache.kafka.common.network.Selector object. There are two Channel maps field in Selector. It seems that somehow the objects is not deleted from the map in a timely manner. {code} private final Map channels; private final Map closingChannels; {code} Please see the attached images and the following link for sample gc analysis. http://gceasy.io/my-gc-report.jsp?p=c2hhcmVkLzIwMTgvMDgvMTcvLS1nYy5sb2cuMC5jdXJyZW50Lmd6LS0yLTM5LTM0 the command line for running kafka: {code} java -Xms10g -Xmx10g -XX:NewSize=512m -XX:MaxNewSize=512m -Xbootclasspath/p:/usr/local/libs/bcp -XX:MetaspaceSize=128m -XX:+UseG1GC -XX:MaxGCPauseMillis=25 -XX:InitiatingHeapOccupancyPercent=35 -XX:G1HeapRegionSize=16M -XX:MinMetaspaceFreeRatio=25 -XX:MaxMetaspaceFreeRatio=75 -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintTenuringDistribution -Xloggc:/var/log/kafka/gc.log -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=40 -XX:GCLogFileSize=50M -Djava.awt.headless=true -Dlog4j.configuration=file:/etc/kafka/log4j.properties -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.port= -Dcom.sun.management.jmxremote.rmi.port= -cp /usr/local/libs/* kafka.Kafka /etc/kafka/server.properties {code} was: We are testing secured writing to kafka through ssl. Testing at small scale, ssl writing to kafka was fine. However, when we enabled ssl writing at a larger scale (>40k clients writes concurrently), the kafka brokers soon hit OutOfMemory issue with 4G memory setting. We have tried with increasing the heap size to 10Gb, but encountered the same issue. We took a few heap dump , and found that most of the heap memory is referenced through org.apache.kafka.common.network.Selector object. There are two Channel maps field in Selector. It seems that somehow the objects is not deleted from the map in a timely manner. {code} private final Map channels; private final Map closingChannels; {code} Please see the attached images and the following link for sample gc analysis. http://gceasy.io/my-gc-report.jsp?p=c2hhcmVkLzIwMTgvMDgvMTcvLS1nYy5sb2cuMC5jdXJyZW50Lmd6LS0yLTM5LTM0 > memory leakage in org.apache.kafka.common.network.Selector > -- > > Key: KAFKA-7304 > URL: https://issues.apache.org/jira/browse/KAFKA-7304 > Project: Kafka > Issue Type: Bug > Components: core >Affects Versions: 1.1.0, 1.1.1 >Reporter: Yu Yang >Priority: Major > Attachments: Screen Shot 2018-08-16 at 11.04.16 PM.png, Screen Shot > 2018-08-16 at 11.06.38 PM.png, Screen Shot 2018-08-16 at 12.41.26 PM.png, > Screen Shot 2018-08-16 at 4.26.19 PM.png > > > We are testing secured writing to kafka through ssl. Testing at small scale, > ssl writing to kafka was fine. However, when we enabled ssl writing at a > larger scale (>40k clients writes concurrently), the kafka brokers soon hit > OutOfMemory issue with 4G memory setting. We have tried with increasing the > heap size to 10Gb, but encountered the same issue. > We took a few heap dump , and found that most of the heap memory is > referenced through org.apache.kafka.common.network.Selector object. There > are two Channel maps field in Selector. It seems that somehow the objects is > not deleted from the map in a timely manner. > {code} > private final Map channels; > private final Map closingChannels; > {code} > Please see the attached images and the following link for sample gc > analysis. > http://gceasy.io/my-gc-report.jsp?p=c2hhcmVkLzIwMTgvMDgvMTcvLS1nYy5sb2cuMC5jdXJyZW50Lmd6LS0yLTM5LTM0 > the command line for running kafka: > {code} > java -Xms10g -Xmx10g -XX:NewSize=512m -XX:MaxNewSize=512m > -Xbootclasspath/p:/usr/local/libs/bcp -XX:MetaspaceSize=128m -XX:+UseG1GC > -XX:MaxGCPauseMillis=25 -XX:InitiatingHeapOccupancyPercent=35 > -XX:G1HeapRegionSize=16M -XX:MinMetaspaceFreeRatio=25 > -XX:MaxMetaspaceFreeRatio=75 -XX:+PrintGCDetails -XX:+PrintGCDateStamps > -XX:+PrintTenuringDistribution -Xloggc:/var/log/kafka/gc.log > -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=40 -XX:GCLogFileSize=50M > -Djava.awt.headless=true > -Dlog4j.configuration=file
[jira] [Updated] (KAFKA-7304) memory leakage in org.apache.kafka.common.network.Selector
[ https://issues.apache.org/jira/browse/KAFKA-7304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yu Yang updated KAFKA-7304: --- Description: We are testing secured writing to kafka through ssl. Testing at small scale, ssl writing to kafka was fine. However, when we enabled ssl writing at a larger scale (>40k clients writes concurrently), the kafka brokers soon hit OutOfMemory issue with 4G memory setting. We have tried with increasing the heap size to 10Gb, but encountered the same issue. We took a few heap dump , and found that most of the heap memory is referenced through org.apache.kafka.common.network.Selector object. There are two Channel maps field in Selector. It seems that somehow the objects is not deleted from the map in a timely manner. {code} private final Map channels; private final Map closingChannels; {code} Please see the attached images and the following link for sample gc analysis. http://gceasy.io/my-gc-report.jsp?p=c2hhcmVkLzIwMTgvMDgvMTcvLS1nYy5sb2cuMC5jdXJyZW50Lmd6LS0yLTM5LTM0 was: We are testing secured writing to kafka through ssl. Testing at small scale, ssl writing to kafka was fine. However, when we enabled ssl writing at a larger scale (>40k clients writes concurrently), the kafka brokers soon hit OutOfMemory issue with 4G memory setting. We have tried with increasing the heap size to 10Gb, and hit the same issue. We took a few heap dump , and found that most of the heap memory is referenced through org.apache.kafka.common.network.Selector object. There are two Channel maps field in Selector. It seems that somehow the objects is not deleted from the map in a timely manner. {code} private final Map channels; private final Map closingChannels; {code} Please see the attached images and the following link for sample gc analysis. http://gceasy.io/my-gc-report.jsp?p=c2hhcmVkLzIwMTgvMDgvMTcvLS1nYy5sb2cuMC5jdXJyZW50Lmd6LS0yLTM5LTM0 > memory leakage in org.apache.kafka.common.network.Selector > -- > > Key: KAFKA-7304 > URL: https://issues.apache.org/jira/browse/KAFKA-7304 > Project: Kafka > Issue Type: Bug > Components: core >Affects Versions: 1.1.0, 1.1.1 >Reporter: Yu Yang >Priority: Major > Attachments: Screen Shot 2018-08-16 at 11.04.16 PM.png, Screen Shot > 2018-08-16 at 11.06.38 PM.png, Screen Shot 2018-08-16 at 12.41.26 PM.png, > Screen Shot 2018-08-16 at 4.26.19 PM.png > > > We are testing secured writing to kafka through ssl. Testing at small scale, > ssl writing to kafka was fine. However, when we enabled ssl writing at a > larger scale (>40k clients writes concurrently), the kafka brokers soon hit > OutOfMemory issue with 4G memory setting. We have tried with increasing the > heap size to 10Gb, but encountered the same issue. > We took a few heap dump , and found that most of the heap memory is > referenced through org.apache.kafka.common.network.Selector object. There > are two Channel maps field in Selector. It seems that somehow the objects is > not deleted from the map in a timely manner. > {code} > private final Map channels; > private final Map closingChannels; > {code} > Please see the attached images and the following link for sample gc > analysis. > http://gceasy.io/my-gc-report.jsp?p=c2hhcmVkLzIwMTgvMDgvMTcvLS1nYy5sb2cuMC5jdXJyZW50Lmd6LS0yLTM5LTM0 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (KAFKA-7304) memory leakage in org.apache.kafka.common.network.Selector
[ https://issues.apache.org/jira/browse/KAFKA-7304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yu Yang updated KAFKA-7304: --- Description: We are testing secured writing to kafka through ssl. Testing at small scale, ssl writing to kafka was fine. However, when we enabled ssl writing at a larger scale (>40k clients writes concurrently), the kafka brokers soon hit OutOfMemory issue with 4G memory setting. We have tried with increasing the heap size to 10Gb, and hit the same issue. We took a few heap dump , and found that most of the heap memory is referenced through org.apache.kafka.common.network.Selector object. There are two Channel maps field in Selector. It seems that somehow the objects is not deleted from the map in a timely manner. {code} private final Map channels; private final Map closingChannels; {code} Please see the attached images and the following link for sample gc analysis. http://gceasy.io/my-gc-report.jsp?p=c2hhcmVkLzIwMTgvMDgvMTcvLS1nYy5sb2cuMC5jdXJyZW50Lmd6LS0yLTM5LTM0 was: We are testing secured writing to kafka through ssl. Testing at small scale, ssl writing to kafka was fine. However, when we enable ssl writing at scale (>40k clients writes concurrently), the kafka brokers soon hit OutOfMemory issue with 4G memory setting. We have tried with increasing the heap size to 10Gb, and hit the same issue. We took a few heap dump , and found that most of the heap memory is referenced through org.apache.kafka.common.network.Selector object. There are two Channel maps field in Selector. It seems that somehow the objects is not deleted from the map in a timely manner. {code} private final Map channels; private final Map closingChannels; {code} Please see the attached images and the following link for sample gc analysis. http://gceasy.io/my-gc-report.jsp?p=c2hhcmVkLzIwMTgvMDgvMTcvLS1nYy5sb2cuMC5jdXJyZW50Lmd6LS0yLTM5LTM0 > memory leakage in org.apache.kafka.common.network.Selector > -- > > Key: KAFKA-7304 > URL: https://issues.apache.org/jira/browse/KAFKA-7304 > Project: Kafka > Issue Type: Bug > Components: core >Affects Versions: 1.1.0, 1.1.1 >Reporter: Yu Yang >Priority: Major > Attachments: Screen Shot 2018-08-16 at 11.04.16 PM.png, Screen Shot > 2018-08-16 at 11.06.38 PM.png, Screen Shot 2018-08-16 at 12.41.26 PM.png, > Screen Shot 2018-08-16 at 4.26.19 PM.png > > > We are testing secured writing to kafka through ssl. Testing at small scale, > ssl writing to kafka was fine. However, when we enabled ssl writing at a > larger scale (>40k clients writes concurrently), the kafka brokers soon hit > OutOfMemory issue with 4G memory setting. We have tried with increasing the > heap size to 10Gb, and hit the same issue. > We took a few heap dump , and found that most of the heap memory is > referenced through org.apache.kafka.common.network.Selector object. There > are two Channel maps field in Selector. It seems that somehow the objects is > not deleted from the map in a timely manner. > {code} > private final Map channels; > private final Map closingChannels; > {code} > Please see the attached images and the following link for sample gc > analysis. > http://gceasy.io/my-gc-report.jsp?p=c2hhcmVkLzIwMTgvMDgvMTcvLS1nYy5sb2cuMC5jdXJyZW50Lmd6LS0yLTM5LTM0 -- This message was sent by Atlassian JIRA (v7.6.3#76005)