[jira] [Updated] (KAFKA-7311) Sender should reset next batch expiry time between poll loops

2018-08-18 Thread Rohan Desai (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohan Desai updated KAFKA-7311:
---
Description: Sender/RecordAccumulator never resets the next batch expiry 
time. Its always computed as the min of the current value and the expiry time 
for all batches being processed. This means that its always set to the expiry 
time of the first batch, and once that time has passed Sender starts spinning 
on epoll with a timeout of 0, which consumes a lot of CPU. This patch updates 
Sender to reset the next batch expiry time on each poll loop so that a new 
value reflecting the expiry time for the current set of batches is computed.  
(was: Sender does not reset next batch expiry time between poll loops. This 
means that once it crosses the expiry time of the first batch, it starts 
spinning on epoll with a timeout of 0, which consumes a lot of CPU. We observed 
this running KSQL when investigating why throughput would drop after about 10 
minutes (the default delivery timeout).)

> Sender should reset next batch expiry time between poll loops
> -
>
> Key: KAFKA-7311
> URL: https://issues.apache.org/jira/browse/KAFKA-7311
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Affects Versions: 2.1.0
>Reporter: Rohan Desai
>Priority: Major
> Fix For: 2.1.0
>
>
> Sender/RecordAccumulator never resets the next batch expiry time. Its always 
> computed as the min of the current value and the expiry time for all batches 
> being processed. This means that its always set to the expiry time of the 
> first batch, and once that time has passed Sender starts spinning on epoll 
> with a timeout of 0, which consumes a lot of CPU. This patch updates Sender 
> to reset the next batch expiry time on each poll loop so that a new value 
> reflecting the expiry time for the current set of batches is computed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-7311) Sender should reset next batch expiry time between poll loops

2018-08-18 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16584969#comment-16584969
 ] 

ASF GitHub Bot commented on KAFKA-7311:
---

rodesai opened a new pull request #5531: KAFKA-7311: Reset next batch expiry 
time on each poll loop
URL: https://github.com/apache/kafka/pull/5531
 
 
   *More detailed description of your change,
   if necessary. The PR title and PR message become
   the squashed commit message, so use a separate
   comment to ping reviewers.*
   
   *Summary of testing strategy (including rationale)
   for the feature or bug fix. Unit and/or integration
   tests are expected for any behaviour change and
   system tests should be considered for larger changes.*
   
   ### Committer Checklist (excluded from commit message)
   - [ ] Verify design and implementation 
   - [ ] Verify test coverage and CI build status
   - [ ] Verify documentation (including upgrade notes)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Sender should reset next batch expiry time between poll loops
> -
>
> Key: KAFKA-7311
> URL: https://issues.apache.org/jira/browse/KAFKA-7311
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Affects Versions: 2.1.0
>Reporter: Rohan Desai
>Priority: Major
> Fix For: 2.1.0
>
>
> Sender does not reset next batch expiry time between poll loops. This means 
> that once it crosses the expiry time of the first batch, it starts spinning 
> on epoll with a timeout of 0, which consumes a lot of CPU. We observed this 
> running KSQL when investigating why throughput would drop after about 10 
> minutes (the default delivery timeout).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-7311) Sender should reset next batch expiry time between poll loops

2018-08-18 Thread Rohan Desai (JIRA)
Rohan Desai created KAFKA-7311:
--

 Summary: Sender should reset next batch expiry time between poll 
loops
 Key: KAFKA-7311
 URL: https://issues.apache.org/jira/browse/KAFKA-7311
 Project: Kafka
  Issue Type: Bug
  Components: clients
Affects Versions: 2.1.0
Reporter: Rohan Desai
 Fix For: 2.1.0


Sender does not reset next batch expiry time between poll loops. This means 
that once it crosses the expiry time of the first batch, it starts spinning on 
epoll with a timeout of 0, which consumes a lot of CPU. We observed this 
running KSQL when investigating why throughput would drop after about 10 
minutes (the default delivery timeout).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-7304) memory leakage in org.apache.kafka.common.network.Selector

2018-08-18 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16584928#comment-16584928
 ] 

Ted Yu commented on KAFKA-7304:
---

Without the fix, the test would fail with:
{code}
org.apache.kafka.common.network.SslSelectorTest > testMuteOnOOM FAILED
java.lang.AssertionError: expected:<0> but was:<1>
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.failNotEquals(Assert.java:834)
at org.junit.Assert.assertEquals(Assert.java:645)
at org.junit.Assert.assertEquals(Assert.java:631)
at 
org.apache.kafka.common.network.SslSelectorTest.tearDown(SslSelectorTest.java:79)
{code}

> memory leakage in org.apache.kafka.common.network.Selector
> --
>
> Key: KAFKA-7304
> URL: https://issues.apache.org/jira/browse/KAFKA-7304
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.1.0
>Reporter: Yu Yang
>Priority: Critical
> Fix For: 1.1.2, 2.0.1, 2.1.0
>
> Attachments: 7304.v4.txt, Screen Shot 2018-08-16 at 11.04.16 PM.png, 
> Screen Shot 2018-08-16 at 11.06.38 PM.png, Screen Shot 2018-08-16 at 12.41.26 
> PM.png, Screen Shot 2018-08-16 at 4.26.19 PM.png, Screen Shot 2018-08-17 at 
> 1.03.35 AM.png, Screen Shot 2018-08-17 at 1.04.32 AM.png, Screen Shot 
> 2018-08-17 at 1.05.30 AM.png
>
>
> We are testing secured writing to kafka through ssl. Testing at small scale, 
> ssl writing to kafka was fine. However, when we enabled ssl writing at a 
> larger scale (>40k clients write concurrently), the kafka brokers soon hit 
> OutOfMemory issue with 4G memory setting. We have tried with increasing the 
> heap size to 10Gb, but encountered the same issue. 
> We took a few heap dumps , and found that most of the heap memory is 
> referenced through org.apache.kafka.common.network.Selector objects.  There 
> are two Channel maps field in Selector. It seems that somehow the objects is 
> not deleted from the map in a timely manner. 
> One observation is that the memory leak seems relate to kafka partition 
> leader changes. If there is broker restart etc. in the cluster that caused 
> partition leadership change, the brokers may hit the OOM issue faster. 
> {code}
> private final Map channels;
> private final Map closingChannels;
> {code}
> Please see the  attached images and the following link for sample gc 
> analysis. 
> http://gceasy.io/my-gc-report.jsp?p=c2hhcmVkLzIwMTgvMDgvMTcvLS1nYy5sb2cuMC5jdXJyZW50Lmd6LS0yLTM5LTM0
> the command line for running kafka: 
> {code}
> java -Xms10g -Xmx10g -XX:NewSize=512m -XX:MaxNewSize=512m 
> -Xbootclasspath/p:/usr/local/libs/bcp -XX:MetaspaceSize=128m -XX:+UseG1GC 
> -XX:MaxGCPauseMillis=25 -XX:InitiatingHeapOccupancyPercent=35 
> -XX:G1HeapRegionSize=16M -XX:MinMetaspaceFreeRatio=25 
> -XX:MaxMetaspaceFreeRatio=75 -XX:+PrintGCDetails -XX:+PrintGCDateStamps 
> -XX:+PrintTenuringDistribution -Xloggc:/var/log/kafka/gc.log 
> -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=40 -XX:GCLogFileSize=50M 
> -Djava.awt.headless=true 
> -Dlog4j.configuration=file:/etc/kafka/log4j.properties 
> -Dcom.sun.management.jmxremote 
> -Dcom.sun.management.jmxremote.authenticate=false 
> -Dcom.sun.management.jmxremote.ssl=false 
> -Dcom.sun.management.jmxremote.port= 
> -Dcom.sun.management.jmxremote.rmi.port= -cp /usr/local/libs/*  
> kafka.Kafka /etc/kafka/server.properties
> {code}
> We use java 1.8.0_102, and has applied a TLS patch on reducing 
> X509Factory.certCache map size from 750 to 20. 
> {code}
> java -version
> java version "1.8.0_102"
> Java(TM) SE Runtime Environment (build 1.8.0_102-b14)
> Java HotSpot(TM) 64-Bit Server VM (build 25.102-b14, mixed mode)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-7304) memory leakage in org.apache.kafka.common.network.Selector

2018-08-18 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16584922#comment-16584922
 ] 

Ted Yu commented on KAFKA-7304:
---

Patch v4 adds an assertion for the count of Channels in closingChannels in 
SslSelectorTest#tearDown .

> memory leakage in org.apache.kafka.common.network.Selector
> --
>
> Key: KAFKA-7304
> URL: https://issues.apache.org/jira/browse/KAFKA-7304
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.1.0
>Reporter: Yu Yang
>Priority: Critical
> Fix For: 1.1.2, 2.0.1, 2.1.0
>
> Attachments: 7304.v4.txt, Screen Shot 2018-08-16 at 11.04.16 PM.png, 
> Screen Shot 2018-08-16 at 11.06.38 PM.png, Screen Shot 2018-08-16 at 12.41.26 
> PM.png, Screen Shot 2018-08-16 at 4.26.19 PM.png, Screen Shot 2018-08-17 at 
> 1.03.35 AM.png, Screen Shot 2018-08-17 at 1.04.32 AM.png, Screen Shot 
> 2018-08-17 at 1.05.30 AM.png
>
>
> We are testing secured writing to kafka through ssl. Testing at small scale, 
> ssl writing to kafka was fine. However, when we enabled ssl writing at a 
> larger scale (>40k clients write concurrently), the kafka brokers soon hit 
> OutOfMemory issue with 4G memory setting. We have tried with increasing the 
> heap size to 10Gb, but encountered the same issue. 
> We took a few heap dumps , and found that most of the heap memory is 
> referenced through org.apache.kafka.common.network.Selector objects.  There 
> are two Channel maps field in Selector. It seems that somehow the objects is 
> not deleted from the map in a timely manner. 
> One observation is that the memory leak seems relate to kafka partition 
> leader changes. If there is broker restart etc. in the cluster that caused 
> partition leadership change, the brokers may hit the OOM issue faster. 
> {code}
> private final Map channels;
> private final Map closingChannels;
> {code}
> Please see the  attached images and the following link for sample gc 
> analysis. 
> http://gceasy.io/my-gc-report.jsp?p=c2hhcmVkLzIwMTgvMDgvMTcvLS1nYy5sb2cuMC5jdXJyZW50Lmd6LS0yLTM5LTM0
> the command line for running kafka: 
> {code}
> java -Xms10g -Xmx10g -XX:NewSize=512m -XX:MaxNewSize=512m 
> -Xbootclasspath/p:/usr/local/libs/bcp -XX:MetaspaceSize=128m -XX:+UseG1GC 
> -XX:MaxGCPauseMillis=25 -XX:InitiatingHeapOccupancyPercent=35 
> -XX:G1HeapRegionSize=16M -XX:MinMetaspaceFreeRatio=25 
> -XX:MaxMetaspaceFreeRatio=75 -XX:+PrintGCDetails -XX:+PrintGCDateStamps 
> -XX:+PrintTenuringDistribution -Xloggc:/var/log/kafka/gc.log 
> -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=40 -XX:GCLogFileSize=50M 
> -Djava.awt.headless=true 
> -Dlog4j.configuration=file:/etc/kafka/log4j.properties 
> -Dcom.sun.management.jmxremote 
> -Dcom.sun.management.jmxremote.authenticate=false 
> -Dcom.sun.management.jmxremote.ssl=false 
> -Dcom.sun.management.jmxremote.port= 
> -Dcom.sun.management.jmxremote.rmi.port= -cp /usr/local/libs/*  
> kafka.Kafka /etc/kafka/server.properties
> {code}
> We use java 1.8.0_102, and has applied a TLS patch on reducing 
> X509Factory.certCache map size from 750 to 20. 
> {code}
> java -version
> java version "1.8.0_102"
> Java(TM) SE Runtime Environment (build 1.8.0_102-b14)
> Java HotSpot(TM) 64-Bit Server VM (build 25.102-b14, mixed mode)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-7304) memory leakage in org.apache.kafka.common.network.Selector

2018-08-18 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated KAFKA-7304:
--
Attachment: 7304.v4.txt

> memory leakage in org.apache.kafka.common.network.Selector
> --
>
> Key: KAFKA-7304
> URL: https://issues.apache.org/jira/browse/KAFKA-7304
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.1.0
>Reporter: Yu Yang
>Priority: Critical
> Fix For: 1.1.2, 2.0.1, 2.1.0
>
> Attachments: 7304.v4.txt, Screen Shot 2018-08-16 at 11.04.16 PM.png, 
> Screen Shot 2018-08-16 at 11.06.38 PM.png, Screen Shot 2018-08-16 at 12.41.26 
> PM.png, Screen Shot 2018-08-16 at 4.26.19 PM.png, Screen Shot 2018-08-17 at 
> 1.03.35 AM.png, Screen Shot 2018-08-17 at 1.04.32 AM.png, Screen Shot 
> 2018-08-17 at 1.05.30 AM.png
>
>
> We are testing secured writing to kafka through ssl. Testing at small scale, 
> ssl writing to kafka was fine. However, when we enabled ssl writing at a 
> larger scale (>40k clients write concurrently), the kafka brokers soon hit 
> OutOfMemory issue with 4G memory setting. We have tried with increasing the 
> heap size to 10Gb, but encountered the same issue. 
> We took a few heap dumps , and found that most of the heap memory is 
> referenced through org.apache.kafka.common.network.Selector objects.  There 
> are two Channel maps field in Selector. It seems that somehow the objects is 
> not deleted from the map in a timely manner. 
> One observation is that the memory leak seems relate to kafka partition 
> leader changes. If there is broker restart etc. in the cluster that caused 
> partition leadership change, the brokers may hit the OOM issue faster. 
> {code}
> private final Map channels;
> private final Map closingChannels;
> {code}
> Please see the  attached images and the following link for sample gc 
> analysis. 
> http://gceasy.io/my-gc-report.jsp?p=c2hhcmVkLzIwMTgvMDgvMTcvLS1nYy5sb2cuMC5jdXJyZW50Lmd6LS0yLTM5LTM0
> the command line for running kafka: 
> {code}
> java -Xms10g -Xmx10g -XX:NewSize=512m -XX:MaxNewSize=512m 
> -Xbootclasspath/p:/usr/local/libs/bcp -XX:MetaspaceSize=128m -XX:+UseG1GC 
> -XX:MaxGCPauseMillis=25 -XX:InitiatingHeapOccupancyPercent=35 
> -XX:G1HeapRegionSize=16M -XX:MinMetaspaceFreeRatio=25 
> -XX:MaxMetaspaceFreeRatio=75 -XX:+PrintGCDetails -XX:+PrintGCDateStamps 
> -XX:+PrintTenuringDistribution -Xloggc:/var/log/kafka/gc.log 
> -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=40 -XX:GCLogFileSize=50M 
> -Djava.awt.headless=true 
> -Dlog4j.configuration=file:/etc/kafka/log4j.properties 
> -Dcom.sun.management.jmxremote 
> -Dcom.sun.management.jmxremote.authenticate=false 
> -Dcom.sun.management.jmxremote.ssl=false 
> -Dcom.sun.management.jmxremote.port= 
> -Dcom.sun.management.jmxremote.rmi.port= -cp /usr/local/libs/*  
> kafka.Kafka /etc/kafka/server.properties
> {code}
> We use java 1.8.0_102, and has applied a TLS patch on reducing 
> X509Factory.certCache map size from 750 to 20. 
> {code}
> java -version
> java version "1.8.0_102"
> Java(TM) SE Runtime Environment (build 1.8.0_102-b14)
> Java HotSpot(TM) 64-Bit Server VM (build 25.102-b14, mixed mode)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (KAFKA-7304) memory leakage in org.apache.kafka.common.network.Selector

2018-08-18 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16584894#comment-16584894
 ] 

Ted Yu edited comment on KAFKA-7304 at 8/18/18 7:38 PM:


Through a bit more additional logging, here is what happens to testMuteOnOOM.
There are two channels added to closingChannels:
{code}
adding org.apache.kafka.common.network.KafkaChannel@334b860c for clientX
adding org.apache.kafka.common.network.KafkaChannel@334b860d for clientY
{code}
Later, when Selector.close() is called by tearDown, the channel for clientY is 
still in closingChannels :
{code}
There are 1 entries in closingChannels
org.apache.kafka.common.network.KafkaChannel@334b860d
{code}
My change above would close the channel left in closingChannels, preventing 
memory leak.


was (Author: yuzhih...@gmail.com):
Through a bit more additional logging, here is what happens to testMuteOnOOM.
There are two channels registered at the beginning of the test:
{code}
adding org.apache.kafka.common.network.KafkaChannel@334b860c for clientX
adding org.apache.kafka.common.network.KafkaChannel@334b860d for clientY
{code}
Later, when Selector.close() is called by tearDown, the channel for clientY is 
in closingChannels :
{code}
There are 1 entries in closingChannels
org.apache.kafka.common.network.KafkaChannel@334b860d
{code}
My change above would close the channel left in closingChannels, preventing 
memory leak.

> memory leakage in org.apache.kafka.common.network.Selector
> --
>
> Key: KAFKA-7304
> URL: https://issues.apache.org/jira/browse/KAFKA-7304
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.1.0
>Reporter: Yu Yang
>Priority: Critical
> Fix For: 1.1.2, 2.0.1, 2.1.0
>
> Attachments: Screen Shot 2018-08-16 at 11.04.16 PM.png, Screen Shot 
> 2018-08-16 at 11.06.38 PM.png, Screen Shot 2018-08-16 at 12.41.26 PM.png, 
> Screen Shot 2018-08-16 at 4.26.19 PM.png, Screen Shot 2018-08-17 at 1.03.35 
> AM.png, Screen Shot 2018-08-17 at 1.04.32 AM.png, Screen Shot 2018-08-17 at 
> 1.05.30 AM.png
>
>
> We are testing secured writing to kafka through ssl. Testing at small scale, 
> ssl writing to kafka was fine. However, when we enabled ssl writing at a 
> larger scale (>40k clients write concurrently), the kafka brokers soon hit 
> OutOfMemory issue with 4G memory setting. We have tried with increasing the 
> heap size to 10Gb, but encountered the same issue. 
> We took a few heap dumps , and found that most of the heap memory is 
> referenced through org.apache.kafka.common.network.Selector objects.  There 
> are two Channel maps field in Selector. It seems that somehow the objects is 
> not deleted from the map in a timely manner. 
> One observation is that the memory leak seems relate to kafka partition 
> leader changes. If there is broker restart etc. in the cluster that caused 
> partition leadership change, the brokers may hit the OOM issue faster. 
> {code}
> private final Map channels;
> private final Map closingChannels;
> {code}
> Please see the  attached images and the following link for sample gc 
> analysis. 
> http://gceasy.io/my-gc-report.jsp?p=c2hhcmVkLzIwMTgvMDgvMTcvLS1nYy5sb2cuMC5jdXJyZW50Lmd6LS0yLTM5LTM0
> the command line for running kafka: 
> {code}
> java -Xms10g -Xmx10g -XX:NewSize=512m -XX:MaxNewSize=512m 
> -Xbootclasspath/p:/usr/local/libs/bcp -XX:MetaspaceSize=128m -XX:+UseG1GC 
> -XX:MaxGCPauseMillis=25 -XX:InitiatingHeapOccupancyPercent=35 
> -XX:G1HeapRegionSize=16M -XX:MinMetaspaceFreeRatio=25 
> -XX:MaxMetaspaceFreeRatio=75 -XX:+PrintGCDetails -XX:+PrintGCDateStamps 
> -XX:+PrintTenuringDistribution -Xloggc:/var/log/kafka/gc.log 
> -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=40 -XX:GCLogFileSize=50M 
> -Djava.awt.headless=true 
> -Dlog4j.configuration=file:/etc/kafka/log4j.properties 
> -Dcom.sun.management.jmxremote 
> -Dcom.sun.management.jmxremote.authenticate=false 
> -Dcom.sun.management.jmxremote.ssl=false 
> -Dcom.sun.management.jmxremote.port= 
> -Dcom.sun.management.jmxremote.rmi.port= -cp /usr/local/libs/*  
> kafka.Kafka /etc/kafka/server.properties
> {code}
> We use java 1.8.0_102, and has applied a TLS patch on reducing 
> X509Factory.certCache map size from 750 to 20. 
> {code}
> java -version
> java version "1.8.0_102"
> Java(TM) SE Runtime Environment (build 1.8.0_102-b14)
> Java HotSpot(TM) 64-Bit Server VM (build 25.102-b14, mixed mode)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-7228) DeadLetterQueue throws a NullPointerException

2018-08-18 Thread Arjun Satish (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16584897#comment-16584897
 ] 

Arjun Satish commented on KAFKA-7228:
-

Installation of hotfix jar:

# Navigate to the lib/ directory of the AK 2.0 installation.
# Replace the "connect-runtime-2.0.0.jar" file with the one provided in this 
ticket:  [^connect-runtime-2.0.0-HOTFIX-7228.jar] .

*Note*: this hotfix only adds the fix for the NPE described in this ticket on 
the 2.0.0 release. Other bugs in connect-runtime 2.0.0 (if any) will not be 
fixed by it.

> DeadLetterQueue throws a NullPointerException
> -
>
> Key: KAFKA-7228
> URL: https://issues.apache.org/jira/browse/KAFKA-7228
> Project: Kafka
>  Issue Type: Task
>  Components: KafkaConnect
>Reporter: Arjun Satish
>Assignee: Arjun Satish
>Priority: Major
> Fix For: 2.0.1, 2.1.0
>
> Attachments: connect-runtime-2.0.0-HOTFIX-7228.jar
>
>
> Using the dead letter queue results in a NPE: 
> {code:java}
> [2018-08-01 07:41:08,907] ERROR WorkerSinkTask{id=s3-sink-yanivr-2-4} Task 
> threw an uncaught and unrecoverable exception 
> (org.apache.kafka.connect.runtime.WorkerTask)
> java.lang.NullPointerException
> at 
> org.apache.kafka.connect.runtime.errors.DeadLetterQueueReporter.report(DeadLetterQueueReporter.java:124)
> at 
> org.apache.kafka.connect.runtime.errors.ProcessingContext.report(ProcessingContext.java:137)
> at 
> org.apache.kafka.connect.runtime.errors.RetryWithToleranceOperator.execute(RetryWithToleranceOperator.java:108)
> at 
> org.apache.kafka.connect.runtime.WorkerSinkTask.convertAndTransformRecord(WorkerSinkTask.java:513)
> at 
> org.apache.kafka.connect.runtime.WorkerSinkTask.convertMessages(WorkerSinkTask.java:490)
> at 
> org.apache.kafka.connect.runtime.WorkerSinkTask.poll(WorkerSinkTask.java:321)
> at 
> org.apache.kafka.connect.runtime.WorkerSinkTask.iteration(WorkerSinkTask.java:225)
> at 
> org.apache.kafka.connect.runtime.WorkerSinkTask.execute(WorkerSinkTask.java:193)
> at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:175)
> at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:219)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> [2018-08-01 07:41:08,908] ERROR WorkerSinkTask{id=s3-sink-yanivr-2-4} Task is 
> being killed and will not recover until manually restarted 
> (org.apache.kafka.connect.runtime.WorkerTask)
> {code}
> DLQ reporter does not get a {{errorHandlingMetrics}} object when initialized 
> through the WorkerSinkTask.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-7228) DeadLetterQueue throws a NullPointerException

2018-08-18 Thread Arjun Satish (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arjun Satish updated KAFKA-7228:

Attachment: connect-runtime-2.0.0-HOTFIX-7228.jar

> DeadLetterQueue throws a NullPointerException
> -
>
> Key: KAFKA-7228
> URL: https://issues.apache.org/jira/browse/KAFKA-7228
> Project: Kafka
>  Issue Type: Task
>  Components: KafkaConnect
>Reporter: Arjun Satish
>Assignee: Arjun Satish
>Priority: Major
> Fix For: 2.0.1, 2.1.0
>
> Attachments: connect-runtime-2.0.0-HOTFIX-7228.jar
>
>
> Using the dead letter queue results in a NPE: 
> {code:java}
> [2018-08-01 07:41:08,907] ERROR WorkerSinkTask{id=s3-sink-yanivr-2-4} Task 
> threw an uncaught and unrecoverable exception 
> (org.apache.kafka.connect.runtime.WorkerTask)
> java.lang.NullPointerException
> at 
> org.apache.kafka.connect.runtime.errors.DeadLetterQueueReporter.report(DeadLetterQueueReporter.java:124)
> at 
> org.apache.kafka.connect.runtime.errors.ProcessingContext.report(ProcessingContext.java:137)
> at 
> org.apache.kafka.connect.runtime.errors.RetryWithToleranceOperator.execute(RetryWithToleranceOperator.java:108)
> at 
> org.apache.kafka.connect.runtime.WorkerSinkTask.convertAndTransformRecord(WorkerSinkTask.java:513)
> at 
> org.apache.kafka.connect.runtime.WorkerSinkTask.convertMessages(WorkerSinkTask.java:490)
> at 
> org.apache.kafka.connect.runtime.WorkerSinkTask.poll(WorkerSinkTask.java:321)
> at 
> org.apache.kafka.connect.runtime.WorkerSinkTask.iteration(WorkerSinkTask.java:225)
> at 
> org.apache.kafka.connect.runtime.WorkerSinkTask.execute(WorkerSinkTask.java:193)
> at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:175)
> at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:219)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> [2018-08-01 07:41:08,908] ERROR WorkerSinkTask{id=s3-sink-yanivr-2-4} Task is 
> being killed and will not recover until manually restarted 
> (org.apache.kafka.connect.runtime.WorkerTask)
> {code}
> DLQ reporter does not get a {{errorHandlingMetrics}} object when initialized 
> through the WorkerSinkTask.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-7304) memory leakage in org.apache.kafka.common.network.Selector

2018-08-18 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16584894#comment-16584894
 ] 

Ted Yu commented on KAFKA-7304:
---

Through a bit more additional logging, here is what happens to testMuteOnOOM.
There are two channels registered at the beginning of the test:
{code}
adding org.apache.kafka.common.network.KafkaChannel@334b860c for clientX
adding org.apache.kafka.common.network.KafkaChannel@334b860d for clientY
{code}
Later, when Selector.close() is called by tearDown, the channel for clientY is 
in closingChannels :
{code}
There are 1 entries in closingChannels
org.apache.kafka.common.network.KafkaChannel@334b860d
{code}
My change above would close the channel left in closingChannels, preventing 
memory leak.

> memory leakage in org.apache.kafka.common.network.Selector
> --
>
> Key: KAFKA-7304
> URL: https://issues.apache.org/jira/browse/KAFKA-7304
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.1.0
>Reporter: Yu Yang
>Priority: Critical
> Fix For: 1.1.2, 2.0.1, 2.1.0
>
> Attachments: Screen Shot 2018-08-16 at 11.04.16 PM.png, Screen Shot 
> 2018-08-16 at 11.06.38 PM.png, Screen Shot 2018-08-16 at 12.41.26 PM.png, 
> Screen Shot 2018-08-16 at 4.26.19 PM.png, Screen Shot 2018-08-17 at 1.03.35 
> AM.png, Screen Shot 2018-08-17 at 1.04.32 AM.png, Screen Shot 2018-08-17 at 
> 1.05.30 AM.png
>
>
> We are testing secured writing to kafka through ssl. Testing at small scale, 
> ssl writing to kafka was fine. However, when we enabled ssl writing at a 
> larger scale (>40k clients write concurrently), the kafka brokers soon hit 
> OutOfMemory issue with 4G memory setting. We have tried with increasing the 
> heap size to 10Gb, but encountered the same issue. 
> We took a few heap dumps , and found that most of the heap memory is 
> referenced through org.apache.kafka.common.network.Selector objects.  There 
> are two Channel maps field in Selector. It seems that somehow the objects is 
> not deleted from the map in a timely manner. 
> One observation is that the memory leak seems relate to kafka partition 
> leader changes. If there is broker restart etc. in the cluster that caused 
> partition leadership change, the brokers may hit the OOM issue faster. 
> {code}
> private final Map channels;
> private final Map closingChannels;
> {code}
> Please see the  attached images and the following link for sample gc 
> analysis. 
> http://gceasy.io/my-gc-report.jsp?p=c2hhcmVkLzIwMTgvMDgvMTcvLS1nYy5sb2cuMC5jdXJyZW50Lmd6LS0yLTM5LTM0
> the command line for running kafka: 
> {code}
> java -Xms10g -Xmx10g -XX:NewSize=512m -XX:MaxNewSize=512m 
> -Xbootclasspath/p:/usr/local/libs/bcp -XX:MetaspaceSize=128m -XX:+UseG1GC 
> -XX:MaxGCPauseMillis=25 -XX:InitiatingHeapOccupancyPercent=35 
> -XX:G1HeapRegionSize=16M -XX:MinMetaspaceFreeRatio=25 
> -XX:MaxMetaspaceFreeRatio=75 -XX:+PrintGCDetails -XX:+PrintGCDateStamps 
> -XX:+PrintTenuringDistribution -Xloggc:/var/log/kafka/gc.log 
> -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=40 -XX:GCLogFileSize=50M 
> -Djava.awt.headless=true 
> -Dlog4j.configuration=file:/etc/kafka/log4j.properties 
> -Dcom.sun.management.jmxremote 
> -Dcom.sun.management.jmxremote.authenticate=false 
> -Dcom.sun.management.jmxremote.ssl=false 
> -Dcom.sun.management.jmxremote.port= 
> -Dcom.sun.management.jmxremote.rmi.port= -cp /usr/local/libs/*  
> kafka.Kafka /etc/kafka/server.properties
> {code}
> We use java 1.8.0_102, and has applied a TLS patch on reducing 
> X509Factory.certCache map size from 750 to 20. 
> {code}
> java -version
> java version "1.8.0_102"
> Java(TM) SE Runtime Environment (build 1.8.0_102-b14)
> Java HotSpot(TM) 64-Bit Server VM (build 25.102-b14, mixed mode)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-7044) kafka-consumer-groups.sh NullPointerException describing round robin or sticky assignors

2018-08-18 Thread Ismael Juma (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-7044:
---
Fix Version/s: 2.1.0

> kafka-consumer-groups.sh NullPointerException describing round robin or 
> sticky assignors
> 
>
> Key: KAFKA-7044
> URL: https://issues.apache.org/jira/browse/KAFKA-7044
> Project: Kafka
>  Issue Type: Bug
>  Components: tools
>Affects Versions: 1.1.0
> Environment: CentOS 7.4, Oracle JDK 1.8
>Reporter: Jeff Field
>Assignee: Vahid Hashemian
>Priority: Minor
> Fix For: 2.1.0
>
>
> We've recently moved to using the round robin assignor for one of our 
> consumer groups, and started testing the sticky assignor. In both cases, 
> using Kafka 1.1.0 we get a null pointer exception *unless* the group being 
> described is rebalancing:
> {code:java}
> kafka-consumer-groups --bootstrap-server fqdn:9092 --describe --group 
> groupname-for-consumer
> Error: Executing consumer group command failed due to null
> [2018-06-12 01:32:34,179] DEBUG Exception in consumer group command 
> (kafka.admin.ConsumerGroupCommand$)
> java.lang.NullPointerException
> at scala.Predef$.Long2long(Predef.scala:363)
> at 
> kafka.admin.ConsumerGroupCommand$KafkaConsumerGroupService$$anonfun$getLogEndOffsets$2.apply(ConsumerGroupCommand.scala:612)
> at 
> kafka.admin.ConsumerGroupCommand$KafkaConsumerGroupService$$anonfun$getLogEndOffsets$2.apply(ConsumerGroupCommand.scala:610)
> at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
> at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
> at scala.collection.immutable.List.foreach(List.scala:392)
> at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
> at scala.collection.immutable.List.map(List.scala:296)
> at 
> kafka.admin.ConsumerGroupCommand$KafkaConsumerGroupService.getLogEndOffsets(ConsumerGroupCommand.scala:610)
> at 
> kafka.admin.ConsumerGroupCommand$ConsumerGroupService$class.describePartitions(ConsumerGroupCommand.scala:328)
> at 
> kafka.admin.ConsumerGroupCommand$ConsumerGroupService$class.collectConsumerAssignment(ConsumerGroupCommand.scala:308)
> at 
> kafka.admin.ConsumerGroupCommand$KafkaConsumerGroupService.collectConsumerAssignment(ConsumerGroupCommand.scala:544)
> at 
> kafka.admin.ConsumerGroupCommand$KafkaConsumerGroupService$$anonfun$10$$anonfun$13.apply(ConsumerGroupCommand.scala:571)
> at 
> kafka.admin.ConsumerGroupCommand$KafkaConsumerGroupService$$anonfun$10$$anonfun$13.apply(ConsumerGroupCommand.scala:565)
> at scala.collection.immutable.List.flatMap(List.scala:338)
> at 
> kafka.admin.ConsumerGroupCommand$KafkaConsumerGroupService$$anonfun$10.apply(ConsumerGroupCommand.scala:565)
> at 
> kafka.admin.ConsumerGroupCommand$KafkaConsumerGroupService$$anonfun$10.apply(ConsumerGroupCommand.scala:558)
> at scala.Option.map(Option.scala:146)
> at 
> kafka.admin.ConsumerGroupCommand$KafkaConsumerGroupService.collectGroupOffsets(ConsumerGroupCommand.scala:558)
> at 
> kafka.admin.ConsumerGroupCommand$ConsumerGroupService$class.describeGroup(ConsumerGroupCommand.scala:271)
> at 
> kafka.admin.ConsumerGroupCommand$KafkaConsumerGroupService.describeGroup(ConsumerGroupCommand.scala:544)
> at kafka.admin.ConsumerGroupCommand$.main(ConsumerGroupCommand.scala:77)
> at kafka.admin.ConsumerGroupCommand.main(ConsumerGroupCommand.scala)
> [2018-06-12 01:32:34,255] DEBUG Removed sensor with name connections-closed: 
> (org.apache.kafka.common.metrics.Metrics){code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-7044) kafka-consumer-groups.sh NullPointerException describing round robin or sticky assignors

2018-08-18 Thread Ismael Juma (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-7044:
---
Priority: Major  (was: Minor)

> kafka-consumer-groups.sh NullPointerException describing round robin or 
> sticky assignors
> 
>
> Key: KAFKA-7044
> URL: https://issues.apache.org/jira/browse/KAFKA-7044
> Project: Kafka
>  Issue Type: Bug
>  Components: tools
>Affects Versions: 1.1.0
> Environment: CentOS 7.4, Oracle JDK 1.8
>Reporter: Jeff Field
>Assignee: Vahid Hashemian
>Priority: Major
> Fix For: 2.1.0
>
>
> We've recently moved to using the round robin assignor for one of our 
> consumer groups, and started testing the sticky assignor. In both cases, 
> using Kafka 1.1.0 we get a null pointer exception *unless* the group being 
> described is rebalancing:
> {code:java}
> kafka-consumer-groups --bootstrap-server fqdn:9092 --describe --group 
> groupname-for-consumer
> Error: Executing consumer group command failed due to null
> [2018-06-12 01:32:34,179] DEBUG Exception in consumer group command 
> (kafka.admin.ConsumerGroupCommand$)
> java.lang.NullPointerException
> at scala.Predef$.Long2long(Predef.scala:363)
> at 
> kafka.admin.ConsumerGroupCommand$KafkaConsumerGroupService$$anonfun$getLogEndOffsets$2.apply(ConsumerGroupCommand.scala:612)
> at 
> kafka.admin.ConsumerGroupCommand$KafkaConsumerGroupService$$anonfun$getLogEndOffsets$2.apply(ConsumerGroupCommand.scala:610)
> at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
> at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
> at scala.collection.immutable.List.foreach(List.scala:392)
> at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
> at scala.collection.immutable.List.map(List.scala:296)
> at 
> kafka.admin.ConsumerGroupCommand$KafkaConsumerGroupService.getLogEndOffsets(ConsumerGroupCommand.scala:610)
> at 
> kafka.admin.ConsumerGroupCommand$ConsumerGroupService$class.describePartitions(ConsumerGroupCommand.scala:328)
> at 
> kafka.admin.ConsumerGroupCommand$ConsumerGroupService$class.collectConsumerAssignment(ConsumerGroupCommand.scala:308)
> at 
> kafka.admin.ConsumerGroupCommand$KafkaConsumerGroupService.collectConsumerAssignment(ConsumerGroupCommand.scala:544)
> at 
> kafka.admin.ConsumerGroupCommand$KafkaConsumerGroupService$$anonfun$10$$anonfun$13.apply(ConsumerGroupCommand.scala:571)
> at 
> kafka.admin.ConsumerGroupCommand$KafkaConsumerGroupService$$anonfun$10$$anonfun$13.apply(ConsumerGroupCommand.scala:565)
> at scala.collection.immutable.List.flatMap(List.scala:338)
> at 
> kafka.admin.ConsumerGroupCommand$KafkaConsumerGroupService$$anonfun$10.apply(ConsumerGroupCommand.scala:565)
> at 
> kafka.admin.ConsumerGroupCommand$KafkaConsumerGroupService$$anonfun$10.apply(ConsumerGroupCommand.scala:558)
> at scala.Option.map(Option.scala:146)
> at 
> kafka.admin.ConsumerGroupCommand$KafkaConsumerGroupService.collectGroupOffsets(ConsumerGroupCommand.scala:558)
> at 
> kafka.admin.ConsumerGroupCommand$ConsumerGroupService$class.describeGroup(ConsumerGroupCommand.scala:271)
> at 
> kafka.admin.ConsumerGroupCommand$KafkaConsumerGroupService.describeGroup(ConsumerGroupCommand.scala:544)
> at kafka.admin.ConsumerGroupCommand$.main(ConsumerGroupCommand.scala:77)
> at kafka.admin.ConsumerGroupCommand.main(ConsumerGroupCommand.scala)
> [2018-06-12 01:32:34,255] DEBUG Removed sensor with name connections-closed: 
> (org.apache.kafka.common.metrics.Metrics){code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-7152) replica should be in-sync if its LEO equals leader's LEO

2018-08-18 Thread Ismael Juma (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16584849#comment-16584849
 ] 

Ismael Juma commented on KAFKA-7152:


[~junrao], are you happy with Dong's explanation?

> replica should be in-sync if its LEO equals leader's LEO
> 
>
> Key: KAFKA-7152
> URL: https://issues.apache.org/jira/browse/KAFKA-7152
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Dong Lin
>Assignee: Zhanxiang (Patrick) Huang
>Priority: Major
> Fix For: 2.1.0
>
>
> Currently a replica will be moved out of ISR if follower has not fetched from 
> leader for 10 sec (default replica.lag.time.max.ms). This cases problem in 
> the following scenario:
> Say follower's ReplicaFetchThread needs to fetch 2k partitions from the 
> leader broker. Only 100 out of 2k partitions are actively being produced to 
> and therefore the total bytes in rate for those 2k partitions are small. The 
> following will happen:
>  
> 1) The follower's ReplicaFetcherThread sends FetchRequest for those 2k 
> partitions.
> 2) Because the total bytes-in-rate for those 2k partitions is very small, 
> follower is able to catch up and leader broker adds these 2k partitions to 
> ISR. Follower's lastCaughtUpTimeMs for all partitions are updated to the 
> current time T0.
> 3) Since follower has caught up for all 2k partitions, leader updates 2k 
> partition znodes to include the follower in the ISR. It may take 20 seconds 
> to write 2k partition znodes if each znode write operation takes 10 ms.
> 4) At T0 + 15, maybeShrinkIsr() is invoked on leader broker. Since there is 
> no FetchRequet from the follower for more than 10 seconds after T0, all those 
> 2k partitions will be considered as out of syn and the follower will be 
> removed from ISR.
> 5) The follower receives FetchResponse at least 20 seconds after T0. That 
> means the next FetchRequest from follower to leader will be after T0 + 20.
> The sequence of events described above will loop over time. There will be 
> constant churn of URP in the cluster even if follower can catch up with 
> leader's byte-in-rate. This reduces the cluster availability.
>  
> In order to address this problem, one simple approach is to keep follower in 
> the ISR as long as follower's LEO equals leader's LEO regardless of 
> follower's lastCaughtUpTimeMs. This is particularly useful if there are a lot 
> of inactive partitions in the cluster.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-7304) memory leakage in org.apache.kafka.common.network.Selector

2018-08-18 Thread Ismael Juma (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16584847#comment-16584847
 ] 

Ismael Juma commented on KAFKA-7304:


I raised the priority of this issue until we know what's going on. 
[~yuzhih...@gmail.com], that's interesting. The thing we need to figure out is 
why we have a connection in both `channels` and `closingChannels`. It should be 
in one or the other and we do the right thing if that is the case. [~yuyang08], 
do you see any errors in the broker logs?

> memory leakage in org.apache.kafka.common.network.Selector
> --
>
> Key: KAFKA-7304
> URL: https://issues.apache.org/jira/browse/KAFKA-7304
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.1.0
>Reporter: Yu Yang
>Priority: Critical
> Fix For: 1.1.2, 2.0.1, 2.1.0
>
> Attachments: Screen Shot 2018-08-16 at 11.04.16 PM.png, Screen Shot 
> 2018-08-16 at 11.06.38 PM.png, Screen Shot 2018-08-16 at 12.41.26 PM.png, 
> Screen Shot 2018-08-16 at 4.26.19 PM.png, Screen Shot 2018-08-17 at 1.03.35 
> AM.png, Screen Shot 2018-08-17 at 1.04.32 AM.png, Screen Shot 2018-08-17 at 
> 1.05.30 AM.png
>
>
> We are testing secured writing to kafka through ssl. Testing at small scale, 
> ssl writing to kafka was fine. However, when we enabled ssl writing at a 
> larger scale (>40k clients write concurrently), the kafka brokers soon hit 
> OutOfMemory issue with 4G memory setting. We have tried with increasing the 
> heap size to 10Gb, but encountered the same issue. 
> We took a few heap dumps , and found that most of the heap memory is 
> referenced through org.apache.kafka.common.network.Selector objects.  There 
> are two Channel maps field in Selector. It seems that somehow the objects is 
> not deleted from the map in a timely manner. 
> One observation is that the memory leak seems relate to kafka partition 
> leader changes. If there is broker restart etc. in the cluster that caused 
> partition leadership change, the brokers may hit the OOM issue faster. 
> {code}
> private final Map channels;
> private final Map closingChannels;
> {code}
> Please see the  attached images and the following link for sample gc 
> analysis. 
> http://gceasy.io/my-gc-report.jsp?p=c2hhcmVkLzIwMTgvMDgvMTcvLS1nYy5sb2cuMC5jdXJyZW50Lmd6LS0yLTM5LTM0
> the command line for running kafka: 
> {code}
> java -Xms10g -Xmx10g -XX:NewSize=512m -XX:MaxNewSize=512m 
> -Xbootclasspath/p:/usr/local/libs/bcp -XX:MetaspaceSize=128m -XX:+UseG1GC 
> -XX:MaxGCPauseMillis=25 -XX:InitiatingHeapOccupancyPercent=35 
> -XX:G1HeapRegionSize=16M -XX:MinMetaspaceFreeRatio=25 
> -XX:MaxMetaspaceFreeRatio=75 -XX:+PrintGCDetails -XX:+PrintGCDateStamps 
> -XX:+PrintTenuringDistribution -Xloggc:/var/log/kafka/gc.log 
> -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=40 -XX:GCLogFileSize=50M 
> -Djava.awt.headless=true 
> -Dlog4j.configuration=file:/etc/kafka/log4j.properties 
> -Dcom.sun.management.jmxremote 
> -Dcom.sun.management.jmxremote.authenticate=false 
> -Dcom.sun.management.jmxremote.ssl=false 
> -Dcom.sun.management.jmxremote.port= 
> -Dcom.sun.management.jmxremote.rmi.port= -cp /usr/local/libs/*  
> kafka.Kafka /etc/kafka/server.properties
> {code}
> We use java 1.8.0_102, and has applied a TLS patch on reducing 
> X509Factory.certCache map size from 750 to 20. 
> {code}
> java -version
> java version "1.8.0_102"
> Java(TM) SE Runtime Environment (build 1.8.0_102-b14)
> Java HotSpot(TM) 64-Bit Server VM (build 25.102-b14, mixed mode)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-7304) memory leakage in org.apache.kafka.common.network.Selector

2018-08-18 Thread Ismael Juma (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-7304:
---
Priority: Critical  (was: Major)

> memory leakage in org.apache.kafka.common.network.Selector
> --
>
> Key: KAFKA-7304
> URL: https://issues.apache.org/jira/browse/KAFKA-7304
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.1.0
>Reporter: Yu Yang
>Priority: Critical
> Fix For: 1.1.2, 2.0.1, 2.1.0
>
> Attachments: Screen Shot 2018-08-16 at 11.04.16 PM.png, Screen Shot 
> 2018-08-16 at 11.06.38 PM.png, Screen Shot 2018-08-16 at 12.41.26 PM.png, 
> Screen Shot 2018-08-16 at 4.26.19 PM.png, Screen Shot 2018-08-17 at 1.03.35 
> AM.png, Screen Shot 2018-08-17 at 1.04.32 AM.png, Screen Shot 2018-08-17 at 
> 1.05.30 AM.png
>
>
> We are testing secured writing to kafka through ssl. Testing at small scale, 
> ssl writing to kafka was fine. However, when we enabled ssl writing at a 
> larger scale (>40k clients write concurrently), the kafka brokers soon hit 
> OutOfMemory issue with 4G memory setting. We have tried with increasing the 
> heap size to 10Gb, but encountered the same issue. 
> We took a few heap dumps , and found that most of the heap memory is 
> referenced through org.apache.kafka.common.network.Selector objects.  There 
> are two Channel maps field in Selector. It seems that somehow the objects is 
> not deleted from the map in a timely manner. 
> One observation is that the memory leak seems relate to kafka partition 
> leader changes. If there is broker restart etc. in the cluster that caused 
> partition leadership change, the brokers may hit the OOM issue faster. 
> {code}
> private final Map channels;
> private final Map closingChannels;
> {code}
> Please see the  attached images and the following link for sample gc 
> analysis. 
> http://gceasy.io/my-gc-report.jsp?p=c2hhcmVkLzIwMTgvMDgvMTcvLS1nYy5sb2cuMC5jdXJyZW50Lmd6LS0yLTM5LTM0
> the command line for running kafka: 
> {code}
> java -Xms10g -Xmx10g -XX:NewSize=512m -XX:MaxNewSize=512m 
> -Xbootclasspath/p:/usr/local/libs/bcp -XX:MetaspaceSize=128m -XX:+UseG1GC 
> -XX:MaxGCPauseMillis=25 -XX:InitiatingHeapOccupancyPercent=35 
> -XX:G1HeapRegionSize=16M -XX:MinMetaspaceFreeRatio=25 
> -XX:MaxMetaspaceFreeRatio=75 -XX:+PrintGCDetails -XX:+PrintGCDateStamps 
> -XX:+PrintTenuringDistribution -Xloggc:/var/log/kafka/gc.log 
> -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=40 -XX:GCLogFileSize=50M 
> -Djava.awt.headless=true 
> -Dlog4j.configuration=file:/etc/kafka/log4j.properties 
> -Dcom.sun.management.jmxremote 
> -Dcom.sun.management.jmxremote.authenticate=false 
> -Dcom.sun.management.jmxremote.ssl=false 
> -Dcom.sun.management.jmxremote.port= 
> -Dcom.sun.management.jmxremote.rmi.port= -cp /usr/local/libs/*  
> kafka.Kafka /etc/kafka/server.properties
> {code}
> We use java 1.8.0_102, and has applied a TLS patch on reducing 
> X509Factory.certCache map size from 750 to 20. 
> {code}
> java -version
> java version "1.8.0_102"
> Java(TM) SE Runtime Environment (build 1.8.0_102-b14)
> Java HotSpot(TM) 64-Bit Server VM (build 25.102-b14, mixed mode)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-7304) memory leakage in org.apache.kafka.common.network.Selector

2018-08-18 Thread Ismael Juma (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-7304:
---
Fix Version/s: 2.1.0
   2.0.1
   1.1.2

> memory leakage in org.apache.kafka.common.network.Selector
> --
>
> Key: KAFKA-7304
> URL: https://issues.apache.org/jira/browse/KAFKA-7304
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.1.0
>Reporter: Yu Yang
>Priority: Critical
> Fix For: 1.1.2, 2.0.1, 2.1.0
>
> Attachments: Screen Shot 2018-08-16 at 11.04.16 PM.png, Screen Shot 
> 2018-08-16 at 11.06.38 PM.png, Screen Shot 2018-08-16 at 12.41.26 PM.png, 
> Screen Shot 2018-08-16 at 4.26.19 PM.png, Screen Shot 2018-08-17 at 1.03.35 
> AM.png, Screen Shot 2018-08-17 at 1.04.32 AM.png, Screen Shot 2018-08-17 at 
> 1.05.30 AM.png
>
>
> We are testing secured writing to kafka through ssl. Testing at small scale, 
> ssl writing to kafka was fine. However, when we enabled ssl writing at a 
> larger scale (>40k clients write concurrently), the kafka brokers soon hit 
> OutOfMemory issue with 4G memory setting. We have tried with increasing the 
> heap size to 10Gb, but encountered the same issue. 
> We took a few heap dumps , and found that most of the heap memory is 
> referenced through org.apache.kafka.common.network.Selector objects.  There 
> are two Channel maps field in Selector. It seems that somehow the objects is 
> not deleted from the map in a timely manner. 
> One observation is that the memory leak seems relate to kafka partition 
> leader changes. If there is broker restart etc. in the cluster that caused 
> partition leadership change, the brokers may hit the OOM issue faster. 
> {code}
> private final Map channels;
> private final Map closingChannels;
> {code}
> Please see the  attached images and the following link for sample gc 
> analysis. 
> http://gceasy.io/my-gc-report.jsp?p=c2hhcmVkLzIwMTgvMDgvMTcvLS1nYy5sb2cuMC5jdXJyZW50Lmd6LS0yLTM5LTM0
> the command line for running kafka: 
> {code}
> java -Xms10g -Xmx10g -XX:NewSize=512m -XX:MaxNewSize=512m 
> -Xbootclasspath/p:/usr/local/libs/bcp -XX:MetaspaceSize=128m -XX:+UseG1GC 
> -XX:MaxGCPauseMillis=25 -XX:InitiatingHeapOccupancyPercent=35 
> -XX:G1HeapRegionSize=16M -XX:MinMetaspaceFreeRatio=25 
> -XX:MaxMetaspaceFreeRatio=75 -XX:+PrintGCDetails -XX:+PrintGCDateStamps 
> -XX:+PrintTenuringDistribution -Xloggc:/var/log/kafka/gc.log 
> -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=40 -XX:GCLogFileSize=50M 
> -Djava.awt.headless=true 
> -Dlog4j.configuration=file:/etc/kafka/log4j.properties 
> -Dcom.sun.management.jmxremote 
> -Dcom.sun.management.jmxremote.authenticate=false 
> -Dcom.sun.management.jmxremote.ssl=false 
> -Dcom.sun.management.jmxremote.port= 
> -Dcom.sun.management.jmxremote.rmi.port= -cp /usr/local/libs/*  
> kafka.Kafka /etc/kafka/server.properties
> {code}
> We use java 1.8.0_102, and has applied a TLS patch on reducing 
> X509Factory.certCache map size from 750 to 20. 
> {code}
> java -version
> java version "1.8.0_102"
> Java(TM) SE Runtime Environment (build 1.8.0_102-b14)
> Java HotSpot(TM) 64-Bit Server VM (build 25.102-b14, mixed mode)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-7304) memory leakage in org.apache.kafka.common.network.Selector

2018-08-18 Thread Ismael Juma (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16584843#comment-16584843
 ] 

Ismael Juma commented on KAFKA-7304:


Yes, I was thinking of KAFKA-6529.

> memory leakage in org.apache.kafka.common.network.Selector
> --
>
> Key: KAFKA-7304
> URL: https://issues.apache.org/jira/browse/KAFKA-7304
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.1.0
>Reporter: Yu Yang
>Priority: Major
> Attachments: Screen Shot 2018-08-16 at 11.04.16 PM.png, Screen Shot 
> 2018-08-16 at 11.06.38 PM.png, Screen Shot 2018-08-16 at 12.41.26 PM.png, 
> Screen Shot 2018-08-16 at 4.26.19 PM.png, Screen Shot 2018-08-17 at 1.03.35 
> AM.png, Screen Shot 2018-08-17 at 1.04.32 AM.png, Screen Shot 2018-08-17 at 
> 1.05.30 AM.png
>
>
> We are testing secured writing to kafka through ssl. Testing at small scale, 
> ssl writing to kafka was fine. However, when we enabled ssl writing at a 
> larger scale (>40k clients write concurrently), the kafka brokers soon hit 
> OutOfMemory issue with 4G memory setting. We have tried with increasing the 
> heap size to 10Gb, but encountered the same issue. 
> We took a few heap dumps , and found that most of the heap memory is 
> referenced through org.apache.kafka.common.network.Selector objects.  There 
> are two Channel maps field in Selector. It seems that somehow the objects is 
> not deleted from the map in a timely manner. 
> One observation is that the memory leak seems relate to kafka partition 
> leader changes. If there is broker restart etc. in the cluster that caused 
> partition leadership change, the brokers may hit the OOM issue faster. 
> {code}
> private final Map channels;
> private final Map closingChannels;
> {code}
> Please see the  attached images and the following link for sample gc 
> analysis. 
> http://gceasy.io/my-gc-report.jsp?p=c2hhcmVkLzIwMTgvMDgvMTcvLS1nYy5sb2cuMC5jdXJyZW50Lmd6LS0yLTM5LTM0
> the command line for running kafka: 
> {code}
> java -Xms10g -Xmx10g -XX:NewSize=512m -XX:MaxNewSize=512m 
> -Xbootclasspath/p:/usr/local/libs/bcp -XX:MetaspaceSize=128m -XX:+UseG1GC 
> -XX:MaxGCPauseMillis=25 -XX:InitiatingHeapOccupancyPercent=35 
> -XX:G1HeapRegionSize=16M -XX:MinMetaspaceFreeRatio=25 
> -XX:MaxMetaspaceFreeRatio=75 -XX:+PrintGCDetails -XX:+PrintGCDateStamps 
> -XX:+PrintTenuringDistribution -Xloggc:/var/log/kafka/gc.log 
> -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=40 -XX:GCLogFileSize=50M 
> -Djava.awt.headless=true 
> -Dlog4j.configuration=file:/etc/kafka/log4j.properties 
> -Dcom.sun.management.jmxremote 
> -Dcom.sun.management.jmxremote.authenticate=false 
> -Dcom.sun.management.jmxremote.ssl=false 
> -Dcom.sun.management.jmxremote.port= 
> -Dcom.sun.management.jmxremote.rmi.port= -cp /usr/local/libs/*  
> kafka.Kafka /etc/kafka/server.properties
> {code}
> We use java 1.8.0_102, and has applied a TLS patch on reducing 
> X509Factory.certCache map size from 750 to 20. 
> {code}
> java -version
> java version "1.8.0_102"
> Java(TM) SE Runtime Environment (build 1.8.0_102-b14)
> Java HotSpot(TM) 64-Bit Server VM (build 25.102-b14, mixed mode)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KAFKA-7308) Fix rat and checkstyle plugins configuration for Java 11 support

2018-08-18 Thread Ismael Juma (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma resolved KAFKA-7308.

   Resolution: Fixed
Fix Version/s: 2.1.0

> Fix rat and checkstyle plugins configuration for Java 11 support
> 
>
> Key: KAFKA-7308
> URL: https://issues.apache.org/jira/browse/KAFKA-7308
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Ismael Juma
>Assignee: Ismael Juma
>Priority: Major
> Fix For: 2.1.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-7308) Fix rat and checkstyle plugins configuration for Java 11 support

2018-08-18 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16584814#comment-16584814
 ] 

ASF GitHub Bot commented on KAFKA-7308:
---

ijuma closed pull request #5529: KAFKA-7308: Fix rat and checkstyle config for 
Java 11 support
URL: https://github.com/apache/kafka/pull/5529
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/build.gradle b/build.gradle
index f8dfeb7fc72..0892ed19402 100644
--- a/build.gradle
+++ b/build.gradle
@@ -109,12 +109,12 @@ ext {
 
 apply from: file('wrapper.gradle')
 
-if (new File('.git').exists()) {
+if (file('.git').exists()) {
   apply from: file('gradle/rat.gradle')
   rat {
 // Exclude everything under the directory that git should be ignoring via 
.gitignore or that isn't checked in. These
 // restrict us only to files that are checked in or are staged.
-def repo = Grgit.open(project.file('.'))
+def repo = Grgit.open(project.getRootDir())
 excludes = new ArrayList(repo.clean(ignore: false, directories: 
true, dryRun: true))
 // And some of the files that we have checked in should also be excluded 
from this check
 excludes.addAll([
@@ -359,7 +359,7 @@ subprojects {
 
   checkstyle {
 configFile = new File(rootDir, "checkstyle/checkstyle.xml")
-configProperties = [importControlFile: 
"$rootDir/checkstyle/import-control.xml"]
+configProperties = checkstyleConfigProperties("import-control.xml")
 toolVersion = '8.10'
   }
   test.dependsOn('checkstyleMain', 'checkstyleTest')
@@ -444,6 +444,12 @@ def fineTuneEclipseClasspathFile(eclipse, project) {
   }
 }
 
+def checkstyleConfigProperties(configFileName) {
+  [importControlFile: "$rootDir/checkstyle/$configFileName",
+   suppressionsFile: "$rootDir/checkstyle/suppressions.xml",
+   headerFile: "$rootDir/checkstyle/java.header"]
+}
+
 // Aggregates all jacoco results into the root project directory
 task jacocoRootReport(type: org.gradle.testing.jacoco.tasks.JacocoReport) {
   def javaProjects = subprojects.findAll { it.path != ':core' }
@@ -775,7 +781,7 @@ project(':core') {
   systemTestLibs.dependsOn('jar', 'testJar', 'copyDependantTestLibs')
 
   checkstyle {
-configProperties = [importControlFile: 
"$rootDir/checkstyle/import-control-core.xml"]
+configProperties = checkstyleConfigProperties("import-control-core.xml")
   }
 }
 
@@ -791,7 +797,7 @@ project(':examples') {
   }
 
   checkstyle {
-configProperties = [importControlFile: 
"$rootDir/checkstyle/import-control-core.xml"]
+configProperties = checkstyleConfigProperties("import-control-core.xml")
   }
 }
 
diff --git a/checkstyle/checkstyle.xml b/checkstyle/checkstyle.xml
index 6eb1a82dde0..13cfdb82bd0 100644
--- a/checkstyle/checkstyle.xml
+++ b/checkstyle/checkstyle.xml
@@ -25,7 +25,7 @@
 
   
   
-
+
   
 
   
@@ -137,6 +137,6 @@
   
 
   
-
+
   
 
diff --git a/gradle/rat.gradle b/gradle/rat.gradle
index a51876c23ea..513e9520ed6 100644
--- a/gradle/rat.gradle
+++ b/gradle/rat.gradle
@@ -17,9 +17,6 @@
  * under the License.
  */
 
-import org.gradle.api.Plugin
-import org.gradle.api.Project
-import org.gradle.api.Task
 import org.gradle.api.internal.project.IsolatedAntBuilder
 
 apply plugin: RatPlugin
@@ -28,18 +25,19 @@ class RatTask extends DefaultTask {
   @Input
   List excludes
 
-  def reportPath = 'build/rat'
-  def stylesheet = 'gradle/resources/rat-output-to-html.xsl'
-  def xmlReport = reportPath + '/rat-report.xml'
-  def htmlReport = reportPath + '/rat-report.html'
+  def reportDir = project.file('build/rat')
+  def stylesheet = 
project.file('gradle/resources/rat-output-to-html.xsl').getAbsolutePath()
+  def xmlReport = new File(reportDir, 'rat-report.xml')
+  def htmlReport = new File(reportDir, 'rat-report.html')
 
   def generateXmlReport(File reportDir) {
 def antBuilder = services.get(IsolatedAntBuilder)
 def ratClasspath = project.configurations.rat
+def projectPath = project.getRootDir().getAbsolutePath()
 antBuilder.withClasspath(ratClasspath).execute {
   ant.taskdef(resource: 'org/apache/rat/anttasks/antlib.xml')
   ant.report(format: 'xml', reportFile: xmlReport) {
-fileset(dir: ".") {
+fileset(dir: projectPath) {
   patternset {
 excludes.each {
   exclude(name: it)
@@ -80,7 +78,6 @@ class RatTask extends DefaultTask {
 
   @TaskAction
   def rat() {
-File reportDir = new File(reportPath)
 if (!reportDir.exists()) {
   reportDir.mkdirs()
 }


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go 

[jira] [Updated] (KAFKA-7264) Support Java 11

2018-08-18 Thread Ismael Juma (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-7264:
---
Description: Java 11 is the next LTS release and it should be released by 
the end of September. Kafka should ideally support that in the 2.1.0 release. A 
few known issues/requirements have been captured via subtasks.  (was: Java 11 
is the next LTS release and it should be released by the end of September. 
Kafka should ideally support that in the 2.1.0 release. A few known 
issues/requirements.)

> Support Java 11
> ---
>
> Key: KAFKA-7264
> URL: https://issues.apache.org/jira/browse/KAFKA-7264
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Ismael Juma
>Priority: Major
> Fix For: 2.1.0
>
>
> Java 11 is the next LTS release and it should be released by the end of 
> September. Kafka should ideally support that in the 2.1.0 release. A few 
> known issues/requirements have been captured via subtasks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-7264) Support Java 11

2018-08-18 Thread Ismael Juma (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-7264:
---
Description: Java 11 is the next LTS release and it should be released by 
the end of September. Kafka should ideally support that in the 2.1.0 release. A 
few known issues/requirements.  (was: Java 11 is the next LTS release and it 
should be released by the end of September. Kafka should ideally support that 
in the 2.1.0 release. A few known issues/requirements:

* A new version of EasyMock with a new version of cglib and ASM7_EXPERIMENTAL 
(or ASM7 when Java 11 ships) enabled (both projects have the necessary code 
changes, but they are not included in released versions).
* The rat and checkstyle gradle plugin configuration rely on the ability to 
mutate user.dir, but that's not allowed in Java 11. So they fail.
* A new version of Jacoco is needed, the changes required are not in a released 
version yet.
* Many SSL tests in clients fail. Probably related to the TLS 1.3 changes in 
Java 11.
* Many SSL (and some SASL) tests in Core fail. Maybe same underlying reason as 
the clients failures, but I didn't investigate.)

> Support Java 11
> ---
>
> Key: KAFKA-7264
> URL: https://issues.apache.org/jira/browse/KAFKA-7264
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Ismael Juma
>Priority: Major
> Fix For: 2.1.0
>
>
> Java 11 is the next LTS release and it should be released by the end of 
> September. Kafka should ideally support that in the 2.1.0 release. A few 
> known issues/requirements.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-7308) Fix rat and checkstyle plugins configuration for Java 11 support

2018-08-18 Thread Ismael Juma (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-7308:
---
Description: (was: A new version of EasyMock with a new version of 
cglib and ASM7_EXPERIMENTAL (or ASM7 when Java 11 ships) enabled (both projects 
have the necessary code changes, but they are not included in released 
versions).)

> Fix rat and checkstyle plugins configuration for Java 11 support
> 
>
> Key: KAFKA-7308
> URL: https://issues.apache.org/jira/browse/KAFKA-7308
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Ismael Juma
>Assignee: Ismael Juma
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-7307) Upgrade EasyMock for Java 11 support

2018-08-18 Thread Ismael Juma (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-7307:
---
Description: A new version of EasyMock with a new version of cglib and 
ASM7_EXPERIMENTAL (or ASM7 when Java 11 ships) enabled (both projects have the 
necessary code changes, but they are not included in released versions).

> Upgrade EasyMock for Java 11 support
> 
>
> Key: KAFKA-7307
> URL: https://issues.apache.org/jira/browse/KAFKA-7307
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Ismael Juma
>Priority: Major
> Fix For: 2.1.0
>
>
> A new version of EasyMock with a new version of cglib and ASM7_EXPERIMENTAL 
> (or ASM7 when Java 11 ships) enabled (both projects have the necessary code 
> changes, but they are not included in released versions).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-7310) Fix SSL tests when running with Java 11

2018-08-18 Thread Ismael Juma (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-7310:
---
Description: 
* Many SSL tests in clients fail. Probably related to the TLS 1.3 changes in 
Java 11.
* Many SSL (and some SASL) tests in Core fail. Maybe same underlying reason as 
the clients failures, but I didn't investigate.

> Fix SSL tests when running with Java 11
> ---
>
> Key: KAFKA-7310
> URL: https://issues.apache.org/jira/browse/KAFKA-7310
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Ismael Juma
>Priority: Critical
> Fix For: 2.1.0
>
>
> * Many SSL tests in clients fail. Probably related to the TLS 1.3 changes in 
> Java 11.
> * Many SSL (and some SASL) tests in Core fail. Maybe same underlying reason 
> as the clients failures, but I didn't investigate.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-7308) Fix rat and checkstyle plugins configuration for Java 11 support

2018-08-18 Thread Ismael Juma (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-7308:
---
Description: A new version of EasyMock with a new version of cglib and 
ASM7_EXPERIMENTAL (or ASM7 when Java 11 ships) enabled (both projects have the 
necessary code changes, but they are not included in released versions).

> Fix rat and checkstyle plugins configuration for Java 11 support
> 
>
> Key: KAFKA-7308
> URL: https://issues.apache.org/jira/browse/KAFKA-7308
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Ismael Juma
>Assignee: Ismael Juma
>Priority: Major
>
> A new version of EasyMock with a new version of cglib and ASM7_EXPERIMENTAL 
> (or ASM7 when Java 11 ships) enabled (both projects have the necessary code 
> changes, but they are not included in released versions).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-7251) Add support for TLS 1.3

2018-08-18 Thread Ismael Juma (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-7251:
---
Description: 
Java 11 adds support for TLS 1.3. We should support this after we add support 
for Java 11.

Related issues:

[https://bugs.openjdk.java.net/browse/JDK-8206170]

[https://bugs.openjdk.java.net/browse/JDK-8206178]

[https://bugs.openjdk.java.net/browse/JDK-8208538]

[https://bugs.openjdk.java.net/browse/JDK-8207009]

 

 

  was:
Java 11 adds support for TLS 1.3. We should support this after add support for 
Java 11.

Related issues:

[https://bugs.openjdk.java.net/browse/JDK-8206170]

[https://bugs.openjdk.java.net/browse/JDK-8206178]

[https://bugs.openjdk.java.net/browse/JDK-8208538]

[https://bugs.openjdk.java.net/browse/JDK-8207009]

 

 


> Add support for TLS 1.3
> ---
>
> Key: KAFKA-7251
> URL: https://issues.apache.org/jira/browse/KAFKA-7251
> Project: Kafka
>  Issue Type: New Feature
>  Components: security
>Reporter: Rajini Sivaram
>Priority: Major
> Fix For: 2.1.0
>
>
> Java 11 adds support for TLS 1.3. We should support this after we add support 
> for Java 11.
> Related issues:
> [https://bugs.openjdk.java.net/browse/JDK-8206170]
> [https://bugs.openjdk.java.net/browse/JDK-8206178]
> [https://bugs.openjdk.java.net/browse/JDK-8208538]
> [https://bugs.openjdk.java.net/browse/JDK-8207009]
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-7251) Add support for TLS 1.3

2018-08-18 Thread Ismael Juma (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-7251:
---
Description: 
Java 11 adds support for TLS 1.3. We should support this after add support for 
Java 11.

Related issues:

[https://bugs.openjdk.java.net/browse/JDK-8206170]

[https://bugs.openjdk.java.net/browse/JDK-8206178]

[https://bugs.openjdk.java.net/browse/JDK-8208538]

[https://bugs.openjdk.java.net/browse/JDK-8207009]

 

 

  was:
Java 11 adds support for TLS 1.3. We should support this when we add support 
for Java 11.

Related issues:

[https://bugs.openjdk.java.net/browse/JDK-8206170]

[https://bugs.openjdk.java.net/browse/JDK-8206178]

[https://bugs.openjdk.java.net/browse/JDK-8208538]

[https://bugs.openjdk.java.net/browse/JDK-8207009]

 

 


> Add support for TLS 1.3
> ---
>
> Key: KAFKA-7251
> URL: https://issues.apache.org/jira/browse/KAFKA-7251
> Project: Kafka
>  Issue Type: New Feature
>  Components: security
>Reporter: Rajini Sivaram
>Priority: Major
> Fix For: 2.1.0
>
>
> Java 11 adds support for TLS 1.3. We should support this after add support 
> for Java 11.
> Related issues:
> [https://bugs.openjdk.java.net/browse/JDK-8206170]
> [https://bugs.openjdk.java.net/browse/JDK-8206178]
> [https://bugs.openjdk.java.net/browse/JDK-8208538]
> [https://bugs.openjdk.java.net/browse/JDK-8207009]
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-7310) Fix SSL tests when running with Java 11

2018-08-18 Thread Ismael Juma (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16584784#comment-16584784
 ] 

Ismael Juma commented on KAFKA-7310:


Possible underlying reason:

https://bugs.openjdk.java.net/browse/JDK-8207317

We should test again once that's fixed.

> Fix SSL tests when running with Java 11
> ---
>
> Key: KAFKA-7310
> URL: https://issues.apache.org/jira/browse/KAFKA-7310
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Ismael Juma
>Priority: Critical
> Fix For: 2.1.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)