[jira] [Commented] (KAFKA-15407) Not able to connect to kafka from the Private NLB from outside the VPC account

2023-08-28 Thread Shivakumar (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-15407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17759487#comment-17759487
 ] 

Shivakumar commented on KAFKA-15407:


[~viktorsomogyi] can you please help us with this issue 

 

> Not able to connect to kafka from the Private NLB from outside the VPC 
> account 
> ---
>
> Key: KAFKA-15407
> URL: https://issues.apache.org/jira/browse/KAFKA-15407
> Project: Kafka
>  Issue Type: Bug
>  Components: clients, connect, consumer, producer , protocol
> Environment: Staging, PROD
>Reporter: Shivakumar
>Priority: Blocker
> Attachments: image-2023-08-28-12-37-33-100.png
>
>
> !image-2023-08-28-12-37-33-100.png|width=768,height=223!
> Problem statement : 
> We are trying to connect Kafka from another account/VPC account
> Our kafka is in EKS cluster , we have service pointing to these pods for 
> connection
> We tried to create private link endpoint form Account B to connect to our NLB 
> to connect to our Kafka in Account A
> We see the connection reset from both client and target(kafka) in the NLB 
> monitoring tab of AWS.
> We tried various combo of listeners and advertised listeners which did not 
> help us.
> We are assuming we are missing some combination of Listeners and Network 
> level configs with which this connection can be made 
> Can you please guide us with this as we are blocked with a major migration. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-15407) Not able to connect to kafka from the Private NLB from outside the VPC account

2023-08-28 Thread Shivakumar (Jira)
Shivakumar created KAFKA-15407:
--

 Summary: Not able to connect to kafka from the Private NLB from 
outside the VPC account 
 Key: KAFKA-15407
 URL: https://issues.apache.org/jira/browse/KAFKA-15407
 Project: Kafka
  Issue Type: Bug
  Components: clients, connect, consumer, producer , protocol
 Environment: Staging, PROD
Reporter: Shivakumar
 Attachments: image-2023-08-28-12-37-33-100.png

!image-2023-08-28-12-37-33-100.png|width=768,height=223!

Problem statement : 
We are trying to connect Kafka from another account/VPC account
Our kafka is in EKS cluster , we have service pointing to these pods for 
connection

We tried to create private link endpoint form Account B to connect to our NLB 
to connect to our Kafka in Account A
We see the connection reset from both client and target(kafka) in the NLB 
monitoring tab of AWS.
We tried various combo of listeners and advertised listeners which did not help 
us.

We are assuming we are missing some combination of Listeners and Network level 
configs with which this connection can be made 
Can you please guide us with this as we are blocked with a major migration. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-13805) Upgrade vulnerable dependencies march 2022

2022-11-01 Thread Shivakumar (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-13805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shivakumar updated KAFKA-13805:
---
Reviewer: Abhishek

> Upgrade vulnerable dependencies march 2022
> --
>
> Key: KAFKA-13805
> URL: https://issues.apache.org/jira/browse/KAFKA-13805
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 2.8.1, 3.0.1
>Reporter: Shivakumar
>Priority: Blocker
>  Labels: secutiry
>
> https://nvd.nist.gov/vuln/detail/CVE-2020-36518
> |Packages|Package Version|CVSS|Fix Status|
> |com.fasterxml.jackson.core_jackson-databind| 2.10.5.1| 7.5|fixed in 2.13.2.1|
> |com.fasterxml.jackson.core_jackson-databind|2.13.1|7.5|fixed in 2.13.2.1|
> Our security scan detected the above vulnerabilities
> upgrade to correct versions for fixing vulnerabilities



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-14100) Upgrade vulnerable dependencies Kafka version 3.1.1 July 2022

2022-07-25 Thread Shivakumar (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-14100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shivakumar updated KAFKA-14100:
---
Description: 
||CVE ID ||Type||Severity||Packages||Package Version||CVSS||Fix Status||
|CVE-2022-2048 |java|high|org.eclipse.jetty_jetty-io|9.4.44.v20210927|7.5|fixed 
in 11.0.9, 10.0.9, 9.4.47|

Our security scan detected the above vulnerabilities

upgrade to correct versions for fixing vulnerabilities

  was:
|Packages|Package Version|CVSS|Fix Status|
|com.fasterxml.jackson.core_jackson-databind| 2.10.5.1| 7.5| fixed in 2.14, 
2.13.1, 2.12.6|
| | | | |

Our security scan detected the above vulnerabilities

upgrade to correct versions for fixing vulnerabilities


> Upgrade vulnerable dependencies Kafka version 3.1.1  July 2022
> --
>
> Key: KAFKA-14100
> URL: https://issues.apache.org/jira/browse/KAFKA-14100
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 2.8.1
>Reporter: Shivakumar
>Priority: Major
>  Labels: secutiry
> Fix For: 3.0.1, 3.2.0, 3.1.1
>
>
> ||CVE ID ||Type||Severity||Packages||Package Version||CVSS||Fix Status||
> |CVE-2022-2048 
> |java|high|org.eclipse.jetty_jetty-io|9.4.44.v20210927|7.5|fixed in 11.0.9, 
> 10.0.9, 9.4.47|
> Our security scan detected the above vulnerabilities
> upgrade to correct versions for fixing vulnerabilities



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-14100) Upgrade vulnerable dependencies Kafka version 3.1.1 July 2022

2022-07-25 Thread Shivakumar (Jira)
Shivakumar created KAFKA-14100:
--

 Summary: Upgrade vulnerable dependencies Kafka version 3.1.1  July 
2022
 Key: KAFKA-14100
 URL: https://issues.apache.org/jira/browse/KAFKA-14100
 Project: Kafka
  Issue Type: Bug
Affects Versions: 2.8.1
Reporter: Shivakumar
 Fix For: 3.0.1, 3.2.0, 3.1.1


|Packages|Package Version|CVSS|Fix Status|
|com.fasterxml.jackson.core_jackson-databind| 2.10.5.1| 7.5| fixed in 2.14, 
2.13.1, 2.12.6|
| | | | |

Our security scan detected the above vulnerabilities

upgrade to correct versions for fixing vulnerabilities



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KAFKA-13726) Fix Vulnerability CVE-2022-23181 -Upgrade org.apache.tomcat.embed_tomcat-embed-core

2022-06-14 Thread Shivakumar (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-13726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17554353#comment-17554353
 ] 

Shivakumar commented on KAFKA-13726:


getting this vulnerability in *3.1.1 Kafka* version as well.

> Fix Vulnerability CVE-2022-23181 -Upgrade 
> org.apache.tomcat.embed_tomcat-embed-core
> ---
>
> Key: KAFKA-13726
> URL: https://issues.apache.org/jira/browse/KAFKA-13726
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 2.8.1
>Reporter: Chris Sabelstrom
>Priority: Major
>
> Our security scanner detected the following vulnerablity. Please upgrade to 
> version noted in Fix Status column.
> |CVE ID|Severity|Packages|Package Version|CVSS|Fix Status|
> |CVE-2022-23181|high|org.apache.tomcat.embed_tomcat-embed-core|9.0.54|7|fixed 
> in 10.0.0, 9.0.1|



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Updated] (KAFKA-13805) Upgrade vulnerable dependencies march 2022

2022-04-07 Thread Shivakumar (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-13805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shivakumar updated KAFKA-13805:
---
Description: 
|Packages|Package Version|CVSS|Fix Status|
|com.fasterxml.jackson.core_jackson-databind| 2.10.5.1| 7.5|fixed in 2.13.0|
|com.fasterxml.jackson.core_jackson-databind|2.13.1|7.5|fixed in 2.13.0|

Our security scan detected the above vulnerabilities

upgrade to correct versions for fixing vulnerabilities

  was:
|Packages|Package Version|CVSS|Fix Status|
|com.fasterxml.jackson.core_jackson-databind| 2.10.5.1| 7.5| fixed in 2.14, 
2.13.1, 2.12.6|
| | | | |

Our security scan detected the above vulnerabilities

upgrade to correct versions for fixing vulnerabilities


> Upgrade vulnerable dependencies march 2022
> --
>
> Key: KAFKA-13805
> URL: https://issues.apache.org/jira/browse/KAFKA-13805
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 2.8.1
>Reporter: Shivakumar
>Priority: Major
>  Labels: secutiry
> Fix For: 3.0.1, 3.2.0, 3.1.1
>
>
> |Packages|Package Version|CVSS|Fix Status|
> |com.fasterxml.jackson.core_jackson-databind| 2.10.5.1| 7.5|fixed in 2.13.0|
> |com.fasterxml.jackson.core_jackson-databind|2.13.1|7.5|fixed in 2.13.0|
> Our security scan detected the above vulnerabilities
> upgrade to correct versions for fixing vulnerabilities



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (KAFKA-13805) Upgrade vulnerable dependencies march 2022

2022-04-07 Thread Shivakumar (Jira)
Shivakumar created KAFKA-13805:
--

 Summary: Upgrade vulnerable dependencies march 2022
 Key: KAFKA-13805
 URL: https://issues.apache.org/jira/browse/KAFKA-13805
 Project: Kafka
  Issue Type: Bug
Affects Versions: 2.8.1
Reporter: Shivakumar
 Fix For: 3.0.1, 3.2.0, 3.1.1


|Packages|Package Version|CVSS|Fix Status|
|com.fasterxml.jackson.core_jackson-databind| 2.10.5.1| 7.5| fixed in 2.14, 
2.13.1, 2.12.6|
| | | | |

Our security scan detected the above vulnerabilities

upgrade to correct versions for fixing vulnerabilities



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (KAFKA-13658) Upgrade vulnerable dependencies jan 2022

2022-02-08 Thread Shivakumar (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-13658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shivakumar updated KAFKA-13658:
---
Description: 
|Packages|Package Version|CVSS|Fix Status|
|com.fasterxml.jackson.core_jackson-databind| 2.10.5.1| 7.5| fixed in 2.14, 
2.13.1, 2.12.6|
| | | | |

Our security scan detected the above vulnerabilities

upgrade to correct versions for fixing vulnerabilities

  was:
|Packages|Package Version|CVSS|Fix Status|
|io.netty_netty-codec  (CVE-2021-43797)|4.1.62.Final|6.5|fixed in 4.1.71|
|org.eclipse.jetty_jetty-server|9.4.43.v20210629|5.5|fixed in 9.4.44|
|org.eclipse.jetty_jetty-servlet|9.4.43.v20210629|5.5|fixed in 9.4.44|

Our security scan detected the above vulnerabilities

upgrade to correct versions for fixing vulnerabilities


> Upgrade vulnerable dependencies jan 2022
> 
>
> Key: KAFKA-13658
> URL: https://issues.apache.org/jira/browse/KAFKA-13658
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 2.8.1
>Reporter: Shivakumar
>Assignee: Luke Chen
>Priority: Major
>  Labels: secutiry
>
> |Packages|Package Version|CVSS|Fix Status|
> |com.fasterxml.jackson.core_jackson-databind| 2.10.5.1| 7.5| fixed in 2.14, 
> 2.13.1, 2.12.6|
> | | | | |
> Our security scan detected the above vulnerabilities
> upgrade to correct versions for fixing vulnerabilities



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (KAFKA-13658) Upgrade vulnerable dependencies jan 2022

2022-02-08 Thread Shivakumar (Jira)
Shivakumar created KAFKA-13658:
--

 Summary: Upgrade vulnerable dependencies jan 2022
 Key: KAFKA-13658
 URL: https://issues.apache.org/jira/browse/KAFKA-13658
 Project: Kafka
  Issue Type: Bug
Affects Versions: 2.8.1
Reporter: Shivakumar
Assignee: Luke Chen


|Packages|Package Version|CVSS|Fix Status|
|io.netty_netty-codec  (CVE-2021-43797)|4.1.62.Final|6.5|fixed in 4.1.71|
|org.eclipse.jetty_jetty-server|9.4.43.v20210629|5.5|fixed in 9.4.44|
|org.eclipse.jetty_jetty-servlet|9.4.43.v20210629|5.5|fixed in 9.4.44|

Our security scan detected the above vulnerabilities

upgrade to correct versions for fixing vulnerabilities



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (KAFKA-13579) Upgrade vulnerable dependencies

2022-01-07 Thread Shivakumar (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-13579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shivakumar updated KAFKA-13579:
---
Summary: Upgrade vulnerable dependencies(was: Upgrade vulnerable 
dependencies for jetty packages )

> Upgrade vulnerable dependencies  
> -
>
> Key: KAFKA-13579
> URL: https://issues.apache.org/jira/browse/KAFKA-13579
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 2.8.1
>Reporter: Shivakumar
>Assignee: Luke Chen
>Priority: Major
>  Labels: secutiry
>
> |Packages|Package Version|CVSS|Fix Status|
> |io.netty_netty-codec  (CVE-2021-43797)|4.1.62.Final|6.5|fixed in 4.1.71|
> |org.eclipse.jetty_jetty-server|9.4.43.v20210629|5.5|fixed in 9.4.44|
> |org.eclipse.jetty_jetty-servlet|9.4.43.v20210629|5.5|fixed in 9.4.44|
> Our security scan detected the above vulnerabilities
> upgrade to correct versions for fixing vulnerabilities



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (KAFKA-13579) Upgrade vulnerable dependencies for jetty packages

2022-01-07 Thread Shivakumar (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-13579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shivakumar updated KAFKA-13579:
---
Description: 
|Packages|Package Version|CVSS|Fix Status|
|io.netty_netty-codec  (CVE-2021-43797)|4.1.62.Final|6.5|fixed in 4.1.71|
|org.eclipse.jetty_jetty-server|9.4.43.v20210629|5.5|fixed in 9.4.44|
|org.eclipse.jetty_jetty-servlet|9.4.43.v20210629|5.5|fixed in 9.4.44|

Our security scan detected the above vulnerabilities

upgrade to correct versions for fixing vulnerabilities

  was:
|Packages|Package Version|CVSS|Fix Status|
|org.eclipse.jetty_jetty-server|9.4.43.v20210629|5.5|fixed in 9.4.44|
|org.eclipse.jetty_jetty-servlet|9.4.43.v20210629|5.5|fixed in 9.4.44|

Our security scan detected the above vulnerabilities

upgrade to correct versions for fixing vulnerabilities


> Upgrade vulnerable dependencies for jetty packages 
> ---
>
> Key: KAFKA-13579
> URL: https://issues.apache.org/jira/browse/KAFKA-13579
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 2.8.1
>Reporter: Shivakumar
>Priority: Major
>  Labels: secutiry
>
> |Packages|Package Version|CVSS|Fix Status|
> |io.netty_netty-codec  (CVE-2021-43797)|4.1.62.Final|6.5|fixed in 4.1.71|
> |org.eclipse.jetty_jetty-server|9.4.43.v20210629|5.5|fixed in 9.4.44|
> |org.eclipse.jetty_jetty-servlet|9.4.43.v20210629|5.5|fixed in 9.4.44|
> Our security scan detected the above vulnerabilities
> upgrade to correct versions for fixing vulnerabilities



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (KAFKA-13579) Upgrade vulnerable dependencies for jetty packages

2022-01-07 Thread Shivakumar (Jira)
Shivakumar created KAFKA-13579:
--

 Summary: Upgrade vulnerable dependencies for jetty packages 
 Key: KAFKA-13579
 URL: https://issues.apache.org/jira/browse/KAFKA-13579
 Project: Kafka
  Issue Type: Bug
Affects Versions: 2.8.1
Reporter: Shivakumar


|Packages|Package Version|CVSS|Fix Status|
|org.eclipse.jetty_jetty-server|9.4.43.v20210629|5.5|fixed in 9.4.44|
|org.eclipse.jetty_jetty-servlet|9.4.43.v20210629|5.5|fixed in 9.4.44|

Our security scan detected the above vulnerabilities

upgrade to correct versions for fixing vulnerabilities



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (KAFKA-13077) Replication failing after unclean shutdown of ZK and all brokers

2021-12-18 Thread Shivakumar (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-13077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17461952#comment-17461952
 ] 

Shivakumar commented on KAFKA-13077:


[~junrao] : We have restarted the zk and also Kafka brokers, but we still end 
up with the ISR=2 and Leader=2 and other brokers are out of ISR.
This scenario is not getting resolved by waiting or by rolling restart of 
brokers. 
we are in search of a solution that does not involve data loss by deleting the 
data directory.

> Replication failing after unclean shutdown of ZK and all brokers
> 
>
> Key: KAFKA-13077
> URL: https://issues.apache.org/jira/browse/KAFKA-13077
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 2.8.0
>Reporter: Christopher Auston
>Priority: Minor
>
> I am submitting this in the spirit of what can go wrong when an operator 
> violates the constraints Kafka depends on. I don't know if Kafka could or 
> should handle this more gracefully. I decided to file this issue because it 
> was easy to get the problem I'm reporting with Kubernetes StatefulSets (STS). 
> By "easy" I mean that I did not go out of my way to corrupt anything, I just 
> was not careful when restarting ZK and brokers.
> I violated the constraints of keeping Zookeeper stable and at least one 
> running in-sync replica. 
> I am running the bitnami/kafka helm chart on Amazon EKS.
> {quote}% kubectl get po kaf-kafka-0 -ojson |jq .spec.containers'[].image'
> "docker.io/bitnami/kafka:2.8.0-debian-10-r43"
> {quote}
> I started with 3 ZK instances and 3 brokers (both STS). I changed the 
> cpu/memory requests on both STS and kubernetes proceeded to restart ZK and 
> kafka instances at the same time. If I recall correctly there were some 
> crashes and several restarts but eventually all the instances were running 
> again. It's possible all ZK nodes and all brokers were unavailable at various 
> points.
> The problem I noticed was that two of the brokers were just continually 
> spitting out messages like:
> {quote}% kubectl logs kaf-kafka-0 --tail 10
> [2021-07-13 14:26:08,871] INFO [ProducerStateManager 
> partition=__transaction_state-0] Loading producer state from snapshot file 
> 'SnapshotFile(/bitnami/kafka/data/__transaction_state-0/0001.snapshot,1)'
>  (kafka.log.ProducerStateManager)
> [2021-07-13 14:26:08,871] WARN [Log partition=__transaction_state-0, 
> dir=/bitnami/kafka/data] *Non-monotonic update of high watermark from 
> (offset=2744 segment=[0:1048644]) to (offset=1 segment=[0:169])* 
> (kafka.log.Log)
> [2021-07-13 14:26:08,874] INFO [Log partition=__transaction_state-10, 
> dir=/bitnami/kafka/data] Truncating to offset 2 (kafka.log.Log)
> [2021-07-13 14:26:08,877] INFO [Log partition=__transaction_state-10, 
> dir=/bitnami/kafka/data] Loading producer state till offset 2 with message 
> format version 2 (kafka.log.Log)
> [2021-07-13 14:26:08,877] INFO [ProducerStateManager 
> partition=__transaction_state-10] Loading producer state from snapshot file 
> 'SnapshotFile(/bitnami/kafka/data/__transaction_state-10/0002.snapshot,2)'
>  (kafka.log.ProducerStateManager)
> [2021-07-13 14:26:08,877] WARN [Log partition=__transaction_state-10, 
> dir=/bitnami/kafka/data] Non-monotonic update of high watermark from 
> (offset=2930 segment=[0:1048717]) to (offset=2 segment=[0:338]) 
> (kafka.log.Log)
> [2021-07-13 14:26:08,880] INFO [Log partition=__transaction_state-20, 
> dir=/bitnami/kafka/data] Truncating to offset 1 (kafka.log.Log)
> [2021-07-13 14:26:08,882] INFO [Log partition=__transaction_state-20, 
> dir=/bitnami/kafka/data] Loading producer state till offset 1 with message 
> format version 2 (kafka.log.Log)
> [2021-07-13 14:26:08,882] INFO [ProducerStateManager 
> partition=__transaction_state-20] Loading producer state from snapshot file 
> 'SnapshotFile(/bitnami/kafka/data/__transaction_state-20/0001.snapshot,1)'
>  (kafka.log.ProducerStateManager)
> [2021-07-13 14:26:08,883] WARN [Log partition=__transaction_state-20, 
> dir=/bitnami/kafka/data] Non-monotonic update of high watermark from 
> (offset=2956 segment=[0:1048608]) to (offset=1 segment=[0:169]) 
> (kafka.log.Log)
> {quote}
> If I describe that topic I can see that several partitions have a leader of 2 
> and the ISR is just 2 (NOTE I added two more brokers and tried to reassign 
> the topic onto brokers 2,3,4 which you can see below). The new brokers also 
> spit out the messages about "non-monotonic update" just like the original 
> followers. This describe output is from the following day.
> {{% kafka-topics.sh ${=BS} -topic __transaction_state -describe}}
> {{Topic: __transaction_state TopicId: i7bBNCeuQMWl-ZMpzrnMAw PartitionCount: 
> 50 ReplicationFactor: 3 Configs: 
> 

[jira] [Commented] (KAFKA-13077) Replication failing after unclean shutdown of ZK and all brokers

2021-12-16 Thread Shivakumar (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-13077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17461241#comment-17461241
 ] 

Shivakumar commented on KAFKA-13077:


[~junrao]  could you please suggest what can we do or recommend any changes to 
fix this?

> Replication failing after unclean shutdown of ZK and all brokers
> 
>
> Key: KAFKA-13077
> URL: https://issues.apache.org/jira/browse/KAFKA-13077
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 2.8.0
>Reporter: Christopher Auston
>Priority: Minor
>
> I am submitting this in the spirit of what can go wrong when an operator 
> violates the constraints Kafka depends on. I don't know if Kafka could or 
> should handle this more gracefully. I decided to file this issue because it 
> was easy to get the problem I'm reporting with Kubernetes StatefulSets (STS). 
> By "easy" I mean that I did not go out of my way to corrupt anything, I just 
> was not careful when restarting ZK and brokers.
> I violated the constraints of keeping Zookeeper stable and at least one 
> running in-sync replica. 
> I am running the bitnami/kafka helm chart on Amazon EKS.
> {quote}% kubectl get po kaf-kafka-0 -ojson |jq .spec.containers'[].image'
> "docker.io/bitnami/kafka:2.8.0-debian-10-r43"
> {quote}
> I started with 3 ZK instances and 3 brokers (both STS). I changed the 
> cpu/memory requests on both STS and kubernetes proceeded to restart ZK and 
> kafka instances at the same time. If I recall correctly there were some 
> crashes and several restarts but eventually all the instances were running 
> again. It's possible all ZK nodes and all brokers were unavailable at various 
> points.
> The problem I noticed was that two of the brokers were just continually 
> spitting out messages like:
> {quote}% kubectl logs kaf-kafka-0 --tail 10
> [2021-07-13 14:26:08,871] INFO [ProducerStateManager 
> partition=__transaction_state-0] Loading producer state from snapshot file 
> 'SnapshotFile(/bitnami/kafka/data/__transaction_state-0/0001.snapshot,1)'
>  (kafka.log.ProducerStateManager)
> [2021-07-13 14:26:08,871] WARN [Log partition=__transaction_state-0, 
> dir=/bitnami/kafka/data] *Non-monotonic update of high watermark from 
> (offset=2744 segment=[0:1048644]) to (offset=1 segment=[0:169])* 
> (kafka.log.Log)
> [2021-07-13 14:26:08,874] INFO [Log partition=__transaction_state-10, 
> dir=/bitnami/kafka/data] Truncating to offset 2 (kafka.log.Log)
> [2021-07-13 14:26:08,877] INFO [Log partition=__transaction_state-10, 
> dir=/bitnami/kafka/data] Loading producer state till offset 2 with message 
> format version 2 (kafka.log.Log)
> [2021-07-13 14:26:08,877] INFO [ProducerStateManager 
> partition=__transaction_state-10] Loading producer state from snapshot file 
> 'SnapshotFile(/bitnami/kafka/data/__transaction_state-10/0002.snapshot,2)'
>  (kafka.log.ProducerStateManager)
> [2021-07-13 14:26:08,877] WARN [Log partition=__transaction_state-10, 
> dir=/bitnami/kafka/data] Non-monotonic update of high watermark from 
> (offset=2930 segment=[0:1048717]) to (offset=2 segment=[0:338]) 
> (kafka.log.Log)
> [2021-07-13 14:26:08,880] INFO [Log partition=__transaction_state-20, 
> dir=/bitnami/kafka/data] Truncating to offset 1 (kafka.log.Log)
> [2021-07-13 14:26:08,882] INFO [Log partition=__transaction_state-20, 
> dir=/bitnami/kafka/data] Loading producer state till offset 1 with message 
> format version 2 (kafka.log.Log)
> [2021-07-13 14:26:08,882] INFO [ProducerStateManager 
> partition=__transaction_state-20] Loading producer state from snapshot file 
> 'SnapshotFile(/bitnami/kafka/data/__transaction_state-20/0001.snapshot,1)'
>  (kafka.log.ProducerStateManager)
> [2021-07-13 14:26:08,883] WARN [Log partition=__transaction_state-20, 
> dir=/bitnami/kafka/data] Non-monotonic update of high watermark from 
> (offset=2956 segment=[0:1048608]) to (offset=1 segment=[0:169]) 
> (kafka.log.Log)
> {quote}
> If I describe that topic I can see that several partitions have a leader of 2 
> and the ISR is just 2 (NOTE I added two more brokers and tried to reassign 
> the topic onto brokers 2,3,4 which you can see below). The new brokers also 
> spit out the messages about "non-monotonic update" just like the original 
> followers. This describe output is from the following day.
> {{% kafka-topics.sh ${=BS} -topic __transaction_state -describe}}
> {{Topic: __transaction_state TopicId: i7bBNCeuQMWl-ZMpzrnMAw PartitionCount: 
> 50 ReplicationFactor: 3 Configs: 
> compression.type=uncompressed,min.insync.replicas=3,cleanup.policy=compact,flush.ms=1000,segment.bytes=104857600,flush.messages=1,max.message.bytes=112,unclean.leader.election.enable=false,retention.bytes=1073741824}}
> {{ Topic: __transaction_state Partition: 0 Leader: 2 

[jira] [Commented] (KAFKA-13077) Replication failing after unclean shutdown of ZK and all brokers

2021-12-16 Thread Shivakumar (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-13077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17461239#comment-17461239
 ] 

Shivakumar commented on KAFKA-13077:


hi [~junrao]  we did not save the DumpLogSegment log during the incident
but we were able to reproduce the error and got this output, it should be the 
same in the case of our above error
kafka [ /var/lib/kafka/data/__consumer_offsets-46 ]$ ls -l
total 32
-rw-rw-r-- 1 kafka kafka0 Dec  9 11:40 .index
-rw-rw-r-- 1 kafka kafka  877 Dec  9 11:40 .log
-rw-rw-r-- 1 kafka kafka   12 Dec  9 11:40 .timeindex
-rw-rw-r-- 1 kafka kafka 10485760 Dec 13 13:20 04804743.index
-rw-rw-r-- 1 kafka kafka  207 Dec 11 21:01 04804743.log
-rw-rw-r-- 1 kafka kafka   10 Dec 11 21:01 04804743.snapshot
-rw-rw-r-- 1 kafka kafka 10485756 Dec 13 13:20 04804743.timeindex
-rw-rw-r-- 1 kafka kafka   10 Dec 13 13:11 04804745.snapshot
-rw-r--r-- 1 kafka kafka  132 Dec 13 13:20 leader-epoch-checkpoint
-rw-rw-r-- 1 kafka kafka   43 Dec  1 11:13 partition.metadata
kafka [ /var/lib/kafka/data/__consumer_offsets-46 ]$ kafka-run-class.sh 
kafka.tools.DumpLogSegments --files 04804743.index
Dumping 04804743.index
offset: 4804743 position: 0
Mismatches in 
:/var/lib/kafka/data/__consumer_offsets-46/04804743.index
  Index offset: 4804743, log offset: 4804744
kafka [ /var/lib/kafka/data/__consumer_offsets-46 ]$ kafka-run-class.sh 
kafka.tools.DumpLogSegments --files 04804743.log
Dumping 04804743.log
Starting offset: 4804743
baseOffset: 4804743 lastOffset: 4804744 count: 2 baseSequence: -1 lastSequence: 
-1 producerId: -1 producerEpoch: -1 partitionLeaderEpoch: 368 isTransactional: 
false isControl: false position: 0 CreateTime: 1639256468568 size: 207 magic: 2 
compresscodec: NONE crc: 2267717758 isvalid: true
kafka [ /var/lib/kafka/data/__consumer_offsets-46 ]$ kafka-run-class.sh 
kafka.tools.DumpLogSegments --files 04804743.timeindex
Dumping 04804743.timeindex
timestamp: 1639256468568 offset: 4804744
kafka [ /var/lib/kafka/data/__consumer_offsets-46 ]$ kafka-run-class.sh 
kafka.tools.DumpLogSegments --files 04804745.snapshot
Dumping 04804745.snapshot
kafka [ /var/lib/kafka/data/__consumer_offsets-46 ]$

> Replication failing after unclean shutdown of ZK and all brokers
> 
>
> Key: KAFKA-13077
> URL: https://issues.apache.org/jira/browse/KAFKA-13077
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 2.8.0
>Reporter: Christopher Auston
>Priority: Minor
>
> I am submitting this in the spirit of what can go wrong when an operator 
> violates the constraints Kafka depends on. I don't know if Kafka could or 
> should handle this more gracefully. I decided to file this issue because it 
> was easy to get the problem I'm reporting with Kubernetes StatefulSets (STS). 
> By "easy" I mean that I did not go out of my way to corrupt anything, I just 
> was not careful when restarting ZK and brokers.
> I violated the constraints of keeping Zookeeper stable and at least one 
> running in-sync replica. 
> I am running the bitnami/kafka helm chart on Amazon EKS.
> {quote}% kubectl get po kaf-kafka-0 -ojson |jq .spec.containers'[].image'
> "docker.io/bitnami/kafka:2.8.0-debian-10-r43"
> {quote}
> I started with 3 ZK instances and 3 brokers (both STS). I changed the 
> cpu/memory requests on both STS and kubernetes proceeded to restart ZK and 
> kafka instances at the same time. If I recall correctly there were some 
> crashes and several restarts but eventually all the instances were running 
> again. It's possible all ZK nodes and all brokers were unavailable at various 
> points.
> The problem I noticed was that two of the brokers were just continually 
> spitting out messages like:
> {quote}% kubectl logs kaf-kafka-0 --tail 10
> [2021-07-13 14:26:08,871] INFO [ProducerStateManager 
> partition=__transaction_state-0] Loading producer state from snapshot file 
> 'SnapshotFile(/bitnami/kafka/data/__transaction_state-0/0001.snapshot,1)'
>  (kafka.log.ProducerStateManager)
> [2021-07-13 14:26:08,871] WARN [Log partition=__transaction_state-0, 
> dir=/bitnami/kafka/data] *Non-monotonic update of high watermark from 
> (offset=2744 segment=[0:1048644]) to (offset=1 segment=[0:169])* 
> (kafka.log.Log)
> [2021-07-13 14:26:08,874] INFO [Log partition=__transaction_state-10, 
> dir=/bitnami/kafka/data] Truncating to offset 2 (kafka.log.Log)
> [2021-07-13 14:26:08,877] INFO [Log partition=__transaction_state-10, 
> dir=/bitnami/kafka/data] Loading producer state till offset 2 with message 
> format version 2 (kafka.log.Log)

[jira] [Comment Edited] (KAFKA-13077) Replication failing after unclean shutdown of ZK and all brokers

2021-12-09 Thread Shivakumar (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-13077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17456515#comment-17456515
 ] 

Shivakumar edited comment on KAFKA-13077 at 12/9/21, 3:44 PM:
--

Hi [~junrao] 

here is the summary of our issue, hope you can help us here 

Kafka(2.8.1), ZooKeeper(3.6.3) in eks kubernetes 1.19
kafka cluster size = 3
zk cluster size = 3

1) after rolling restart of zk , sometimes all the partitions of the topic 
become out of sync especially for broker 2, ISR=2 and Leader=2 and other 
brokers are out of ISR 
Topic: __consumer_offsets    PartitionCount: 50    ReplicationFactor: 3    
Configs: 
compression.type=producer,cleanup.policy=compact,segment.bytes=104857600
    Topic: __consumer_offsets    Partition: 0    Leader: 2    Replicas: 0,1,2   
 Isr: 2
    Topic: __consumer_offsets    Partition: 1    Leader: 2    Replicas: 1,2,0   
 Isr: 2
    Topic: __consumer_offsets    Partition: 2    Leader: 2    Replicas: 2,0,1   
 Isr: 2,1,0
    Topic: __consumer_offsets    Partition: 3    Leader: 2    Replicas: 0,2,1   
 Isr: 2
    Topic: __consumer_offsets    Partition: 4    Leader: 2    Replicas: 1,0,2   
 Isr: 2
    Topic: __consumer_offsets    Partition: 5    Leader: 2    Replicas: 2,1,0   
 Isr: 2
    Topic: __consumer_offsets    Partition: 6    Leader: 2    Replicas: 0,1,2   
 Isr: 2
    Topic: __consumer_offsets    Partition: 7    Leader: 2    Replicas: 1,2,0   
 Isr: 2
    Topic: __consumer_offsets    Partition: 8    Leader: 2    Replicas: 2,0,1   
 Isr: 2
    Topic: __consumer_offsets    Partition: 9    Leader: 2    Replicas: 0,2,1   
 Isr: 2
    Topic: __consumer_offsets    Partition: 10    Leader: 2    Replicas: 1,0,2  
  Isr: 2
    Topic: __consumer_offsets    Partition: 11    Leader: 2    Replicas: 2,1,0  
  Isr: 2
    Topic: __consumer_offsets    Partition: 12    Leader: 2    Replicas: 0,1,2  
  Isr: 2
    Topic: __consumer_offsets    Partition: 13    Leader: 2    Replicas: 1,2,0  
  Isr: 2
    Topic: __consumer_offsets    Partition: 14    Leader: 2    Replicas: 2,0,1  
  Isr: 2
    Topic: __consumer_offsets    Partition: 15    Leader: 2    Replicas: 0,2,1  
  Isr: 2
    Topic: __consumer_offsets    Partition: 16    Leader: 2    Replicas: 1,0,2  
  Isr: 2
    Topic: __consumer_offsets    Partition: 17    Leader: 2    Replicas: 2,1,0  
  Isr: 2
    Topic: __consumer_offsets    Partition: 18    Leader: 2    Replicas: 0,1,2  
  Isr: 2
    Topic: __consumer_offsets    Partition: 19    Leader: 2    Replicas: 1,2,0  
  Isr: 2
    Topic: __consumer_offsets    Partition: 20    Leader: 2    Replicas: 2,0,1  
  Isr: 2
    Topic: __consumer_offsets    Partition: 21    Leader: 2    Replicas: 0,2,1  
  Isr: 2
    Topic: __consumer_offsets    Partition: 22    Leader: 2    Replicas: 1,0,2  
  Isr: 2
    Topic: __consumer_offsets    Partition: 23    Leader: 2    Replicas: 2,1,0  
  Isr: 2
    Topic: __consumer_offsets    Partition: 24    Leader: 2    Replicas: 0,1,2  
  Isr: 2
    Topic: __consumer_offsets    Partition: 25    Leader: 2    Replicas: 1,2,0  
  Isr: 2
    Topic: __consumer_offsets    Partition: 26    Leader: 2    Replicas: 2,0,1  
  Isr: 2
    Topic: __consumer_offsets    Partition: 27    Leader: 2    Replicas: 0,2,1  
  Isr: 2
    Topic: __consumer_offsets    Partition: 28    Leader: 2    Replicas: 1,0,2  
  Isr: 2
    Topic: __consumer_offsets    Partition: 29    Leader: 2    Replicas: 2,1,0  
  Isr: 2,1,0
    Topic: __consumer_offsets    Partition: 30    Leader: 2    Replicas: 0,1,2  
  Isr: 2
    Topic: __consumer_offsets    Partition: 31    Leader: 2    Replicas: 1,2,0  
  Isr: 2
    Topic: __consumer_offsets    Partition: 32    Leader: 2    Replicas: 2,0,1  
  Isr: 2
    Topic: __consumer_offsets    Partition: 33    Leader: 2    Replicas: 0,2,1  
  Isr: 2
    Topic: __consumer_offsets    Partition: 34    Leader: 2    Replicas: 1,0,2  
  Isr: 2
    Topic: __consumer_offsets    Partition: 35    Leader: 2    Replicas: 2,1,0  
  Isr: 2
    Topic: __consumer_offsets    Partition: 36    Leader: 2    Replicas: 0,1,2  
  Isr: 2
    Topic: __consumer_offsets    Partition: 37    Leader: 2    Replicas: 1,2,0  
  Isr: 2
    Topic: __consumer_offsets    Partition: 38    Leader: 2    Replicas: 2,0,1  
  Isr: 2
    Topic: __consumer_offsets    Partition: 39    Leader: 2    Replicas: 0,2,1  
  Isr: 2
    Topic: __consumer_offsets    Partition: 40    Leader: 2    Replicas: 1,0,2  
  Isr: 2
    Topic: __consumer_offsets    Partition: 41    Leader: 2    Replicas: 2,1,0  
  Isr: 2,1,0
    Topic: __consumer_offsets    Partition: 42    Leader: 2    Replicas: 0,1,2  
  Isr: 2
    Topic: __consumer_offsets    Partition: 43    Leader: 2    Replicas: 1,2,0  
  Isr: 2
    Topic: __consumer_offsets    Partition: 44    Leader: 2    Replicas: 2,0,1  
  Isr: 2
    Topic: __consumer_offsets    Partition: 45    Leader: 2    Replicas: 0,2,1  
  Isr: 2
    Topic: __consumer_offsets    Partition: 46    Leader: 2    

[jira] [Commented] (KAFKA-13077) Replication failing after unclean shutdown of ZK and all brokers

2021-12-09 Thread Shivakumar (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-13077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17456515#comment-17456515
 ] 

Shivakumar commented on KAFKA-13077:


Hi [~junrao] 

here is the summary of our issue, hope you can help us here 

Kafka(2.8.1), ZooKeeper(3.6.3) in eks kubernetes 1.19
kafka cluster size = 3
zk cluster size = 3

1) after rolling restart of zk , sometimes all the partitions of the topic 
become out of sync especially for broker 2, ISR=2 and Leader=2 and other 
brokers are out of ISR 
Topic: __consumer_offsets    PartitionCount: 50    ReplicationFactor: 3    
Configs: 
compression.type=producer,cleanup.policy=compact,segment.bytes=104857600
    Topic: __consumer_offsets    Partition: 0    Leader: 2    Replicas: 0,1,2   
 Isr: 2
    Topic: __consumer_offsets    Partition: 1    Leader: 2    Replicas: 1,2,0   
 Isr: 2
    Topic: __consumer_offsets    Partition: 2    Leader: 2    Replicas: 2,0,1   
 Isr: 2,1,0
    Topic: __consumer_offsets    Partition: 3    Leader: 2    Replicas: 0,2,1   
 Isr: 2
    Topic: __consumer_offsets    Partition: 4    Leader: 2    Replicas: 1,0,2   
 Isr: 2
    Topic: __consumer_offsets    Partition: 5    Leader: 2    Replicas: 2,1,0   
 Isr: 2
    Topic: __consumer_offsets    Partition: 6    Leader: 2    Replicas: 0,1,2   
 Isr: 2
    Topic: __consumer_offsets    Partition: 7    Leader: 2    Replicas: 1,2,0   
 Isr: 2
    Topic: __consumer_offsets    Partition: 8    Leader: 2    Replicas: 2,0,1   
 Isr: 2
    Topic: __consumer_offsets    Partition: 9    Leader: 2    Replicas: 0,2,1   
 Isr: 2
    Topic: __consumer_offsets    Partition: 10    Leader: 2    Replicas: 1,0,2  
  Isr: 2
    Topic: __consumer_offsets    Partition: 11    Leader: 2    Replicas: 2,1,0  
  Isr: 2
    Topic: __consumer_offsets    Partition: 12    Leader: 2    Replicas: 0,1,2  
  Isr: 2
    Topic: __consumer_offsets    Partition: 13    Leader: 2    Replicas: 1,2,0  
  Isr: 2
    Topic: __consumer_offsets    Partition: 14    Leader: 2    Replicas: 2,0,1  
  Isr: 2
    Topic: __consumer_offsets    Partition: 15    Leader: 2    Replicas: 0,2,1  
  Isr: 2
    Topic: __consumer_offsets    Partition: 16    Leader: 2    Replicas: 1,0,2  
  Isr: 2
    Topic: __consumer_offsets    Partition: 17    Leader: 2    Replicas: 2,1,0  
  Isr: 2
    Topic: __consumer_offsets    Partition: 18    Leader: 2    Replicas: 0,1,2  
  Isr: 2
    Topic: __consumer_offsets    Partition: 19    Leader: 2    Replicas: 1,2,0  
  Isr: 2
    Topic: __consumer_offsets    Partition: 20    Leader: 2    Replicas: 2,0,1  
  Isr: 2
    Topic: __consumer_offsets    Partition: 21    Leader: 2    Replicas: 0,2,1  
  Isr: 2
    Topic: __consumer_offsets    Partition: 22    Leader: 2    Replicas: 1,0,2  
  Isr: 2
    Topic: __consumer_offsets    Partition: 23    Leader: 2    Replicas: 2,1,0  
  Isr: 2
    Topic: __consumer_offsets    Partition: 24    Leader: 2    Replicas: 0,1,2  
  Isr: 2
    Topic: __consumer_offsets    Partition: 25    Leader: 2    Replicas: 1,2,0  
  Isr: 2
    Topic: __consumer_offsets    Partition: 26    Leader: 2    Replicas: 2,0,1  
  Isr: 2
    Topic: __consumer_offsets    Partition: 27    Leader: 2    Replicas: 0,2,1  
  Isr: 2
    Topic: __consumer_offsets    Partition: 28    Leader: 2    Replicas: 1,0,2  
  Isr: 2
    Topic: __consumer_offsets    Partition: 29    Leader: 2    Replicas: 2,1,0  
  Isr: 2,1,0
    Topic: __consumer_offsets    Partition: 30    Leader: 2    Replicas: 0,1,2  
  Isr: 2
    Topic: __consumer_offsets    Partition: 31    Leader: 2    Replicas: 1,2,0  
  Isr: 2
    Topic: __consumer_offsets    Partition: 32    Leader: 2    Replicas: 2,0,1  
  Isr: 2
    Topic: __consumer_offsets    Partition: 33    Leader: 2    Replicas: 0,2,1  
  Isr: 2
    Topic: __consumer_offsets    Partition: 34    Leader: 2    Replicas: 1,0,2  
  Isr: 2
    Topic: __consumer_offsets    Partition: 35    Leader: 2    Replicas: 2,1,0  
  Isr: 2
    Topic: __consumer_offsets    Partition: 36    Leader: 2    Replicas: 0,1,2  
  Isr: 2
    Topic: __consumer_offsets    Partition: 37    Leader: 2    Replicas: 1,2,0  
  Isr: 2
    Topic: __consumer_offsets    Partition: 38    Leader: 2    Replicas: 2,0,1  
  Isr: 2
    Topic: __consumer_offsets    Partition: 39    Leader: 2    Replicas: 0,2,1  
  Isr: 2
    Topic: __consumer_offsets    Partition: 40    Leader: 2    Replicas: 1,0,2  
  Isr: 2
    Topic: __consumer_offsets    Partition: 41    Leader: 2    Replicas: 2,1,0  
  Isr: 2,1,0
    Topic: __consumer_offsets    Partition: 42    Leader: 2    Replicas: 0,1,2  
  Isr: 2
    Topic: __consumer_offsets    Partition: 43    Leader: 2    Replicas: 1,2,0  
  Isr: 2
    Topic: __consumer_offsets    Partition: 44    Leader: 2    Replicas: 2,0,1  
  Isr: 2
    Topic: __consumer_offsets    Partition: 45    Leader: 2    Replicas: 0,2,1  
  Isr: 2
    Topic: __consumer_offsets    Partition: 46    Leader: 2    Replicas: 1,0,2  
  Isr: 2
    Topic: