Re: [PR] KAFKA-15417 flip joinSpuriousLookBackTimeMs and emit non-joined items [kafka]

2024-01-08 Thread via GitHub


VictorvandenHoven commented on PR #14426:
URL: https://github.com/apache/kafka/pull/14426#issuecomment-1880585360

   > @mjsax , @guozhangwang , can we merge this?
   
   Since it has been a couple of months, I suppose it will not be merged then?
   
   Can we discuss this?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Created] (KAFKA-16089) Kafka Streams still leaking memory in 3.7

2024-01-08 Thread Lucas Brutschy (Jira)
Lucas Brutschy created KAFKA-16089:
--

 Summary: Kafka Streams still leaking memory in 3.7
 Key: KAFKA-16089
 URL: https://issues.apache.org/jira/browse/KAFKA-16089
 Project: Kafka
  Issue Type: Bug
  Components: streams
Affects Versions: 3.7.0
Reporter: Lucas Brutschy
 Fix For: 3.7.0
 Attachments: graphviz (1).svg

In 
[https://github.com/apache/kafka/commit/58d6d2e5922df60cdc5c5c1bcd1c2d82dd2b71e2]
 a leak was fixed in the release candidate for 3.7.

 

However, Kafka Streams still seems to be leaking memory (just slower) after the 
fix.

 

Attached is the `jeprof` output right before a crash after ~11 hours.

 

 

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-16089) Kafka Streams still leaking memory in 3.7

2024-01-08 Thread Lucas Brutschy (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-16089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lucas Brutschy updated KAFKA-16089:
---
Priority: Blocker  (was: Major)

> Kafka Streams still leaking memory in 3.7
> -
>
> Key: KAFKA-16089
> URL: https://issues.apache.org/jira/browse/KAFKA-16089
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 3.7.0
>Reporter: Lucas Brutschy
>Priority: Blocker
> Fix For: 3.7.0
>
> Attachments: graphviz (1).svg
>
>
> In 
> [https://github.com/apache/kafka/commit/58d6d2e5922df60cdc5c5c1bcd1c2d82dd2b71e2]
>  a leak was fixed in the release candidate for 3.7.
>  
> However, Kafka Streams still seems to be leaking memory (just slower) after 
> the fix.
>  
> Attached is the `jeprof` output right before a crash after ~11 hours.
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] MINOR: Fix shadow jar publishing for the clients module [kafka]

2024-01-08 Thread via GitHub


stanislavkozlovski merged PR #15127:
URL: https://github.com/apache/kafka/pull/15127


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-16059: close more kafkaApis instances [kafka]

2024-01-08 Thread via GitHub


divijvaidya commented on PR #15132:
URL: https://github.com/apache/kafka/pull/15132#issuecomment-1880673686

   backported to 3.7


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (KAFKA-16082) JBOD: Possible dataloss when moving leader partition

2024-01-08 Thread Stanislav Kozlovski (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-16082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stanislav Kozlovski updated KAFKA-16082:

Fix Version/s: (was: 3.7.0)

> JBOD: Possible dataloss when moving leader partition
> 
>
> Key: KAFKA-16082
> URL: https://issues.apache.org/jira/browse/KAFKA-16082
> Project: Kafka
>  Issue Type: Bug
>  Components: jbod
>Affects Versions: 3.7.0
>Reporter: Proven Provenzano
>Assignee: Gaurav Narula
>Priority: Blocker
>
> There is a possible dataloss scenario
> when using JBOD,
> when moving the partition leader log from one directory to another on the 
> same broker,
> when after the destination log has caught up to the source log and after the 
> broker has sent an update to the partition assignment
> if the broker accepts and commits a new record for the partition and then the 
> broker restarts and the original partition leader log is lost
> then the destination log would not contain the new record.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-16089) Kafka Streams still leaking memory in 3.7

2024-01-08 Thread Stanislav Kozlovski (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-16089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stanislav Kozlovski updated KAFKA-16089:

Fix Version/s: (was: 3.7.0)

> Kafka Streams still leaking memory in 3.7
> -
>
> Key: KAFKA-16089
> URL: https://issues.apache.org/jira/browse/KAFKA-16089
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 3.7.0
>Reporter: Lucas Brutschy
>Priority: Blocker
> Attachments: graphviz (1).svg
>
>
> In 
> [https://github.com/apache/kafka/commit/58d6d2e5922df60cdc5c5c1bcd1c2d82dd2b71e2]
>  a leak was fixed in the release candidate for 3.7.
>  
> However, Kafka Streams still seems to be leaking memory (just slower) after 
> the fix.
>  
> Attached is the `jeprof` output right before a crash after ~11 hours.
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-16088) Not reading active segments when RemoteFetch return Empty Records.

2024-01-08 Thread Divij Vaidya (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-16088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Divij Vaidya updated KAFKA-16088:
-
Component/s: Tiered-Storage

>  Not reading active segments  when RemoteFetch return Empty Records.
> 
>
> Key: KAFKA-16088
> URL: https://issues.apache.org/jira/browse/KAFKA-16088
> Project: Kafka
>  Issue Type: Bug
>  Components: Tiered-Storage
>Reporter: Arpit Goyal
>Priority: Critical
>
> Please refer this comment for details 
> https://github.com/apache/kafka/pull/15060#issuecomment-1879657273



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-16088) Not reading active segments when RemoteFetch return Empty Records.

2024-01-08 Thread Divij Vaidya (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-16088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Divij Vaidya updated KAFKA-16088:
-
Labels: tiered-storage  (was: )

>  Not reading active segments  when RemoteFetch return Empty Records.
> 
>
> Key: KAFKA-16088
> URL: https://issues.apache.org/jira/browse/KAFKA-16088
> Project: Kafka
>  Issue Type: Bug
>  Components: Tiered-Storage
>Reporter: Arpit Goyal
>Priority: Critical
>  Labels: tiered-storage
>
> Please refer this comment for details 
> https://github.com/apache/kafka/pull/15060#issuecomment-1879657273



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] KAFKA-16082: Avoid resuming future replica if current replica is in the same directory [kafka]

2024-01-08 Thread via GitHub


OmniaGM commented on code in PR #15136:
URL: https://github.com/apache/kafka/pull/15136#discussion_r142323


##
core/src/main/scala/kafka/log/LogManager.scala:
##
@@ -1156,6 +1156,32 @@ class LogManager(logDirs: Seq[File],
 }
   }
 
+  def maybeRecoverAbandonedFutureLog(topicPartition: TopicPartition, 
partitionAssignedDirectoryId: Option[Uuid], callback: () => Unit): Unit = {
+logCreationOrDeletionLock synchronized {
+  for {
+futureLog <- getLog(topicPartition, isFuture = true)
+futureDirId <- directoryId(futureLog.parentDir)
+assignedDirId <- partitionAssignedDirectoryId
+if futureDirId == assignedDirId
+  } {
+val sourceLog = futureLog

Review Comment:
   `sourceLog` can move into the `for` block. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-16082: Avoid resuming future replica if current replica is in the same directory [kafka]

2024-01-08 Thread via GitHub


OmniaGM commented on code in PR #15136:
URL: https://github.com/apache/kafka/pull/15136#discussion_r142323


##
core/src/main/scala/kafka/log/LogManager.scala:
##
@@ -1156,6 +1156,32 @@ class LogManager(logDirs: Seq[File],
 }
   }
 
+  def maybeRecoverAbandonedFutureLog(topicPartition: TopicPartition, 
partitionAssignedDirectoryId: Option[Uuid], callback: () => Unit): Unit = {
+logCreationOrDeletionLock synchronized {
+  for {
+futureLog <- getLog(topicPartition, isFuture = true)
+futureDirId <- directoryId(futureLog.parentDir)
+assignedDirId <- partitionAssignedDirectoryId
+if futureDirId == assignedDirId
+  } {
+val sourceLog = futureLog

Review Comment:
   `sourceLog` can be moved into the `for` block. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-15388: Handling remote segment read in case of log compaction [kafka]

2024-01-08 Thread via GitHub


iit2009060 commented on PR #15060:
URL: https://github.com/apache/kafka/pull/15060#issuecomment-1880775216

   > > @satishd @divijvaidya @kamalcph I am able to reproduce the above 
scenario using retention feature instead of log compaction.
   > > The overall problem is we are sending MemoryRecords.Empty when unable to 
find offset even though active segments can still have the data.
   > > Consider the below scenario
   > > 
   > > 1. create topic test8 with partition 0 and with remote storage enabled.
   > > 2. Current status of topic (Offset 0,2,3,4)
   > > 
   > > https://private-user-images.githubusercontent.com/59436466/294679830-ab83f9e3-a44b-4c22-8d72-20a07fc6ce6d.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MDQ2MTI0MjksIm5iZiI6MTcwNDYxMjEyOSwicGF0aCI6Ii81OTQzNjQ2Ni8yOTQ2Nzk4MzAtYWI4M2Y5ZTMtYTQ0Yi00YzIyLThkNzItMjBhMDdmYzZjZTZkLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDAxMDclMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwMTA3VDA3MjIwOVomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTcwNzlkNzQ1NTRmYzFmZjU5OTNjYmMwZDM5OTU5YmZjNjBhZjczMzdjNjg4ZDQxZWI2NzdiM2U5ZTlhNzQwMjcmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.M4Y1cvGSDNozMQTsAWwwq9K8ZHkCnGUZLqy6fKBk5GY";>
 3. When we make a fetch request with offset 1 https://private-user-images.githubusercontent.com/59436466/294679851-9ef58907-a545-403b-a004-20e094945d30.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MDQ2MTI0MjksIm5iZiI6MTcwNDYxMjEyOSwicGF0aCI6Ii81OTQzNjQ2Ni8yOTQ2Nzk4NTEtOWVmNTg5MDctYTU0NS00MDNiLWEwMDQtMjBlMDk0OTQ1ZDMwLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDAxMDclMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwMTA3VDA3MjIwOVomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTNmNzY2OTUzMTUwMGE3NzNhODE5ZmJlYWIxYWRjNDBhNjRjMTJlNWUzNTllMTQ5MDFhMmZmMDdiM2E0MDAxNTcmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.HeHVo2nfLE0HLDe4QkEcrFzx1_hUGrwgnaifTzJPo_I";>
 4. To delete remote segments I have set up the configuration retention.ms=1000 
https://private-user-images.
 
githubusercontent.com/59436466/294679912-ba5b34cd-b9f1-4bb3-9574-8ebe9cd37741.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MDQ2MTI0MjksIm5iZiI6MTcwNDYxMjEyOSwicGF0aCI6Ii81OTQzNjQ2Ni8yOTQ2Nzk5MTItYmE1YjM0Y2QtYjlmMS00YmIzLTk1NzQtOGViZTljZDM3NzQxLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDAxMDclMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwMTA3VDA3MjIwOVomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTUxZjRmODg4MjRiNjE3Njc5M2E1Y2IyYzViZWVlMzdhNTU2Mzk4MzMzMmI0ZjcxMGZiZDUyYjZjYmRhY2EyZjMmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.loKBLUJmgPtGvrnAFspaabHWZDzMPl_c1M8K4mqdRZQ">
 5. Now once the data is deleted, reset the local and remote retention to 1 
hour. Produce some data https://private-user-images.githubusercon
 
tent.com/59436466/294680022-ee91b9a7-8fbe-4586-8ca8-3164fe82bddd.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MDQ2MTI0MjksIm5iZiI6MTcwNDYxMjEyOSwicGF0aCI6Ii81OTQzNjQ2Ni8yOTQ2ODAwMjItZWU5MWI5YTctOGZiZS00NTg2LThjYTgtMzE2NGZlODJiZGRkLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDAxMDclMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwMTA3VDA3MjIwOVomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTA0NDI1MjRiM2YwNWRkNGJiYjE2NWVmNzhiOTBhNjEzYmU5YTFjNTJmYTQ4N2Q1NjgyYjQwMTE1ZmY4MDYwYTMmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.OWvD41CLgnMU69aL_4z5Lak-KlZKCnxRl7apaDQx-DA">
 7. When I am trying to fetch offset 1 from the topic test 8 partition 0 , it 
never able to respond. Ideally it should pick the data from the active segment 
whose offset starts at 6. https://private-user-images.githubusercontent.com/59436466/294680083-15a560ca-9396-4a73-9c16-8812cda36adf.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MDQ2MTI0MjksIm5iZiI6MTcwNDYxMjEyOSwicGF0aCI6Ii81OTQzNjQ2Ni8yOTQ2ODAwODMtMTVhNTYwY2EtOTM5Ni00YTczLTljMTYtODgxMmNkYTM2YWRmLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDAxMDclMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwMTA3VDA3MjIwOVomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTcwZGE4MWRiNGViNTQzZDliNjhlNmVmZDFlOGQxY2NhOWI3ZjlkZWYxZmQ4MGU2ODk1ZWRlZTUyOTNmNzNjNjMmWC1BbXotU2lnbmVkS

Re: [PR] KAFKA-16082: Avoid resuming future replica if current replica is in the same directory [kafka]

2024-01-08 Thread via GitHub


OmniaGM commented on code in PR #15136:
URL: https://github.com/apache/kafka/pull/15136#discussion_r157092


##
core/src/test/scala/unit/kafka/server/ReplicaManagerTest.scala:
##
@@ -5592,6 +5592,63 @@ class ReplicaManagerTest {
 }
   }
 
+  @ParameterizedTest
+  @ValueSource(booleans = Array(true, false))
+  def testDeltaFollowerRecoverAbandonedFutureReplica(enableRemoteStorage: 
Boolean): Unit = {
+// Given
+val localId = 1
+val topicPartition = new TopicPartition("foo", 0)
+
+val mockReplicaFetcherManager = mock(classOf[ReplicaFetcherManager])
+val mockReplicaAlterLogDirsManager = 
mock(classOf[ReplicaAlterLogDirsManager])
+val replicaManager = setupReplicaManagerWithMockedPurgatories(
+  timer = new MockTimer(time),
+  brokerId = localId,
+  mockReplicaFetcherManager = Some(mockReplicaFetcherManager),
+  mockReplicaAlterLogDirsManager = Some(mockReplicaAlterLogDirsManager),
+  enableRemoteStorage = enableRemoteStorage,
+  shouldMockLog = true
+)
+
+val directoryId1 = Uuid.randomUuid()
+val directoryId2 = Uuid.randomUuid()
+val mockFutureLog = setupMockLog("/KAFKA-16082-test", isFuture = true)
+
+val mockLogMgr = replicaManager.logManager
+when(
+  mockLogMgr.getLog(topicPartition, isFuture = true)
+).thenReturn {
+  Some(mockFutureLog)
+}
+when(mockLogMgr.getLog(topicPartition)).thenReturn {
+  None
+}
+
+when(
+  mockLogMgr.directoryId(mockFutureLog.parentDir)
+).thenReturn {
+  Some(directoryId1)

Review Comment:
   same as above



##
core/src/test/scala/unit/kafka/server/ReplicaManagerTest.scala:
##
@@ -4831,15 +4831,15 @@ class ReplicaManagerTest {
 try {
   val foo0 = new TopicPartition("foo", 0)
   val emptyDelta = new TopicsDelta(TopicsImage.EMPTY)
-  val (fooPart, fooNew) = replicaManager.getOrCreatePartition(foo0, 
emptyDelta, FOO_UUID).get
+  val (fooPart, fooNew) = replicaManager.getOrCreatePartition(foo0, 
emptyDelta, FOO_UUID, false, None).get

Review Comment:
   Any reason why we are not setting `partitionAssignedDirectoryId` to None as 
a default in `replicaManager.getOrCreatePartition` signature? 



##
core/src/test/scala/unit/kafka/server/ReplicaManagerTest.scala:
##
@@ -5592,6 +5592,63 @@ class ReplicaManagerTest {
 }
   }
 
+  @ParameterizedTest
+  @ValueSource(booleans = Array(true, false))
+  def testDeltaFollowerRecoverAbandonedFutureReplica(enableRemoteStorage: 
Boolean): Unit = {
+// Given
+val localId = 1
+val topicPartition = new TopicPartition("foo", 0)
+
+val mockReplicaFetcherManager = mock(classOf[ReplicaFetcherManager])
+val mockReplicaAlterLogDirsManager = 
mock(classOf[ReplicaAlterLogDirsManager])
+val replicaManager = setupReplicaManagerWithMockedPurgatories(
+  timer = new MockTimer(time),
+  brokerId = localId,
+  mockReplicaFetcherManager = Some(mockReplicaFetcherManager),
+  mockReplicaAlterLogDirsManager = Some(mockReplicaAlterLogDirsManager),
+  enableRemoteStorage = enableRemoteStorage,
+  shouldMockLog = true
+)
+
+val directoryId1 = Uuid.randomUuid()
+val directoryId2 = Uuid.randomUuid()
+val mockFutureLog = setupMockLog("/KAFKA-16082-test", isFuture = true)
+
+val mockLogMgr = replicaManager.logManager
+when(
+  mockLogMgr.getLog(topicPartition, isFuture = true)
+).thenReturn {

Review Comment:
   `{` is used to hold a block but in this case it is a simple value. So maybe 
replace all `.thenReturn {Some(mockFutureLog)}` to 
`.thenReturn(Some(mockFutureLog))`



##
core/src/test/scala/unit/kafka/server/ReplicaManagerTest.scala:
##
@@ -5592,6 +5592,63 @@ class ReplicaManagerTest {
 }
   }
 
+  @ParameterizedTest
+  @ValueSource(booleans = Array(true, false))
+  def testDeltaFollowerRecoverAbandonedFutureReplica(enableRemoteStorage: 
Boolean): Unit = {
+// Given
+val localId = 1
+val topicPartition = new TopicPartition("foo", 0)
+
+val mockReplicaFetcherManager = mock(classOf[ReplicaFetcherManager])
+val mockReplicaAlterLogDirsManager = 
mock(classOf[ReplicaAlterLogDirsManager])
+val replicaManager = setupReplicaManagerWithMockedPurgatories(
+  timer = new MockTimer(time),
+  brokerId = localId,
+  mockReplicaFetcherManager = Some(mockReplicaFetcherManager),
+  mockReplicaAlterLogDirsManager = Some(mockReplicaAlterLogDirsManager),
+  enableRemoteStorage = enableRemoteStorage,
+  shouldMockLog = true
+)
+
+val directoryId1 = Uuid.randomUuid()
+val directoryId2 = Uuid.randomUuid()
+val mockFutureLog = setupMockLog("/KAFKA-16082-test", isFuture = true)
+
+val mockLogMgr = replicaManager.logManager
+when(
+  mockLogMgr.getLog(topicPartition, isFuture = true)
+).thenReturn {
+  Some(mockFutureLog)
+}
+when(mockLogMgr.getLog(topicPartition)).thenReturn {
+  None

Review Comment:

Re: [PR] KAFKA-16016: Add docker wrapper in core and remove docker utility script [kafka]

2024-01-08 Thread via GitHub


omkreddy commented on code in PR #15048:
URL: https://github.com/apache/kafka/pull/15048#discussion_r162085


##
core/src/main/scala/kafka/docker/KafkaDockerWrapper.scala:
##
@@ -0,0 +1,218 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package kafka.docker
+
+import kafka.tools.StorageTool
+import kafka.utils.Exit
+
+import java.nio.charset.StandardCharsets
+import java.nio.file.{Files, Paths, StandardCopyOption, StandardOpenOption}
+
+object KafkaDockerWrapper {
+  def main(args: Array[String]): Unit = {
+if (args.length == 0) {
+  throw new RuntimeException(s"Error: No operation input provided. " +
+s"Please provide a valid operation: 'setup'.")
+}
+val operation = args.head
+val arguments = args.tail
+
+operation match {
+  case "setup" =>
+if (arguments.length != 3) {
+  val errMsg = "not enough arguments passed. Usage: " +
+"setup  , 
"
+  System.err.println(errMsg)
+  Exit.exit(1, Some(errMsg))
+}
+val defaultConfigsDir = arguments(0)
+val mountedConfigsDir = arguments(1)
+val finalConfigsDir = arguments(2)
+try {
+  prepareConfigs(defaultConfigsDir, mountedConfigsDir, finalConfigsDir)
+} catch {
+  case e: Throwable =>
+val errMsg = s"error while preparing configs: ${e.getMessage}"
+System.err.println(errMsg)
+Exit.exit(1, Some(errMsg))
+}
+
+val formatCmd = formatStorageCmd(finalConfigsDir, envVars)
+StorageTool.main(formatCmd)
+  case _ =>
+throw new RuntimeException(s"Unknown operation $operation. " +
+  s"Please provide a valid operation: 'setup'.")
+}
+  }
+
+  import Constants._
+
+  private def formatStorageCmd(configsDir: String, env: Map[String, String]): 
Array[String] = {

Review Comment:
   Can we create JIRA to track this?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-16082: Avoid resuming future replica if current replica is in the same directory [kafka]

2024-01-08 Thread via GitHub


gaurav-narula commented on code in PR #15136:
URL: https://github.com/apache/kafka/pull/15136#discussion_r169837


##
core/src/main/scala/kafka/server/ReplicaManager.scala:
##
@@ -2773,6 +2776,12 @@ class ReplicaManager(val config: KafkaConfig,
 Some(partition, false)
 
   case HostedPartition.None =>
+var isNew = true

Review Comment:
   I don't think that would work because `isNew` is mutated inside the callback 
which may be executed conditionally.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-16082: Avoid resuming future replica if current replica is in the same directory [kafka]

2024-01-08 Thread via GitHub


gaurav-narula commented on code in PR #15136:
URL: https://github.com/apache/kafka/pull/15136#discussion_r169837


##
core/src/main/scala/kafka/server/ReplicaManager.scala:
##
@@ -2773,6 +2776,12 @@ class ReplicaManager(val config: KafkaConfig,
 Some(partition, false)
 
   case HostedPartition.None =>
+var isNew = true

Review Comment:
   I don't think that would work because `isNew` is mutated inside the callback 
which may be executed conditionally.
   
   Edit: Perhaps I can have `maybeRecoverAbandonedFutureLog` return a boolean 
instead



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-14588 ConfigEntityName move to server-common [kafka]

2024-01-08 Thread via GitHub


mimaison merged PR #14868:
URL: https://github.com/apache/kafka/pull/14868


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-16016: Add docker wrapper in core and remove docker utility script [kafka]

2024-01-08 Thread via GitHub


omkreddy commented on PR #15048:
URL: https://github.com/apache/kafka/pull/15048#issuecomment-1880849914

   > I saw that but the action only allows me to pick a branch or tag. Ideally 
I'd like to run it on the pull request.
   
   @VedarthConfluent Can you pls crate JIRA to track this?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Created] (KAFKA-16090) Refactor call to storage tool from kafka docker wrapper

2024-01-08 Thread Vedarth Sharma (Jira)
Vedarth Sharma created KAFKA-16090:
--

 Summary: Refactor call to storage tool from kafka docker wrapper
 Key: KAFKA-16090
 URL: https://issues.apache.org/jira/browse/KAFKA-16090
 Project: Kafka
  Issue Type: Sub-task
Reporter: Vedarth Sharma
Assignee: Vedarth Sharma


Once rewrite of kafka storage tool is done, refactor how we are calling storage 
tool from kafka docker wrapper



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-16091) Pipeline to run docker build test on a PR

2024-01-08 Thread Vedarth Sharma (Jira)
Vedarth Sharma created KAFKA-16091:
--

 Summary: Pipeline to run docker build test on a PR
 Key: KAFKA-16091
 URL: https://issues.apache.org/jira/browse/KAFKA-16091
 Project: Kafka
  Issue Type: Sub-task
Reporter: Vedarth Sharma
Assignee: Vedarth Sharma


Create a new pipeline that will allow the capability to run the docker sanity 
tests on a PR



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] KAFKA-16016: Add docker wrapper in core and remove docker utility script [kafka]

2024-01-08 Thread via GitHub


VedarthConfluent commented on code in PR #15048:
URL: https://github.com/apache/kafka/pull/15048#discussion_r1444502244


##
core/src/main/scala/kafka/docker/KafkaDockerWrapper.scala:
##
@@ -0,0 +1,218 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package kafka.docker
+
+import kafka.tools.StorageTool
+import kafka.utils.Exit
+
+import java.nio.charset.StandardCharsets
+import java.nio.file.{Files, Paths, StandardCopyOption, StandardOpenOption}
+
+object KafkaDockerWrapper {
+  def main(args: Array[String]): Unit = {
+if (args.length == 0) {
+  throw new RuntimeException(s"Error: No operation input provided. " +
+s"Please provide a valid operation: 'setup'.")
+}
+val operation = args.head
+val arguments = args.tail
+
+operation match {
+  case "setup" =>
+if (arguments.length != 3) {
+  val errMsg = "not enough arguments passed. Usage: " +
+"setup  , 
"
+  System.err.println(errMsg)
+  Exit.exit(1, Some(errMsg))
+}
+val defaultConfigsDir = arguments(0)
+val mountedConfigsDir = arguments(1)
+val finalConfigsDir = arguments(2)
+try {
+  prepareConfigs(defaultConfigsDir, mountedConfigsDir, finalConfigsDir)
+} catch {
+  case e: Throwable =>
+val errMsg = s"error while preparing configs: ${e.getMessage}"
+System.err.println(errMsg)
+Exit.exit(1, Some(errMsg))
+}
+
+val formatCmd = formatStorageCmd(finalConfigsDir, envVars)
+StorageTool.main(formatCmd)
+  case _ =>
+throw new RuntimeException(s"Unknown operation $operation. " +
+  s"Please provide a valid operation: 'setup'.")
+}
+  }
+
+  import Constants._
+
+  private def formatStorageCmd(configsDir: String, env: Map[String, String]): 
Array[String] = {

Review Comment:
   Done! https://issues.apache.org/jira/browse/KAFKA-16090



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-14588 ZK configuration moved to ZkConfig [kafka]

2024-01-08 Thread via GitHub


mimaison commented on code in PR #15075:
URL: https://github.com/apache/kafka/pull/15075#discussion_r1444500990


##
server-common/src/main/java/org/apache/kafka/server/metrics/ClientMetricsConfigs.java:
##


Review Comment:
   These files don't seem to be related to ZooKeeper configs, why are we moving 
them in this PR?



##
core/src/main/scala/kafka/server/DynamicConfig.scala:
##
@@ -35,87 +33,22 @@ import scala.jdk.CollectionConverters._
 object DynamicConfig {
 
   object Broker {
-// Properties
-val LeaderReplicationThrottledRateProp = 
"leader.replication.throttled.rate"

Review Comment:
   Do we have to move these fields in this PR? These are not ZooKeeper configs



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-15388: Handling remote segment read in case of log compaction [kafka]

2024-01-08 Thread via GitHub


satishd commented on PR #15060:
URL: https://github.com/apache/kafka/pull/15060#issuecomment-1880861009

   @iit2009060  KAFKA-15388 is about addressing the fetch requests for tiered 
storage enabled topics that have compacted offsets in a partition before their 
cleanup policy is changed to delete. This PR only covers if the next offset is 
available only within the remote log segments. But there is a case when the 
offset is available in the subsequent local segments(not only in active local 
segments but in any of the local only segments) that have not yet been copied 
to remote storage. The solution does not address KAFKA-15388 completely.
   
   I am fine to merge this and have a followup PR for the mentioned scenario to 
resolve KAFKA-15388 addressing the compacted topics before tiered storage is 
enabled.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-16016: Add docker wrapper in core and remove docker utility script [kafka]

2024-01-08 Thread via GitHub


VedarthConfluent commented on PR #15048:
URL: https://github.com/apache/kafka/pull/15048#issuecomment-1880861940

   I have created a jira to track the new github actions pipeline requested 
above https://issues.apache.org/jira/browse/KAFKA-16091


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-16016: Add docker wrapper in core and remove docker utility script [kafka]

2024-01-08 Thread via GitHub


mimaison commented on PR #15048:
URL: https://github.com/apache/kafka/pull/15048#issuecomment-1880862505

   @omkreddy Does this work for you?
   
   1. Checkout the branch from the PR and go to the `docker` directory
   2. Run `python3 docker_build_test.py kafka/test --image-tag=3.6.0 
--image-type=jvm 
--kafka-url=https://archive.apache.org/dist/kafka/3.6.0/kafka_2.13-3.6.0.tgz`
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-16016: Add docker wrapper in core and remove docker utility script [kafka]

2024-01-08 Thread via GitHub


mimaison commented on PR #15048:
URL: https://github.com/apache/kafka/pull/15048#issuecomment-1880864298

   +1 for cherry-picking on 3.7 if Stanislav agrees


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-15853: Move KafkaConfig to server module [kafka]

2024-01-08 Thread via GitHub


mimaison commented on PR #15103:
URL: https://github.com/apache/kafka/pull/15103#issuecomment-1880865673

   Thanks for the PR. 
   It's really big, is there a way to split this into smaller PRs? For example 
we have https://github.com/apache/kafka/pull/15075 in progress to move the 
ZooKeeper configs. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-14683 Migrate #testStartPaused to Mockito [kafka]

2024-01-08 Thread via GitHub


mimaison commented on PR #14663:
URL: https://github.com/apache/kafka/pull/14663#issuecomment-1880869214

   It would be good to get @gharris1727 / @C0urante to take a look too


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-15388: Handling remote segment read in case of log compaction [kafka]

2024-01-08 Thread via GitHub


iit2009060 commented on PR #15060:
URL: https://github.com/apache/kafka/pull/15060#issuecomment-1880890006

   > @iit2009060 
[KAFKA-15388](https://issues.apache.org/jira/browse/KAFKA-15388) is about 
addressing the fetch requests for tiered storage enabled topics that have 
compacted offsets in a partition before their cleanup policy is changed to 
delete. This PR only covers if the next offset is available only within the 
remote log segments. But there is a case when the offset is available in the 
subsequent local segments(not only in active local segments but in any of the 
local only segments) that have not yet been copied to remote storage. The 
solution does not address 
[KAFKA-15388](https://issues.apache.org/jira/browse/KAFKA-15388) completely.
   > 
   > I am fine to merge this and have a followup PR for the mentioned scenario 
to resolve [KAFKA-15388](https://issues.apache.org/jira/browse/KAFKA-15388) 
addressing the compacted topics before tiered storage is enabled.
   
   @satishd   Yes, as you mentioned the current PR is not covering reading from 
local log segments ,i.e. data not available in RemoteLogSegments and the reason 
can be 
   1. Log compaction
   2. Retention 
As the above specific scenario touch point multiple feature and not 
generally a specific issue because of log compaction. 
   The intention behind creating a separate 
[ticket](https://issues.apache.org/jira/browse/KAFKA-16088) is to provide a 
dedicated space for addressing the broader issue, encompassing scenarios beyond 
the scope of the current PR. 
   However I am ok if we want to have a follow up PR to merge this. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (KAFKA-16089) Kafka Streams still leaking memory in 3.7

2024-01-08 Thread Stanislav Kozlovski (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-16089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stanislav Kozlovski updated KAFKA-16089:

Fix Version/s: 3.7.0

> Kafka Streams still leaking memory in 3.7
> -
>
> Key: KAFKA-16089
> URL: https://issues.apache.org/jira/browse/KAFKA-16089
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 3.7.0
>Reporter: Lucas Brutschy
>Priority: Blocker
> Fix For: 3.7.0
>
> Attachments: graphviz (1).svg
>
>
> In 
> [https://github.com/apache/kafka/commit/58d6d2e5922df60cdc5c5c1bcd1c2d82dd2b71e2]
>  a leak was fixed in the release candidate for 3.7.
>  
> However, Kafka Streams still seems to be leaking memory (just slower) after 
> the fix.
>  
> Attached is the `jeprof` output right before a crash after ~11 hours.
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-16082) JBOD: Possible dataloss when moving leader partition

2024-01-08 Thread Stanislav Kozlovski (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-16082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stanislav Kozlovski updated KAFKA-16082:

Fix Version/s: 3.7.0

> JBOD: Possible dataloss when moving leader partition
> 
>
> Key: KAFKA-16082
> URL: https://issues.apache.org/jira/browse/KAFKA-16082
> Project: Kafka
>  Issue Type: Bug
>  Components: jbod
>Affects Versions: 3.7.0
>Reporter: Proven Provenzano
>Assignee: Gaurav Narula
>Priority: Blocker
> Fix For: 3.7.0
>
>
> There is a possible dataloss scenario
> when using JBOD,
> when moving the partition leader log from one directory to another on the 
> same broker,
> when after the destination log has caught up to the source log and after the 
> broker has sent an update to the partition assignment
> if the broker accepts and commits a new record for the partition and then the 
> broker restarts and the original partition leader log is lost
> then the destination log would not contain the new record.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] KAFKA-16016: Add docker wrapper in core and remove docker utility script [kafka]

2024-01-08 Thread via GitHub


omkreddy commented on PR #15048:
URL: https://github.com/apache/kafka/pull/15048#issuecomment-1880911951

   > @omkreddy Does this work for you?
   > 
   > 1. Checkout the branch from the PR and go to the `docker` directory
   > 2. Run `python3 docker_build_test.py kafka/test --image-tag=3.6.0 
--image-type=jvm 
--kafka-url=https://archive.apache.org/dist/kafka/3.6.0/kafka_2.13-3.6.0.tgz`
   
   Hi @mimaison, It will not work now, as the newly added 
`KafkaDockerWrapper.scala` classes are not part of 3.6. 
   For now,  I have tested with custom image and few changes to the PR. We can 
use above cmd once we have 3.7 RC0. 
   As suggested by you, we need docker build option for PRs. I have requested 
the same  from @VedarthConfluent.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-16016: Add docker wrapper in core and remove docker utility script [kafka]

2024-01-08 Thread via GitHub


stanislavkozlovski commented on PR #15048:
URL: https://github.com/apache/kafka/pull/15048#issuecomment-1880919489

   Let's merge to 3.7


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-16016: Add docker wrapper in core and remove docker utility script [kafka]

2024-01-08 Thread via GitHub


omkreddy merged PR #15048:
URL: https://github.com/apache/kafka/pull/15048


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[PR] KAFKA-14505; [4/N] Wire transaction verification [kafka]

2024-01-08 Thread via GitHub


dajac opened a new pull request, #15142:
URL: https://github.com/apache/kafka/pull/15142

   This patch wires the transaction verification in the new group coordinator. 
It basically calls the verification path before scheduling the write operation. 
If the verification fails, the error is returned to the caller.
   
   Note that the patch uses `appendForGroup`. I suppose that we will move away 
from using it when https://github.com/apache/kafka/pull/15087 is merged.
   
   ### Committer Checklist (excluded from commit message)
   - [ ] Verify design and implementation 
   - [ ] Verify test coverage and CI build status
   - [ ] Verify documentation (including upgrade notes)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-14588 ZK configuration moved to ZkConfig [kafka]

2024-01-08 Thread via GitHub


nizhikov commented on PR #15075:
URL: https://github.com/apache/kafka/pull/15075#issuecomment-1880988000

   Hello @mimaison 
   
   This PR based on #14871 and includes changes mad in it.
   I think it will be better to review #14871 and after it returns to these 
changes.
   
   Will add "[WIP]" lablel to this PR.
   Sorry for any inconvinience.
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-14588 Move DynamicConfig to server-commons [kafka]

2024-01-08 Thread via GitHub


nizhikov commented on PR #14871:
URL: https://github.com/apache/kafka/pull/14871#issuecomment-1880989880

   Hello. @mmaison
   Please, take a look here before reviewing #15075 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[PR] MINOR: remove classic group preparing rebalance sensor [kafka]

2024-01-08 Thread via GitHub


jeffkbkim opened a new pull request, #15143:
URL: https://github.com/apache/kafka/pull/15143

   *More detailed description of your change,
   if necessary. The PR title and PR message become
   the squashed commit message, so use a separate
   comment to ping reviewers.*
   
   *Summary of testing strategy (including rationale)
   for the feature or bug fix. Unit and/or integration
   tests are expected for any behaviour change and
   system tests should be considered for larger changes.*
   
   ### Committer Checklist (excluded from commit message)
   - [ ] Verify design and implementation 
   - [ ] Verify test coverage and CI build status
   - [ ] Verify documentation (including upgrade notes)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-15853: Move KafkaConfig to server module [kafka]

2024-01-08 Thread via GitHub


OmniaGM commented on PR #15103:
URL: https://github.com/apache/kafka/pull/15103#issuecomment-1881108764

   > Thanks for the PR. It's really big, is there a way to split this into 
smaller PRs? For example we have #15075 in progress to move the ZooKeeper 
configs.
   
   I can try to break it down, I think some interfaces can move out in separate 
pr. The problem is that KafkaConfig is imported in many places and depends on 
too many other classes in core and outside of core. Let me see what I can do. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-15747: KRaft support in DynamicConnectionQuotaTest [kafka]

2024-01-08 Thread via GitHub


mimaison commented on code in PR #15028:
URL: https://github.com/apache/kafka/pull/15028#discussion_r1444736522


##
core/src/test/scala/integration/kafka/network/DynamicConnectionQuotaTest.scala:
##
@@ -58,11 +61,13 @@ class DynamicConnectionQuotaTest extends BaseRequestTest {
   @BeforeEach
   override def setUp(testInfo: TestInfo): Unit = {
 super.setUp(testInfo)
-TestUtils.createTopic(zkClient, topic, brokerCount, brokerCount, servers)
+admin = TestUtils.createAdminClient(brokers, listener)

Review Comment:
   This can be simplified with:
   ```
   admin = createAdminClient(listener)
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Resolved] (KAFKA-15325) Integrate topicId in OffsetFetch and OffsetCommit async consumer calls

2024-01-08 Thread Lianet Magrans (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-15325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lianet Magrans resolved KAFKA-15325.

Resolution: Duplicate

> Integrate topicId in OffsetFetch and OffsetCommit async consumer calls
> --
>
> Key: KAFKA-15325
> URL: https://issues.apache.org/jira/browse/KAFKA-15325
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients, consumer
>Reporter: Lianet Magrans
>Assignee: Lianet Magrans
>Priority: Major
>  Labels: kip-848, kip-848-client-support
> Fix For: 3.8.0
>
>
> KIP-848 introduces support for topicIds in the OffsetFetch and OffsetCommit 
> APIs. The consumer calls to those APIs should be updated to include topicIds 
> when available.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KAFKA-15558) Determine if Timer should be used elsewhere in PrototypeAsyncConsumer.updateFetchPositions()

2024-01-08 Thread Andrew Schofield (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-15558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17804324#comment-17804324
 ] 

Andrew Schofield commented on KAFKA-15558:
--

[~phuctran] I think my PR was in the same area and was a significant 
improvement, but it's not the same thing. I think there's a piece of work to 
improve the handling of timeouts in fetch.

> Determine if Timer should be used elsewhere in 
> PrototypeAsyncConsumer.updateFetchPositions()
> 
>
> Key: KAFKA-15558
> URL: https://issues.apache.org/jira/browse/KAFKA-15558
> Project: Kafka
>  Issue Type: Task
>  Components: clients, consumer
>Reporter: Kirk True
>Assignee: Phuc Hong Tran
>Priority: Major
>  Labels: consumer-threading-refactor, kip-848-preview
> Fix For: 3.8.0
>
>
> This is a followup ticket based on a question from [~junrao] when reviewing 
> the [fetch request manager pull 
> request|https://github.com/apache/kafka/pull/14406]:
> {quote}It still seems weird that we only use the timer for 
> {{{}refreshCommittedOffsetsIfNeeded{}}}, but not for other cases where we 
> don't have valid fetch positions. For example, if all partitions are in 
> {{AWAIT_VALIDATION}} state, it seems that {{PrototypeAsyncConsumer.poll()}} 
> will just go in a busy loop, which is not efficient.
> {quote}
> The goal here is to determine if we should also be propagating the Timer to 
> the validate positions and reset positions operations.
> Note: we should also investigate if the existing {{KafkaConsumer}} 
> implementation should be fixed, too.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] KAFKA-15746: KRaft support in ControllerMutationQuotaTest [kafka]

2024-01-08 Thread via GitHub


mimaison commented on code in PR #15038:
URL: https://github.com/apache/kafka/pull/15038#discussion_r1444747617


##
core/src/test/scala/unit/kafka/server/ControllerMutationQuotaTest.scala:
##
@@ -139,20 +143,23 @@ class ControllerMutationQuotaTest extends BaseRequestTest 
{
   val (_, errors) = createTopics(Map("topic" -> 1), 
StrictDeleteTopicsRequestVersion)
   assertEquals(Set(Errors.NONE), errors.values.toSet)
 
-  // Metric must be there with the correct config
-  waitQuotaMetric(principal.getName, ControllerMutationRate)
+  if (!isKRaftTest()) {

Review Comment:
   Why are we only doing these checks in ZK mode? Isn't the controller mutation 
quota also working in KRaft mode?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (KAFKA-16089) Kafka Streams still leaking memory in 3.7

2024-01-08 Thread Stanislav Kozlovski (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-16089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stanislav Kozlovski updated KAFKA-16089:

Fix Version/s: (was: 3.7.0)

> Kafka Streams still leaking memory in 3.7
> -
>
> Key: KAFKA-16089
> URL: https://issues.apache.org/jira/browse/KAFKA-16089
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 3.7.0
>Reporter: Lucas Brutschy
>Priority: Blocker
> Attachments: graphviz (1).svg
>
>
> In 
> [https://github.com/apache/kafka/commit/58d6d2e5922df60cdc5c5c1bcd1c2d82dd2b71e2]
>  a leak was fixed in the release candidate for 3.7.
>  
> However, Kafka Streams still seems to be leaking memory (just slower) after 
> the fix.
>  
> Attached is the `jeprof` output right before a crash after ~11 hours.
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-16082) JBOD: Possible dataloss when moving leader partition

2024-01-08 Thread Stanislav Kozlovski (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-16082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stanislav Kozlovski updated KAFKA-16082:

Fix Version/s: (was: 3.7.0)

> JBOD: Possible dataloss when moving leader partition
> 
>
> Key: KAFKA-16082
> URL: https://issues.apache.org/jira/browse/KAFKA-16082
> Project: Kafka
>  Issue Type: Bug
>  Components: jbod
>Affects Versions: 3.7.0
>Reporter: Proven Provenzano
>Assignee: Gaurav Narula
>Priority: Blocker
>
> There is a possible dataloss scenario
> when using JBOD,
> when moving the partition leader log from one directory to another on the 
> same broker,
> when after the destination log has caught up to the source log and after the 
> broker has sent an update to the partition assignment
> if the broker accepts and commits a new record for the partition and then the 
> broker restarts and the original partition leader log is lost
> then the destination log would not contain the new record.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] KAFKA-15741: KRaft support in DescribeConsumerGroupTest [kafka]

2024-01-08 Thread via GitHub


mimaison commented on code in PR #14668:
URL: https://github.com/apache/kafka/pull/14668#discussion_r1444787636


##
core/src/test/scala/unit/kafka/admin/DescribeConsumerGroupTest.scala:
##
@@ -351,7 +367,9 @@ class DescribeConsumerGroupTest extends 
ConsumerGroupCommandTest {
 val (result, succeeded) = 
TestUtils.computeUntilTrue(service.collectGroupOffsets(group)) {
   case (state, assignments) =>
 val testGroupAssignments = assignments.toSeq.flatMap(_.filter(_.group 
== group))
+

Review Comment:
   Let's try to not undo unnecessary changes like these



##
core/src/test/scala/unit/kafka/admin/DescribeConsumerGroupTest.scala:
##
@@ -35,9 +35,11 @@ class DescribeConsumerGroupTest extends 
ConsumerGroupCommandTest {
   private val describeTypeState = Array(Array("--state"))
   private val describeTypes = describeTypeOffsets ++ describeTypeMembers ++ 
describeTypeState
 
-  @Test
-  def testDescribeNonExistingGroup(): Unit = {
-TestUtils.createOffsetsTopic(zkClient, servers)
+  @ParameterizedTest(name = TestInfoUtils.TestWithParameterizedQuorumName)
+  @ValueSource(strings = Array("zk", "kraft"))
+  def testDescribeNonExistingGroup(quorum: String): Unit = {
+createOffsetsTopic();

Review Comment:
   Nit: Can we remove the unneeded semicolon?
   Same below



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Resolved] (KAFKA-16016) Migrate utility scripts to kafka codebase

2024-01-08 Thread Vedarth Sharma (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-16016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vedarth Sharma resolved KAFKA-16016.

  Reviewer: Manikumar
Resolution: Fixed

> Migrate utility scripts to kafka codebase
> -
>
> Key: KAFKA-16016
> URL: https://issues.apache.org/jira/browse/KAFKA-16016
> Project: Kafka
>  Issue Type: Sub-task
>  Components: core
>Reporter: Vedarth Sharma
>Assignee: Vedarth Sharma
>Priority: Blocker
> Fix For: 3.8.0
>
>
> Migrate the logic implemented in golang to kafka codebase by creating a new 
> entrypoint for docker images



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (KAFKA-15725) KRaft support in FetchRequestTest

2024-01-08 Thread Mickael Maison (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-15725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mickael Maison resolved KAFKA-15725.

Fix Version/s: 3.7.0
   Resolution: Fixed

> KRaft support in FetchRequestTest
> -
>
> Key: KAFKA-15725
> URL: https://issues.apache.org/jira/browse/KAFKA-15725
> Project: Kafka
>  Issue Type: Task
>  Components: core
>Reporter: Sameer Tejani
>Priority: Minor
>  Labels: kraft, kraft-test, newbie
> Fix For: 3.7.0
>
>
> The following tests in FetchRequestTest in 
> core/src/test/scala/unit/kafka/server/FetchRequestTest.scala need to be 
> updated to support KRaft
> 45 : def testBrokerRespectsPartitionsOrderAndSizeLimits(): Unit = {
> 147 : def testFetchRequestV4WithReadCommitted(): Unit = {
> 165 : def testFetchRequestToNonReplica(): Unit = {
> 195 : def testLastFetchedEpochValidation(): Unit = {
> 200 : def testLastFetchedEpochValidationV12(): Unit = {
> 247 : def testCurrentEpochValidation(): Unit = {
> 252 : def testCurrentEpochValidationV12(): Unit = {
> 295 : def testEpochValidationWithinFetchSession(): Unit = {
> 300 : def testEpochValidationWithinFetchSessionV12(): Unit = {
> 361 : def testDownConversionWithConnectionFailure(): Unit = {
> 428 : def testDownConversionFromBatchedToUnbatchedRespectsOffset(): Unit = {
> 509 : def testCreateIncrementalFetchWithPartitionsInErrorV12(): Unit = {
> 568 : def testFetchWithPartitionsWithIdError(): Unit = {
> 610 : def testZStdCompressedTopic(): Unit = {
> 657 : def testZStdCompressedRecords(): Unit = {
> Scanned 783 lines. Found 0 KRaft tests out of 15 tests



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] KAFKA-14588 Move DynamicConfig to server-commons [kafka]

2024-01-08 Thread via GitHub


nizhikov commented on PR #14871:
URL: https://github.com/apache/kafka/pull/14871#issuecomment-1881211127

   @mimaison @ijuma Guys, I need your advice here.
   
   `ConfigCommand` code are tightly coupled with other part of `core` module.
   In this PR I tried to move one of dependency - `DynamicConfig` to 
`server-commons`.
   
   There still one method that not moved:
   
   ```
 object Broker {
   DynamicBrokerConfig.addDynamicConfigs(BROKER_CONFIG_DEF)
   val nonDynamicProps = KafkaConfig.configNames.toSet -- 
BROKER_CONFIG_DEF.names.asScala
 }
   ```
   
   dependencies are `DynamicConfig` -> `DynamicBrokerConfig` -> `KafkaConfig`, 
etc.
   
   
   Can you, please, advice me - should I continue and try move all chain to 
java?
   I can try to make separate PR with moving `KafkaConfig` and other classes.
   Or is it too much for now and I must go other way for these changes?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (KAFKA-15832) Trigger client reconciliation based on manager poll

2024-01-08 Thread Lianet Magrans (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-15832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lianet Magrans updated KAFKA-15832:
---
Description: 
Currently the reconciliation logic on the client is triggered when a new target 
assignment is received and resolved, or when new unresolved target assignments 
are discovered in metadata.

This could be improved by triggering the reconciliation logic on each poll 
iteration, to reconcile whatever is ready to be reconciled. This would required 
changes to support poll on the MembershipManager, and integrate it with the 
current polling logic in the background thread.

As a result of this task, it should be ensured that the client always makes 
progress (not stuck with pending reconciliations in JOINING or RECONCILING), 
considering edge cases where the client could have an assignment ready to 
reconcile, but continue to receive null assignments from the coordinator 
indicating that nothing has changed from the last assignment sent.

  was:
Currently the reconciliation logic on the client is triggered when a new target 
assignment is received and resolved, or when new unresolved target assignments 
are discovered in metadata.

This could be improved by triggering the reconciliation logic on each poll 
iteration, to reconcile whatever is ready to be reconciled. This would required 
changes to support poll on the MembershipManager, and integrate it with the 
current polling logic in the background thread.


> Trigger client reconciliation based on manager poll
> ---
>
> Key: KAFKA-15832
> URL: https://issues.apache.org/jira/browse/KAFKA-15832
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients, consumer
>Reporter: Lianet Magrans
>Assignee: Lianet Magrans
>Priority: Major
>  Labels: kip-848, kip-848-client-support, kip-848-e2e, 
> kip-848-preview
> Fix For: 3.8.0
>
>
> Currently the reconciliation logic on the client is triggered when a new 
> target assignment is received and resolved, or when new unresolved target 
> assignments are discovered in metadata.
> This could be improved by triggering the reconciliation logic on each poll 
> iteration, to reconcile whatever is ready to be reconciled. This would 
> required changes to support poll on the MembershipManager, and integrate it 
> with the current polling logic in the background thread.
> As a result of this task, it should be ensured that the client always makes 
> progress (not stuck with pending reconciliations in JOINING or RECONCILING), 
> considering edge cases where the client could have an assignment ready to 
> reconcile, but continue to receive null assignments from the coordinator 
> indicating that nothing has changed from the last assignment sent.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] KAFKA-14588 Move DynamicConfig to server-commons [kafka]

2024-01-08 Thread via GitHub


ijuma commented on PR #14871:
URL: https://github.com/apache/kafka/pull/14871#issuecomment-1881235062

   KafkaConfig should move to `server`, not `server-common`. We can introduce 
new interfaces if needed to disentangle the dependencies. Moving a large part 
of `core` to `server-common` defeats the modularization effort.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-14588 Move DynamicConfig to server-commons [kafka]

2024-01-08 Thread via GitHub


ijuma commented on PR #14871:
URL: https://github.com/apache/kafka/pull/14871#issuecomment-1881237874

   Even `DynamicConfig` seems odd to be in `server-common` instead of `server`. 
Things that are specific to the broker should not be in `server-common`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (KAFKA-15832) Trigger client reconciliation based on manager poll

2024-01-08 Thread Lianet Magrans (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-15832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lianet Magrans updated KAFKA-15832:
---
Description: 
Currently the reconciliation logic on the client is triggered when a new target 
assignment is received and resolved, or when new unresolved target assignments 
are discovered in metadata.

This could be improved by triggering the reconciliation logic on each poll 
iteration, to reconcile whatever is ready to be reconciled. This would required 
changes to support poll on the MembershipManager, and integrate it with the 
current polling logic in the background thread.

As a result of this task, it should be ensured that the client always 
reconciles whatever it has pending that has not been removed by the 
coordinator. (This should address edge cases where a client might get stuck 
JOINING/RECONCILING, with a pending reconciliation, where null assignments are 
exchanged between the client and the coordinator, while the long-running 
reconciliation completes) 

  was:
Currently the reconciliation logic on the client is triggered when a new target 
assignment is received and resolved, or when new unresolved target assignments 
are discovered in metadata.

This could be improved by triggering the reconciliation logic on each poll 
iteration, to reconcile whatever is ready to be reconciled. This would required 
changes to support poll on the MembershipManager, and integrate it with the 
current polling logic in the background thread.

As a result of this task, it should be ensured that the client always makes 
progress (not stuck with pending reconciliations in JOINING or RECONCILING), 
considering edge cases where the client could have an assignment ready to 
reconcile, but continue to receive null assignments from the coordinator 
indicating that nothing has changed from the last assignment sent.


> Trigger client reconciliation based on manager poll
> ---
>
> Key: KAFKA-15832
> URL: https://issues.apache.org/jira/browse/KAFKA-15832
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients, consumer
>Reporter: Lianet Magrans
>Assignee: Lianet Magrans
>Priority: Major
>  Labels: kip-848, kip-848-client-support, kip-848-e2e, 
> kip-848-preview
> Fix For: 3.8.0
>
>
> Currently the reconciliation logic on the client is triggered when a new 
> target assignment is received and resolved, or when new unresolved target 
> assignments are discovered in metadata.
> This could be improved by triggering the reconciliation logic on each poll 
> iteration, to reconcile whatever is ready to be reconciled. This would 
> required changes to support poll on the MembershipManager, and integrate it 
> with the current polling logic in the background thread.
> As a result of this task, it should be ensured that the client always 
> reconciles whatever it has pending that has not been removed by the 
> coordinator. (This should address edge cases where a client might get stuck 
> JOINING/RECONCILING, with a pending reconciliation, where null assignments 
> are exchanged between the client and the coordinator, while the long-running 
> reconciliation completes) 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] KAFKA-14588 Move DynamicConfig to server-commons [kafka]

2024-01-08 Thread via GitHub


nizhikov commented on PR #14871:
URL: https://github.com/apache/kafka/pull/14871#issuecomment-1881277689

   @ijuma We can use direct calls like 
`QuotaConfigs.userAndClientQuotaConfigs().names()`, 
`QuotaConfigs.scramMechanismsPlusUserAndClientQuotaConfigs().names()` to obtain 
config names for help message in `ConfigCommand`.
   
   But I can't see how `DynamicConfig.Broker.names` can be created without 
accessing to `KafkaConfig.configKeys`
   
   ```
   object DynamicConfig {
   
 object Broker {
   ...
   DynamicBrokerConfig.addDynamicConfigs(brokerConfigDef)
 }
   }
   
   object DynamicBrokerConfig {
 private[server] def addDynamicConfigs(configDef: ConfigDef): Unit = {
   KafkaConfig.configKeys.forKeyValue { (configName, config) =>
 if (AllDynamicConfigs.contains(configName)) {
   configDef.define(config.name, config.`type`, config.defaultValue, 
config.validator,
 config.importance, config.documentation, config.group, 
config.orderInGroup, config.width,
 config.displayName, config.dependents, config.recommender)
 }
   }
 }
   }
   ``


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (KAFKA-15558) Determine if Timer should be used elsewhere in PrototypeAsyncConsumer.updateFetchPositions()

2024-01-08 Thread Phuc Hong Tran (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-15558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17804348#comment-17804348
 ] 

Phuc Hong Tran commented on KAFKA-15558:


[~schofielaj], can you elaborate a bit more about it? What I get from the 
original commenter is that we should also use timer to control the runtime of 
operation like validating fetch positions and reseting fetch positions, which I 
already saw you implemented.

> Determine if Timer should be used elsewhere in 
> PrototypeAsyncConsumer.updateFetchPositions()
> 
>
> Key: KAFKA-15558
> URL: https://issues.apache.org/jira/browse/KAFKA-15558
> Project: Kafka
>  Issue Type: Task
>  Components: clients, consumer
>Reporter: Kirk True
>Assignee: Phuc Hong Tran
>Priority: Major
>  Labels: consumer-threading-refactor, kip-848-preview
> Fix For: 3.8.0
>
>
> This is a followup ticket based on a question from [~junrao] when reviewing 
> the [fetch request manager pull 
> request|https://github.com/apache/kafka/pull/14406]:
> {quote}It still seems weird that we only use the timer for 
> {{{}refreshCommittedOffsetsIfNeeded{}}}, but not for other cases where we 
> don't have valid fetch positions. For example, if all partitions are in 
> {{AWAIT_VALIDATION}} state, it seems that {{PrototypeAsyncConsumer.poll()}} 
> will just go in a busy loop, which is not efficient.
> {quote}
> The goal here is to determine if we should also be propagating the Timer to 
> the validate positions and reset positions operations.
> Note: we should also investigate if the existing {{KafkaConsumer}} 
> implementation should be fixed, too.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (KAFKA-15558) Determine if Timer should be used elsewhere in PrototypeAsyncConsumer.updateFetchPositions()

2024-01-08 Thread Phuc Hong Tran (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-15558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17804348#comment-17804348
 ] 

Phuc Hong Tran edited comment on KAFKA-15558 at 1/8/24 3:45 PM:


[~schofielaj], can you elaborate a bit more about the improment we need? What I 
get from the original commenter is that we should also use timer to control the 
runtime of operation like validating fetch positions and reseting fetch 
positions, which I already saw you implemented.


was (Author: JIRAUSER301295):
[~schofielaj], can you elaborate a bit more about it? What I get from the 
original commenter is that we should also use timer to control the runtime of 
operation like validating fetch positions and reseting fetch positions, which I 
already saw you implemented.

> Determine if Timer should be used elsewhere in 
> PrototypeAsyncConsumer.updateFetchPositions()
> 
>
> Key: KAFKA-15558
> URL: https://issues.apache.org/jira/browse/KAFKA-15558
> Project: Kafka
>  Issue Type: Task
>  Components: clients, consumer
>Reporter: Kirk True
>Assignee: Phuc Hong Tran
>Priority: Major
>  Labels: consumer-threading-refactor, kip-848-preview
> Fix For: 3.8.0
>
>
> This is a followup ticket based on a question from [~junrao] when reviewing 
> the [fetch request manager pull 
> request|https://github.com/apache/kafka/pull/14406]:
> {quote}It still seems weird that we only use the timer for 
> {{{}refreshCommittedOffsetsIfNeeded{}}}, but not for other cases where we 
> don't have valid fetch positions. For example, if all partitions are in 
> {{AWAIT_VALIDATION}} state, it seems that {{PrototypeAsyncConsumer.poll()}} 
> will just go in a busy loop, which is not efficient.
> {quote}
> The goal here is to determine if we should also be propagating the Timer to 
> the validate positions and reset positions operations.
> Note: we should also investigate if the existing {{KafkaConsumer}} 
> implementation should be fixed, too.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] KAFKA-14588 Move DynamicConfig to server-commons [kafka]

2024-01-08 Thread via GitHub


ijuma commented on PR #14871:
URL: https://github.com/apache/kafka/pull/14871#issuecomment-1881328863

   It seems like the help message is wrong though, right? It will print the 
values based on the tools version versus the server version. Properly 
separating this code will surface bugs like this. To get the correct data, the 
tool has to ask the server via RPC.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Resolved] (KAFKA-15515) Remove duplicated integration tests for new consumer

2024-01-08 Thread Lianet Magrans (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-15515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lianet Magrans resolved KAFKA-15515.

Fix Version/s: (was: 4.0.0)
   Resolution: Not A Problem

PlaintextAsyncConsumer.scala was discarded before merging as it was not needed 
anymore (existing PlaintextConsumer.scala was parametrized)

> Remove duplicated integration tests for new consumer
> 
>
> Key: KAFKA-15515
> URL: https://issues.apache.org/jira/browse/KAFKA-15515
> Project: Kafka
>  Issue Type: Test
>  Components: clients, consumer, unit tests
>Reporter: Lianet Magrans
>Priority: Major
>  Labels: consumer-threading-refactor
>
> This task involves removing the temporary `PlaintextAsyncConsumer` file 
> containing duplicated integration tests for the new consumer. The copy was 
> generated to catch regressions and validate functionality in the new consumer 
> while in development. It should be deleted when the new consumer is fully 
> implemented and the existing integration tests (`PlaintextConsumerTest`) can 
> be executed for both implementations.
>  
> Context:
>  
> For the current KafkaConsumer, a set of integration tests exist in the file 
> PlaintextConsumerTest. Those tests cannot be executed as such for the new 
> consumer implementation for 2 main reasons
> - the new consumer is being developed as a new PrototypeAsyncConsumer class, 
> in parallel to the existing KafkaConsumer. 
> - the new consumer is under development, so it does not support all the 
> consumer functionality yet. 
>  
> In order to be able to run the subsets of tests that the new consumer 
> supports while the implementation completes, it was decided to :  
>  - to make a copy of the `PlaintextAsyncConsumer` class, named 
> PlaintextAsyncConsumer.
> - leave all the existing integration tests that cover the simple consumer 
> case unchanged, and disable the tests that are not yet supported by the new 
> consumer. Disabled tests will be enabled as the async consumer
> evolves.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] KAFKA-14588 Move DynamicConfig to server-commons [kafka]

2024-01-08 Thread via GitHub


ijuma commented on PR #14871:
URL: https://github.com/apache/kafka/pull/14871#issuecomment-1881344819

   By the way, if some of this disentangling is too hard to do all at once - it 
may actually be easier to rewrite in Java and move it to the `server` module. 
Then we can take another step where we move the pieces from `server` to 
`server-common`. If we bloat `server-common`, I think it will be harder to make 
it slim again later.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (KAFKA-15832) Trigger client reconciliation based on manager poll

2024-01-08 Thread Lianet Magrans (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-15832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lianet Magrans updated KAFKA-15832:
---
Description: 
Currently the reconciliation logic on the client is triggered when a new target 
assignment is received and resolved, or when new unresolved target assignments 
are discovered in metadata.

This could be improved by triggering the reconciliation logic on each poll 
iteration, to reconcile whatever is ready to be reconciled. This would required 
changes to support poll on the MembershipManager, and integrate it with the 
current polling logic in the background thread.

As a result of this task, it should be ensured that the client always 
reconciles whatever it has pending that has not been removed by the 
coordinator. (This should address edge cases where a client might get stuck 
JOINING/RECONCILING, with a pending reconciliation, where null assignments are 
exchanged between the client and the coordinator, while the long-running 
reconciliation completes. Note that currently, the MembershipManager relies on 
assignment != null to trigger the reconciliation of pending assignments) 

  was:
Currently the reconciliation logic on the client is triggered when a new target 
assignment is received and resolved, or when new unresolved target assignments 
are discovered in metadata.

This could be improved by triggering the reconciliation logic on each poll 
iteration, to reconcile whatever is ready to be reconciled. This would required 
changes to support poll on the MembershipManager, and integrate it with the 
current polling logic in the background thread.

As a result of this task, it should be ensured that the client always 
reconciles whatever it has pending that has not been removed by the 
coordinator. (This should address edge cases where a client might get stuck 
JOINING/RECONCILING, with a pending reconciliation, where null assignments are 
exchanged between the client and the coordinator, while the long-running 
reconciliation completes) 


> Trigger client reconciliation based on manager poll
> ---
>
> Key: KAFKA-15832
> URL: https://issues.apache.org/jira/browse/KAFKA-15832
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients, consumer
>Reporter: Lianet Magrans
>Assignee: Lianet Magrans
>Priority: Major
>  Labels: kip-848, kip-848-client-support, kip-848-e2e, 
> kip-848-preview
> Fix For: 3.8.0
>
>
> Currently the reconciliation logic on the client is triggered when a new 
> target assignment is received and resolved, or when new unresolved target 
> assignments are discovered in metadata.
> This could be improved by triggering the reconciliation logic on each poll 
> iteration, to reconcile whatever is ready to be reconciled. This would 
> required changes to support poll on the MembershipManager, and integrate it 
> with the current polling logic in the background thread.
> As a result of this task, it should be ensured that the client always 
> reconciles whatever it has pending that has not been removed by the 
> coordinator. (This should address edge cases where a client might get stuck 
> JOINING/RECONCILING, with a pending reconciliation, where null assignments 
> are exchanged between the client and the coordinator, while the long-running 
> reconciliation completes. Note that currently, the MembershipManager relies 
> on assignment != null to trigger the reconciliation of pending assignments) 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-15839) Review topic ID integration in consumer membershipManager

2024-01-08 Thread Lianet Magrans (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-15839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lianet Magrans updated KAFKA-15839:
---
Description: 
TopicIdPartition is currently used in the reconciliation path. Could be used 
more, just leaving topicPartitions when necessary for the callbacks and 
interaction with the subscription state that does not fully support topic ids 
yet

Ensure that we properly handle topic re-creation (same name, diff topic IDs) in 
the reconciliation process (assignment cache, same assignment comparison, etc.)

  was:
It is currently used in the reconciliation path. Could be used more, just 
leaving topicPartitions when necessary for the callbacks and interaction with 
the subscription state that does not fully support topic ids yet

Ensure that we properly handle topic re-creation (same name, diff topic IDs) in 
the reconciliation process (assignment cache, same assignment comparison, etc.)

Summary: Review topic ID integration in consumer membershipManager  
(was: Improve TopicIdPartition integration in consumer membershipManager)

> Review topic ID integration in consumer membershipManager
> -
>
> Key: KAFKA-15839
> URL: https://issues.apache.org/jira/browse/KAFKA-15839
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients, consumer
>Reporter: Lianet Magrans
>Assignee: Lianet Magrans
>Priority: Major
>  Labels: kip-848, kip-848-client-support, kip-848-e2e, 
> kip-848-preview
> Fix For: 3.8.0
>
>
> TopicIdPartition is currently used in the reconciliation path. Could be used 
> more, just leaving topicPartitions when necessary for the callbacks and 
> interaction with the subscription state that does not fully support topic ids 
> yet
> Ensure that we properly handle topic re-creation (same name, diff topic IDs) 
> in the reconciliation process (assignment cache, same assignment comparison, 
> etc.)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[PR] Draft: Revert RocksDB change and disable state updater [kafka]

2024-01-08 Thread via GitHub


lucasbru opened a new pull request, #15144:
URL: https://github.com/apache/kafka/pull/15144

   This is an attempt to disable all features that are currently causing 
problems in the 3.7 release candidate.
   
   ### Committer Checklist (excluded from commit message)
   - [ ] Verify design and implementation 
   - [ ] Verify test coverage and CI build status
   - [ ] Verify documentation (including upgrade notes)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-14588 Move DynamicConfig to server-commons [kafka]

2024-01-08 Thread via GitHub


nizhikov commented on PR #14871:
URL: https://github.com/apache/kafka/pull/14871#issuecomment-1881376286

   > To get the correct data, the tool has to ask the server via RPC. Am I 
missing something?
   
   Different version of Kafka supports different dynamic properties.
   So, to create correct list of supported properties we must get it from 
server.
   
   If yes, I will create separate Jira for this.
   
   > Even DynamicConfig seems odd to be in server-common instead of server. 
   
   Got it. Thanks.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (KAFKA-16054) Sudden 100% CPU on a broker

2024-01-08 Thread Oleksandr Shulgin (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-16054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17804353#comment-17804353
 ] 

Oleksandr Shulgin commented on KAFKA-16054:
---

I was trying to implement the same workaround in the Kafka's Selector class, 
but instead I found something that may explain the issue that we are seeing.

It seems that it is caused by the combination of:
1. Setting timeout to 0 in the poll() method, on condition 
"madeReadProgressLastCall && dataInBuffers": 
https://github.com/apache/kafka/blob/3.6/clients/src/main/java/org/apache/kafka/common/network/Selector.java#L449-L450
2. Setting  madeReadProgressLastCall to "true", even when nothing was actually 
read from a channel: 
https://github.com/apache/kafka/blob/3.6/clients/src/main/java/org/apache/kafka/common/network/Selector.java#L688

The flag madeReadProgressLastCall is apparently there to prevent Kafka user 
space to run out of memory due to all the buffered data, but in this case it 
causes kernel space to run out of TCP buffers...

Can someone more experienced with the code validate my assumptions, please?

> Sudden 100% CPU on a broker
> ---
>
> Key: KAFKA-16054
> URL: https://issues.apache.org/jira/browse/KAFKA-16054
> Project: Kafka
>  Issue Type: Bug
>  Components: network
>Affects Versions: 3.3.2, 3.6.1
> Environment: Amazon AWS, c6g.4xlarge arm64 16 vCPUs + 30 GB,  Amazon 
> Linux
>Reporter: Oleksandr Shulgin
>Priority: Critical
>  Labels: linux
>
> We have observed now for the 3rd time in production the issue where a Kafka 
> broker will suddenly jump to 100% CPU usage and will not recover on its own 
> until manually restarted.
> After a deeper investigation, we now believe that this is an instance of the 
> infamous epoll bug. See:
> [https://github.com/netty/netty/issues/327]
> [https://github.com/netty/netty/pull/565] (original workaround)
> [https://github.com/netty/netty/blob/4.1/transport/src/main/java/io/netty/channel/nio/NioEventLoop.java#L624-L632]
>  (same workaround in the current Netty code)
> The first occurrence in our production environment was on 2023-08-26 and the 
> other two — on 2023-12-10 and 2023-12-20.
> Each time the high CPU issue is also resulting in this other issue (misplaced 
> messages) I was asking about on the users mailing list in September, but to 
> date got not a single reply, unfortunately: 
> [https://lists.apache.org/thread/x1thr4r0vbzjzq5sokqgrxqpsbnnd3yy]
> We still do not know how this other issue is happening.
> When the high CPU happens, top(1) reports a number of "data-plane-kafka..." 
> threads consuming ~60% user and ~40% system CPU, and the thread dump contains 
> a lot of stack traces like the following one:
> "data-plane-kafka-network-thread-67111914-ListenerName(PLAINTEXT)-PLAINTEXT-10"
>  #76 prio=5 os_prio=0 cpu=346710.78ms elapsed=243315.54s 
> tid=0xa12d7690 nid=0x20c runnable [0xfffed87fe000]
> java.lang.Thread.State: RUNNABLE
> #011at sun.nio.ch.EPoll.wait(java.base@17.0.9/Native Method)
> #011at 
> sun.nio.ch.EPollSelectorImpl.doSelect(java.base@17.0.9/EPollSelectorImpl.java:118)
> #011at 
> sun.nio.ch.SelectorImpl.lockAndDoSelect(java.base@17.0.9/SelectorImpl.java:129)
> #011- locked <0x0006c1246410> (a sun.nio.ch.Util$2)
> #011- locked <0x0006c1246318> (a sun.nio.ch.EPollSelectorImpl)
> #011at sun.nio.ch.SelectorImpl.select(java.base@17.0.9/SelectorImpl.java:141)
> #011at org.apache.kafka.common.network.Selector.select(Selector.java:874)
> #011at org.apache.kafka.common.network.Selector.poll(Selector.java:465)
> #011at kafka.network.Processor.poll(SocketServer.scala:1107)
> #011at kafka.network.Processor.run(SocketServer.scala:1011)
> #011at java.lang.Thread.run(java.base@17.0.9/Thread.java:840)
> At the same time the Linux kernel reports repeatedly "TCP: out of memory – 
> consider tuning tcp_mem".
> We are running relatively big machines in production — c6g.4xlarge with 30 GB 
> RAM and the auto-configured setting is: "net.ipv4.tcp_mem = 376608 502145 
> 753216", which corresponds to ~3 GB for the "high" parameter, assuming 4 KB 
> memory pages.
> We were able to reproduce the issue in our test environment (which is using 
> 4x smaller machines), simply by tuning the tcp_mem down by a factor of 10: 
> "sudo sysctl -w net.ipv4.tcp_mem='9234 12313 18469'". The strace of one of 
> the busy Kafka threads shows the following syscalls repeating constantly:
> epoll_pwait(15558, [\{events=EPOLLOUT, data={u32=12286, 
> u64=468381628432382}}|file://\{events=epollout,%20data={u32=12286,%20u64=468381628432382}}/],
>  1024, 300, NULL, 8) = 1
> fstat(12019,\{st_mode=S_IFREG|0644, st_size=414428357, ...}) = 0
> fstat(12019, \{st_mode=S_IFREG|0644, st_size=414428357, ...}) = 0
> sendfile(12286, 12019, [174899834], 947517) 

[PR] KAFKA-16089: Revert "KAFKA-14412: Better Rocks column family management" [kafka]

2024-01-08 Thread via GitHub


lucasbru opened a new pull request, #15145:
URL: https://github.com/apache/kafka/pull/15145

   This PR reverts #14852 and a follow-up fix, since it
   seems to be leaking memory still.
   
   This is still a draft PR, since I will let the reverted
   version soak for a little while longer, but I will ask
   for review / CI already.
   
   ### Committer Checklist (excluded from commit message)
   - [ ] Verify design and implementation 
   - [ ] Verify test coverage and CI build status
   - [ ] Verify documentation (including upgrade notes)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] Draft: Revert RocksDB change and disable state updater [kafka]

2024-01-08 Thread via GitHub


lucasbru closed pull request #15144: Draft: Revert RocksDB change and disable 
state updater
URL: https://github.com/apache/kafka/pull/15144


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[PR] Draft: disable state updater in 3.7 [kafka]

2024-01-08 Thread via GitHub


lucasbru opened a new pull request, #15146:
URL: https://github.com/apache/kafka/pull/15146

   Several problems are still appearing while running 3.7 with
   the state updater. 
   
   This is a draft PR to revert the state updater. Waiting for
   a longer soak to confirm it, but would like to get approval
   and a passing CI run already.
   
   ### Committer Checklist (excluded from commit message)
   - [ ] Verify design and implementation 
   - [ ] Verify test coverage and CI build status
   - [ ] Verify documentation (including upgrade notes)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (KAFKA-15839) Review topic ID integration in consumer reconciliation process

2024-01-08 Thread Lianet Magrans (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-15839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lianet Magrans updated KAFKA-15839:
---
Summary: Review topic ID integration in consumer reconciliation process  
(was: Review topic ID integration in consumer membershipManager)

> Review topic ID integration in consumer reconciliation process
> --
>
> Key: KAFKA-15839
> URL: https://issues.apache.org/jira/browse/KAFKA-15839
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients, consumer
>Reporter: Lianet Magrans
>Assignee: Lianet Magrans
>Priority: Major
>  Labels: kip-848, kip-848-client-support, kip-848-e2e, 
> kip-848-preview
> Fix For: 3.8.0
>
>
> TopicIdPartition is currently used in the reconciliation path. Could be used 
> more, just leaving topicPartitions when necessary for the callbacks and 
> interaction with the subscription state that does not fully support topic ids 
> yet
> Ensure that we properly handle topic re-creation (same name, diff topic IDs) 
> in the reconciliation process (assignment cache, same assignment comparison, 
> etc.)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KAFKA-15283) Add support for topic ID-related Consumer changes

2024-01-08 Thread Lianet Magrans (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-15283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17804360#comment-17804360
 ] 

Lianet Magrans commented on KAFKA-15283:


Not covered by my work for OffsetFetch and OffsetCommit v9 support, given that 
v9 APIs introduced on the server side did not include topic ID as it was 
initially planned. This task is still needed, and it should be done in sync 
with the broker side, when it introduces topic ID in these API calls (probably 
a v10 version)

> Add support for topic ID-related Consumer changes
> -
>
> Key: KAFKA-15283
> URL: https://issues.apache.org/jira/browse/KAFKA-15283
> Project: Kafka
>  Issue Type: New Feature
>  Components: clients, consumer
>Reporter: Kirk True
>Priority: Major
>  Labels: kip-848, kip-848-client-support
> Fix For: 3.8.0
>
>
> Currently, {{KafkaConsumer}} keeps track of topic IDs in the in-memory 
> {{ConsumerMetadata}} object, and they are provided to the {{FETCH}} and 
> {{METADATA}} RPC calls.
> With KIP-848 the OffsetFetch and OffsetCommit will start using topic IDs in 
> the same way, so the new client implementation will provide it when issuing 
> those requests. Topic names should continue to be supported as needed by the 
> {{{}AdminClient{}}}.
> We should also review/clean-up the support for topic names in requests such 
> as the {{METADATA}} request (currently supporting topic names as well as 
> topic IDs on the client side).
> Tasks include:
>  * Introduce Topic ID in existing OffsetFetch and OffsetCommit API that will 
> be upgraded on the server to support topic ID
>  * Check topic ID propagation internally in the client based on RPCs 
> including it.
>  * Review existing support for topic name for potential clean if not needed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] KAFKA-15719: KRaft support in OffsetsForLeaderEpochRequestTest [kafka]

2024-01-08 Thread via GitHub


mimaison merged PR #15049:
URL: https://github.com/apache/kafka/pull/15049


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Resolved] (KAFKA-15719) KRaft support in OffsetsForLeaderEpochRequestTest

2024-01-08 Thread Mickael Maison (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-15719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mickael Maison resolved KAFKA-15719.

Fix Version/s: 3.8.0
   Resolution: Fixed

> KRaft support in OffsetsForLeaderEpochRequestTest
> -
>
> Key: KAFKA-15719
> URL: https://issues.apache.org/jira/browse/KAFKA-15719
> Project: Kafka
>  Issue Type: Task
>  Components: core
>Reporter: Sameer Tejani
>Assignee: Zihao Lin
>Priority: Minor
>  Labels: kraft, kraft-test, newbie
> Fix For: 3.8.0
>
>
> The following tests in OffsetsForLeaderEpochRequestTest in 
> core/src/test/scala/unit/kafka/server/OffsetsForLeaderEpochRequestTest.scala 
> need to be updated to support KRaft
> 37 : def testOffsetsForLeaderEpochErrorCodes(): Unit = {
> 60 : def testCurrentEpochValidation(): Unit = {
> Scanned 127 lines. Found 0 KRaft tests out of 2 tests



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-15832) Trigger client reconciliation based on manager poll

2024-01-08 Thread Lianet Magrans (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-15832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lianet Magrans updated KAFKA-15832:
---
Parent: KAFKA-14048
Issue Type: Sub-task  (was: Improvement)

> Trigger client reconciliation based on manager poll
> ---
>
> Key: KAFKA-15832
> URL: https://issues.apache.org/jira/browse/KAFKA-15832
> Project: Kafka
>  Issue Type: Sub-task
>  Components: clients, consumer
>Reporter: Lianet Magrans
>Assignee: Lianet Magrans
>Priority: Major
>  Labels: kip-848, kip-848-client-support, kip-848-e2e, 
> kip-848-preview
> Fix For: 3.8.0
>
>
> Currently the reconciliation logic on the client is triggered when a new 
> target assignment is received and resolved, or when new unresolved target 
> assignments are discovered in metadata.
> This could be improved by triggering the reconciliation logic on each poll 
> iteration, to reconcile whatever is ready to be reconciled. This would 
> required changes to support poll on the MembershipManager, and integrate it 
> with the current polling logic in the background thread.
> As a result of this task, it should be ensured that the client always 
> reconciles whatever it has pending that has not been removed by the 
> coordinator. (This should address edge cases where a client might get stuck 
> JOINING/RECONCILING, with a pending reconciliation, where null assignments 
> are exchanged between the client and the coordinator, while the long-running 
> reconciliation completes. Note that currently, the MembershipManager relies 
> on assignment != null to trigger the reconciliation of pending assignments) 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-15283) Client support for OffsetFetch and OffsetCommit with topic ID

2024-01-08 Thread Lianet Magrans (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-15283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lianet Magrans updated KAFKA-15283:
---
Summary: Client support for OffsetFetch and OffsetCommit with topic ID  
(was: Add support for topic ID-related Consumer changes)

> Client support for OffsetFetch and OffsetCommit with topic ID
> -
>
> Key: KAFKA-15283
> URL: https://issues.apache.org/jira/browse/KAFKA-15283
> Project: Kafka
>  Issue Type: New Feature
>  Components: clients, consumer
>Reporter: Kirk True
>Priority: Major
>  Labels: kip-848, kip-848-client-support
> Fix For: 3.8.0
>
>
> Currently, {{KafkaConsumer}} keeps track of topic IDs in the in-memory 
> {{ConsumerMetadata}} object, and they are provided to the {{FETCH}} and 
> {{METADATA}} RPC calls.
> With KIP-848 the OffsetFetch and OffsetCommit will start using topic IDs in 
> the same way, so the new client implementation will provide it when issuing 
> those requests. Topic names should continue to be supported as needed by the 
> {{{}AdminClient{}}}.
> We should also review/clean-up the support for topic names in requests such 
> as the {{METADATA}} request (currently supporting topic names as well as 
> topic IDs on the client side).
> Tasks include:
>  * Introduce Topic ID in existing OffsetFetch and OffsetCommit API that will 
> be upgraded on the server to support topic ID
>  * Check topic ID propagation internally in the client based on RPCs 
> including it.
>  * Review existing support for topic name for potential clean if not needed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-15283) Client support for OffsetFetch and OffsetCommit with topic ID

2024-01-08 Thread Lianet Magrans (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-15283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lianet Magrans updated KAFKA-15283:
---
Parent: KAFKA-14048
Issue Type: Sub-task  (was: New Feature)

> Client support for OffsetFetch and OffsetCommit with topic ID
> -
>
> Key: KAFKA-15283
> URL: https://issues.apache.org/jira/browse/KAFKA-15283
> Project: Kafka
>  Issue Type: Sub-task
>  Components: clients, consumer
>Reporter: Kirk True
>Priority: Major
>  Labels: kip-848, kip-848-client-support
> Fix For: 3.8.0
>
>
> Currently, {{KafkaConsumer}} keeps track of topic IDs in the in-memory 
> {{ConsumerMetadata}} object, and they are provided to the {{FETCH}} and 
> {{METADATA}} RPC calls.
> With KIP-848 the OffsetFetch and OffsetCommit will start using topic IDs in 
> the same way, so the new client implementation will provide it when issuing 
> those requests. Topic names should continue to be supported as needed by the 
> {{{}AdminClient{}}}.
> We should also review/clean-up the support for topic names in requests such 
> as the {{METADATA}} request (currently supporting topic names as well as 
> topic IDs on the client side).
> Tasks include:
>  * Introduce Topic ID in existing OffsetFetch and OffsetCommit API that will 
> be upgraded on the server to support topic ID
>  * Check topic ID propagation internally in the client based on RPCs 
> including it.
>  * Review existing support for topic name for potential clean if not needed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] KAFKA-14588 Move DynamicConfig to server-commons [kafka]

2024-01-08 Thread via GitHub


mimaison commented on PR #14871:
URL: https://github.com/apache/kafka/pull/14871#issuecomment-188143

   Note there's also https://github.com/apache/kafka/pull/15103 which is 
attempting to move KafkaConfig to the server module. You may want to sync with 
Omnia to avoid duplicating work.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (KAFKA-15847) Allow to resolve client metadata for specific topics

2024-01-08 Thread Lianet Magrans (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-15847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lianet Magrans updated KAFKA-15847:
---
Description: Currently metadata updates requested through the metadata 
object request metadata for all topics. Consider allowing the partial updates 
that are already expressed as an intention in the Metadata class but not fully 
supported (investigate background in case there were some specifics that led to 
this intention not being fully implemented)   (was: Currently metadata updates 
requested through the metadata object request metadata for all topics. Consider 
allowing the partial updates that are already expressed as an intention in the 
Metadata class but not fully supported. )

> Allow to resolve client metadata for specific topics
> 
>
> Key: KAFKA-15847
> URL: https://issues.apache.org/jira/browse/KAFKA-15847
> Project: Kafka
>  Issue Type: Bug
>  Components: clients, consumer
>Reporter: Lianet Magrans
>Priority: Major
>
> Currently metadata updates requested through the metadata object request 
> metadata for all topics. Consider allowing the partial updates that are 
> already expressed as an intention in the Metadata class but not fully 
> supported (investigate background in case there were some specifics that led 
> to this intention not being fully implemented) 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-15847) Consider partial metadata requests for client reconciliation

2024-01-08 Thread Lianet Magrans (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-15847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lianet Magrans updated KAFKA-15847:
---
Summary: Consider partial metadata requests for client reconciliation  
(was: Allow to resolve client metadata for specific topics)

> Consider partial metadata requests for client reconciliation
> 
>
> Key: KAFKA-15847
> URL: https://issues.apache.org/jira/browse/KAFKA-15847
> Project: Kafka
>  Issue Type: Bug
>  Components: clients, consumer
>Reporter: Lianet Magrans
>Priority: Major
>
> Currently metadata updates requested through the metadata object request 
> metadata for all topics. Consider allowing the partial updates that are 
> already expressed as an intention in the Metadata class but not fully 
> supported (investigate background in case there were some specifics that led 
> to this intention not being fully implemented) 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-15847) Consider partial metadata requests for client reconciliation

2024-01-08 Thread Lianet Magrans (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-15847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lianet Magrans updated KAFKA-15847:
---
Description: New consumer implementing KIP-848 protocol needs to resolve 
metadata for the topics received in the assignment. It does so by relying on 
the centralized metadata object. Currently metadata updates requested through 
the metadata object, request metadata for all topics. Consider allowing the 
partial updates that are already expressed as an intention in the Metadata 
class but not fully supported (investigate background in case there were some 
specifics that led to this intention not being fully implemented)   (was: 
Currently metadata updates requested through the metadata object request 
metadata for all topics. Consider allowing the partial updates that are already 
expressed as an intention in the Metadata class but not fully supported 
(investigate background in case there were some specifics that led to this 
intention not being fully implemented) )

> Consider partial metadata requests for client reconciliation
> 
>
> Key: KAFKA-15847
> URL: https://issues.apache.org/jira/browse/KAFKA-15847
> Project: Kafka
>  Issue Type: Bug
>  Components: clients, consumer
>Reporter: Lianet Magrans
>Priority: Major
>
> New consumer implementing KIP-848 protocol needs to resolve metadata for the 
> topics received in the assignment. It does so by relying on the centralized 
> metadata object. Currently metadata updates requested through the metadata 
> object, request metadata for all topics. Consider allowing the partial 
> updates that are already expressed as an intention in the Metadata class but 
> not fully supported (investigate background in case there were some specifics 
> that led to this intention not being fully implemented) 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-16033) Review client retry logic of OffsetFetch and OffsetCommit responses

2024-01-08 Thread Lianet Magrans (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-16033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lianet Magrans updated KAFKA-16033:
---
Summary: Review client retry logic of OffsetFetch and OffsetCommit 
responses  (was: Review retry logic of OffsetFetch and OffsetCommit responses)

> Review client retry logic of OffsetFetch and OffsetCommit responses
> ---
>
> Key: KAFKA-16033
> URL: https://issues.apache.org/jira/browse/KAFKA-16033
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients, consumer
>Reporter: Lianet Magrans
>Assignee: Lianet Magrans
>Priority: Major
>  Labels: kip-848-client-support
> Fix For: 3.8.0
>
>
> The retry logic for OffsetFetch and OffsetCommit requests lives in the 
> CommitRequestManager, and applies to requests issued from multiple components 
> (AsyncKakfaConsumer for commitSync and commitAsync, CommitRequestManager for 
> the regular auto-commits, MembershipManager for auto-commits before 
> rebalance, auto-commit before closing consumer). While this approach helps to 
> avoid having the retry logic in each caller, currently the CommitManager has 
> it in different places and it ends up being rather hard to follow.
> This task aims at reviewing the retry logic from a high level perspective 
> (multiple callers, with retry needs that have similarities and differences at 
> the same time). So the review should asses the similarities vs differences, 
> and then consider two options:
> 1. Keep retry logic centralized in the CommitManager, but fixed in a more 
> consistent way, applied the same way for all requests, depending on the 
> intention expressed by the caller. Advantages of this approach (current 
> approach + improvement) is that callers that require the same retry logic 
> could reuse if, keeping it in a single place (ex. commitSync from the 
> consumer retries in the same way as the auto-commit before rebalance). 
> 2. move retry logic to the caller. This aligns with the way it was done on 
> the legacy coordinator, but the main challenge seems to be not duplicating 
> the retry logic in callers that require the same.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-16032) Review client inconsistent error handling of OffsetFetch and OffsetCommit responses

2024-01-08 Thread Lianet Magrans (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-16032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lianet Magrans updated KAFKA-16032:
---
Summary: Review client inconsistent error handling of OffsetFetch and 
OffsetCommit responses  (was: Review inconsistent error handling of OffsetFetch 
and OffsetCommit responses)

> Review client inconsistent error handling of OffsetFetch and OffsetCommit 
> responses
> ---
>
> Key: KAFKA-16032
> URL: https://issues.apache.org/jira/browse/KAFKA-16032
> Project: Kafka
>  Issue Type: Bug
>  Components: clients, consumer
>Reporter: Lianet Magrans
>Assignee: Lianet Magrans
>Priority: Major
>  Labels: kip-848-client-support
> Fix For: 3.8.0
>
>
> OffsetFetch and OffsetCommit handle errors separately. There are 2 issues to 
> review around this:
>  - The logic is duplicated for some errors that are treated similarly (ex. 
> NOT_COORDINATOR)
>  - Some errors are not handled similarly in both requests (ex. 
> COORDINATOR_NOT_AVAILABLE handled and retried for OffsetCommit but not 
> OffsetFetch). Note that the specific errors handled by each request were kept 
> the same as in the legacy ConsumerCoordinator but this should be reviewed, in 
> an attempt to handle the same errors, in the same way, whenever possible.
> Note that the legacy approach handles expected errors for each path (FETCH 
> and COMMIT), retrying on those when needed, but does not retry on unexpected 
> retriable errors.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-16033) Review client retry logic of OffsetFetch and OffsetCommit responses

2024-01-08 Thread Lianet Magrans (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-16033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lianet Magrans updated KAFKA-16033:
---
Parent: KAFKA-14048
Issue Type: Sub-task  (was: Improvement)

> Review client retry logic of OffsetFetch and OffsetCommit responses
> ---
>
> Key: KAFKA-16033
> URL: https://issues.apache.org/jira/browse/KAFKA-16033
> Project: Kafka
>  Issue Type: Sub-task
>  Components: clients, consumer
>Reporter: Lianet Magrans
>Assignee: Lianet Magrans
>Priority: Major
>  Labels: kip-848-client-support
> Fix For: 3.8.0
>
>
> The retry logic for OffsetFetch and OffsetCommit requests lives in the 
> CommitRequestManager, and applies to requests issued from multiple components 
> (AsyncKakfaConsumer for commitSync and commitAsync, CommitRequestManager for 
> the regular auto-commits, MembershipManager for auto-commits before 
> rebalance, auto-commit before closing consumer). While this approach helps to 
> avoid having the retry logic in each caller, currently the CommitManager has 
> it in different places and it ends up being rather hard to follow.
> This task aims at reviewing the retry logic from a high level perspective 
> (multiple callers, with retry needs that have similarities and differences at 
> the same time). So the review should asses the similarities vs differences, 
> and then consider two options:
> 1. Keep retry logic centralized in the CommitManager, but fixed in a more 
> consistent way, applied the same way for all requests, depending on the 
> intention expressed by the caller. Advantages of this approach (current 
> approach + improvement) is that callers that require the same retry logic 
> could reuse if, keeping it in a single place (ex. commitSync from the 
> consumer retries in the same way as the auto-commit before rebalance). 
> 2. move retry logic to the caller. This aligns with the way it was done on 
> the legacy coordinator, but the main challenge seems to be not duplicating 
> the retry logic in callers that require the same.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-15538) Client support for java regex based subscription

2024-01-08 Thread Lianet Magrans (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-15538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lianet Magrans updated KAFKA-15538:
---
Parent: KAFKA-14048
Issue Type: Sub-task  (was: New Feature)

> Client support for java regex based subscription 
> -
>
> Key: KAFKA-15538
> URL: https://issues.apache.org/jira/browse/KAFKA-15538
> Project: Kafka
>  Issue Type: Sub-task
>  Components: clients, consumer
>Reporter: Lianet Magrans
>Priority: Major
>  Labels: kip-848, kip-848-client-support
> Fix For: 3.8.0
>
>
> When using subscribe with a java regex (Pattern), we need to resolve it on 
> the client side to send the broker a list of topic names to subscribe to.
> Context:
> The new consumer group protocol uses [Google 
> RE2/J|https://github.com/google/re2j] for regular expressions and introduces 
> new methods in the consumer API to subscribe using a `SubscribePattern`. The 
> subscribe using a java `Pattern` will be still supported for a while but 
> eventually removed.
>  * When the subscribe with SubscriptionPattern is used, the client should 
> just send the regex to the broker and it will be resolved on the server side.
>  * In the case of the subscribe with Pattern, the regex should be resolved on 
> the client side.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (KAFKA-15542) Release member assignments on errors

2024-01-08 Thread Lianet Magrans (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-15542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lianet Magrans resolved KAFKA-15542.

Fix Version/s: 3.7.0
   (was: 3.8.0)
   Resolution: Fixed

Fixed in https://github.com/apache/kafka/pull/14690

> Release member assignments on errors
> 
>
> Key: KAFKA-15542
> URL: https://issues.apache.org/jira/browse/KAFKA-15542
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients, consumer
>Reporter: Lianet Magrans
>Assignee: Lianet Magrans
>Priority: Major
>  Labels: kip-848, kip-848-client-support, kip-848-e2e, 
> kip-848-preview
> Fix For: 3.7.0
>
>
> Member should release assignment by triggering the onPartitionsLost flow from 
> the HB manager when errors occur (both fencing and unrecoverable errors)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-15835) Group commit/callbacks triggering logic

2024-01-08 Thread Lianet Magrans (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-15835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lianet Magrans updated KAFKA-15835:
---
Parent: KAFKA-14048
Issue Type: Sub-task  (was: Improvement)

> Group commit/callbacks triggering logic
> ---
>
> Key: KAFKA-15835
> URL: https://issues.apache.org/jira/browse/KAFKA-15835
> Project: Kafka
>  Issue Type: Sub-task
>  Components: clients, consumer
>Reporter: Lianet Magrans
>Assignee: Lianet Magrans
>Priority: Major
>  Labels: kip-848, kip-848-client-support, kip-848-e2e, 
> kip-848-preview
> Fix For: 3.8.0
>
>
> The new consumer reconciliation logic triggers a commit request, revocation 
> callback and assignment callbacks sequentially to ensure that they are 
> executed in that order. This means that we could require multiple iterations 
> of the poll loop to complete reconciling an assignment. 
> We could consider triggering them all together, to be executed in the same 
> poll iteration, while still making sure that they are executed in the right 
> order. Note that the sequence sometimes should not block on failures (ex. if 
> commit fails revocation proceeds anyways), and other times it does block (if 
> revocation callbacks fail onPartitionsAssigned is not called).
> As part of this task, review the time boundaries for the commit request 
> issued when the assignment changes. It will be effectively time bounded by 
> the rebalance timeout enforced by the broker, so initial approach is to use 
> the same rebalance timeout as boundary on the client. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-15846) Review consumer leave group request best effort

2024-01-08 Thread Lianet Magrans (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-15846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lianet Magrans updated KAFKA-15846:
---
Description: New consumer sends out a leave group request with a best 
effort approach. Transitions to LEAVING to indicate the HB manager that the 
request must be sent, but it does not do any response handling or retrying 
(note that the response is still handled as any other HB response). After the 
first HB manager poll iteration while on LEAVING, the consumer transitions into 
UNSUBSCRIBE (no matter if the request was actually sent out or not, ex, due to 
coordinator not known). Review if this is good enough as an effort to send the 
request, and consider effect of responses that may be received and processed 
when there are no longer relevant  (was: New consumer sends out a leave group 
request with a best effort approach. Transitions to LEAVING to indicate the HB 
manager that the request must be sent, but it does not do any response handling 
or retrying (note that the response is still handled as any other HB response). 
After the first HB manager poll iteration while on LEAVING, the consumer 
transitions into UNSUBSCRIBE (no matter if the request was actually sent out or 
not, ex, due to coordinator not known). Review if this is good enough as an 
effort to send the request.)

> Review consumer leave group request best effort
> ---
>
> Key: KAFKA-15846
> URL: https://issues.apache.org/jira/browse/KAFKA-15846
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients, consumer
>Reporter: Lianet Magrans
>Priority: Major
>  Labels: kip-848, kip-848-client-support
> Fix For: 3.8.0
>
>
> New consumer sends out a leave group request with a best effort approach. 
> Transitions to LEAVING to indicate the HB manager that the request must be 
> sent, but it does not do any response handling or retrying (note that the 
> response is still handled as any other HB response). After the first HB 
> manager poll iteration while on LEAVING, the consumer transitions into 
> UNSUBSCRIBE (no matter if the request was actually sent out or not, ex, due 
> to coordinator not known). Review if this is good enough as an effort to send 
> the request, and consider effect of responses that may be received and 
> processed when there are no longer relevant



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] KAFKA-14589 ConsumerGroupCommand options and case classes rewritten [kafka]

2024-01-08 Thread via GitHub


nizhikov commented on PR #14856:
URL: https://github.com/apache/kafka/pull/14856#issuecomment-1881507357

   @mimaison can you, please, take a look.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (KAFKA-15846) Review consumer leave group request best effort and response handling

2024-01-08 Thread Lianet Magrans (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-15846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lianet Magrans updated KAFKA-15846:
---
Summary: Review consumer leave group request best effort and response 
handling  (was: Review consumer leave group request best effort)

> Review consumer leave group request best effort and response handling
> -
>
> Key: KAFKA-15846
> URL: https://issues.apache.org/jira/browse/KAFKA-15846
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients, consumer
>Reporter: Lianet Magrans
>Priority: Major
>  Labels: kip-848, kip-848-client-support
> Fix For: 3.8.0
>
>
> New consumer sends out a leave group request with a best effort approach. 
> Transitions to LEAVING to indicate the HB manager that the request must be 
> sent, but it does not do any response handling or retrying (note that the 
> response is still handled as any other HB response). After the first HB 
> manager poll iteration while on LEAVING, the consumer transitions into 
> UNSUBSCRIBE (no matter if the request was actually sent out or not, ex, due 
> to coordinator not known). Review if this is good enough as an effort to send 
> the request, and consider effect of responses that may be received and 
> processed when there are no longer relevant



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-15846) Review consumer leave group request best effort and response handling

2024-01-08 Thread Lianet Magrans (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-15846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lianet Magrans updated KAFKA-15846:
---
Parent: KAFKA-14048
Issue Type: Sub-task  (was: Improvement)

> Review consumer leave group request best effort and response handling
> -
>
> Key: KAFKA-15846
> URL: https://issues.apache.org/jira/browse/KAFKA-15846
> Project: Kafka
>  Issue Type: Sub-task
>  Components: clients, consumer
>Reporter: Lianet Magrans
>Priority: Major
>  Labels: kip-848, kip-848-client-support
> Fix For: 3.8.0
>
>
> New consumer sends out a leave group request with a best effort approach. 
> Transitions to LEAVING to indicate the HB manager that the request must be 
> sent, but it does not do any response handling or retrying (note that the 
> response is still handled as any other HB response). After the first HB 
> manager poll iteration while on LEAVING, the consumer transitions into 
> UNSUBSCRIBE (no matter if the request was actually sent out or not, ex, due 
> to coordinator not known). Review if this is good enough as an effort to send 
> the request, and consider effect of responses that may be received and 
> processed when there are no longer relevant



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[PR] KAFKA-7957: Enable testMetricsReporterUpdate [kafka]

2024-01-08 Thread via GitHub


mimaison opened a new pull request, #15147:
URL: https://github.com/apache/kafka/pull/15147

   Ignore for now, I'm investigating this flaky test
   
   ### Committer Checklist (excluded from commit message)
   - [ ] Verify design and implementation 
   - [ ] Verify test coverage and CI build status
   - [ ] Verify documentation (including upgrade notes)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (KAFKA-15843) Review consumer onPartitionsAssigned called with empty partitions

2024-01-08 Thread Lianet Magrans (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-15843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lianet Magrans updated KAFKA-15843:
---
Parent: KAFKA-14048
Issue Type: Sub-task  (was: Improvement)

> Review consumer onPartitionsAssigned called with empty partitions
> -
>
> Key: KAFKA-15843
> URL: https://issues.apache.org/jira/browse/KAFKA-15843
> Project: Kafka
>  Issue Type: Sub-task
>  Components: clients, consumer
>Reporter: Lianet Magrans
>Priority: Major
>  Labels: kip-848, kip-848-client-support, kip-848-e2e, 
> kip-848-preview
> Fix For: 3.8.0
>
>
> Legacy coordinator triggers onPartitionsAssigned with empty assignment (which 
> is not the case when triggering onPartitionsRevoked or Lost). This is the 
> behaviour of the legacy coordinator, and the new consumer implementation 
> maintains the same principle. We should review this to fully understand if it 
> is really needed to call onPartitionsAssigned with empty assignment (or if it 
> should behave consistently with the onRevoke/Lost)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (KAFKA-15846) Review consumer leave group request best effort and response handling

2024-01-08 Thread Lianet Magrans (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-15846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lianet Magrans resolved KAFKA-15846.

Fix Version/s: (was: 3.8.0)
   Resolution: Duplicate

> Review consumer leave group request best effort and response handling
> -
>
> Key: KAFKA-15846
> URL: https://issues.apache.org/jira/browse/KAFKA-15846
> Project: Kafka
>  Issue Type: Sub-task
>  Components: clients, consumer
>Reporter: Lianet Magrans
>Priority: Major
>  Labels: kip-848, kip-848-client-support
>
> New consumer sends out a leave group request with a best effort approach. 
> Transitions to LEAVING to indicate the HB manager that the request must be 
> sent, but it does not do any response handling or retrying (note that the 
> response is still handled as any other HB response). After the first HB 
> manager poll iteration while on LEAVING, the consumer transitions into 
> UNSUBSCRIBE (no matter if the request was actually sent out or not, ex, due 
> to coordinator not known). Review if this is good enough as an effort to send 
> the request, and consider effect of responses that may be received and 
> processed when there are no longer relevant



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-15954) Review minimal effort approach on consumer last heartbeat on unsubscribe

2024-01-08 Thread Lianet Magrans (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-15954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lianet Magrans updated KAFKA-15954:
---
Parent: KAFKA-14048
Issue Type: Sub-task  (was: Improvement)

> Review minimal effort approach on consumer last heartbeat on unsubscribe
> 
>
> Key: KAFKA-15954
> URL: https://issues.apache.org/jira/browse/KAFKA-15954
> Project: Kafka
>  Issue Type: Sub-task
>  Components: clients, consumer
>Reporter: Lianet Magrans
>Assignee: Lianet Magrans
>Priority: Major
>  Labels: kip-848, kip-848-client-support
> Fix For: 3.8.0
>
>
> Currently the legacy and new consumer follows a minimal effort approach when 
> sending a leave group (legacy) or last heartbeat request (new consumer). The 
> request is sent without waiting/handling any response. This behaviour applies 
> when the consumer is being closed or when it unsubscribes.
> For the case when the consumer is being closed, (which is a "terminal" 
> state), it makes sense to just follow a minimal effort approach for 
> "properly" leaving the group. But for the case of unsubscribe, it would maybe 
> make sense to put a little more effort in making sure that the last heartbeat 
> is sent and received by the broker. Note that unsubscribe could a temporary 
> state, where the consumer might want to re-join the group at any time. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-15839) Review topic ID integration in consumer reconciliation process

2024-01-08 Thread Lianet Magrans (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-15839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lianet Magrans updated KAFKA-15839:
---
Parent: KAFKA-14048
Issue Type: Sub-task  (was: Improvement)

> Review topic ID integration in consumer reconciliation process
> --
>
> Key: KAFKA-15839
> URL: https://issues.apache.org/jira/browse/KAFKA-15839
> Project: Kafka
>  Issue Type: Sub-task
>  Components: clients, consumer
>Reporter: Lianet Magrans
>Assignee: Lianet Magrans
>Priority: Major
>  Labels: kip-848, kip-848-client-support, kip-848-e2e, 
> kip-848-preview
> Fix For: 3.8.0
>
>
> TopicIdPartition is currently used in the reconciliation path. Could be used 
> more, just leaving topicPartitions when necessary for the callbacks and 
> interaction with the subscription state that does not fully support topic ids 
> yet
> Ensure that we properly handle topic re-creation (same name, diff topic IDs) 
> in the reconciliation process (assignment cache, same assignment comparison, 
> etc.)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-16092) Queues for Kafka

2024-01-08 Thread Andrew Schofield (Jira)
Andrew Schofield created KAFKA-16092:


 Summary: Queues for Kafka
 Key: KAFKA-16092
 URL: https://issues.apache.org/jira/browse/KAFKA-16092
 Project: Kafka
  Issue Type: Improvement
Reporter: Andrew Schofield
Assignee: Andrew Schofield


This Jira tracks the development of KIP-932: 
https://cwiki.apache.org/confluence/display/KAFKA/KIP-932%3A+Queues+for+Kafka



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] KAFKA-16025: Fix orphaned locks when rebalancing and store cleanup race on unassigned task directories [kafka]

2024-01-08 Thread via GitHub


sanepal commented on code in PR #15088:
URL: https://github.com/apache/kafka/pull/15088#discussion_r1445046684


##
streams/src/main/java/org/apache/kafka/streams/processor/internals/TaskManager.java:
##
@@ -1229,10 +1229,21 @@ private void tryToLockAllNonEmptyTaskDirectories() {
 final String namedTopology = taskDir.namedTopology();
 try {
 final TaskId id = parseTaskDirectoryName(dir.getName(), 
namedTopology);
-if (stateDirectory.lock(id)) {
-lockedTaskDirectories.add(id);
-if (!allTasks.containsKey(id)) {
-log.debug("Temporarily locked unassigned task {} for 
the upcoming rebalance", id);
+boolean lockedEmptyDirectory = false;
+try {
+if (stateDirectory.lock(id)) {
+if (stateDirectory.directoryForTaskIsEmpty(id)) {
+lockedEmptyDirectory = true;

Review Comment:
   Hi @ableegoldman, can you take a look at the update please? Thank you



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (KAFKA-16089) Kafka Streams still leaking memory in 3.7

2024-01-08 Thread Stanislav Kozlovski (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-16089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stanislav Kozlovski updated KAFKA-16089:

Fix Version/s: 3.7.0

> Kafka Streams still leaking memory in 3.7
> -
>
> Key: KAFKA-16089
> URL: https://issues.apache.org/jira/browse/KAFKA-16089
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 3.7.0
>Reporter: Lucas Brutschy
>Priority: Blocker
> Fix For: 3.7.0
>
> Attachments: graphviz (1).svg
>
>
> In 
> [https://github.com/apache/kafka/commit/58d6d2e5922df60cdc5c5c1bcd1c2d82dd2b71e2]
>  a leak was fixed in the release candidate for 3.7.
>  
> However, Kafka Streams still seems to be leaking memory (just slower) after 
> the fix.
>  
> Attached is the `jeprof` output right before a crash after ~11 hours.
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-16092) Queues for Kafka

2024-01-08 Thread Matthias J. Sax (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-16092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias J. Sax updated KAFKA-16092:

Issue Type: New Feature  (was: Improvement)

> Queues for Kafka
> 
>
> Key: KAFKA-16092
> URL: https://issues.apache.org/jira/browse/KAFKA-16092
> Project: Kafka
>  Issue Type: New Feature
>Reporter: Andrew Schofield
>Assignee: Andrew Schofield
>Priority: Major
>
> This Jira tracks the development of KIP-932: 
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-932%3A+Queues+for+Kafka



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-16082) JBOD: Possible dataloss when moving leader partition

2024-01-08 Thread Stanislav Kozlovski (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-16082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stanislav Kozlovski updated KAFKA-16082:

Fix Version/s: 3.7.0

> JBOD: Possible dataloss when moving leader partition
> 
>
> Key: KAFKA-16082
> URL: https://issues.apache.org/jira/browse/KAFKA-16082
> Project: Kafka
>  Issue Type: Bug
>  Components: jbod
>Affects Versions: 3.7.0
>Reporter: Proven Provenzano
>Assignee: Gaurav Narula
>Priority: Blocker
> Fix For: 3.7.0
>
>
> There is a possible dataloss scenario
> when using JBOD,
> when moving the partition leader log from one directory to another on the 
> same broker,
> when after the destination log has caught up to the source log and after the 
> broker has sent an update to the partition assignment
> if the broker accepts and commits a new record for the partition and then the 
> broker restarts and the original partition leader log is lost
> then the destination log would not contain the new record.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] MINOR: Update documentation.html with the 3.7 release [kafka]

2024-01-08 Thread via GitHub


stanislavkozlovski commented on code in PR #15010:
URL: https://github.com/apache/kafka/pull/15010#discussion_r1445107285


##
docs/documentation.html:
##
@@ -60,6 +60,7 @@ Kafka 3.6 Documentation
 3.3.X.
 3.4.X.
 3.5.X.
+3.6.X.

Review Comment:
   https://github.com/apache/kafka-site/pull/576



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[PR] KAFKA-15807: Added support for compression of metrics (KIP-714) [kafka]

2024-01-08 Thread via GitHub


apoorvmittal10 opened a new pull request, #15148:
URL: https://github.com/apache/kafka/pull/15148

   Adds support for compression/decompression of metrics defined in KIP-714.
   
   ### Committer Checklist (excluded from commit message)
   - [ ] Verify design and implementation 
   - [ ] Verify test coverage and CI build status
   - [ ] Verify documentation (including upgrade notes)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



  1   2   >