[jira] [Comment Edited] (KAFKA-17024) add integration test for TransactionsCommand

2024-06-22 Thread Kuan Po Tseng (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-17024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17859453#comment-17859453
 ] 

Kuan Po Tseng edited comment on KAFKA-17024 at 6/23/24 5:24 AM:


Hi [~chia7712] , if you are not working on this one I'm willing to solve this. 
Thanks !


was (Author: brandboat):
Hi [~chia7712] , if you are not working on this one if you are not working on 
it. Thanks !

> add integration test for TransactionsCommand
> 
>
> Key: KAFKA-17024
> URL: https://issues.apache.org/jira/browse/KAFKA-17024
> Project: Kafka
>  Issue Type: Test
>Reporter: Chia-Ping Tsai
>Assignee: Chia-Ping Tsai
>Priority: Minor
>
> as title. currently we have only UT for TransactionsCommand



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KAFKA-17024) add integration test for TransactionsCommand

2024-06-22 Thread Kuan Po Tseng (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-17024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17859453#comment-17859453
 ] 

Kuan Po Tseng commented on KAFKA-17024:
---

Hi [~chia7712] , if you are not working on this one if you are not working on 
it. Thanks !

> add integration test for TransactionsCommand
> 
>
> Key: KAFKA-17024
> URL: https://issues.apache.org/jira/browse/KAFKA-17024
> Project: Kafka
>  Issue Type: Test
>Reporter: Chia-Ping Tsai
>Assignee: Chia-Ping Tsai
>Priority: Minor
>
> as title. currently we have only UT for TransactionsCommand



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-17024) add integration test for TransactionsCommand

2024-06-22 Thread Chia-Ping Tsai (Jira)
Chia-Ping Tsai created KAFKA-17024:
--

 Summary: add integration test for TransactionsCommand
 Key: KAFKA-17024
 URL: https://issues.apache.org/jira/browse/KAFKA-17024
 Project: Kafka
  Issue Type: Test
Reporter: Chia-Ping Tsai
Assignee: Chia-Ping Tsai


as title. currently we have only UT for TransactionsCommand



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (KAFKA-17017) AsyncConsumer#unsubscribe does not clean the assigned partitions

2024-06-22 Thread PoAn Yang (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-17017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

PoAn Yang reassigned KAFKA-17017:
-

Assignee: PoAn Yang  (was: Chia-Ping Tsai)

> AsyncConsumer#unsubscribe does not clean the assigned partitions
> 
>
> Key: KAFKA-17017
> URL: https://issues.apache.org/jira/browse/KAFKA-17017
> Project: Kafka
>  Issue Type: Bug
>Reporter: Chia-Ping Tsai
>Assignee: PoAn Yang
>Priority: Major
>  Labels: kip-848-client-support
>
> According to docs [0] `Consumer#unsubscribe` should clean both subscribed and 
> assigned partitions. However, there are two issues about `AsyncConsumer`
> 1)  if we don't set group id,  `AsyncConsumer#unsubscribe`[1] will be no-op
> 2)  if we set group id, `AsyncConsumer` is always in `UNSUBSCRIBED` state and 
> so `MembershipManagerImpl#leaveGroup`[2] will be no-op
> [0] 
> https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java#L759
> [1] 
> https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/clients/consumer/internals/AsyncKafkaConsumer.java#L1479
> [2] 
> https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/clients/consumer/internals/MembershipManagerImpl.java#L666



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KAFKA-17021) Migrate AclCommandTest to new test infra

2024-06-22 Thread Chia-Ping Tsai (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-17021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17856960#comment-17856960
 ] 

Chia-Ping Tsai commented on KAFKA-17021:


[~javakillah] I have assigned this to you. Also, please feel free to file two 
PRs if rewriting it by java and new infra will produce a bunch of changes :)

> Migrate AclCommandTest to new test infra
> 
>
> Key: KAFKA-17021
> URL: https://issues.apache.org/jira/browse/KAFKA-17021
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Chia-Ping Tsai
>Assignee: Chia-Ping Tsai
>Priority: Major
>
> as title. Also, it would be great to rewrite it by java



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (KAFKA-17021) Migrate AclCommandTest to new test infra

2024-06-22 Thread Chia-Ping Tsai (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-17021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chia-Ping Tsai reassigned KAFKA-17021:
--

Assignee: Dmitry Werner  (was: Chia-Ping Tsai)

> Migrate AclCommandTest to new test infra
> 
>
> Key: KAFKA-17021
> URL: https://issues.apache.org/jira/browse/KAFKA-17021
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Chia-Ping Tsai
>Assignee: Dmitry Werner
>Priority: Major
>
> as title. Also, it would be great to rewrite it by java



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KAFKA-17021) Migrate AclCommandTest to new test infra

2024-06-22 Thread Dmitry Werner (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-17021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17856957#comment-17856957
 ] 

Dmitry Werner commented on KAFKA-17021:
---

[~chia7712] Hello. If you are not start working on this, I would like take this 
issue.

> Migrate AclCommandTest to new test infra
> 
>
> Key: KAFKA-17021
> URL: https://issues.apache.org/jira/browse/KAFKA-17021
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Chia-Ping Tsai
>Assignee: Chia-Ping Tsai
>Priority: Major
>
> as title. Also, it would be great to rewrite it by java



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-15713) KRaft support in AclCommandTest

2024-06-22 Thread Pavel Pozdeev (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-15713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pavel Pozdeev updated KAFKA-15713:
--
Description: 
The following tests in AclCommandTest in 
core/src/test/scala/unit/kafka/admin/AclCommandTest.scala need to be updated to 
support KRaft

125 : def testAclCliWithAuthorizer(): Unit = {

130 : def testAclCliWithAdminAPI(): Unit = {

186 : def testProducerConsumerCliWithAuthorizer(): Unit = {

191 : def testProducerConsumerCliWithAdminAPI(): Unit = {

197 : def testAclCliWithClientId(): Unit = {

236 : def testAclsOnPrefixedResourcesWithAuthorizer(): Unit = {

241 : def testAclsOnPrefixedResourcesWithAdminAPI(): Unit = {

268 : def testInvalidAuthorizerProperty(): Unit = {

276 : def testPatternTypes(): Unit = {

Scanned 336 lines. Found 0 KRaft tests out of 9 tests

  was:
The following tests in SaslClientsWithInvalidCredentialsTest in 
core/src/test/scala/integration/kafka/api/SaslClientsWithInvalidCredentialsTest.scala
 need to be updated to support KRaft

125 : def testAclCliWithAuthorizer(): Unit = {

130 : def testAclCliWithAdminAPI(): Unit = {

186 : def testProducerConsumerCliWithAuthorizer(): Unit = {

191 : def testProducerConsumerCliWithAdminAPI(): Unit = {

197 : def testAclCliWithClientId(): Unit = {

236 : def testAclsOnPrefixedResourcesWithAuthorizer(): Unit = {

241 : def testAclsOnPrefixedResourcesWithAdminAPI(): Unit = {

268 : def testInvalidAuthorizerProperty(): Unit = {

276 : def testPatternTypes(): Unit = {

Scanned 336 lines. Found 0 KRaft tests out of 9 tests


> KRaft support in AclCommandTest
> ---
>
> Key: KAFKA-15713
> URL: https://issues.apache.org/jira/browse/KAFKA-15713
> Project: Kafka
>  Issue Type: Task
>  Components: core
>Reporter: Sameer Tejani
>Assignee: Pavel Pozdeev
>Priority: Minor
>  Labels: kraft, kraft-test, newbie
> Fix For: 3.9.0
>
>
> The following tests in AclCommandTest in 
> core/src/test/scala/unit/kafka/admin/AclCommandTest.scala need to be updated 
> to support KRaft
> 125 : def testAclCliWithAuthorizer(): Unit = {
> 130 : def testAclCliWithAdminAPI(): Unit = {
> 186 : def testProducerConsumerCliWithAuthorizer(): Unit = {
> 191 : def testProducerConsumerCliWithAdminAPI(): Unit = {
> 197 : def testAclCliWithClientId(): Unit = {
> 236 : def testAclsOnPrefixedResourcesWithAuthorizer(): Unit = {
> 241 : def testAclsOnPrefixedResourcesWithAdminAPI(): Unit = {
> 268 : def testInvalidAuthorizerProperty(): Unit = {
> 276 : def testPatternTypes(): Unit = {
> Scanned 336 lines. Found 0 KRaft tests out of 9 tests



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KAFKA-17023) add PCollectionsImmutableMap to ConcurrentMapBenchmark

2024-06-22 Thread Chia-Ping Tsai (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-17023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17856944#comment-17856944
 ] 

Chia-Ping Tsai commented on KAFKA-17023:


noticed that `PCollectionsImmutableMap` does not declare thread-safe, and so we 
need to use `volatile` to make it run with multi-thread (see `AclCache` for 
example)

> add PCollectionsImmutableMap to ConcurrentMapBenchmark
> --
>
> Key: KAFKA-17023
> URL: https://issues.apache.org/jira/browse/KAFKA-17023
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Chia-Ping Tsai
>Assignee: TaiJuWu
>Priority: Minor
>
> PCollectionsImmutableMap is used in code base, and so we should consider add 
> it to benchmark :)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (KAFKA-17023) add PCollectionsImmutableMap to ConcurrentMapBenchmark

2024-06-22 Thread Chia-Ping Tsai (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-17023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chia-Ping Tsai reassigned KAFKA-17023:
--

Assignee: TaiJuWu  (was: Chia-Ping Tsai)

> add PCollectionsImmutableMap to ConcurrentMapBenchmark
> --
>
> Key: KAFKA-17023
> URL: https://issues.apache.org/jira/browse/KAFKA-17023
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Chia-Ping Tsai
>Assignee: TaiJuWu
>Priority: Minor
>
> PCollectionsImmutableMap is used in code base, and so we should consider add 
> it to benchmark :)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KAFKA-17023) add PCollectionsImmutableMap to ConcurrentMapBenchmark

2024-06-22 Thread TaiJuWu (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-17023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17856934#comment-17856934
 ] 

TaiJuWu commented on KAFKA-17023:
-

Hi [~chia7712] , I want to take it. Please assign it to me. Thanks.

> add PCollectionsImmutableMap to ConcurrentMapBenchmark
> --
>
> Key: KAFKA-17023
> URL: https://issues.apache.org/jira/browse/KAFKA-17023
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Chia-Ping Tsai
>Assignee: Chia-Ping Tsai
>Priority: Minor
>
> PCollectionsImmutableMap is used in code base, and so we should consider add 
> it to benchmark :)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-17023) add PCollectionsImmutableMap to ConcurrentMapBenchmark

2024-06-22 Thread Chia-Ping Tsai (Jira)
Chia-Ping Tsai created KAFKA-17023:
--

 Summary: add PCollectionsImmutableMap to ConcurrentMapBenchmark
 Key: KAFKA-17023
 URL: https://issues.apache.org/jira/browse/KAFKA-17023
 Project: Kafka
  Issue Type: Improvement
Reporter: Chia-Ping Tsai
Assignee: Chia-Ping Tsai


PCollectionsImmutableMap is used in code base, and so we should consider add it 
to benchmark :)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (KAFKA-17022) Fix error-prone in KafkaApis#handleFetchRequest

2024-06-22 Thread Chia-Ping Tsai (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-17022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chia-Ping Tsai reassigned KAFKA-17022:
--

Assignee: TengYao Chi  (was: Chia-Ping Tsai)

> Fix error-prone in KafkaApis#handleFetchRequest 
> 
>
> Key: KAFKA-17022
> URL: https://issues.apache.org/jira/browse/KAFKA-17022
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Chia-Ping Tsai
>Assignee: TengYao Chi
>Priority: Minor
>
>  `createResponse`[0] references a variable out of scope, and so that is 
> error-prone since it could be not initialized when executing. We should do a 
> bit refactor to add `unconvertedFetchResponse` to `createResponse.
> [0] 
> https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/server/KafkaApis.scala#L939



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KAFKA-17022) Fix error-prone in KafkaApis#handleFetchRequest

2024-06-22 Thread TengYao Chi (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-17022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17856931#comment-17856931
 ] 

TengYao Chi commented on KAFKA-17022:
-

Gentle ping [~chia7712] ,if you are not start working on this, I would like 
handle this issue :)

> Fix error-prone in KafkaApis#handleFetchRequest 
> 
>
> Key: KAFKA-17022
> URL: https://issues.apache.org/jira/browse/KAFKA-17022
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Chia-Ping Tsai
>Assignee: Chia-Ping Tsai
>Priority: Minor
>
>  `createResponse`[0] references a variable out of scope, and so that is 
> error-prone since it could be not initialized when executing. We should do a 
> bit refactor to add `unconvertedFetchResponse` to `createResponse.
> [0] 
> https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/server/KafkaApis.scala#L939



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KAFKA-17014) ScramFormatter should not use String for password.

2024-06-22 Thread Tsz-wo Sze (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-17014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17856930#comment-17856930
 ] 

Tsz-wo Sze commented on KAFKA-17014:


[~bmilk], thanks for picking this up!

> ScramFormatter should not use String for password.
> --
>
> Key: KAFKA-17014
> URL: https://issues.apache.org/jira/browse/KAFKA-17014
> Project: Kafka
>  Issue Type: Improvement
>  Components: security
>Reporter: Tsz-wo Sze
>Assignee: dujian0068
>Priority: Major
>
> Since String is immutable, there is no easy way to erase a String password 
> after use.  We should not use String for password.  See also  
> https://stackoverflow.com/questions/8881291/why-is-char-preferred-over-string-for-passwords



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-17022) Fix error-prone in KafkaApis#handleFetchRequest

2024-06-22 Thread Chia-Ping Tsai (Jira)
Chia-Ping Tsai created KAFKA-17022:
--

 Summary: Fix error-prone in KafkaApis#handleFetchRequest 
 Key: KAFKA-17022
 URL: https://issues.apache.org/jira/browse/KAFKA-17022
 Project: Kafka
  Issue Type: Improvement
Reporter: Chia-Ping Tsai


 `createResponse`[0] references a variable out of scope, and so that is 
error-prone since it could be not initialized when executing. We should do a 
bit refactor to add `unconvertedFetchResponse` to `createResponse.

[0] 
https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/server/KafkaApis.scala#L939



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (KAFKA-17022) Fix error-prone in KafkaApis#handleFetchRequest

2024-06-22 Thread Chia-Ping Tsai (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-17022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chia-Ping Tsai reassigned KAFKA-17022:
--

Assignee: Chia-Ping Tsai

> Fix error-prone in KafkaApis#handleFetchRequest 
> 
>
> Key: KAFKA-17022
> URL: https://issues.apache.org/jira/browse/KAFKA-17022
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Chia-Ping Tsai
>Assignee: Chia-Ping Tsai
>Priority: Minor
>
>  `createResponse`[0] references a variable out of scope, and so that is 
> error-prone since it could be not initialized when executing. We should do a 
> bit refactor to add `unconvertedFetchResponse` to `createResponse.
> [0] 
> https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/server/KafkaApis.scala#L939



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (KAFKA-16707) Kafka Kraft : adding Principal Type in StandardACL for matching with KafkaPrincipal of connected client in order to defined ACL with a notion of group

2024-06-22 Thread Franck LEDAY (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-16707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17856929#comment-17856929
 ] 

Franck LEDAY edited comment on KAFKA-16707 at 6/22/24 1:49 PM:
---

Hi Greg.

I'm pleased to get a return on my JIRA.

To answer you, I hope my explanation won't be too confusing : you're right, I'm 
adding new Principal Type, BUT, the perimeter of validity is only in 
StandardACL part of the KRAFT implementation, and to defined how the ACL will 
check the credential of the client connected to the broker. Those new types 
oare not to be used in a PrincipalKafka used for a connected client.

Also, I understand that you fear a confusing usage of this, and so, I suggest 
that I change those specific Principal Type for Standard ACL by  adding a 
prefix like Acl to avoid confusion, nor collision, it would give somehting like:
 * Regex > AclRegex
 * StartsWith > AclStartsWith
 * EndsWith > AclEndsWith
 * Contains > AclContains
 * User > User to keep it as before, in order to not change the basic behavior.

With those informations, do you believe I have to create a KIP?

(Note : I switch the PR to WIP as I will have less time over next month on this 
subject as previous months, but I'm still present for modification and test).

(Also, as I didn't remember reference of other Principal Type than User in the 
Kafka doc, may I ask you to point me where in the doc (and code) there are 
checks on Principal Type of other value than User?)


was (Author: handfreezer):
Hi Greg.

I'm pleased to get a return on my JIRA.

To answer you, I hope my explanation won't be too confusing : you're right, I'm 
adding new Principal Type, BUT, the perimeter of validity is only in 
StandardACL part of the KRAFT implementation, and to defined how the ACL will 
check the credential of the client connected to the broker. Those new types 
oare not to be used in a PrincipalKafka used for a connected client.

Also, I understand that you fear a confusing usage of this, and so, I suggest 
that I change those specific Principal Type for Standard ACL by  adding a 
prefix like Acl to avoid confusion, nor collision, it would give somehting like:
 * Regex > AclRegex
 * StartsWith > AclStartsWith
 * EndsWith > AclEndsWith
 * Contains > AclContains
 * User > User to keep it as before, in order to not change the basic behavior.

With those informations, do you believe I have to create a KIP?

(Note : I switch the PR to WIP as I will have less time over next month on this 
subject as previous months, but I'm still present for modification and test).

 

> Kafka Kraft : adding Principal Type in StandardACL for matching with 
> KafkaPrincipal of connected client in order to defined ACL with a notion of 
> group
> --
>
> Key: KAFKA-16707
> URL: https://issues.apache.org/jira/browse/KAFKA-16707
> Project: Kafka
>  Issue Type: Improvement
>  Components: kraft, security
>Affects Versions: 3.7.0, 3.8.0, 3.7.1
>Reporter: Franck LEDAY
>Assignee: Franck LEDAY
>Priority: Major
>  Labels: KafkaPrincipal, acl, authorization, group, metadata, 
> security, user
>   Original Estimate: 0h
>  Remaining Estimate: 0h
>
> Default StandardAuthorizer in Kraft mode is defining a KafkaPrincpal as 
> type=User and a name, and a special wildcard eventually.
> The difficulty with this solution is that we can't define ACL by group of 
> KafkaPrincipal.
> There is a way for the moment to do so by defining RULE to rewrite the 
> KafkaPrincipal name field, BUT, to introduce this way the notion of group, 
> you have to set rules which will make you loose the uniq part of the 
> KafkaPrincipal name of the connected client.
> The concept here, in the StandardAuthorizer of Kafka Kraft, is to add  the 
> management of KafkaPrincipal type:
>  * Regex
>  * StartsWith
>  * EndsWith
>  * Contains
>  * (User is still available and keep working as before to avoid any 
> regression/issue with current configurations)
> This would be done in the StandardAcl class of metadata/authorizer, and the 
> findresult method of StandardAuthorizerData will delegate the match to the 
> StandardAcl class (for performance reason, see below explanation).
> By this way, you can still use RULEs to rewrite KafkaPrincipal name of 
> connected client (say you want to transform a DN of SSL certificate : 
> cn=myCN,ou=myOU,c=FR becomes myCN@myOU), and then, you can define a new ACL 
> with principal like: 'Regex:^.*@my[oO]U$' that will match all connected 
> client with a certificate bind to ou=myOU . Note in this particular case, the 
> same can be done with 'EndsWtih:@myOU', and the type 'Contains' can work, but 
> I imagine more the usage of 

[jira] [Commented] (KAFKA-16707) Kafka Kraft : adding Principal Type in StandardACL for matching with KafkaPrincipal of connected client in order to defined ACL with a notion of group

2024-06-22 Thread Franck LEDAY (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-16707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17856929#comment-17856929
 ] 

Franck LEDAY commented on KAFKA-16707:
--

Hi Greg.

I'm pleased to get a return on my JIRA.

To answer you, I hope my explanation won't be too confusing : you're right, I'm 
adding new Principal Type, BUT, the perimeter of validity is only in 
StandardACL part of the KRAFT implementation, and to defined how the ACL will 
check the credential of the client connected to the broker. Those new types 
oare not to be used in a PrincipalKafka used for a connected client.

Also, I understand that you fear a confusing usage of this, and so, I suggest 
that I change those specific Principal Type for Standard ACL by  adding a 
prefix like Acl to avoid confusion, nor collision, it would give somehting like:
 * Regex > AclRegex
 * StartsWith > AclStartsWith
 * EndsWith > AclEndsWith
 * Contains > AclContains
 * User > User to keep it as before, in order to not change the basic behavior.

With those informations, do you believe I have to create a KIP?

(Note : I switch the PR to WIP as I will have less time over next month on this 
subject as previous months, but I'm still present for modification and test).

 

> Kafka Kraft : adding Principal Type in StandardACL for matching with 
> KafkaPrincipal of connected client in order to defined ACL with a notion of 
> group
> --
>
> Key: KAFKA-16707
> URL: https://issues.apache.org/jira/browse/KAFKA-16707
> Project: Kafka
>  Issue Type: Improvement
>  Components: kraft, security
>Affects Versions: 3.7.0, 3.8.0, 3.7.1
>Reporter: Franck LEDAY
>Assignee: Franck LEDAY
>Priority: Major
>  Labels: KafkaPrincipal, acl, authorization, group, metadata, 
> security, user
>   Original Estimate: 0h
>  Remaining Estimate: 0h
>
> Default StandardAuthorizer in Kraft mode is defining a KafkaPrincpal as 
> type=User and a name, and a special wildcard eventually.
> The difficulty with this solution is that we can't define ACL by group of 
> KafkaPrincipal.
> There is a way for the moment to do so by defining RULE to rewrite the 
> KafkaPrincipal name field, BUT, to introduce this way the notion of group, 
> you have to set rules which will make you loose the uniq part of the 
> KafkaPrincipal name of the connected client.
> The concept here, in the StandardAuthorizer of Kafka Kraft, is to add  the 
> management of KafkaPrincipal type:
>  * Regex
>  * StartsWith
>  * EndsWith
>  * Contains
>  * (User is still available and keep working as before to avoid any 
> regression/issue with current configurations)
> This would be done in the StandardAcl class of metadata/authorizer, and the 
> findresult method of StandardAuthorizerData will delegate the match to the 
> StandardAcl class (for performance reason, see below explanation).
> By this way, you can still use RULEs to rewrite KafkaPrincipal name of 
> connected client (say you want to transform a DN of SSL certificate : 
> cn=myCN,ou=myOU,c=FR becomes myCN@myOU), and then, you can define a new ACL 
> with principal like: 'Regex:^.*@my[oO]U$' that will match all connected 
> client with a certificate bind to ou=myOU . Note in this particular case, the 
> same can be done with 'EndsWtih:@myOU', and the type 'Contains' can work, but 
> I imagine more the usage of this type for matching in a multigroup definition 
> in a KafkaPrincipal.
>  
> Note about performance reason : for the moment, I have it implemented in a 
> fork of StandardAuthroizer/StandardAuthroizerData/StandardAcl defined by the 
> property authorizer.class.name in a cluster of Kraft with SSL authentication 
> required and tested fine. But, by this way, every time that an ACL is checked 
> against a KafkaPrincipal, I do a strcmp of the KafkaPrincipal type of the ACL 
> to determine the matching method to be done. By implementing it in 
> StandardAcl class, and then delegating the matching from 
> StandardAuthorizerData to the StandardAcl class, this allow to analyse and 
> store the type of the KafkaPrincipal method for matching as an enum, and the 
> KafkaPrincipal name separately in order to avoid redoing the job each time a 
> match has to be checked.
>  
> Here is my status of the implementation:
>  * I have this solution ('performance reason') implemented in fork (then 
> branch) of the 3.7.0 github repo,
>  * I added few unit test, and a gradlew metadata:test is working fine on all 
> tests except one (witch is failing also on branch 3.7.0 without my changes),
>  * I added few lines about in security.html .
>  
> I'm opening the issue to discuss it with you, because I would like to create 
> a PR on Github for next version.




[jira] [Updated] (KAFKA-17020) After enabling tiered storage, occasional residual logs are left in the replica

2024-06-22 Thread Jianbin Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-17020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jianbin Chen updated KAFKA-17020:
-
Attachment: image-2024-06-22-21-47-00-230.png
image-2024-06-22-21-46-42-917.png
image-2024-06-22-21-46-26-530.png
image-2024-06-22-21-46-12-371.png
image-2024-06-22-21-45-43-815.png
External issue URL: 
https://github.com/Aiven-Open/tiered-storage-for-apache-kafka/issues/562
   Description: 
After enabling tiered storage, occasional residual logs are left in the replica.
Based on the observed phenomenon, the index values of the rolled-out logs 
generated by the replica and the leader are not the same. As a result, the logs 
uploaded to S3 at the same time do not include the corresponding log files on 
the replica side, making it impossible to delete the local logs.
!image-2024-06-22-21-45-43-815.png!
leader config:
{code:java}
num.partitions=3
default.replication.factor=2
delete.topic.enable=true
auto.create.topics.enable=false
num.recovery.threads.per.data.dir=1
offsets.topic.replication.factor=3
transaction.state.log.replication.factor=2
transaction.state.log.min.isr=1
offsets.retention.minutes=4320
log.roll.ms=8640
log.local.retention.ms=60
log.segment.bytes=536870912
num.replica.fetchers=1
log.retention.ms=1581120
remote.log.manager.thread.pool.size=4
remote.log.reader.threads=4
remote.log.metadata.topic.replication.factor=3
remote.log.storage.system.enable=true
remote.log.metadata.topic.retention.ms=18000
rsm.config.fetch.chunk.cache.class=io.aiven.kafka.tieredstorage.fetch.cache.DiskChunkCache
rsm.config.fetch.chunk.cache.path=/data01/kafka-tiered-storage-cache

Pick some cache size, 16 GiB here:
rsm.config.fetch.chunk.cache.size=34359738368
rsm.config.fetch.chunk.cache.retention.ms=120
# # Prefetching size, 16 MiB here:
rsm.config.fetch.chunk.cache.prefetch.max.size=33554432
rsm.config.storage.backend.class=io.aiven.kafka.tieredstorage.storage.s3.S3Storage
rsm.config.storage.s3.bucket.name=
rsm.config.storage.s3.region=us-west-1
rsm.config.storage.aws.secret.access.key=
rsm.config.storage.aws.access.key.id=
rsm.config.chunk.size=8388608
remote.log.storage.manager.class.path=/home/admin/core-0.0.1-SNAPSHOT/:/home/admin/s3-0.0.1-SNAPSHOT/
remote.log.storage.manager.class.name=io.aiven.kafka.tieredstorage.RemoteStorageManager
remote.log.metadata.manager.class.name=org.apache.kafka.server.log.remote.metadata.storage.TopicBasedRemoteLogMetadataManager
remote.log.metadata.manager.listener.name=PLAINTEXT
rsm.config.upload.rate.limit.bytes.per.second=31457280
{code}
 replica config:
{code:java}
num.partitions=3
default.replication.factor=2
delete.topic.enable=true
auto.create.topics.enable=false
num.recovery.threads.per.data.dir=1
offsets.topic.replication.factor=3
transaction.state.log.replication.factor=2
transaction.state.log.min.isr=1
offsets.retention.minutes=4320
log.roll.ms=8640
log.local.retention.ms=60
log.segment.bytes=536870912
num.replica.fetchers=1
log.retention.ms=1581120
remote.log.manager.thread.pool.size=4
remote.log.reader.threads=4
remote.log.metadata.topic.replication.factor=3
remote.log.storage.system.enable=true
#remote.log.metadata.topic.retention.ms=18000
rsm.config.fetch.chunk.cache.class=io.aiven.kafka.tieredstorage.fetch.cache.DiskChunkCache
rsm.config.fetch.chunk.cache.path=/data01/kafka-tiered-storage-cache
# Pick some cache size, 16 GiB here:
rsm.config.fetch.chunk.cache.size=34359738368
rsm.config.fetch.chunk.cache.retention.ms=120
# # # Prefetching size, 16 MiB here:
rsm.config.fetch.chunk.cache.prefetch.max.size=33554432
rsm.config.storage.backend.class=io.aiven.kafka.tieredstorage.storage.s3.S3Storage
rsm.config.storage.s3.bucket.name=
rsm.config.storage.s3.region=us-west-1
rsm.config.storage.aws.secret.access.key=
rsm.config.storage.aws.access.key.id=
rsm.config.chunk.size=8388608
remote.log.storage.manager.class.path=/home/admin/core-0.0.1-SNAPSHOT/*:/home/admin/s3-0.0.1-SNAPSHOT/*
remote.log.storage.manager.class.name=io.aiven.kafka.tieredstorage.RemoteStorageManager
remote.log.metadata.manager.class.name=org.apache.kafka.server.log.remote.metadata.storage.TopicBasedRemoteLogMetadataManager
remote.log.metadata.manager.listener.name=PLAINTEXT
rsm.config.upload.rate.limit.bytes.per.second=31457280 {code}
topic config:
{code:java}
Dynamic configs for topic xx are:
local.retention.ms=60 sensitive=false 
synonyms={DYNAMIC_TOPIC_CONFIG:local.retention.ms=60, 
STATIC_BROKER_CONFIG:log.local.retention.ms=60, 
DEFAULT_CONFIG:log.local.retention.ms=-2}
remote.storage.enable=true sensitive=false 
synonyms={DYNAMIC_TOPIC_CONFIG:remote.storage.enable=true}
retention.ms=1581120 sensitive=false 
synonyms={DYNAMIC_TOPIC_CONFIG:retention.ms=1581120, 
STATIC_BROKER_CONFIG:log.retention.ms=1581120, 

[jira] [Resolved] (KAFKA-12227) Add method "Producer#send" to return CompletionStage instead of Future

2024-06-22 Thread Chia-Ping Tsai (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-12227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chia-Ping Tsai resolved KAFKA-12227.

Resolution: Duplicate

>  Add method "Producer#send" to return CompletionStage instead of Future
> ---
>
> Key: KAFKA-12227
> URL: https://issues.apache.org/jira/browse/KAFKA-12227
> Project: Kafka
>  Issue Type: New Feature
>Reporter: Chia-Ping Tsai
>Assignee: Chia-Ping Tsai
>Priority: Major
>
> Producer and KafkaProducer return a java.util.concurrent.Future from their 
> send methods. This makes it challenging to write asynchronous non-blocking 
> code given Future's limited interface. Since Kafka now requires Java 8, we 
> now have the option of using CompletionStage and/or CompletableFuture that 
> were introduced to solve this issue. It's worth noting that the Kafka 
> AdminClient solved this issue by using org.apache.kafka.common.KafkaFuture as 
> Java 7 support was still required then.
>  
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-XXX%3A+Return+CompletableFuture+from+KafkaProducer.send



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KAFKA-17017) AsyncConsumer#unsubscribe does not clean the assigned partitions

2024-06-22 Thread Chia-Ping Tsai (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-17017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17856928#comment-17856928
 ] 

Chia-Ping Tsai commented on KAFKA-17017:


[~lianetm] [~kirktrue] Could you please take a look? If 
AsyncConsumer#unsubscribe should not touch assigned partitions, we can update 
the docs to highlight that behavior changes.

> AsyncConsumer#unsubscribe does not clean the assigned partitions
> 
>
> Key: KAFKA-17017
> URL: https://issues.apache.org/jira/browse/KAFKA-17017
> Project: Kafka
>  Issue Type: Bug
>Reporter: Chia-Ping Tsai
>Assignee: Chia-Ping Tsai
>Priority: Major
>  Labels: kip-848-client-support
>
> According to docs [0] `Consumer#unsubscribe` should clean both subscribed and 
> assigned partitions. However, there are two issues about `AsyncConsumer`
> 1)  if we don't set group id,  `AsyncConsumer#unsubscribe`[1] will be no-op
> 2)  if we set group id, `AsyncConsumer` is always in `UNSUBSCRIBED` state and 
> so `MembershipManagerImpl#leaveGroup`[2] will be no-op
> [0] 
> https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java#L759
> [1] 
> https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/clients/consumer/internals/AsyncKafkaConsumer.java#L1479
> [2] 
> https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/clients/consumer/internals/MembershipManagerImpl.java#L666



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (KAFKA-17014) ScramFormatter should not use String for password.

2024-06-22 Thread Chia-Ping Tsai (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-17014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chia-Ping Tsai reassigned KAFKA-17014:
--

Assignee: dujian0068

> ScramFormatter should not use String for password.
> --
>
> Key: KAFKA-17014
> URL: https://issues.apache.org/jira/browse/KAFKA-17014
> Project: Kafka
>  Issue Type: Improvement
>  Components: security
>Reporter: Tsz-wo Sze
>Assignee: dujian0068
>Priority: Major
>
> Since String is immutable, there is no easy way to erase a String password 
> after use.  We should not use String for password.  See also  
> https://stackoverflow.com/questions/8881291/why-is-char-preferred-over-string-for-passwords



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-17021) Migrate AclCommandTest to new test infra

2024-06-22 Thread Chia-Ping Tsai (Jira)
Chia-Ping Tsai created KAFKA-17021:
--

 Summary: Migrate AclCommandTest to new test infra
 Key: KAFKA-17021
 URL: https://issues.apache.org/jira/browse/KAFKA-17021
 Project: Kafka
  Issue Type: Improvement
Reporter: Chia-Ping Tsai
Assignee: Chia-Ping Tsai


as title. Also, it would be great to rewrite it by java



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-15713) KRaft support in AclCommandTest

2024-06-22 Thread Chia-Ping Tsai (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-15713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chia-Ping Tsai updated KAFKA-15713:
---
Summary: KRaft support in AclCommandTest  (was: KRaft support in 
SaslClientsWithInvalidCredentialsTest)

> KRaft support in AclCommandTest
> ---
>
> Key: KAFKA-15713
> URL: https://issues.apache.org/jira/browse/KAFKA-15713
> Project: Kafka
>  Issue Type: Task
>  Components: core
>Reporter: Sameer Tejani
>Assignee: Pavel Pozdeev
>Priority: Minor
>  Labels: kraft, kraft-test, newbie
> Fix For: 3.9.0
>
>
> The following tests in SaslClientsWithInvalidCredentialsTest in 
> core/src/test/scala/integration/kafka/api/SaslClientsWithInvalidCredentialsTest.scala
>  need to be updated to support KRaft
> 125 : def testAclCliWithAuthorizer(): Unit = {
> 130 : def testAclCliWithAdminAPI(): Unit = {
> 186 : def testProducerConsumerCliWithAuthorizer(): Unit = {
> 191 : def testProducerConsumerCliWithAdminAPI(): Unit = {
> 197 : def testAclCliWithClientId(): Unit = {
> 236 : def testAclsOnPrefixedResourcesWithAuthorizer(): Unit = {
> 241 : def testAclsOnPrefixedResourcesWithAdminAPI(): Unit = {
> 268 : def testInvalidAuthorizerProperty(): Unit = {
> 276 : def testPatternTypes(): Unit = {
> Scanned 336 lines. Found 0 KRaft tests out of 9 tests



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (KAFKA-15713) KRaft support in SaslClientsWithInvalidCredentialsTest

2024-06-22 Thread Chia-Ping Tsai (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-15713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chia-Ping Tsai resolved KAFKA-15713.

Fix Version/s: 3.9.0
   Resolution: Fixed

> KRaft support in SaslClientsWithInvalidCredentialsTest
> --
>
> Key: KAFKA-15713
> URL: https://issues.apache.org/jira/browse/KAFKA-15713
> Project: Kafka
>  Issue Type: Task
>  Components: core
>Reporter: Sameer Tejani
>Assignee: Pavel Pozdeev
>Priority: Minor
>  Labels: kraft, kraft-test, newbie
> Fix For: 3.9.0
>
>
> The following tests in SaslClientsWithInvalidCredentialsTest in 
> core/src/test/scala/integration/kafka/api/SaslClientsWithInvalidCredentialsTest.scala
>  need to be updated to support KRaft
> 125 : def testAclCliWithAuthorizer(): Unit = {
> 130 : def testAclCliWithAdminAPI(): Unit = {
> 186 : def testProducerConsumerCliWithAuthorizer(): Unit = {
> 191 : def testProducerConsumerCliWithAdminAPI(): Unit = {
> 197 : def testAclCliWithClientId(): Unit = {
> 236 : def testAclsOnPrefixedResourcesWithAuthorizer(): Unit = {
> 241 : def testAclsOnPrefixedResourcesWithAdminAPI(): Unit = {
> 268 : def testInvalidAuthorizerProperty(): Unit = {
> 276 : def testPatternTypes(): Unit = {
> Scanned 336 lines. Found 0 KRaft tests out of 9 tests



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (KAFKA-10840) Need way to catch auth issues in poll method of Java Kafka client

2024-06-22 Thread Sagar Rao (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-10840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sagar Rao reassigned KAFKA-10840:
-

Assignee: Sagar Rao

> Need way to catch auth issues in poll method of Java Kafka client
> -
>
> Key: KAFKA-10840
> URL: https://issues.apache.org/jira/browse/KAFKA-10840
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Devin G. Bost
>Assignee: Sagar Rao
>Priority: Blocker
>  Labels: authentication, client
>
> We recently implemented SSL authentication at our company, and when certs 
> expire, the Kafka client poll method silently fails without throwing any kind 
> of exception. This is a problem because the data flow could stop at any time 
> (due to certificate expiration) without us being able to handle it. The auth 
> issue shows up in Kafka broker logs, but we don't see any indication on the 
> client-side that there was an auth issue. As a consequence, the auth failure 
> happens 10 times a second forever. 
> We need a way to know on the client-side if an auth issue is blocking the 
> connection to Kafka so we can handle the exception and refresh the certs 
> (keystore/truststore) when the certs expire. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-17020) After enabling tiered storage, occasional residual logs are left in the replica

2024-06-22 Thread Jianbin Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-17020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jianbin Chen updated KAFKA-17020:
-
Description: 
After enabling tiered storage, occasional residual logs are left in the replica.
Based on the observed phenomenon, the index values of the rolled-out logs 
generated by the replica and the leader are not the same. As a result, the logs 
uploaded to S3 at the same time do not include the corresponding log files on 
the replica side, making it impossible to delete the local logs.
[!https://private-user-images.githubusercontent.com/19943636/341939158-d0b87a7d-aca1-4700-b3e1-fceff0530c79.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MTkwMzY1OTIsIm5iZiI6MTcxOTAzNjI5MiwicGF0aCI6Ii8xOTk0MzYzNi8zNDE5MzkxNTgtZDBiODdhN2QtYWNhMS00NzAwLWIzZTEtZmNlZmYwNTMwYzc5LnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA2MjIlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwNjIyVDA2MDQ1MlomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPWY3ZDQ2OGIxMmE3OGI2Njc2YzdkNzkwMzlhNmM5MzAxNjY0MWZiMzA2ZjgwNzgzM2JlYTMxMzM4Njk1NGI5MDYmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.Sdsvwn0dUi_p1dG0W_AvQY6Iqeimy_UZ8VldKUS1Q0E!|https://private-user-images.githubusercontent.com/19943636/341939158-d0b87a7d-aca1-4700-b3e1-fceff0530c79.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MTkwMzY1OTIsIm5iZiI6MTcxOTAzNjI5MiwicGF0aCI6Ii8xOTk0MzYzNi8zNDE5MzkxNTgtZDBiODdhN2QtYWNhMS00NzAwLWIzZTEtZmNlZmYwNTMwYzc5LnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA2MjIlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwNjIyVDA2MDQ1MlomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPWY3ZDQ2OGIxMmE3OGI2Njc2YzdkNzkwMzlhNmM5MzAxNjY0MWZiMzA2ZjgwNzgzM2JlYTMxMzM4Njk1NGI5MDYmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.Sdsvwn0dUi_p1dG0W_AvQY6Iqeimy_UZ8VldKUS1Q0E]
leader config:
{code:java}
num.partitions=3
default.replication.factor=2
delete.topic.enable=true
auto.create.topics.enable=false
num.recovery.threads.per.data.dir=1
offsets.topic.replication.factor=3
transaction.state.log.replication.factor=2
transaction.state.log.min.isr=1
offsets.retention.minutes=4320
log.roll.ms=8640
log.local.retention.ms=60
log.segment.bytes=536870912
num.replica.fetchers=1
log.retention.ms=1581120
remote.log.manager.thread.pool.size=4
remote.log.reader.threads=4
remote.log.metadata.topic.replication.factor=3
remote.log.storage.system.enable=true
remote.log.metadata.topic.retention.ms=18000
rsm.config.fetch.chunk.cache.class=io.aiven.kafka.tieredstorage.fetch.cache.DiskChunkCache
rsm.config.fetch.chunk.cache.path=/data01/kafka-tiered-storage-cache

Pick some cache size, 16 GiB here:
rsm.config.fetch.chunk.cache.size=34359738368
rsm.config.fetch.chunk.cache.retention.ms=120
# # Prefetching size, 16 MiB here:
rsm.config.fetch.chunk.cache.prefetch.max.size=33554432
rsm.config.storage.backend.class=io.aiven.kafka.tieredstorage.storage.s3.S3Storage
rsm.config.storage.s3.bucket.name=
rsm.config.storage.s3.region=us-west-1
rsm.config.storage.aws.secret.access.key=
rsm.config.storage.aws.access.key.id=
rsm.config.chunk.size=8388608
remote.log.storage.manager.class.path=/home/admin/core-0.0.1-SNAPSHOT/:/home/admin/s3-0.0.1-SNAPSHOT/
remote.log.storage.manager.class.name=io.aiven.kafka.tieredstorage.RemoteStorageManager
remote.log.metadata.manager.class.name=org.apache.kafka.server.log.remote.metadata.storage.TopicBasedRemoteLogMetadataManager
remote.log.metadata.manager.listener.name=PLAINTEXT
rsm.config.upload.rate.limit.bytes.per.second=31457280
{code}
 replica config:
{code:java}
num.partitions=3
default.replication.factor=2
delete.topic.enable=true
auto.create.topics.enable=false
num.recovery.threads.per.data.dir=1
offsets.topic.replication.factor=3
transaction.state.log.replication.factor=2
transaction.state.log.min.isr=1
offsets.retention.minutes=4320
log.roll.ms=8640
log.local.retention.ms=60
log.segment.bytes=536870912
num.replica.fetchers=1
log.retention.ms=1581120
remote.log.manager.thread.pool.size=4
remote.log.reader.threads=4
remote.log.metadata.topic.replication.factor=3
remote.log.storage.system.enable=true
#remote.log.metadata.topic.retention.ms=18000
rsm.config.fetch.chunk.cache.class=io.aiven.kafka.tieredstorage.fetch.cache.DiskChunkCache
rsm.config.fetch.chunk.cache.path=/data01/kafka-tiered-storage-cache
# Pick some cache size, 16 GiB here:
rsm.config.fetch.chunk.cache.size=34359738368
rsm.config.fetch.chunk.cache.retention.ms=120
# # # Prefetching size, 16 MiB here:
rsm.config.fetch.chunk.cache.prefetch.max.size=33554432

[jira] [Updated] (KAFKA-17020) After enabling tiered storage, occasional residual logs are left in the replica

2024-06-22 Thread Jianbin Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-17020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jianbin Chen updated KAFKA-17020:
-
Description: 
After enabling tiered storage, occasional residual logs are left in the replica.
Based on the observed phenomenon, the index values of the rolled-out logs 
generated by the replica and the leader are not the same. As a result, the logs 
uploaded to S3 at the same time do not include the corresponding log files on 
the replica side, making it impossible to delete the local logs.
[!https://private-user-images.githubusercontent.com/19943636/341939158-d0b87a7d-aca1-4700-b3e1-fceff0530c79.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MTkwMzY1OTIsIm5iZiI6MTcxOTAzNjI5MiwicGF0aCI6Ii8xOTk0MzYzNi8zNDE5MzkxNTgtZDBiODdhN2QtYWNhMS00NzAwLWIzZTEtZmNlZmYwNTMwYzc5LnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA2MjIlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwNjIyVDA2MDQ1MlomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPWY3ZDQ2OGIxMmE3OGI2Njc2YzdkNzkwMzlhNmM5MzAxNjY0MWZiMzA2ZjgwNzgzM2JlYTMxMzM4Njk1NGI5MDYmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.Sdsvwn0dUi_p1dG0W_AvQY6Iqeimy_UZ8VldKUS1Q0E!|https://private-user-images.githubusercontent.com/19943636/341939158-d0b87a7d-aca1-4700-b3e1-fceff0530c79.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MTkwMzY1OTIsIm5iZiI6MTcxOTAzNjI5MiwicGF0aCI6Ii8xOTk0MzYzNi8zNDE5MzkxNTgtZDBiODdhN2QtYWNhMS00NzAwLWIzZTEtZmNlZmYwNTMwYzc5LnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA2MjIlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwNjIyVDA2MDQ1MlomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPWY3ZDQ2OGIxMmE3OGI2Njc2YzdkNzkwMzlhNmM5MzAxNjY0MWZiMzA2ZjgwNzgzM2JlYTMxMzM4Njk1NGI5MDYmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.Sdsvwn0dUi_p1dG0W_AvQY6Iqeimy_UZ8VldKUS1Q0E]
leader config:
num.partitions=3
default.replication.factor=2
delete.topic.enable=true
auto.create.topics.enable=false
num.recovery.threads.per.data.dir=1
offsets.topic.replication.factor=3
transaction.state.log.replication.factor=2
transaction.state.log.min.isr=1
offsets.retention.minutes=4320
log.roll.ms=8640
log.local.retention.ms=60
log.segment.bytes=536870912
num.replica.fetchers=1
log.retention.ms=1581120
remote.log.manager.thread.pool.size=4
remote.log.reader.threads=4
remote.log.metadata.topic.replication.factor=3
remote.log.storage.system.enable=true
remote.log.metadata.topic.retention.ms=18000
rsm.config.fetch.chunk.cache.class=io.aiven.kafka.tieredstorage.fetch.cache.DiskChunkCache
rsm.config.fetch.chunk.cache.path=/data01/kafka-tiered-storage-cache
 # Pick some cache size, 16 GiB here:
rsm.config.fetch.chunk.cache.size=34359738368
rsm.config.fetch.chunk.cache.retention.ms=120
 # # # Prefetching size, 16 MiB here:
rsm.config.fetch.chunk.cache.prefetch.max.size=33554432
rsm.config.storage.backend.class=io.aiven.kafka.tieredstorage.storage.s3.S3Storage
rsm.config.storage.s3.bucket.name=
rsm.config.storage.s3.region=us-west-1
rsm.config.storage.aws.secret.access.key=
rsm.config.storage.aws.access.key.id=
rsm.config.chunk.size=8388608
remote.log.storage.manager.class.path=/home/admin/core-0.0.1-SNAPSHOT/{*}:/home/admin/s3-0.0.1-SNAPSHOT/{*}
remote.log.storage.manager.class.name=io.aiven.kafka.tieredstorage.RemoteStorageManager
remote.log.metadata.manager.class.name=org.apache.kafka.server.log.remote.metadata.storage.TopicBasedRemoteLogMetadataManager
remote.log.metadata.manager.listener.name=PLAINTEXT
rsm.config.upload.rate.limit.bytes.per.second=31457280
replica config:
num.partitions=3
default.replication.factor=2
delete.topic.enable=true
auto.create.topics.enable=false
num.recovery.threads.per.data.dir=1
offsets.topic.replication.factor=3
transaction.state.log.replication.factor=2
transaction.state.log.min.isr=1
offsets.retention.minutes=4320
log.roll.ms=8640
log.local.retention.ms=60
log.segment.bytes=536870912
num.replica.fetchers=1
log.retention.ms=1581120
remote.log.manager.thread.pool.size=4
remote.log.reader.threads=4
remote.log.metadata.topic.replication.factor=3
remote.log.storage.system.enable=true
#remote.log.metadata.topic.retention.ms=18000
rsm.config.fetch.chunk.cache.class=io.aiven.kafka.tieredstorage.fetch.cache.DiskChunkCache
rsm.config.fetch.chunk.cache.path=/data01/kafka-tiered-storage-cache
 # Pick some cache size, 16 GiB here:
rsm.config.fetch.chunk.cache.size=34359738368
rsm.config.fetch.chunk.cache.retention.ms=120
 # # # Prefetching size, 16 MiB here:
rsm.config.fetch.chunk.cache.prefetch.max.size=33554432

[jira] (KAFKA-17020) After enabling tiered storage, occasional residual logs are left in the replica

2024-06-22 Thread Jianbin Chen (Jira)


[ https://issues.apache.org/jira/browse/KAFKA-17020 ]


Jianbin Chen deleted comment on KAFKA-17020:
--

was (Author: jianbin):
Restarting does not resolve this issue. The only solution is to delete the log 
folder corresponding to the replica where the log segment anomaly occurred and 
then resynchronize from the leader.
![image](https://github.com/Aiven-Open/tiered-storage-for-apache-kafka/assets/19943636/7256c156-6e90-4799-b0cf-a48c247c5b51)

> After enabling tiered storage, occasional residual logs are left in the 
> replica
> ---
>
> Key: KAFKA-17020
> URL: https://issues.apache.org/jira/browse/KAFKA-17020
> Project: Kafka
>  Issue Type: Wish
>Affects Versions: 3.7.0
>Reporter: Jianbin Chen
>Priority: Major
>
> After enabling tiered storage, occasional residual logs are left in the 
> replica.
> Based on the observed phenomenon, the index values of the rolled-out logs 
> generated by the replica and the leader are not the same. As a result, the 
> logs uploaded to S3 at the same time do not include the corresponding log 
> files on the replica side, making it impossible to delete the local logs.
> [!https://private-user-images.githubusercontent.com/19943636/341939158-d0b87a7d-aca1-4700-b3e1-fceff0530c79.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MTkwMzY1OTIsIm5iZiI6MTcxOTAzNjI5MiwicGF0aCI6Ii8xOTk0MzYzNi8zNDE5MzkxNTgtZDBiODdhN2QtYWNhMS00NzAwLWIzZTEtZmNlZmYwNTMwYzc5LnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA2MjIlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwNjIyVDA2MDQ1MlomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPWY3ZDQ2OGIxMmE3OGI2Njc2YzdkNzkwMzlhNmM5MzAxNjY0MWZiMzA2ZjgwNzgzM2JlYTMxMzM4Njk1NGI5MDYmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.Sdsvwn0dUi_p1dG0W_AvQY6Iqeimy_UZ8VldKUS1Q0E!|https://private-user-images.githubusercontent.com/19943636/341939158-d0b87a7d-aca1-4700-b3e1-fceff0530c79.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MTkwMzY1OTIsIm5iZiI6MTcxOTAzNjI5MiwicGF0aCI6Ii8xOTk0MzYzNi8zNDE5MzkxNTgtZDBiODdhN2QtYWNhMS00NzAwLWIzZTEtZmNlZmYwNTMwYzc5LnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA2MjIlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwNjIyVDA2MDQ1MlomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPWY3ZDQ2OGIxMmE3OGI2Njc2YzdkNzkwMzlhNmM5MzAxNjY0MWZiMzA2ZjgwNzgzM2JlYTMxMzM4Njk1NGI5MDYmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.Sdsvwn0dUi_p1dG0W_AvQY6Iqeimy_UZ8VldKUS1Q0E]
> leader config:
> num.partitions=3
> default.replication.factor=2
> delete.topic.enable=true
> auto.create.topics.enable=false
> num.recovery.threads.per.data.dir=1
> offsets.topic.replication.factor=3
> transaction.state.log.replication.factor=2
> transaction.state.log.min.isr=1
> offsets.retention.minutes=4320
> log.roll.ms=8640
> log.local.retention.ms=60
> log.segment.bytes=536870912
> num.replica.fetchers=1
> log.retention.ms=1581120
> remote.log.manager.thread.pool.size=4
> remote.log.reader.threads=4
> remote.log.metadata.topic.replication.factor=3
> remote.log.storage.system.enable=true
> remote.log.metadata.topic.retention.ms=18000
> rsm.config.fetch.chunk.cache.class=io.aiven.kafka.tieredstorage.fetch.cache.DiskChunkCache
> rsm.config.fetch.chunk.cache.path=/data01/kafka-tiered-storage-cache
>  # Pick some cache size, 16 GiB here:
> rsm.config.fetch.chunk.cache.size=34359738368
> rsm.config.fetch.chunk.cache.retention.ms=120
>  # # # Prefetching size, 16 MiB here:
> rsm.config.fetch.chunk.cache.prefetch.max.size=33554432
> rsm.config.storage.backend.class=io.aiven.kafka.tieredstorage.storage.s3.S3Storage
> rsm.config.storage.s3.bucket.name=
> rsm.config.storage.s3.region=us-west-1
> rsm.config.storage.aws.secret.access.key=
> rsm.config.storage.aws.access.key.id=
> rsm.config.chunk.size=8388608
> remote.log.storage.manager.class.path=/home/admin/core-0.0.1-SNAPSHOT/{*}:/home/admin/s3-0.0.1-SNAPSHOT/{*}
> remote.log.storage.manager.class.name=io.aiven.kafka.tieredstorage.RemoteStorageManager
> remote.log.metadata.manager.class.name=org.apache.kafka.server.log.remote.metadata.storage.TopicBasedRemoteLogMetadataManager
> remote.log.metadata.manager.listener.name=PLAINTEXT
> rsm.config.upload.rate.limit.bytes.per.second=31457280
> replica config:
> num.partitions=3
> default.replication.factor=2
> delete.topic.enable=true
> auto.create.topics.enable=false
> num.recovery.threads.per.data.dir=1
> offsets.topic.replication.factor=3
> 

[jira] [Updated] (KAFKA-17020) After enabling tiered storage, occasional residual logs are left in the replica

2024-06-22 Thread Jianbin Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-17020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jianbin Chen updated KAFKA-17020:
-
Description: 
After enabling tiered storage, occasional residual logs are left in the replica.
Based on the observed phenomenon, the index values of the rolled-out logs 
generated by the replica and the leader are not the same. As a result, the logs 
uploaded to S3 at the same time do not include the corresponding log files on 
the replica side, making it impossible to delete the local logs.
[!https://private-user-images.githubusercontent.com/19943636/341939158-d0b87a7d-aca1-4700-b3e1-fceff0530c79.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MTkwMzY1OTIsIm5iZiI6MTcxOTAzNjI5MiwicGF0aCI6Ii8xOTk0MzYzNi8zNDE5MzkxNTgtZDBiODdhN2QtYWNhMS00NzAwLWIzZTEtZmNlZmYwNTMwYzc5LnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA2MjIlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwNjIyVDA2MDQ1MlomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPWY3ZDQ2OGIxMmE3OGI2Njc2YzdkNzkwMzlhNmM5MzAxNjY0MWZiMzA2ZjgwNzgzM2JlYTMxMzM4Njk1NGI5MDYmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.Sdsvwn0dUi_p1dG0W_AvQY6Iqeimy_UZ8VldKUS1Q0E!|https://private-user-images.githubusercontent.com/19943636/341939158-d0b87a7d-aca1-4700-b3e1-fceff0530c79.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MTkwMzY1OTIsIm5iZiI6MTcxOTAzNjI5MiwicGF0aCI6Ii8xOTk0MzYzNi8zNDE5MzkxNTgtZDBiODdhN2QtYWNhMS00NzAwLWIzZTEtZmNlZmYwNTMwYzc5LnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA2MjIlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwNjIyVDA2MDQ1MlomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPWY3ZDQ2OGIxMmE3OGI2Njc2YzdkNzkwMzlhNmM5MzAxNjY0MWZiMzA2ZjgwNzgzM2JlYTMxMzM4Njk1NGI5MDYmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.Sdsvwn0dUi_p1dG0W_AvQY6Iqeimy_UZ8VldKUS1Q0E]
leader config:
num.partitions=3
default.replication.factor=2
delete.topic.enable=true
auto.create.topics.enable=false
num.recovery.threads.per.data.dir=1
offsets.topic.replication.factor=3
transaction.state.log.replication.factor=2
transaction.state.log.min.isr=1
offsets.retention.minutes=4320
log.roll.ms=8640
log.local.retention.ms=60
log.segment.bytes=536870912
num.replica.fetchers=1
log.retention.ms=1581120
remote.log.manager.thread.pool.size=4
remote.log.reader.threads=4
remote.log.metadata.topic.replication.factor=3
remote.log.storage.system.enable=true
remote.log.metadata.topic.retention.ms=18000
rsm.config.fetch.chunk.cache.class=io.aiven.kafka.tieredstorage.fetch.cache.DiskChunkCache
rsm.config.fetch.chunk.cache.path=/data01/kafka-tiered-storage-cache
 # Pick some cache size, 16 GiB here:
rsm.config.fetch.chunk.cache.size=34359738368
rsm.config.fetch.chunk.cache.retention.ms=120
 # # # Prefetching size, 16 MiB here:
rsm.config.fetch.chunk.cache.prefetch.max.size=33554432
rsm.config.storage.backend.class=io.aiven.kafka.tieredstorage.storage.s3.S3Storage
rsm.config.storage.s3.bucket.name=
rsm.config.storage.s3.region=us-west-1
rsm.config.storage.aws.secret.access.key=
rsm.config.storage.aws.access.key.id=
rsm.config.chunk.size=8388608
remote.log.storage.manager.class.path=/home/admin/core-0.0.1-SNAPSHOT/{*}:/home/admin/s3-0.0.1-SNAPSHOT/{*}
remote.log.storage.manager.class.name=io.aiven.kafka.tieredstorage.RemoteStorageManager
remote.log.metadata.manager.class.name=org.apache.kafka.server.log.remote.metadata.storage.TopicBasedRemoteLogMetadataManager
remote.log.metadata.manager.listener.name=PLAINTEXT
rsm.config.upload.rate.limit.bytes.per.second=31457280
replica config:
num.partitions=3
default.replication.factor=2
delete.topic.enable=true
auto.create.topics.enable=false
num.recovery.threads.per.data.dir=1
offsets.topic.replication.factor=3
transaction.state.log.replication.factor=2
transaction.state.log.min.isr=1
offsets.retention.minutes=4320
log.roll.ms=8640
log.local.retention.ms=60
log.segment.bytes=536870912
num.replica.fetchers=1
log.retention.ms=1581120
remote.log.manager.thread.pool.size=4
remote.log.reader.threads=4
remote.log.metadata.topic.replication.factor=3
remote.log.storage.system.enable=true
#remote.log.metadata.topic.retention.ms=18000
rsm.config.fetch.chunk.cache.class=io.aiven.kafka.tieredstorage.fetch.cache.DiskChunkCache
rsm.config.fetch.chunk.cache.path=/data01/kafka-tiered-storage-cache
 # Pick some cache size, 16 GiB here:
rsm.config.fetch.chunk.cache.size=34359738368
rsm.config.fetch.chunk.cache.retention.ms=120
 # # # Prefetching size, 16 MiB here:
rsm.config.fetch.chunk.cache.prefetch.max.size=33554432

[jira] [Commented] (KAFKA-17020) After enabling tiered storage, occasional residual logs are left in the replica

2024-06-22 Thread Jianbin Chen (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-17020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17856903#comment-17856903
 ] 

Jianbin Chen commented on KAFKA-17020:
--

Restarting does not resolve this issue. The only solution is to delete the log 
folder corresponding to the replica where the log segment anomaly occurred and 
then resynchronize from the leader.
![image](https://github.com/Aiven-Open/tiered-storage-for-apache-kafka/assets/19943636/7256c156-6e90-4799-b0cf-a48c247c5b51)

> After enabling tiered storage, occasional residual logs are left in the 
> replica
> ---
>
> Key: KAFKA-17020
> URL: https://issues.apache.org/jira/browse/KAFKA-17020
> Project: Kafka
>  Issue Type: Wish
>Affects Versions: 3.7.0
>Reporter: Jianbin Chen
>Priority: Major
>
> After enabling tiered storage, occasional residual logs are left in the 
> replica.
> Based on the observed phenomenon, the index values of the rolled-out logs 
> generated by the replica and the leader are not the same. As a result, the 
> logs uploaded to S3 at the same time do not include the corresponding log 
> files on the replica side, making it impossible to delete the local logs.
> [!https://private-user-images.githubusercontent.com/19943636/341939158-d0b87a7d-aca1-4700-b3e1-fceff0530c79.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MTkwMzY1OTIsIm5iZiI6MTcxOTAzNjI5MiwicGF0aCI6Ii8xOTk0MzYzNi8zNDE5MzkxNTgtZDBiODdhN2QtYWNhMS00NzAwLWIzZTEtZmNlZmYwNTMwYzc5LnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA2MjIlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwNjIyVDA2MDQ1MlomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPWY3ZDQ2OGIxMmE3OGI2Njc2YzdkNzkwMzlhNmM5MzAxNjY0MWZiMzA2ZjgwNzgzM2JlYTMxMzM4Njk1NGI5MDYmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.Sdsvwn0dUi_p1dG0W_AvQY6Iqeimy_UZ8VldKUS1Q0E!|https://private-user-images.githubusercontent.com/19943636/341939158-d0b87a7d-aca1-4700-b3e1-fceff0530c79.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MTkwMzY1OTIsIm5iZiI6MTcxOTAzNjI5MiwicGF0aCI6Ii8xOTk0MzYzNi8zNDE5MzkxNTgtZDBiODdhN2QtYWNhMS00NzAwLWIzZTEtZmNlZmYwNTMwYzc5LnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA2MjIlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwNjIyVDA2MDQ1MlomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPWY3ZDQ2OGIxMmE3OGI2Njc2YzdkNzkwMzlhNmM5MzAxNjY0MWZiMzA2ZjgwNzgzM2JlYTMxMzM4Njk1NGI5MDYmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.Sdsvwn0dUi_p1dG0W_AvQY6Iqeimy_UZ8VldKUS1Q0E]
> leader config:
> num.partitions=3
> default.replication.factor=2
> delete.topic.enable=true
> auto.create.topics.enable=false
> num.recovery.threads.per.data.dir=1
> offsets.topic.replication.factor=3
> transaction.state.log.replication.factor=2
> transaction.state.log.min.isr=1
> offsets.retention.minutes=4320
> log.roll.ms=8640
> log.local.retention.ms=60
> log.segment.bytes=536870912
> num.replica.fetchers=1
> log.retention.ms=1581120
> remote.log.manager.thread.pool.size=4
> remote.log.reader.threads=4
> remote.log.metadata.topic.replication.factor=3
> remote.log.storage.system.enable=true
> remote.log.metadata.topic.retention.ms=18000
> rsm.config.fetch.chunk.cache.class=io.aiven.kafka.tieredstorage.fetch.cache.DiskChunkCache
> rsm.config.fetch.chunk.cache.path=/data01/kafka-tiered-storage-cache
> # Pick some cache size, 16 GiB here:
> rsm.config.fetch.chunk.cache.size=34359738368
> rsm.config.fetch.chunk.cache.retention.ms=120
> # # # Prefetching size, 16 MiB here:
> rsm.config.fetch.chunk.cache.prefetch.max.size=33554432
> rsm.config.storage.backend.class=io.aiven.kafka.tieredstorage.storage.s3.S3Storage
> rsm.config.storage.s3.bucket.name=
> rsm.config.storage.s3.region=us-west-1
> rsm.config.storage.aws.secret.access.key=
> rsm.config.storage.aws.access.key.id=
> rsm.config.chunk.size=8388608
> remote.log.storage.manager.class.path=/home/admin/core-0.0.1-SNAPSHOT/*:/home/admin/s3-0.0.1-SNAPSHOT/*
> remote.log.storage.manager.class.name=io.aiven.kafka.tieredstorage.RemoteStorageManager
> remote.log.metadata.manager.class.name=org.apache.kafka.server.log.remote.metadata.storage.TopicBasedRemoteLogMetadataManager
> remote.log.metadata.manager.listener.name=PLAINTEXT
> rsm.config.upload.rate.limit.bytes.per.second=31457280
> replica config:
> num.partitions=3
> default.replication.factor=2
> delete.topic.enable=true
> auto.create.topics.enable=false
> num.recovery.threads.per.data.dir=1
> 

[jira] [Created] (KAFKA-17020) After enabling tiered storage, occasional residual logs are left in the replica

2024-06-22 Thread Jianbin Chen (Jira)
Jianbin Chen created KAFKA-17020:


 Summary: After enabling tiered storage, occasional residual logs 
are left in the replica
 Key: KAFKA-17020
 URL: https://issues.apache.org/jira/browse/KAFKA-17020
 Project: Kafka
  Issue Type: Wish
Affects Versions: 3.7.0
Reporter: Jianbin Chen


After enabling tiered storage, occasional residual logs are left in the replica.
Based on the observed phenomenon, the index values of the rolled-out logs 
generated by the replica and the leader are not the same. As a result, the logs 
uploaded to S3 at the same time do not include the corresponding log files on 
the replica side, making it impossible to delete the local logs.
[!https://private-user-images.githubusercontent.com/19943636/341939158-d0b87a7d-aca1-4700-b3e1-fceff0530c79.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MTkwMzY1OTIsIm5iZiI6MTcxOTAzNjI5MiwicGF0aCI6Ii8xOTk0MzYzNi8zNDE5MzkxNTgtZDBiODdhN2QtYWNhMS00NzAwLWIzZTEtZmNlZmYwNTMwYzc5LnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA2MjIlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwNjIyVDA2MDQ1MlomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPWY3ZDQ2OGIxMmE3OGI2Njc2YzdkNzkwMzlhNmM5MzAxNjY0MWZiMzA2ZjgwNzgzM2JlYTMxMzM4Njk1NGI5MDYmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.Sdsvwn0dUi_p1dG0W_AvQY6Iqeimy_UZ8VldKUS1Q0E!|https://private-user-images.githubusercontent.com/19943636/341939158-d0b87a7d-aca1-4700-b3e1-fceff0530c79.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MTkwMzY1OTIsIm5iZiI6MTcxOTAzNjI5MiwicGF0aCI6Ii8xOTk0MzYzNi8zNDE5MzkxNTgtZDBiODdhN2QtYWNhMS00NzAwLWIzZTEtZmNlZmYwNTMwYzc5LnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA2MjIlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwNjIyVDA2MDQ1MlomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPWY3ZDQ2OGIxMmE3OGI2Njc2YzdkNzkwMzlhNmM5MzAxNjY0MWZiMzA2ZjgwNzgzM2JlYTMxMzM4Njk1NGI5MDYmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.Sdsvwn0dUi_p1dG0W_AvQY6Iqeimy_UZ8VldKUS1Q0E]
leader config:
num.partitions=3
default.replication.factor=2
delete.topic.enable=true
auto.create.topics.enable=false
num.recovery.threads.per.data.dir=1
offsets.topic.replication.factor=3
transaction.state.log.replication.factor=2
transaction.state.log.min.isr=1
offsets.retention.minutes=4320
log.roll.ms=8640
log.local.retention.ms=60
log.segment.bytes=536870912
num.replica.fetchers=1
log.retention.ms=1581120
remote.log.manager.thread.pool.size=4
remote.log.reader.threads=4
remote.log.metadata.topic.replication.factor=3
remote.log.storage.system.enable=true
remote.log.metadata.topic.retention.ms=18000
rsm.config.fetch.chunk.cache.class=io.aiven.kafka.tieredstorage.fetch.cache.DiskChunkCache
rsm.config.fetch.chunk.cache.path=/data01/kafka-tiered-storage-cache
# Pick some cache size, 16 GiB here:
rsm.config.fetch.chunk.cache.size=34359738368
rsm.config.fetch.chunk.cache.retention.ms=120
# # # Prefetching size, 16 MiB here:
rsm.config.fetch.chunk.cache.prefetch.max.size=33554432
rsm.config.storage.backend.class=io.aiven.kafka.tieredstorage.storage.s3.S3Storage
rsm.config.storage.s3.bucket.name=
rsm.config.storage.s3.region=us-west-1
rsm.config.storage.aws.secret.access.key=
rsm.config.storage.aws.access.key.id=
rsm.config.chunk.size=8388608
remote.log.storage.manager.class.path=/home/admin/core-0.0.1-SNAPSHOT/*:/home/admin/s3-0.0.1-SNAPSHOT/*
remote.log.storage.manager.class.name=io.aiven.kafka.tieredstorage.RemoteStorageManager
remote.log.metadata.manager.class.name=org.apache.kafka.server.log.remote.metadata.storage.TopicBasedRemoteLogMetadataManager
remote.log.metadata.manager.listener.name=PLAINTEXT
rsm.config.upload.rate.limit.bytes.per.second=31457280
replica config:
num.partitions=3
default.replication.factor=2
delete.topic.enable=true
auto.create.topics.enable=false
num.recovery.threads.per.data.dir=1
offsets.topic.replication.factor=3
transaction.state.log.replication.factor=2
transaction.state.log.min.isr=1
offsets.retention.minutes=4320
log.roll.ms=8640
log.local.retention.ms=60
log.segment.bytes=536870912
num.replica.fetchers=1
log.retention.ms=1581120
remote.log.manager.thread.pool.size=4
remote.log.reader.threads=4
remote.log.metadata.topic.replication.factor=3
remote.log.storage.system.enable=true
#remote.log.metadata.topic.retention.ms=18000
rsm.config.fetch.chunk.cache.class=io.aiven.kafka.tieredstorage.fetch.cache.DiskChunkCache
rsm.config.fetch.chunk.cache.path=/data01/kafka-tiered-storage-cache
# Pick some cache size, 16 GiB here:
rsm.config.fetch.chunk.cache.size=34359738368