[jira] [Resolved] (KAFKA-16644) FK join emits duplicate tombstone on left-side delete

2024-04-30 Thread Matthias J. Sax (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-16644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias J. Sax resolved KAFKA-16644.
-
Resolution: Duplicate

> FK join emits duplicate tombstone on left-side delete
> -
>
> Key: KAFKA-16644
> URL: https://issues.apache.org/jira/browse/KAFKA-16644
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 3.7.0
>Reporter: Matthias J. Sax
>Priority: Major
>
> We introduced a regression bug in 3.7.0 release via KAFKA-14748. When a 
> left-hand side record is deleted, the join now emits two tombstone records 
> instead of one.
> The problem was not detected via unit test, because the tests use a `Map` 
> instead of a `List` when verifying the result topic records 
> ([https://github.com/apache/kafka/blob/trunk/streams/src/test/java/org/apache/kafka/streams/integration/KTableKTableForeignKeyJoinIntegrationTest.java#L250-L255).]
> We should update all test cases to use `List` when reading from the output 
> topic, and of course fix the introduced bug: The 
> `SubscriptionSendProcessorSupplier` is sending two subscription records 
> instead of just a single one: 
> [https://github.com/apache/kafka/blob/trunk/streams/src/main/java/org/apache/kafka/streams/kstream/internals/foreignkeyjoin/SubscriptionSendProcessorSupplier.java#L133-L136]
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Jenkins build is still unstable: Kafka » Kafka Branch Builder » trunk #2861

2024-04-30 Thread Apache Jenkins Server
See 




Re: [DISCUSS] KIP-1033: Add Kafka Streams exception handler for exceptions occuring during processing

2024-04-30 Thread Matthias J. Sax

Thanks for the update.

I am wondering if we should use `ReadOnlyHeaders` instead of 
`ImmutableHeaders` as interface name?


Also, the returned `Header` interface is technically not immutable 
either, because `Header#key()` returns a mutable byte-array... Would we 
need a `ReadOnlyHeader` interface?


If yes, it seems that `ReadOnlyHeaders` should not be a super-interface 
of `Headers` but it would rather be a standalone interface, and a 
wrapper for a `Headers` instance? And `ReadOnlyHeader` would return some 
immutable type instead of `byte[]` for the value()?


An alternative would be to deep-copy the value byte-array what would not 
be free, but given that we are talking about exception handling, it 
would not be on the hot code path, and thus might be acceptable?



The above seems to increase the complexity significantly though. Hence, 
I have seconds thoughts on the immutability question:


Do we really need to worry about mutability after all, because in the 
end, KS runtime won't read the Headers instance after the handler was 
called, and if a user modifies the passed in headers, there won't be any 
actual damage (ie, no side effects)? For this case, it might even be ok 
to also not add `ImmutableHeaders` to begin with?




Sorry for the forth and back (yes, forth and back, because back and 
forth does not make sense -- it's not logical -- just trying to fix 
English :D) as I did bring up the immutability question in the first 
place...




-Matthias

On 4/25/24 5:56 AM, Loic Greffier wrote:

Hi Matthias,

I have updated the KIP regarding points 103 and 108.

103.
I have suggested a new `ImmutableHeaders` interface to deal with the
immutability concern of the headers, which is basically the `Headers`
interface without the write accesses.

public interface ImmutableHeaders {
 Header lastHeader(String key);
 Iterable headers(String key);
 Header[] toArray();
}

The `Headers` interface can be updated accordingly:

public interface Headers extends ImmutableHeaders, Iterable {
 //…
}

Loïc


Re: [DISCUSS] KIP-1035: StateStore managed changelog offsets

2024-04-30 Thread Matthias J. Sax

Thanks Bruno.



101: I think I understand this better now. But just want to make sure I 
do. What do you mean by "they can diverge" and "Recovering after a 
failure might load inconsistent offsets and positions."


The checkpoint is the offset from the changelog, while the position is 
the offset from the upstream source topic, right? -- In the end, the 
position is about IQ, and if we fail to update it, it only means that 
there is some gap when we might not be able to query a standby task, 
because we think it's not up-to-date enough even if it is, which would 
resolve itself soon? Ie, the position might "lag", but it's not 
"inconsistent". Do we believe that this lag would be highly problematic?




102: I am confused.

The position is maintained inside the state store, but is persisted in the .position file when the state store closes. 


This contradicts the KIP:


 these position offsets will be stored in RocksDB, in the same column family as 
the changelog offsets, instead of the .position file




My main concern is currently about rebalance metadata -- opening RocksDB 
stores seems to be very expensive, but if we follow the KIP:


We will do this under EOS by updating the .checkpoint file whenever a store is close()d. 


It seems, having the offset inside RocksDB does not help us at all? In 
the end, when we crash, we don't want to lose the state, but when we 
update the .checkpoint only on a clean close, the .checkpoint might be 
stale (ie, still contains the checkpoint when we opened the store when 
we got a task assigned).




-Matthias

On 4/30/24 2:40 AM, Bruno Cadonna wrote:

Hi all,

100
I think we already have such a wrapper. It is called 
AbstractReadWriteDecorator.



101
Currently, the position is checkpointed when a offset checkpoint is 
written. If we let the state store manage the committed offsets, we need 
to also let the state store also manage the position otherwise they 
might diverge. State store managed offsets can get flushed (i.e. 
checkpointed) to the disk when the state store decides to flush its 
in-memory data structures, but the position is only checkpointed at 
commit time. Recovering after a failure might load inconsistent offsets 
and positions.



102
The position is maintained inside the state store, but is persisted in 
the .position file when the state store closes. The only public 
interface that uses the position is IQv2 in a read-only mode. So the 
position is only updated within the state store and read from IQv2. No 
need to add anything to the public StateStore interface.



103
Deprecating managesOffsets() right away might be a good idea.


104
I agree that we should try to support downgrades without wipes. At least 
Nick should state in the KIP why we do not support it.



Best,
Bruno




On 4/23/24 8:13 AM, Matthias J. Sax wrote:
Thanks for splitting out this KIP. The discussion shows, that it is a 
complex beast by itself, so worth to discuss by its own.



Couple of question / comment:


100 `StateStore#commit()`: The JavaDoc says "must not be called by 
users" -- I would propose to put a guard in place for this, by either 
throwing an exception (preferable) or adding a no-op implementation 
(at least for our own stores, by wrapping them -- we cannot enforce it 
for custom stores I assume), and document this contract explicitly.



101 adding `.position` to the store: Why do we actually need this? The 
KIP says "To ensure consistency with the committed data and changelog 
offsets" but I am not sure if I can follow? Can you elaborate why 
leaving the `.position` file as-is won't work?



If it's possible at all, it will need to be done by
creating temporary StateManagers and StateStores during rebalance. I 
think

it is possible, and probably not too expensive, but the devil will be in
the detail.


This sounds like a significant overhead to me. We know that opening a 
single RocksDB takes about 500ms, and thus opening RocksDB to get this 
information might slow down rebalances significantly.



102: It's unclear to me, how `.position` information is added. The KIP 
only says: "position offsets will be stored in RocksDB, in the same 
column family as the changelog offsets". Do you intent to add this 
information to the map passed via `commit(final MapLong> changelogOffsets)`? The KIP should describe this in more detail. 
Also, if my assumption is correct, we might want to rename the 
parameter and also have a better JavaDoc description?



103: Should we make it mandatory (long-term) that all stores 
(including custom stores) manage their offsets internally? Maintaining 
both options and thus both code paths puts a burden on everyone and 
make the code messy. I would strongly prefer if we could have mid-term 
path to get rid of supporting both.  -- For this case, we should 
deprecate the newly added `managesOffsets()` method right away, to 
point out that we intend to remove it. If it's mandatory to maintain 
offsets for stores, we won't need this 

Re: [DISCUSS] KIP-924: customizable task assignment for Streams

2024-04-30 Thread Matthias J. Sax
I like the idea of error codes. Not sure if the name are ideal? 
UNKNOWN_PROCESS_ID makes sense, but the other two seems a little bit 
difficult to understand?


Should we be very descriptive (and also try to avoid coupling it to the 
threading model -- important for the first error code):

 - ACTIVE_TASK_ ASSIGNED_MULTIPLE_TIMES
 - ACTIVE_AND_STANDBY_ASSIGNED_TO_SAME_CLIENT (or _INSTANCE

I think we also need to add NONE as option or make the error parameter 
an `Optional`?




OVERLAPPING_CLIENT : multiple KafkaStreams clients assigned with the same 
active task


Would also be an error if assigned to two consumers of the same 
client... Needs to be rephrased.





If any of these errors are detected, the StreamsPartitionAssignor will immediately 
"fail" the rebalance and retry it by scheduling an immediate followup rebalance.


Does this make sense? If we assume that the task-assignment is 
deterministic, we would end up with an infinite retry loop? Also, 
assuming that an client leave the group, we cannot assign some task any 
longer... I would rather throw a StreamsException and let the client crash.




-Matthias

On 4/30/24 12:22 PM, Sophie Blee-Goldman wrote:

One last thing: I added an error code enum to be returned from the
#onAssignmentComputed method in case of an invalid assignment. I created
one code for each of the invalid cases we described above. The downside is
that this means we'll have to go through a deprecation cycle if we want to
loosen up the restrictions on any of the enforced cases. The upside is that
we can very clearly mark what is an invalid assignment and this will
(hopefully) assist users who are new to customizing assignments by clearly
denoting the requirements, and returning a clear error if they are not
followed.

Of course the StreamsPartitionAssignor will also do a "fallback & retry" in
this case by returning the same assignment to the consumers and scheduling
a followup rebalance. I've added all of this to the TaskAssignor  and
#onAssignmentComputed javadocs, and added a section under "Public Changes"
as well.

Please let me know if there are any concerns, or if you have suggestions
for how else we can handle an invalid assignment

On Tue, Apr 30, 2024 at 11:39 AM Sophie Blee-Goldman 
wrote:


Thanks guys! I agree with what Lucas said about 117c, we can always loosen
a restriction later and I don't want to do anything now that might get in
the way of the new threading models.

With that I think we're all in agreement on 117. I'll update the KIP to
include what we've discussed

(and will fix the remaining #finalAssignment mention as well, thanks
Bruno. Glad to have such good proof readers! :P)

On Tue, Apr 30, 2024 at 8:35 AM Bruno Cadonna  wrote:


Hi again,

I forgot to ask whether you could add the agreement about handling
invalid assignment to the KIP.

Best,
Bruno

On 4/30/24 2:00 PM, Bruno Cadonna wrote:

Hi all,

I think we are converging!

117
a) fail: Since it is an invalid consumer assignment
b) pass: I agree that not assigning a task might be reasonable in some
situations
c) fail: For the reasons Lucas pointed out. I am missing a good use

case

here.
d) fail: It is invalid


Somewhere in the KIP you still use finalAssignment() instead of the
wonderful method name onAssignmentComputed() ;-)
"... interface also includes a method named finalAssignment which is
called with the final computed GroupAssignment ..."


Best,
Bruno


On 4/30/24 1:04 PM, Lucas Brutschy wrote:

Hi,

Looks like a great KIP to me!

I'm late, so I'm only going to comment on the last open point 117. I'm
against any fallbacks like "use the default assignor if the custom
assignment is invalid", as it's just going to hide bugs. For the 4
cases mentioned by Sophie:

117a) I'd fail immediately here, as it's an implementation bug, and
should not lead to a valid consumer group assignment.
117b) Agreed. This is a useful assignment and should be allowed.
117c) This is the tricky case. However, I'm leaning towards not
allowing this, unless we have a concrete use case. This will block us
from potentially using a single consumer for active and standby tasks
in the future. It's easier to drop the restriction later if we have a
concrete use case.
117d) Definitely fail immediately, as you said.

Cheers,
Lucas



On Mon, Apr 29, 2024 at 11:13 PM Sophie Blee-Goldman
 wrote:


Yeah I think that sums it up well. Either you computed a *possible*
assignment,
or you returned something that makes it literally impossible for the
StreamsPartitionAssignor to decipher/translate into an actual group
assignment, in which case it should just fail

That's more or less it for the open questions that have been raised
so far,
so I just want to remind folks that there's already a voting thread

for

this. I cast my vote a few minutes ago so it should resurface in
everyone's
inbox :)

On Thu, Apr 25, 2024 at 11:42 PM Rohan Desai 


wrote:


117: as Sophie laid out, there are two cases here right:
1. cases that are considered 

Jenkins build is still unstable: Kafka » Kafka Branch Builder » trunk #2860

2024-04-30 Thread Apache Jenkins Server
See 




[jira] [Created] (KAFKA-16650) add integration test for Admin#abortTransaction

2024-04-30 Thread Chia-Ping Tsai (Jira)
Chia-Ping Tsai created KAFKA-16650:
--

 Summary: add integration test for Admin#abortTransaction
 Key: KAFKA-16650
 URL: https://issues.apache.org/jira/browse/KAFKA-16650
 Project: Kafka
  Issue Type: Improvement
Reporter: Chia-Ping Tsai
Assignee: Chia-Ping Tsai


It seems there are only few unit tests. We should add IT includeing zk, kraft, 
and new group coordinator for it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-16649) Fix potential deadlock in DynamicBrokerConfig

2024-04-30 Thread Colin McCabe (Jira)
Colin McCabe created KAFKA-16649:


 Summary: Fix potential deadlock in DynamicBrokerConfig
 Key: KAFKA-16649
 URL: https://issues.apache.org/jira/browse/KAFKA-16649
 Project: Kafka
  Issue Type: Bug
Reporter: Colin McCabe






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] KIP-1036: Extend RecordDeserializationException exception

2024-04-30 Thread Sophie Blee-Goldman
Actually one more thing, after thinking a bit about how this would be used
in practice, I'm inclined to agree that maybe distinguishing between key vs
value errors via subclasses is not the cleanest way to present the API.
Users would still essentially want to catch the general
RecordDeserializationException error since in practice, much of the
handling is likely to be the same between key and value errors. So then to
distinguish between these, they would end up doing an "instance of" check
on the exception type. Which feels like an awkward way to answer a question
that could have just been a simple API on the
RecordDeserializationException itself. What do you think about getting rid
of the subclasses and instead adding one more API to the
RecordDeserializationException that indicates whether it was a key or value
error?

This could return a simple boolean and be called #isKeyError or something,
but that feels kind of awkward too. Maybe a better alternative would be
something like this:

class RecordDeserializationException {
enum DeserializationExceptionType {
KEY,
VALUE
}

  public DeserializationExceptionType exceptionType();
}

I also worry that people don't always check for exception subtypes and
would easily miss the existence of the KeyDeserializationException and
ValueDeserializationException. Simply adding an API to the
RecordDeserializationException will make it much easier for users to notice
and react accordingly, if they care to do something different based on
whether the error happened during key or value deserialization.

Thoughts?

On Tue, Apr 30, 2024 at 1:45 PM Sophie Blee-Goldman 
wrote:

> Hey Fred, I think this is looking good but I just want to follow up on
> what Kirk asked earlier about having both the ByteBuffer and byte[] forms.
> Can't users just use the ByteBuffer versions and convert them into a byte[]
> themselves? In some cases it maybe makes sense to offer some additional
> APIs if there is complex processing involved in converting between returned
> types, but ByteBuffer --> byte[] seems pretty straightforward to me :)
>
> Generally speaking we try to keep the APIs as tight as possible and offer
> only what is necessary, and I'd rather leave off "syntactic sugar" type
> APIs unless there is a clear need. Put another way: it's easy to add
> additional methods if someone wants them, but it's much harder to remote
> methods since we have to go through a deprecation cycle. So I'd prefer to
> just keep only the ByteBuffer versions (or only the byte[] -- don't
> personally care which of the two)
>
> One more small nit: since we're deprecating the old exception constructor,
> can you list that in the "Compatibility, Deprecation, and Migration Plan"
> section?
>
>
>
> On Wed, Apr 24, 2024 at 5:35 AM Frédérik Rouleau
>  wrote:
>
>> Hi,
>>
>> I have updated the KIP now and the latest version of PR is available.
>>
>> About Kirk's questions:
>>
>> K11: Yes, both can have a deserialization exception but we deserialize the
>> key first, so if an error occurs then, we do not try to deserialize the
>> value. So the exception raised is always for key or value.
>>
>> K12: Not sure of concrete usage for now, just a sugar feature. I suppose
>> we
>> can imagine some use case where you need/want only the first bytes and do
>> not want to waste memory allocating the whole payload (SchemaRegistry's
>> schema Id or something similar).
>>
>> K13: The old constructor is not needed anymore. It is just for
>> compatibility until removed in a major version. As public we might have
>> some users using it even if I cannot see any valid reason for this.
>>
>> Thanks,
>> Fred
>>
>


Re: [DISCUSS] KIP-1036: Extend RecordDeserializationException exception

2024-04-30 Thread Sophie Blee-Goldman
Hey Fred, I think this is looking good but I just want to follow up on what
Kirk asked earlier about having both the ByteBuffer and byte[] forms. Can't
users just use the ByteBuffer versions and convert them into a byte[]
themselves? In some cases it maybe makes sense to offer some additional
APIs if there is complex processing involved in converting between returned
types, but ByteBuffer --> byte[] seems pretty straightforward to me :)

Generally speaking we try to keep the APIs as tight as possible and offer
only what is necessary, and I'd rather leave off "syntactic sugar" type
APIs unless there is a clear need. Put another way: it's easy to add
additional methods if someone wants them, but it's much harder to remote
methods since we have to go through a deprecation cycle. So I'd prefer to
just keep only the ByteBuffer versions (or only the byte[] -- don't
personally care which of the two)

One more small nit: since we're deprecating the old exception constructor,
can you list that in the "Compatibility, Deprecation, and Migration Plan"
section?



On Wed, Apr 24, 2024 at 5:35 AM Frédérik Rouleau
 wrote:

> Hi,
>
> I have updated the KIP now and the latest version of PR is available.
>
> About Kirk's questions:
>
> K11: Yes, both can have a deserialization exception but we deserialize the
> key first, so if an error occurs then, we do not try to deserialize the
> value. So the exception raised is always for key or value.
>
> K12: Not sure of concrete usage for now, just a sugar feature. I suppose we
> can imagine some use case where you need/want only the first bytes and do
> not want to waste memory allocating the whole payload (SchemaRegistry's
> schema Id or something similar).
>
> K13: The old constructor is not needed anymore. It is just for
> compatibility until removed in a major version. As public we might have
> some users using it even if I cannot see any valid reason for this.
>
> Thanks,
> Fred
>


Re: [PR] PayPal powered by Apache Kafka section Update [kafka-site]

2024-04-30 Thread via GitHub


bbejeck merged PR #590:
URL: https://github.com/apache/kafka-site/pull/590


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [DISCUSS] KIP-924: customizable task assignment for Streams

2024-04-30 Thread Sophie Blee-Goldman
One last thing: I added an error code enum to be returned from the
#onAssignmentComputed method in case of an invalid assignment. I created
one code for each of the invalid cases we described above. The downside is
that this means we'll have to go through a deprecation cycle if we want to
loosen up the restrictions on any of the enforced cases. The upside is that
we can very clearly mark what is an invalid assignment and this will
(hopefully) assist users who are new to customizing assignments by clearly
denoting the requirements, and returning a clear error if they are not
followed.

Of course the StreamsPartitionAssignor will also do a "fallback & retry" in
this case by returning the same assignment to the consumers and scheduling
a followup rebalance. I've added all of this to the TaskAssignor  and
#onAssignmentComputed javadocs, and added a section under "Public Changes"
as well.

Please let me know if there are any concerns, or if you have suggestions
for how else we can handle an invalid assignment

On Tue, Apr 30, 2024 at 11:39 AM Sophie Blee-Goldman 
wrote:

> Thanks guys! I agree with what Lucas said about 117c, we can always loosen
> a restriction later and I don't want to do anything now that might get in
> the way of the new threading models.
>
> With that I think we're all in agreement on 117. I'll update the KIP to
> include what we've discussed
>
> (and will fix the remaining #finalAssignment mention as well, thanks
> Bruno. Glad to have such good proof readers! :P)
>
> On Tue, Apr 30, 2024 at 8:35 AM Bruno Cadonna  wrote:
>
>> Hi again,
>>
>> I forgot to ask whether you could add the agreement about handling
>> invalid assignment to the KIP.
>>
>> Best,
>> Bruno
>>
>> On 4/30/24 2:00 PM, Bruno Cadonna wrote:
>> > Hi all,
>> >
>> > I think we are converging!
>> >
>> > 117
>> > a) fail: Since it is an invalid consumer assignment
>> > b) pass: I agree that not assigning a task might be reasonable in some
>> > situations
>> > c) fail: For the reasons Lucas pointed out. I am missing a good use
>> case
>> > here.
>> > d) fail: It is invalid
>> >
>> >
>> > Somewhere in the KIP you still use finalAssignment() instead of the
>> > wonderful method name onAssignmentComputed() ;-)
>> > "... interface also includes a method named finalAssignment which is
>> > called with the final computed GroupAssignment ..."
>> >
>> >
>> > Best,
>> > Bruno
>> >
>> >
>> > On 4/30/24 1:04 PM, Lucas Brutschy wrote:
>> >> Hi,
>> >>
>> >> Looks like a great KIP to me!
>> >>
>> >> I'm late, so I'm only going to comment on the last open point 117. I'm
>> >> against any fallbacks like "use the default assignor if the custom
>> >> assignment is invalid", as it's just going to hide bugs. For the 4
>> >> cases mentioned by Sophie:
>> >>
>> >> 117a) I'd fail immediately here, as it's an implementation bug, and
>> >> should not lead to a valid consumer group assignment.
>> >> 117b) Agreed. This is a useful assignment and should be allowed.
>> >> 117c) This is the tricky case. However, I'm leaning towards not
>> >> allowing this, unless we have a concrete use case. This will block us
>> >> from potentially using a single consumer for active and standby tasks
>> >> in the future. It's easier to drop the restriction later if we have a
>> >> concrete use case.
>> >> 117d) Definitely fail immediately, as you said.
>> >>
>> >> Cheers,
>> >> Lucas
>> >>
>> >>
>> >>
>> >> On Mon, Apr 29, 2024 at 11:13 PM Sophie Blee-Goldman
>> >>  wrote:
>> >>>
>> >>> Yeah I think that sums it up well. Either you computed a *possible*
>> >>> assignment,
>> >>> or you returned something that makes it literally impossible for the
>> >>> StreamsPartitionAssignor to decipher/translate into an actual group
>> >>> assignment, in which case it should just fail
>> >>>
>> >>> That's more or less it for the open questions that have been raised
>> >>> so far,
>> >>> so I just want to remind folks that there's already a voting thread
>> for
>> >>> this. I cast my vote a few minutes ago so it should resurface in
>> >>> everyone's
>> >>> inbox :)
>> >>>
>> >>> On Thu, Apr 25, 2024 at 11:42 PM Rohan Desai > >
>> >>> wrote:
>> >>>
>>  117: as Sophie laid out, there are two cases here right:
>>  1. cases that are considered invalid by the existing assignors but
>> are
>>  still valid assignments in the sense that they can be used to
>>  generate a
>>  valid consumer group assignment (from the perspective of the
>>  consumer group
>>  protocol). An assignment that excludes a task is one such example,
>> and
>>  Sophie pointed out a good use case for it. I also think it makes
>>  sense to
>>  allow these. It's hard to predict how a user might want to use the
>>  custom
>>  assignor, and its reasonable to expect them to use it with care and
>> not
>>  hand-hold them.
>>  2. cases that are not valid because it is impossible to compute a
>> valid
>>  consumer group assignment from them. In this case it seems totally

Re: [DISCUSS] KIP-924: customizable task assignment for Streams

2024-04-30 Thread Sophie Blee-Goldman
Thanks guys! I agree with what Lucas said about 117c, we can always loosen
a restriction later and I don't want to do anything now that might get in
the way of the new threading models.

With that I think we're all in agreement on 117. I'll update the KIP to
include what we've discussed

(and will fix the remaining #finalAssignment mention as well, thanks Bruno.
Glad to have such good proof readers! :P)

On Tue, Apr 30, 2024 at 8:35 AM Bruno Cadonna  wrote:

> Hi again,
>
> I forgot to ask whether you could add the agreement about handling
> invalid assignment to the KIP.
>
> Best,
> Bruno
>
> On 4/30/24 2:00 PM, Bruno Cadonna wrote:
> > Hi all,
> >
> > I think we are converging!
> >
> > 117
> > a) fail: Since it is an invalid consumer assignment
> > b) pass: I agree that not assigning a task might be reasonable in some
> > situations
> > c) fail: For the reasons Lucas pointed out. I am missing a good use case
> > here.
> > d) fail: It is invalid
> >
> >
> > Somewhere in the KIP you still use finalAssignment() instead of the
> > wonderful method name onAssignmentComputed() ;-)
> > "... interface also includes a method named finalAssignment which is
> > called with the final computed GroupAssignment ..."
> >
> >
> > Best,
> > Bruno
> >
> >
> > On 4/30/24 1:04 PM, Lucas Brutschy wrote:
> >> Hi,
> >>
> >> Looks like a great KIP to me!
> >>
> >> I'm late, so I'm only going to comment on the last open point 117. I'm
> >> against any fallbacks like "use the default assignor if the custom
> >> assignment is invalid", as it's just going to hide bugs. For the 4
> >> cases mentioned by Sophie:
> >>
> >> 117a) I'd fail immediately here, as it's an implementation bug, and
> >> should not lead to a valid consumer group assignment.
> >> 117b) Agreed. This is a useful assignment and should be allowed.
> >> 117c) This is the tricky case. However, I'm leaning towards not
> >> allowing this, unless we have a concrete use case. This will block us
> >> from potentially using a single consumer for active and standby tasks
> >> in the future. It's easier to drop the restriction later if we have a
> >> concrete use case.
> >> 117d) Definitely fail immediately, as you said.
> >>
> >> Cheers,
> >> Lucas
> >>
> >>
> >>
> >> On Mon, Apr 29, 2024 at 11:13 PM Sophie Blee-Goldman
> >>  wrote:
> >>>
> >>> Yeah I think that sums it up well. Either you computed a *possible*
> >>> assignment,
> >>> or you returned something that makes it literally impossible for the
> >>> StreamsPartitionAssignor to decipher/translate into an actual group
> >>> assignment, in which case it should just fail
> >>>
> >>> That's more or less it for the open questions that have been raised
> >>> so far,
> >>> so I just want to remind folks that there's already a voting thread for
> >>> this. I cast my vote a few minutes ago so it should resurface in
> >>> everyone's
> >>> inbox :)
> >>>
> >>> On Thu, Apr 25, 2024 at 11:42 PM Rohan Desai 
> >>> wrote:
> >>>
>  117: as Sophie laid out, there are two cases here right:
>  1. cases that are considered invalid by the existing assignors but are
>  still valid assignments in the sense that they can be used to
>  generate a
>  valid consumer group assignment (from the perspective of the
>  consumer group
>  protocol). An assignment that excludes a task is one such example, and
>  Sophie pointed out a good use case for it. I also think it makes
>  sense to
>  allow these. It's hard to predict how a user might want to use the
>  custom
>  assignor, and its reasonable to expect them to use it with care and
> not
>  hand-hold them.
>  2. cases that are not valid because it is impossible to compute a
> valid
>  consumer group assignment from them. In this case it seems totally
>  reasonable to just throw a fatal exception that gets passed to the
>  uncaught
>  exception handler. If this case happens then there is some bug in the
>  user's assignor and its totally reasonable to fail the application
>  in that
>  case. We _could_ try to be more graceful and default to one of the
>  existing
>  assignors. But it's usually better to fail hard and fast when there
>  is some
>  illegal state detected imo.
> 
>  On Fri, Apr 19, 2024 at 4:18 PM Rohan Desai 
>  wrote:
> 
> > Bruno, I've incorporated your feedback into the KIP document.
> >
> > On Fri, Apr 19, 2024 at 3:55 PM Rohan Desai  >
> > wrote:
> >
> >> Thanks for the feedback Bruno! For the most part I think it makes
> >> sense,
> >> but leaving a couple follow-up thoughts/questions:
> >>
> >> re 4: I think Sophie's point was slightly different - that we
> >> might want
> >> to wrap the return type for `assign` in a class so that its easily
> >> extensible. This makes sense to me. Whether we do that or not, we
> can
>  have
> >> the return type be a Set instead of a Map as well.
> >>
> >> re 6: 

RE: lambda not returning?

2024-04-30 Thread John Hawkins
Hi,
The connector is up and running - it's reading in files (I'm using custenborder 
file adapter as a simple test adapter but it fails with JDBC too )
.
I have seen issues when I've tried to use transformers that weren't configured 
correctly- sometimes they throw and I get an internal error thrown back to the 
REST client.
I've added instrumentation - not sure how to add the screen shots here !
It clearly shows that it's a single thread all the way down the stack. I've 
added in notation when I'm just about to call the callback and when I've done 
it - the originating thread never takes up the challenge and times out after 90 
seconds.
As I say - I've also put this under a debugger and when I try to go into the 
oncompletion method it just disappears. I have put instrumentation and extra 
trace in the futureCallback#onCompletion method and it never shows.

Regards,
John.


John Hawkins
Senior Architect
   
p: +44 (0) 7879 992532
Follow Coliance:  LinkedIn  | YouTube  |  Facebook  |  Instagram

-Original Message-
From: Chris Egerton  
Sent: Tuesday, April 30, 2024 6:59 PM
To: dev@kafka.apache.org
Subject: Re: lambda not returning?

[EXTERNAL EMAIL : Use caution when opening attachments, clicking links or 
responding to requests]

Hi John,

Do you have logs demonstrating that the connector was able to start up 
successfully? If not, it's possible that instead of the callback being dropped, 
it was properly retained but never invoked because the connector was blocked.

Cheers,

Chris

On Tue, Apr 30, 2024 at 1:57 PM John Hawkins  
wrote:

> Hi folks,
>
> I’ve been working with the standalone kafka for quite a while now and 
> been debugging a problem I’m having when deploying new connectors 
> (using the PUT method but it happens in the POST as well).
>
>
>
> I get a 500 timeout whenever I try to create a new connector. It looks 
> to me like the lambda callback is being called but disappears into a 
> blackhole. So, the original caller – ConnectorsResources.java never 
> gets notified and so times-out.
>
> The herder never calls back…
>
>
>
> herder.putConnectorConfig(connector, connectorConfig, *true*, cb);
>
>
>
> or rather the callback does get #onComplete() called in the 
> doTransition method of the connectorworker but the callback disappears 
> – it never seems to work.
>
>
>
> I’m left wondering if the nested lambda calling is losing the memory 
> somehow and this could be a classloading issue somewhere that is 
> somehow losing the context of the original lambda?
>
>
>
> Has this been seen before?
>
>
>
> I’ve seen a few posts (not in kakfa) where lambda calls fails to 
> callback if there are still threads running in the call itself. From 
> what I can see there are plenty of threads still running in the call
>
> I’m using a 21 microsoft OpenJDK JRE but it also fails with IBMs version.
>
>
>
> Our project is dependent on using the kakfa standalone so I need to 
> get this to work.
>
>
>
> Thanks for your thoughts,
>
> John.
>
>
>
>
>
> John Hawkins
>
> *Senior Architect*
>
>
>
>
>
>
>
>
>
>
>
> *john.hawk...@coliance.co * | 
> www.coliance.co
>
> p: +44 (0) 7879 992532
>
> Follow Coliance:  LinkedIn
>   | YouTube 
>   |  
> Facebook   |  Instagram 
> 
>
>
>
>
>


Re: lambda not returning?

2024-04-30 Thread Chris Egerton
Hi John,

Do you have logs demonstrating that the connector was able to start up
successfully? If not, it's possible that instead of the callback being
dropped, it was properly retained but never invoked because the connector
was blocked.

Cheers,

Chris

On Tue, Apr 30, 2024 at 1:57 PM John Hawkins
 wrote:

> Hi folks,
>
> I’ve been working with the standalone kafka for quite a while now and been
> debugging a problem I’m having when deploying new connectors (using the PUT
> method but it happens in the POST as well).
>
>
>
> I get a 500 timeout whenever I try to create a new connector. It looks to
> me like the lambda callback is being called but disappears into a
> blackhole. So, the original caller – ConnectorsResources.java never gets
> notified and so times-out.
>
> The herder never calls back…
>
>
>
> herder.putConnectorConfig(connector, connectorConfig, *true*, cb);
>
>
>
> or rather the callback does get #onComplete() called in the doTransition
> method of the connectorworker but the callback disappears – it never seems
> to work.
>
>
>
> I’m left wondering if the nested lambda calling is losing the memory
> somehow and this could be a classloading issue somewhere that is somehow
> losing the context of the original lambda?
>
>
>
> Has this been seen before?
>
>
>
> I’ve seen a few posts (not in kakfa) where lambda calls fails to callback
> if there are still threads running in the call itself. From what I can see
> there are plenty of threads still running in the call
>
> I’m using a 21 microsoft OpenJDK JRE but it also fails with IBMs version.
>
>
>
> Our project is dependent on using the kakfa standalone so I need to get
> this to work.
>
>
>
> Thanks for your thoughts,
>
> John.
>
>
>
>
>
> John Hawkins
>
> *Senior Architect*
>
>
>
>
>
>
>
>
>
>
>
> *john.hawk...@coliance.co * | www.coliance.co
>
> p: +44 (0) 7879 992532
>
> Follow Coliance:  LinkedIn
>   | YouTube
>   |  Facebook
>   |  Instagram 
>
>
>
>
>


lambda not returning?

2024-04-30 Thread John Hawkins
Hi folks,
I've been working with the standalone kafka for quite a while now and been 
debugging a problem I'm having when deploying new connectors (using the PUT 
method but it happens in the POST as well).

I get a 500 timeout whenever I try to create a new connector. It looks to me 
like the lambda callback is being called but disappears into a blackhole. So, 
the original caller - ConnectorsResources.java never gets notified and so 
times-out.

The herder never calls back...

herder.putConnectorConfig(connector, connectorConfig, true, cb);

or rather the callback does get #onComplete() called in the doTransition method 
of the connectorworker but the callback disappears - it never seems to work.

I'm left wondering if the nested lambda calling is losing the memory somehow 
and this could be a classloading issue somewhere that is somehow losing the 
context of the original lambda?

Has this been seen before?

I've seen a few posts (not in kakfa) where lambda calls fails to callback if 
there are still threads running in the call itself. From what I can see there 
are plenty of threads still running in the call
I'm using a 21 microsoft OpenJDK JRE but it also fails with IBMs version.

Our project is dependent on using the kakfa standalone so I need to get this to 
work.

Thanks for your thoughts,
John.


John Hawkins
Senior Architect


[cid:image001.png@01DA9B2E.049E3910]


john.hawk...@coliance.co | www.coliance.co
p: +44 (0) 7879 992532
Follow Coliance:  LinkedIn  
| YouTube  |  
Facebook  |  Instagram




Jenkins build is still unstable: Kafka » Kafka Branch Builder » trunk #2859

2024-04-30 Thread Apache Jenkins Server
See 




Re: [DISCUSS] KIP-1024: Make the restore behavior of GlobalKTables with custom processors configureable

2024-04-30 Thread Walker Carlson
Hey Bruno,

Thanks for the feedback. Sorry for the late reply, I was hoping others
might weigh in as well and it got away from me.

a) I like this but I think we should separate this out. This kip has
already dragged on more than it should and I think that is a big enough
change to get done by itself.

b) I'm a bit resistant to adding a new type of processor or store for this
change. It feels excessively complicated for what should be a small change.
I think that it might be good but I don't want to expand the scope more
than absolutely necessary here.

best,
walker

On Wed, Apr 10, 2024 at 4:34 AM Bruno Cadonna  wrote:

> Hi Walker,
>
> Thanks for the updates!
>
>
> (1) While I like naming the methods differently, I have also to say that
> I do not like addIsomorphicGlobalStore() because it does not really tell
> what the method does. I could also not come up with a better name than
> addGlobalStoreWithReprocessingOnRestore(). However, I had two ideas on
> which I would like to have your opinion.
>
> (a) Add a new GlobalStoreBuilder in which users can set if the global
> state store should reprocess on restore. Additionally, to the option to
> enable or disable reprocessing on restore, you could also NOT offer a
> way to enable or disable logging in the GlobalStoreBuilder. Currently,
> if users enable logging for a store builder that they pass into
> addGlobalStore(), Kafka Streams needs to explicitly disable it again,
> which is not ideal.
>
> (b) Add a new GlobalProcessorSupplier in which users can set if the
> global state store should reprocess on restore. Another ugliness that
> could be fixed with this is passing Void, Void to ProcessorSupplier. The
> GlobalProcessorSupplier would just have two type parameters .
> The nice aspect of this idea is that the option to enable/disable
> reprocessing on restore is only needed when a processor supplier is
> passed into the methods. That is not true for idea (a).
>
>
> (2) Yes, that was my intent.
>
>
> Best,
> Bruno
>
> On 4/9/24 9:33 PM, Walker Carlson wrote:
> > Hey all,
> >
> > (1) no I hadn't considered just naming the methods differently. I
> actually
> > really like this idea and am for it. Except we need 3 different methods
> > now. One for no processor, one for a processor that should restore and
> one
> > that reprocesses. How about `addCustomGlobalStore` and
> > `addIsomorphicGlobalStore` and then just `addGlobalStateStore` for the no
> > processor case? If everyone likes that I can add that to the KIP and
> rename
> > the methods.
> >
> > (2) we can have the the built in case use StoreBuilder > KeyValueStore> and manually check for the TimestampedKeyValueStore. That
> is
> > fine with me.
> >
> > Bruno I hope that was what you were intending.
> >
> > (3) For the scala api, do we need to make it match the java api or are we
> > just making the minimum changes? as if we take point 1 I don't know how
> > much we need to change.
> >
> > Thanks,
> > Walker
> >
> >
> > On Tue, Apr 2, 2024 at 8:38 AM Matthias J. Sax  wrote:
> >
> >> One more thing:
> >>
> >> I was just looking into the WIP PR, and it seems we will also need to
> >> change `StreamsBuilder.scala`. The KIP needs to cover this changes as
> well.
> >>
> >>
> >> -Matthias
> >>
> >> On 4/1/24 10:33 PM, Bruno Cadonna wrote:
> >>> Hi Walker and Matthias,
> >>>
> >>> (2)
> >>> That is exactly my point about having a compile time error versus a
> >>> runtime error. The added flexibility as proposed by Matthias sounds
> good
> >>> to me.
> >>>
> >>> Regarding the Named parameter, I was not aware that the processor that
> >>> writes records to the global state store is named according to the name
> >>> passed in by Consumed. I thought Consumed strictly specifies the names
> >>> of source processors. So I am fine with not having an overload with a
> >>> Named parameter.
> >>>
> >>> Best,
> >>> Bruno
> >>>
> >>> On 3/31/24 11:30 AM, Matthias J. Sax wrote:
>  Two more follow up thoughts:
> 
>  (1) I am still not a big fan of the boolean parameter we introduce.
>  Did you consider to use different method names, like
>  `addReadOnlyGlobalStore()` (for the optimized method, that would not
>  reprocess data on restore), and maybe add `addModifiableGlobalStore()`
>  (not a good name, but we cannot re-use existing `addGlobalStore()` --
>  maybe somebody else has a good idea about a better `addXxxGlobalStore`
>  that would describe it well).
> 
>  (2) I was thinking about Bruno's comment to limit the scope the store
>  builder for the optimized case. I think we should actually do
>  something about it, because in the end, the runtime (ie, the
>  `Processor` we hard wire) would need to pick a store it supports and
>  cast to the corresponding store? If the cast fails, we hit a runtime
>  exception, but by putting the store we cast to into the signature we
>  can actually convert it into a compile time error what seems better.
>  -- If we 

Re: [DISCUSS] KIP-1028: Docker Official Image for Apache Kafka

2024-04-30 Thread Chris Egerton
Hi Vedarth and Krish,

Thanks for the KIP! I have to admit I'm a little skeptical; hopefully you
can help me understand the need for these additional images.

1) In the motivation section it's stated that "Several other Apache
projects, like Flink, Spark, Solr, have already released Docker Official
Images, with download figures ranging from 50 million to over 1 billion.
These numbers highlight the significant demand among users." But then
immediately afterwards, we learn that "Also the Docker Official Images are
always the top 1 search result, irrespective of the number of downloads."
Wouldn't a high number of downloads for an image naturally follow from
being the top search result? It seems like we can't necessarily assume that
Docker Official Images are inherently more desirable for users based solely
on download statistics.

2) Can you elaborate on the value that these new images would add from a
user's perspective? I'm hesitant to introduce another image, since it adds
to the cognitive burden of people who will inevitably have to answer the
question of "What are the differences between all of the available images
and which one is best for my use case?"

3) Would a separate Docker-owned repository be out of the question? I'm
guessing there are some trademark issues that might get in the way, but
it's worth exploring since the entire purpose of this KIP seems to be to
provide images that are vetted and designed by Docker more than by the
Apache Kafka contributors/committers/PMC.

I may have more questions later but wanted to get this initial round out
now without trying to list everything first.

Looking forward to your thoughts!

Cheers,

Chris

On Mon, Apr 22, 2024 at 2:14 PM Vedarth Sharma 
wrote:

> Hey folks,
>
> Thanks a lot for reviewing the KIP and providing feedback.
> The discussion thread seems resolved and KIP has been updated accordingly.
> We will be starting the voting thread for this KIP in the next few days.
> Please take a look at the KIP and let us know if any further discussion
> is needed.
>
> Thanks and regards,
> Vedarth
>
> On Fri, Apr 19, 2024 at 1:33 PM Manikumar 
> wrote:
>
> > Thanks Krish. KIP looks good to me.
> >
> > On Wed, Apr 17, 2024 at 1:38 PM Krish Vora 
> wrote:
> > >
> > > Hi Manikumar,
> > >
> > > Thanks for the comments.
> > >
> > > Maybe as part of the release process, RM can create a JIRA for this
> > > > task. This can be taken by RM or any comitter or any contributor
> (with
> > > > some help from commiters to run "Docker Image Preparation via GitHub
> > > > Actions:"
> > >
> > > This sounds like a good idea. This step would be beneficial. By
> creating
> > a
> > > JIRA ticket, it will also serve as a reminder to complete the
> > post-release
> > > steps for the Docker official images. Have updated the KIP with this
> > step.
> > >
> > > Is this using GitHub Actions workflow? or manual testing?
> > >
> > > This will be done by a Github Actions workflow, which will test the
> > static
> > > Docker Official Image assets for a specific release version.
> > >
> > > Is it mandatory for RM/comitters to raise the PR to Docker Hub’s
> > > > official images repository (or) can it be done by any contributor.
> > >
> > > I believe that it can be done by any contributor (ref: This link
> > >  >
> > > quotes "*Anyone can provide feedback, contribute code, suggest process
> > > changes, or even propose a new Official Image.*")
> > >
> > > Also I was thinking, once the KIP gets voted, we should try to release
> > > > kafka:3.7.0 (or 3.7.1) Docker Official image. This will help us to
> > > > validate the process and allow us to fix any changes suggested by
> > > > Dockerhub before the 3.8.0 release.
> > >
> > > This sounds like a great idea. This KIP proposes release of DOI as a
> > > post-release process, which can be done anytime post release. Since
> 3.7.0
> > > is already released, we can perform these steps for that release too.
> By
> > > the time the KIP gets implemented, if 3.7.1 is released, we could do
> > these
> > > steps for 3.7.1, instead of 3.7.0. This would allow us to make changes
> to
> > > the Dockerfiles and other assets based on feedback from Docker Hub
> before
> > > the release of version 3.8.0.
> > >
> > > Thanks,
> > > Krish.
> > >
> > > On Mon, Apr 15, 2024 at 12:59 PM Manikumar 
> > > wrote:
> > >
> > > > Hi Krish,
> > > >
> > > > Thanks for the updated KIP. a few comments below.
> > > >
> > > > > "These actions can be carried out by the RM or any contributor post
> > the
> > > > release process."
> > > > Maybe as part of the release process, RM can create a JIRA for this
> > > > task. This can be taken by RM or any comitter or any contributor
> (with
> > > > some help from commiters to run "Docker Image Preparation via GitHub
> > > > Actions:"
> > > >
> > > > > "Perform Docker build tests to ensure image integrity"
> > > > Is this using GitHub Actions workflow? or manual 

Re: [DISCUSS] KIP-924: customizable task assignment for Streams

2024-04-30 Thread Bruno Cadonna

Hi again,

I forgot to ask whether you could add the agreement about handling 
invalid assignment to the KIP.


Best,
Bruno

On 4/30/24 2:00 PM, Bruno Cadonna wrote:

Hi all,

I think we are converging!

117
a) fail: Since it is an invalid consumer assignment
b) pass: I agree that not assigning a task might be reasonable in some 
situations
c) fail: For the reasons Lucas pointed out. I am missing a good use case 
here.

d) fail: It is invalid


Somewhere in the KIP you still use finalAssignment() instead of the 
wonderful method name onAssignmentComputed() ;-)
"... interface also includes a method named finalAssignment which is 
called with the final computed GroupAssignment ..."



Best,
Bruno


On 4/30/24 1:04 PM, Lucas Brutschy wrote:

Hi,

Looks like a great KIP to me!

I'm late, so I'm only going to comment on the last open point 117. I'm
against any fallbacks like "use the default assignor if the custom
assignment is invalid", as it's just going to hide bugs. For the 4
cases mentioned by Sophie:

117a) I'd fail immediately here, as it's an implementation bug, and
should not lead to a valid consumer group assignment.
117b) Agreed. This is a useful assignment and should be allowed.
117c) This is the tricky case. However, I'm leaning towards not
allowing this, unless we have a concrete use case. This will block us
from potentially using a single consumer for active and standby tasks
in the future. It's easier to drop the restriction later if we have a
concrete use case.
117d) Definitely fail immediately, as you said.

Cheers,
Lucas



On Mon, Apr 29, 2024 at 11:13 PM Sophie Blee-Goldman
 wrote:


Yeah I think that sums it up well. Either you computed a *possible* 
assignment,

or you returned something that makes it literally impossible for the
StreamsPartitionAssignor to decipher/translate into an actual group
assignment, in which case it should just fail

That's more or less it for the open questions that have been raised 
so far,

so I just want to remind folks that there's already a voting thread for
this. I cast my vote a few minutes ago so it should resurface in 
everyone's

inbox :)

On Thu, Apr 25, 2024 at 11:42 PM Rohan Desai 
wrote:


117: as Sophie laid out, there are two cases here right:
1. cases that are considered invalid by the existing assignors but are
still valid assignments in the sense that they can be used to 
generate a
valid consumer group assignment (from the perspective of the 
consumer group

protocol). An assignment that excludes a task is one such example, and
Sophie pointed out a good use case for it. I also think it makes 
sense to
allow these. It's hard to predict how a user might want to use the 
custom

assignor, and its reasonable to expect them to use it with care and not
hand-hold them.
2. cases that are not valid because it is impossible to compute a valid
consumer group assignment from them. In this case it seems totally
reasonable to just throw a fatal exception that gets passed to the 
uncaught

exception handler. If this case happens then there is some bug in the
user's assignor and its totally reasonable to fail the application 
in that
case. We _could_ try to be more graceful and default to one of the 
existing
assignors. But it's usually better to fail hard and fast when there 
is some

illegal state detected imo.

On Fri, Apr 19, 2024 at 4:18 PM Rohan Desai 
wrote:


Bruno, I've incorporated your feedback into the KIP document.

On Fri, Apr 19, 2024 at 3:55 PM Rohan Desai 
wrote:

Thanks for the feedback Bruno! For the most part I think it makes 
sense,

but leaving a couple follow-up thoughts/questions:

re 4: I think Sophie's point was slightly different - that we 
might want

to wrap the return type for `assign` in a class so that its easily
extensible. This makes sense to me. Whether we do that or not, we can

have

the return type be a Set instead of a Map as well.

re 6: Yes, it's a callback that's called with the final assignment. I
like your suggested name.

On Fri, Apr 5, 2024 at 12:17 PM Rohan Desai 
wrote:


Thanks for the feedback Sophie!

re1: Totally agree. The fact that it's related to the partition

assignor

is clear from just `task.assignor`. I'll update.
re3: This is a good point, and something I would find useful

personally.
I think its worth adding an interface that lets the plugin 
observe the

final assignment. I'll add that.
re4: I like the new `NodeAssignment` type. I'll update the KIP with

that.


On Thu, Nov 9, 2023 at 11:18 PM Rohan Desai 


wrote:


Thanks for the feedback so far! I think pretty much all of it is
reasonable. I'll reply to it inline:


1. All the API logic is granular at the Task level, except the

previousOwnerForPartition func. I’m not clear what’s the motivation
behind it, does our controller also want to change how the
partitions->tasks mapping is formed?
You're right that this is out of place. I've removed this method as
it's not needed by the task assignor.

2. Just on the API layering itself: it feels a 

Re: [DISCUSS] KIP-1039: Disable automatic topic creation for MirrorMaker2 consumers

2024-04-30 Thread Chris Egerton
Hi Aaron,

Thanks for publishing this KIP after the feedback on your PR. I'm generally
in favor but have a few questions:

1) Is the consumer mentioned in the KIP the one constructed by the
MirrorSourceConnector for polling replicated records from the source
cluster? If yes, are there any other consumers for which it would also make
sense to disable automatic topic creation?

2) Is broker-side automatic topic creation enabled intentionally in your
source cluster? I know that it is by default but in my experience it's
highly unusual to see it enabled in prod.

3) In that same vein, can you add disabling automatic topic creation
broker-side (setting "auto.create.topics.enable" to "false" in your broker
config(s)) as another rejected alternative? This is useful because it
provides a workaround for users running existing versions of MM2 that don't
have the changes proposed by your KIP.

4) Can you also provide the exact configuration changes that would be
required for the currently-mentioned rejected alternative ("An alternative
solution is to disable this option manually,"), which presumably refers to
altering the configuration of MM2 to manually force the changes to its
consumer(s) that this KIP proposes be applied by default? Again, this is
useful as documentation of a workaround for users running existing versions
of MM2.

Thanks again for the KIP!

Cheers,

Chris

On Fri, Apr 19, 2024 at 6:33 AM aaron ai  wrote:

> Hi Omnia,
> Thanks for your feedback!
>
> Yes, another approach could be to disable this option manually. IMHO, it's
> hard to envision a scenario where enabling it would be necessary. BTW, I've
> already added this part into the KIP.
>
> On Fri, Apr 19, 2024 at 6:18 PM Omnia Ibrahim 
> wrote:
>
> > Hi Aaron,
> > You mentioned that there is no public interface changes however changing
> > the default value of a config should be considered as a public change.
> You
> > can check other KIP where we changed the default config value for a
> > reference.
> >
> > Can you please list what is the impact of changing the behaviour of the
> > topic creation along side  as well as is there any rejected alternatives
> > like can’t customer disable allow.auto.create.topics manually for example
> > as a workaround?
> >
> > Thanks
> > Omnia
> >
> > > On 19 Apr 2024, at 10:37, aaron ai  wrote:
> > >
> > > Hi all,
> > >
> > > Here is the KIP-1039: Disable automatic topic creation for MirrorMaker2
> > > consumers
> > > <
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1039%3A+Disable+automatic+topic+creation+for+MirrorMaker2+consumers
> > >>
> > >
> > > Looking forward to your feedback!
> > >
> > > Best regards,
> > > Aaron
> >
> >
>


Re: [DISCUSS] KIP-936 Throttle number of active PIDs

2024-04-30 Thread Omnia Ibrahim
Hi, 
Just bringing some offline discussion and recent updated to the KIP here to the 
mailing list
Claude updated the KIP to use LayeredBloomFilter from Apache-commons 
https://commons.apache.org/proper/commons-collections/apidocs/org/apache/commons/collections4/bloomfilter/LayeredBloomFilter.html
 for the caching layer instead of implementing TimeBasedBloomFilter from 
scratch. 
Claude and I had some discussion regarding some configurations that impact the 
bloom filter. 
The KIP originally proposed to split each window into half however this now has 
been updated to use `producer.id .quota.cache.layer.count` 
which control how many layer each window will have. Default is 4 layers. I 
updated the Rejected Alternatives section to reflect why 2 layered bloom filter 
has been rejected
The KIP originally proposed to start a new bloom layer ever producer.id 
.quota.window.size.seconds/2 now this will be triggered 
ever producer.id .quota.window.size.seconds/producer.id 
.quota.cache.layer.count. I updated the diagram in the KIP 
to reflect this. 
The producer_ids_rate per user will be driving the maximum number of PIDs in 
Shape 
https://commons.apache.org/proper/commons-collections/apidocs/org/apache/commons/collections4/bloomfilter/Shape.html
 instead of introducing a global config for max size of Shape as it goes 
against the other rejected alternatives. Now Shape can be 
`Shape.fromNP(producer_ids_rate, producer.id 
.quota.cache.false.positive.rate)
The KIP now propose another config to control the bloom's false postive rate 
using producer.id .quota.cache.false.positive.rate by 
default this is set to 0.001 instead of being hardcoded. 

I think the KIP is now in better shape to ask others for a review. 

Omnia

> On 24 Apr 2024, at 14:48, Omnia Ibrahim  wrote:
> 
> Hi Glaude sorry that it took me a while to respond. I finally had time to 
> look into your implementation here 
> https://github.com/Claudenw/kafka/blob/KIP-936/storage/src/main/java/org/apache/kafka/storage/internals/log/ProducerIDQuotaManager.java#L121
>  and so far it make sense. 
> 
>> So an early PID is added to the first filter and the associated metric is
>> updated.
>> that PID is seen multiple times over the next 60 minutes, but is not added
>> to the Bloom filters again.
>> once the 60 minutes elapses the first filter is cleared, or removed and a
>> new one started.  In any case the PID is no longer recorded in any extant
>> Bloom filter.
>> the PID is seen again and is added to the newest bloom filter and the
>> associated metric is updated.
>> 
>> I believe at this point the metric is incorrect, the PID has been counted
>> 2x, when it has been in use for the entire time.
> 
> You are right! My original design with slides window bloom carries this risk 
> of duplication. I had a look into your track method and it makes sense.
>> 1. p.i.q.window.size.seconds the length of time that a window will
>>   exist.  This is also the maximum time between PID uses where the PID is
>>   considered to be the same.  Reuse of the PID after p.iq.window.size.seconds
>>   triggers recording the PID as a new PID.
>>   Define a new configuration option "producer.id.quota.window.count" as
>>   the number of windows active in window.size.seconds.
> This is correct. 
> 
>>   3. p.i.q.window.num, specified as 11 in the KIP.  I thought this was how
>>   many PIDs were expected in each window.  In the original KIP this means
>>   that we expect to see the PID 22 times (2 windows x 11).  In the Layered
>>   Bloom filter this would be the N value for the Shape.
> producer.id.quota.window.num is how many metrics samples we retain but not 
> how many PID we store in memory. 
>>   2. p.i.q.window.count (the new one), how many sections to break
>>   p.i.q.window.size.seconds into.  In the initial KIP this is 2.  This value
>>   gives us the timing for creating a new layer in the layered bloom filter.
>>   So every (p.i.q.window.size.seconds / p.i.q.window.count) seconds a new
>>   layer will be created and an old layer or layers will be removed.  The
>>   layered bloom filter will add layers to keep the probability of false
>>   positives in range.
>>   4. p.i.q.window.fpr (needs to be specified) the expected false positive
>>   rate.  Not sure how to express this in the config in a way that makes sense
>>   but something like 0.06 or the like.  This is the P value for the
>>   Shape.  See https://hur.st/bloomfilter for a Bloom filter 
> 
> For these two we might need to introduce them. How would they interact with 
> each other in the layered bloom filter? 
> 
> Thanks 
> 
>> On 16 Apr 2024, at 08:00, Claude Warren  wrote:
>> 
>> The difference between p.i.q.window.count and p.i.q.window.num:
>> 
>> To be honest, I may have misunderstood your definition of window num.  But
>> here is what I have in mind:
>> 
>> 
>>   1. 

Jenkins build is unstable: Kafka » Kafka Branch Builder » trunk #2858

2024-04-30 Thread Apache Jenkins Server
See 




Re: [DISCUSS] KIP-924: customizable task assignment for Streams

2024-04-30 Thread Bruno Cadonna

Hi all,

I think we are converging!

117
a) fail: Since it is an invalid consumer assignment
b) pass: I agree that not assigning a task might be reasonable in some 
situations
c) fail: For the reasons Lucas pointed out. I am missing a good use case 
here.

d) fail: It is invalid


Somewhere in the KIP you still use finalAssignment() instead of the 
wonderful method name onAssignmentComputed() ;-)
"... interface also includes a method named finalAssignment which is 
called with the final computed GroupAssignment ..."



Best,
Bruno


On 4/30/24 1:04 PM, Lucas Brutschy wrote:

Hi,

Looks like a great KIP to me!

I'm late, so I'm only going to comment on the last open point 117. I'm
against any fallbacks like "use the default assignor if the custom
assignment is invalid", as it's just going to hide bugs. For the 4
cases mentioned by Sophie:

117a) I'd fail immediately here, as it's an implementation bug, and
should not lead to a valid consumer group assignment.
117b) Agreed. This is a useful assignment and should be allowed.
117c) This is the tricky case. However, I'm leaning towards not
allowing this, unless we have a concrete use case. This will block us
from potentially using a single consumer for active and standby tasks
in the future. It's easier to drop the restriction later if we have a
concrete use case.
117d) Definitely fail immediately, as you said.

Cheers,
Lucas



On Mon, Apr 29, 2024 at 11:13 PM Sophie Blee-Goldman
 wrote:


Yeah I think that sums it up well. Either you computed a *possible* assignment,
or you returned something that makes it literally impossible for the
StreamsPartitionAssignor to decipher/translate into an actual group
assignment, in which case it should just fail

That's more or less it for the open questions that have been raised so far,
so I just want to remind folks that there's already a voting thread for
this. I cast my vote a few minutes ago so it should resurface in everyone's
inbox :)

On Thu, Apr 25, 2024 at 11:42 PM Rohan Desai 
wrote:


117: as Sophie laid out, there are two cases here right:
1. cases that are considered invalid by the existing assignors but are
still valid assignments in the sense that they can be used to generate a
valid consumer group assignment (from the perspective of the consumer group
protocol). An assignment that excludes a task is one such example, and
Sophie pointed out a good use case for it. I also think it makes sense to
allow these. It's hard to predict how a user might want to use the custom
assignor, and its reasonable to expect them to use it with care and not
hand-hold them.
2. cases that are not valid because it is impossible to compute a valid
consumer group assignment from them. In this case it seems totally
reasonable to just throw a fatal exception that gets passed to the uncaught
exception handler. If this case happens then there is some bug in the
user's assignor and its totally reasonable to fail the application in that
case. We _could_ try to be more graceful and default to one of the existing
assignors. But it's usually better to fail hard and fast when there is some
illegal state detected imo.

On Fri, Apr 19, 2024 at 4:18 PM Rohan Desai 
wrote:


Bruno, I've incorporated your feedback into the KIP document.

On Fri, Apr 19, 2024 at 3:55 PM Rohan Desai 
wrote:


Thanks for the feedback Bruno! For the most part I think it makes sense,
but leaving a couple follow-up thoughts/questions:

re 4: I think Sophie's point was slightly different - that we might want
to wrap the return type for `assign` in a class so that its easily
extensible. This makes sense to me. Whether we do that or not, we can

have

the return type be a Set instead of a Map as well.

re 6: Yes, it's a callback that's called with the final assignment. I
like your suggested name.

On Fri, Apr 5, 2024 at 12:17 PM Rohan Desai 
wrote:


Thanks for the feedback Sophie!

re1: Totally agree. The fact that it's related to the partition

assignor

is clear from just `task.assignor`. I'll update.
re3: This is a good point, and something I would find useful

personally.

I think its worth adding an interface that lets the plugin observe the
final assignment. I'll add that.
re4: I like the new `NodeAssignment` type. I'll update the KIP with

that.


On Thu, Nov 9, 2023 at 11:18 PM Rohan Desai 
wrote:


Thanks for the feedback so far! I think pretty much all of it is
reasonable. I'll reply to it inline:


1. All the API logic is granular at the Task level, except the

previousOwnerForPartition func. I’m not clear what’s the motivation
behind it, does our controller also want to change how the
partitions->tasks mapping is formed?
You're right that this is out of place. I've removed this method as
it's not needed by the task assignor.


2. Just on the API layering itself: it feels a bit weird to have the

three built-in functions (defaultStandbyTaskAssignment etc) sitting in
the ApplicationMetadata class. If we consider them as some default

util

functions, how about 

[jira] [Resolved] (KAFKA-16506) add the scala version of tool-related classes back to core module to follow KIP-906

2024-04-30 Thread PoAn Yang (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-16506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

PoAn Yang resolved KAFKA-16506.
---
Resolution: Not A Problem

Compare commands between 2.8 and trunk brand in folders 
core/src/main/scala/kafka/admin and tools/src/main. The only missing command is 
[PreferredReplicaLeaderElectionCommand.scala|https://github.com/apache/kafka/blob/2.8/core/src/main/scala/kafka/admin/PreferredReplicaLeaderElectionCommand.scala].
 It was deprecated in KAFKA-8405. Close the Jira since we don't add any command 
back.

> add the scala version of tool-related classes back to core module to follow 
> KIP-906
> ---
>
> Key: KAFKA-16506
> URL: https://issues.apache.org/jira/browse/KAFKA-16506
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Chia-Ping Tsai
>Assignee: PoAn Yang
>Priority: Major
>
> According to 
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-906%3A+Tools+migration+guidelines
>  , we have to deprecate the scala version of tool-related classes instead of 
> deleting them.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] KIP-924: customizable task assignment for Streams

2024-04-30 Thread Lucas Brutschy
Hi,

Looks like a great KIP to me!

I'm late, so I'm only going to comment on the last open point 117. I'm
against any fallbacks like "use the default assignor if the custom
assignment is invalid", as it's just going to hide bugs. For the 4
cases mentioned by Sophie:

117a) I'd fail immediately here, as it's an implementation bug, and
should not lead to a valid consumer group assignment.
117b) Agreed. This is a useful assignment and should be allowed.
117c) This is the tricky case. However, I'm leaning towards not
allowing this, unless we have a concrete use case. This will block us
from potentially using a single consumer for active and standby tasks
in the future. It's easier to drop the restriction later if we have a
concrete use case.
117d) Definitely fail immediately, as you said.

Cheers,
Lucas



On Mon, Apr 29, 2024 at 11:13 PM Sophie Blee-Goldman
 wrote:
>
> Yeah I think that sums it up well. Either you computed a *possible* 
> assignment,
> or you returned something that makes it literally impossible for the
> StreamsPartitionAssignor to decipher/translate into an actual group
> assignment, in which case it should just fail
>
> That's more or less it for the open questions that have been raised so far,
> so I just want to remind folks that there's already a voting thread for
> this. I cast my vote a few minutes ago so it should resurface in everyone's
> inbox :)
>
> On Thu, Apr 25, 2024 at 11:42 PM Rohan Desai 
> wrote:
>
> > 117: as Sophie laid out, there are two cases here right:
> > 1. cases that are considered invalid by the existing assignors but are
> > still valid assignments in the sense that they can be used to generate a
> > valid consumer group assignment (from the perspective of the consumer group
> > protocol). An assignment that excludes a task is one such example, and
> > Sophie pointed out a good use case for it. I also think it makes sense to
> > allow these. It's hard to predict how a user might want to use the custom
> > assignor, and its reasonable to expect them to use it with care and not
> > hand-hold them.
> > 2. cases that are not valid because it is impossible to compute a valid
> > consumer group assignment from them. In this case it seems totally
> > reasonable to just throw a fatal exception that gets passed to the uncaught
> > exception handler. If this case happens then there is some bug in the
> > user's assignor and its totally reasonable to fail the application in that
> > case. We _could_ try to be more graceful and default to one of the existing
> > assignors. But it's usually better to fail hard and fast when there is some
> > illegal state detected imo.
> >
> > On Fri, Apr 19, 2024 at 4:18 PM Rohan Desai 
> > wrote:
> >
> > > Bruno, I've incorporated your feedback into the KIP document.
> > >
> > > On Fri, Apr 19, 2024 at 3:55 PM Rohan Desai 
> > > wrote:
> > >
> > >> Thanks for the feedback Bruno! For the most part I think it makes sense,
> > >> but leaving a couple follow-up thoughts/questions:
> > >>
> > >> re 4: I think Sophie's point was slightly different - that we might want
> > >> to wrap the return type for `assign` in a class so that its easily
> > >> extensible. This makes sense to me. Whether we do that or not, we can
> > have
> > >> the return type be a Set instead of a Map as well.
> > >>
> > >> re 6: Yes, it's a callback that's called with the final assignment. I
> > >> like your suggested name.
> > >>
> > >> On Fri, Apr 5, 2024 at 12:17 PM Rohan Desai 
> > >> wrote:
> > >>
> > >>> Thanks for the feedback Sophie!
> > >>>
> > >>> re1: Totally agree. The fact that it's related to the partition
> > assignor
> > >>> is clear from just `task.assignor`. I'll update.
> > >>> re3: This is a good point, and something I would find useful
> > personally.
> > >>> I think its worth adding an interface that lets the plugin observe the
> > >>> final assignment. I'll add that.
> > >>> re4: I like the new `NodeAssignment` type. I'll update the KIP with
> > that.
> > >>>
> > >>> On Thu, Nov 9, 2023 at 11:18 PM Rohan Desai 
> > >>> wrote:
> > >>>
> >  Thanks for the feedback so far! I think pretty much all of it is
> >  reasonable. I'll reply to it inline:
> > 
> >  > 1. All the API logic is granular at the Task level, except the
> >  previousOwnerForPartition func. I’m not clear what’s the motivation
> >  behind it, does our controller also want to change how the
> >  partitions->tasks mapping is formed?
> >  You're right that this is out of place. I've removed this method as
> >  it's not needed by the task assignor.
> > 
> >  > 2. Just on the API layering itself: it feels a bit weird to have the
> >  three built-in functions (defaultStandbyTaskAssignment etc) sitting in
> >  the ApplicationMetadata class. If we consider them as some default
> > util
> >  functions, how about introducing moving those into their own static
> > util
> >  methods to separate from the ApplicationMetadata “fact objects” ?

[jira] [Created] (KAFKA-16648) Question: KIP-848 and KafkaTestKit.java

2024-04-30 Thread sanghyeok An (Jira)
sanghyeok An created KAFKA-16648:


 Summary: Question: KIP-848 and KafkaTestKit.java
 Key: KAFKA-16648
 URL: https://issues.apache.org/jira/browse/KAFKA-16648
 Project: Kafka
  Issue Type: Bug
Reporter: sanghyeok An
 Attachments: image-2024-04-30-19-19-12-316.png, 
image-2024-04-30-19-20-14-427.png

Hi, Kafka Team.
I am writing test code for the new rebalancing protocol proposed in KIP-848.

It works well in general code. However, it does not work properly when creating 
an EmbeddedBroker using KafkaTestKit.java.

 
 
 
### Phenomena
 # Create a CombineBroker that acts as both controller and broker using 
KafkaTestKit.
 # Consumer do subscribe() and poll() to created Broker. 
 

At this time, the Consumer sends a HeartBeat Signal to the Broker successfully. 
However, it never receives a Partition Assigned response from the Broker.
 
### What is my broker configs? 
!image-2024-04-30-19-19-12-316.png|width=530,height=228!
 
### Actual Broker Config.
!image-2024-04-30-19-20-14-427.png|width=465,height=151!
I set controller.quorum.voters = 0@localhost:9093, but 0@0.0.0.0.0:0 is setted. 
Because of this codes 
([https://github.com/apache/kafka/blob/7c0a302c4da9d53a8fddc504a9fac8d8afecbec8/core/src/test/java/kafka/testkit/KafkaClusterTestKit.java#L305-L307)]
 
 
 
### My opinion.
I am not familiar with the broker's quorum, but it seems to be the problem.
 
I expect that when the Consumer sends a poll request to the broker, the group 
coordinator broker assigns the topic/partition and then performs quorum for 
each epoch number.
 
However, it seems to not work because the controller to vote is represented as 
0.0.0.0:0.
 
This setting does not work well when applied to containers in docker-compose.
Could this be the cause of the problem?
 
 
### Question
If {{controller.quorum.voters}} is set to {{0.0.0.0:0}} and i want to use 
consumer group rebalancing through KIP-848, what settings should be applied to 
the brokers and consumers?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [ANNOUNCE] New committer: Igor Soarez

2024-04-30 Thread ziming deng
Congrats Igor!

> On Apr 26, 2024, at 13:53, Christo Lolov  wrote:
> 
> Congratulations Igor :) !
> 
> On Thu, 25 Apr 2024 at 17:07, Igor Soarez  wrote:
> 
>> Thanks everyone, I'm very honoured to join!
>> 
>> --
>> Igor
>> 



Re: [DISCUSS] KIP-1035: StateStore managed changelog offsets

2024-04-30 Thread Bruno Cadonna

Hi all,

100
I think we already have such a wrapper. It is called 
AbstractReadWriteDecorator.



101
Currently, the position is checkpointed when a offset checkpoint is 
written. If we let the state store manage the committed offsets, we need 
to also let the state store also manage the position otherwise they 
might diverge. State store managed offsets can get flushed (i.e. 
checkpointed) to the disk when the state store decides to flush its 
in-memory data structures, but the position is only checkpointed at 
commit time. Recovering after a failure might load inconsistent offsets 
and positions.



102
The position is maintained inside the state store, but is persisted in 
the .position file when the state store closes. The only public 
interface that uses the position is IQv2 in a read-only mode. So the 
position is only updated within the state store and read from IQv2. No 
need to add anything to the public StateStore interface.



103
Deprecating managesOffsets() right away might be a good idea.


104
I agree that we should try to support downgrades without wipes. At least 
Nick should state in the KIP why we do not support it.



Best,
Bruno




On 4/23/24 8:13 AM, Matthias J. Sax wrote:
Thanks for splitting out this KIP. The discussion shows, that it is a 
complex beast by itself, so worth to discuss by its own.



Couple of question / comment:


100 `StateStore#commit()`: The JavaDoc says "must not be called by 
users" -- I would propose to put a guard in place for this, by either 
throwing an exception (preferable) or adding a no-op implementation (at 
least for our own stores, by wrapping them -- we cannot enforce it for 
custom stores I assume), and document this contract explicitly.



101 adding `.position` to the store: Why do we actually need this? The 
KIP says "To ensure consistency with the committed data and changelog 
offsets" but I am not sure if I can follow? Can you elaborate why 
leaving the `.position` file as-is won't work?



If it's possible at all, it will need to be done by
creating temporary StateManagers and StateStores during rebalance. I 
think

it is possible, and probably not too expensive, but the devil will be in
the detail.


This sounds like a significant overhead to me. We know that opening a 
single RocksDB takes about 500ms, and thus opening RocksDB to get this 
information might slow down rebalances significantly.



102: It's unclear to me, how `.position` information is added. The KIP 
only says: "position offsets will be stored in RocksDB, in the same 
column family as the changelog offsets". Do you intent to add this 
information to the map passed via `commit(final MapLong> changelogOffsets)`? The KIP should describe this in more detail. 
Also, if my assumption is correct, we might want to rename the parameter 
and also have a better JavaDoc description?



103: Should we make it mandatory (long-term) that all stores (including 
custom stores) manage their offsets internally? Maintaining both options 
and thus both code paths puts a burden on everyone and make the code 
messy. I would strongly prefer if we could have mid-term path to get rid 
of supporting both.  -- For this case, we should deprecate the newly 
added `managesOffsets()` method right away, to point out that we intend 
to remove it. If it's mandatory to maintain offsets for stores, we won't 
need this method any longer. In memory stores can just return null from 
#committedOffset().



104 "downgrading": I think it might be worth to add support for 
downgrading w/o the need to wipe stores? Leveraging `upgrade.from` 
parameter, we could build a two rolling bounce downgrade: (1) the new 
code is started with `upgrade.from` set to a lower version, telling the 
runtime to do the cleanup on `close()` -- (ie, ensure that all data is 
written into `.checkpoint` and `.position` file, and the newly added CL 
is deleted). In a second, rolling bounce, the old code would be able to 
open RocksDB. -- I understand that this implies much more work, but 
downgrade seems to be common enough, that it might be worth it? Even if 
we did not always support this in the past, we have the face the fact 
that KS is getting more and more adopted and as a more mature product 
should support this?





-Matthias







On 4/21/24 11:58 PM, Bruno Cadonna wrote:

Hi all,

How should we proceed here?

1. with the plain .checkpoint file
2. with a way to use the state store interface on unassigned but 
locally existing task state


While I like option 2, I think option 1 is less risky and will give us 
the benefits of transactional state stores sooner. We should consider 
the interface approach afterwards, though.



Best,
Bruno



On 4/17/24 3:15 PM, Bruno Cadonna wrote:

Hi Nick and Sophie,

I think the task ID is not enough to create a state store that can 
read the offsets of non-assigned tasks for lag computation during 
rebalancing. The state store also needs the state directory so that 
it knows where to find the 

Re: [VOTE] KIP-1037: Allow WriteTxnMarkers API with Alter Cluster Permission

2024-04-30 Thread Nikhil Ramakrishnan
Hi all,

KIP-1037 is accepted with 4 +1s (3 binding and 1 non-binding votes).

Thank you all for casting your votes!


Nikhil



Nikhil

On Thu, Apr 25, 2024 at 10:28 AM Nikhil Ramakrishnan
 wrote:
>
> Thank you Justine, Christo, Andrew, and Luke for the vote! I will keep
> this vote open for about 3 more days in case there are any more
> comments or suggestions.
>
>
> Thanks,
> Nikhil
>
> On Wed, Apr 24, 2024 at 1:01 PM Luke Chen  wrote:
> >
> > Hi Nikhil,
> > Thanks for the KIP.
> >
> > +1 from me.
> >
> > Luke
> >
> > On Mon, Apr 22, 2024 at 7:41 PM Andrew Schofield 
> > wrote:
> >
> > > Hi Nikhil,
> > > Thanks for the KIP. Looks good to me.
> > >
> > > +1 (non-binding)
> > >
> > > Thanks,
> > > Andrew
> > >
> > > > On 22 Apr 2024, at 09:17, Christo Lolov  wrote:
> > > >
> > > > Heya Nikhil,
> > > >
> > > > Thanks for the proposal, as mentioned before it makes sense to me!
> > > >
> > > > +1 (binding)
> > > >
> > > > Best,
> > > > Christo
> > > >
> > > > On Sat, 20 Apr 2024 at 00:25, Justine Olshan
> > > 
> > > > wrote:
> > > >
> > > >> Hey Nikhil,
> > > >>
> > > >> I meant to comment on the discussion thread, but my draft took so long,
> > > you
> > > >> opened the vote.
> > > >>
> > > >> Regardless, I just wanted to say that it makes sense to me. +1 
> > > >> (binding)
> > > >>
> > > >> Justine
> > > >>
> > > >> On Fri, Apr 19, 2024 at 7:22 AM Nikhil Ramakrishnan <
> > > >> ramakrishnan.nik...@gmail.com> wrote:
> > > >>
> > > >>> Hi everyone,
> > > >>>
> > > >>> I would like to start a voting thread for KIP-1037: Allow
> > > >>> WriteTxnMarkers API with Alter Cluster Permission
> > > >>> (
> > > >>>
> > > >>
> > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-1037%3A+Allow+WriteTxnMarkers+API+with+Alter+Cluster+Permission
> > > >>> )
> > > >>> as there have been no objections on the discussion thread.
> > > >>>
> > > >>> For comments or feedback please check the discussion thread here:
> > > >>> https://lists.apache.org/thread/bbkyt8mrc8xp3jfyvhph7oqtjxl29xmn
> > > >>>
> > > >>> Thanks,
> > > >>> Nikhil
> > > >>>
> > > >>
> > >
> > >


[jira] [Created] (KAFKA-16647) Remove setMetadataDirectory from BrokerNode/ControllerNode

2024-04-30 Thread Chia-Ping Tsai (Jira)
Chia-Ping Tsai created KAFKA-16647:
--

 Summary: Remove setMetadataDirectory from BrokerNode/ControllerNode
 Key: KAFKA-16647
 URL: https://issues.apache.org/jira/browse/KAFKA-16647
 Project: Kafka
  Issue Type: Improvement
Reporter: Chia-Ping Tsai
Assignee: Kuan Po Tseng


`TestKitNodes` does not enable callers to define the location of "base folder". 
That makes sense to me since callers should not care for it. That means the 
location of metadata folder shoud be transparent to callers. Hence, the setter 
of metadata folder is useless.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[VOTE] KIP-1036: Extend RecordDeserializationException exception

2024-04-30 Thread Frédérik Rouleau
Hi all,

As there is no more activity for a while on the discuss thread, I think we
can start a vote.
The KIP is available on
https://cwiki.apache.org/confluence/display/KAFKA/KIP-1036%3A+Extend+RecordDeserializationException+exception


If you have some feedback or suggestions, please participate to the
discussion thread:
https://lists.apache.org/thread/or85okygtfywvnsfd37kwykkq5jq7fy5

Best regards,
Fred


[jira] [Created] (KAFKA-16646) Consider only running the CVE scanner action on apache/kafka and not in forks

2024-04-30 Thread Mickael Maison (Jira)
Mickael Maison created KAFKA-16646:
--

 Summary: Consider only running the CVE scanner action on 
apache/kafka and not in forks
 Key: KAFKA-16646
 URL: https://issues.apache.org/jira/browse/KAFKA-16646
 Project: Kafka
  Issue Type: Sub-task
Reporter: Mickael Maison


Currently the CVE scanner action is failing due to CVEs in the base image. It 
seems that anybody that has a fork is getting daily emails about it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-16645) CVEs in 3.7.0 docker image

2024-04-30 Thread Mickael Maison (Jira)
Mickael Maison created KAFKA-16645:
--

 Summary: CVEs in 3.7.0 docker image
 Key: KAFKA-16645
 URL: https://issues.apache.org/jira/browse/KAFKA-16645
 Project: Kafka
  Issue Type: Task
Affects Versions: 3.7.0
Reporter: Mickael Maison


Our Docker Image CVE Scanner GitHub action reports 2 high CVEs in our base 
image:

apache/kafka:3.7.0 (alpine 3.19.1)
==
Total: 2 (HIGH: 2, CRITICAL: 0)

┌──┬┬──┬┬───┬───┬─┐
│ Library  │ Vulnerability  │ Severity │ Status │ Installed Version │ Fixed 
Version │Title│
├──┼┼──┼┼───┼───┼─┤
│ libexpat │ CVE-2023-52425 │ HIGH │ fixed  │ 2.5.0-r2  │ 2.6.0-r0  
│ expat: parsing large tokens can trigger a denial of service │
│  ││  ││   │   
│ https://avd.aquasec.com/nvd/cve-2023-52425  │
│  ├┤  ││   
├───┼─┤
│  │ CVE-2024-28757 │  ││   │ 2.6.2-r0  
│ expat: XML Entity Expansion │
│  ││  ││   │   
│ https://avd.aquasec.com/nvd/cve-2024-28757  │
└──┴┴──┴┴───┴───┴─┘

Looking at the 
[KIP|https://cwiki.apache.org/confluence/display/KAFKA/KIP-975%3A+Docker+Image+for+Apache+Kafka#KIP975:DockerImageforApacheKafka-WhatifweobserveabugoracriticalCVEinthereleasedApacheKafkaDockerImage?]
 that introduced the docker images, it seems we should release a bugfix when 
high CVEs are detected. It would be good to investigate and assess whether 
Kafka is impacted or not.




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (KAFKA-10859) add @Test annotation to FileStreamSourceTaskTest.testInvalidFile and reduce the loop count to speedup the test

2024-04-30 Thread Chia-Ping Tsai (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-10859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chia-Ping Tsai resolved KAFKA-10859.

Resolution: Duplicate

fixed by 
https://github.com/apache/kafka/commit/3dacdc5694da5db283524889d2270695defebbaa

> add @Test annotation to FileStreamSourceTaskTest.testInvalidFile and reduce 
> the loop count to speedup the test
> --
>
> Key: KAFKA-10859
> URL: https://issues.apache.org/jira/browse/KAFKA-10859
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Chia-Ping Tsai
>Assignee: Tom Bentley
>Priority: Major
>  Labels: newbie
>
> FileStreamSourceTaskTest.testInvalidFile miss a `@Test` annotation. Also, it 
> loops 100 times which spend about 2m to complete a unit test.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)