Re: [DISCUSS] KIP-1020: Deprecate `window.size.ms` in `StreamsConfig`

2024-03-16 Thread Matthias J. Sax

Thanks for the summary. Sounds right to me. That is what I would propose.

As you pointed out, we of course still need to support the current 
confis, and we should log a warning when in use (even if the new one is 
in use IMHO) -- but that's more an implementation detail.


I agree that the new config should take preference in case both are 
specified. This should be pointed out in the KIP, as it's an important 
contract the user needs to understand.



-Matthias

On 3/14/24 6:18 PM, Sophie Blee-Goldman wrote:


Should we change it do `.serializer` and `.deserialize`?


That's a good point -- if we're going to split this up by defining the
config
in both the TimeWindowedSerializer and TimeWindowedDeserializer,
then it makes perfect sense to go a step further and actually define
only the relevant de/serializer class instead of the full serde

Just to put this all together, it sounds like the proposal is to:

1) Deprecate both these configs where they appear in StreamsConfig
(as per the original plan in the KIP, just reiterating it here)

2) Don't "define" either config in any specific client's Config class,
but just define a String variable with the config name in the relevant
de/serializer class, and maybe point people to it in the docs somewhere

3) We would add three new public String variables for three different
configs across two classes, specifically:

In TimeWindowedSerializer:
   - define a constant for "windowed.inner.serializer.class"
In TimeWindowedDeserializer:
   - define a constant for "windowed.inner.deserializer.class"
   - define a constant for "window.size.ms"

4) Lastly, we would update the windowed de/serializer implementations
to check for the new configs (ie "windowed.inner.de/serializer.class")
and use the provided de/serializer class, if one was given. If the new
configs are not present, they would fall back to the original/current
logic (ie that based on the old "windowed.inner.serde.class" config)

I think that's everything. Does this sound about right for where we want
to go with these configs?

On Thu, Mar 14, 2024 at 4:58 PM Matthias J. Sax  wrote:


By "don't add them" do you just mean we would not have any actual
variables defined anywhere for these configs (eg WINDOW_SIZE_MS)
and simply document -- somewhere -- that one can use the string
"window.size.ms" when configuring a command-line client with a
windowed serde?


Yes. That's the idea.




I assume you aren't proposing

to remove the ability to use and understand this config from the
implementations themselves, but correct me if that's wrong.


No, that would effectively break what we fixed with the original KIP :)




Are there any other configs in similar situations that we could look
to for precedent?


Not aware of any others, either.




If these are truly the first/only of their kind, I would vote to just

stick

them in the appropriate class. As for which class to put them in, I
think I'm convinced that "window.size.ms" should only go in the
TimeWindowedDeserializer rather than sticking them both in the
TimeWindowedSerde as I originally suggested. However, I would
even go a step further and not place the "inner.window.class.serde"
in the TimeWindowedSerde class either. To me, it actually makes
the most sense to define it in both the TimeWindowedSerializer
and the TimeWindowedDeserializer.


Not sure either. What you propose is fine with me. However, I am
wondering about the config names... Why is it `serde` for this case?
Should we change it do `.serializer` and `.deserialize`?



-Matthias


On 3/13/24 8:19 PM, Sophie Blee-Goldman wrote:

By "don't add them" do you just mean we would not have any actual
variables defined anywhere for these configs (eg WINDOW_SIZE_MS)
and simply document -- somewhere -- that one can use the string
"window.size.ms" when configuring a command-line client with a
windowed serde? Or something else? I assume you aren't proposing
to remove the ability to use and understand this config from the
implementations themselves, but correct me if that's wrong.

Are there any other configs in similar situations that we could look
to for precedent? I personally am not aware of any but by definition
I suppose these would be hard to discover unless you were actively
looking for them, so I'm wondering if there might be other "shadow
configs" elsewhere in the code base.

If these are truly the first/only of their kind, I would vote to just

stick

them in the appropriate class. As for which class to put them in, I
think I'm convinced that "window.size.ms" should only go in the
TimeWindowedDeserializer rather than sticking them both in the
TimeWindowedSerde as I originally suggested. However, I would
even go a step further and not place the "inner.window.class.serde"
in the TimeWindowedSerde class either. To me, it actually makes
the most sense to define it in both the TimeWindowedSerializer
and the TimeWindowedDeserializer.

The reason being that, as discussed above, the only use case for
these 

Re: [DISCUSS] KIP-956: Tiered Storage Quotas

2024-03-16 Thread Abhijeet Kumar
Hi Jorge,

The configs name was chosen to keep it consistent with the other existing
quota configs, such as
*replica.alter.log.dirs.io.max.bytes.per.second* as pointed out by Jun in
the thread.

Also, we can revisit the names of the components during implementation,
since those are not exposed to the user.

Please let me know if you have any further concerns.

Regards,
Abhijeet.



On Mon, Mar 11, 2024 at 6:11 PM Jorge Esteban Quilcate Otoya <
quilcate.jo...@gmail.com> wrote:

> Hi Abhijeet,
>
> Thanks for the KIP! Looks good to me. I just have a minor comments on
> naming:
>
> Would it be work to align the config names to existing quota names?
> e.g. `remote.log.manager.copy.byte.rate.quota` (or similar) instead of
> `remote.log.manager.copy.max.bytes.per.second`?
>
> Same for new components, could we use the same verbs as in the configs:
> - RLMCopyQuotaManager
> - RLMFetchQuotaManager
>
>
> On Fri, 8 Mar 2024 at 13:43, Abhijeet Kumar 
> wrote:
>
> > Thank you all for your comments. As all the comments in the thread are
> > addressed, I am starting a Vote thread for the KIP. Please have a look.
> >
> > Regards.
> >
> > On Thu, Mar 7, 2024 at 12:34 PM Luke Chen  wrote:
> >
> > > Hi Abhijeet,
> > >
> > > Thanks for the update and the explanation.
> > > I had another look, and it LGTM now!
> > >
> > > Thanks.
> > > Luke
> > >
> > > On Tue, Mar 5, 2024 at 2:50 AM Jun Rao 
> wrote:
> > >
> > > > Hi, Abhijeet,
> > > >
> > > > Thanks for the reply. Sounds good to me.
> > > >
> > > > Jun
> > > >
> > > >
> > > > On Sat, Mar 2, 2024 at 7:40 PM Abhijeet Kumar <
> > > abhijeet.cse@gmail.com>
> > > > wrote:
> > > >
> > > > > Hi Jun,
> > > > >
> > > > > Thanks for pointing it out. It makes sense to me. We can have the
> > > > following
> > > > > metrics instead. What do you think?
> > > > >
> > > > >- remote-(fetch|copy)-throttle-time-avg (The average time in ms
> > > remote
> > > > >fetches/copies was throttled by a broker)
> > > > >- remote-(fetch|copy)-throttle-time--max (The maximum time in ms
> > > > remote
> > > > >fetches/copies was throttled by a broker)
> > > > >
> > > > > These are similar to fetch-throttle-time-avg and
> > > fetch-throttle-time-max
> > > > > metrics we have for Kafak Consumers?
> > > > > The Avg and Max are computed over the (sliding) window as defined
> by
> > > the
> > > > > configuration metrics.sample.window.ms on the server.
> > > > >
> > > > > (Also, I will update the config and metric names to be consistent)
> > > > >
> > > > > Regards.
> > > > >
> > > > > On Thu, Feb 29, 2024 at 2:51 AM Jun Rao 
> > > > wrote:
> > > > >
> > > > > > Hi, Abhijeet,
> > > > > >
> > > > > > Thanks for the reply.
> > > > > >
> > > > > > The issue with recording the throttle time as a gauge is that
> it's
> > > > > > transient. If the metric is not read immediately, the recorded
> > value
> > > > > could
> > > > > > be reset to 0. The admin won't realize that throttling has
> > happened.
> > > > > >
> > > > > > For client quotas, the throttle time is tracked as the average
> > > > > > throttle-time per user/client-id. This makes the metric less
> > > transient.
> > > > > >
> > > > > > Also, the configs use read/write whereas the metrics use
> > fetch/copy.
> > > > > Could
> > > > > > we make them consistent?
> > > > > >
> > > > > > Jun
> > > > > >
> > > > > > On Wed, Feb 28, 2024 at 6:49 AM Abhijeet Kumar <
> > > > > abhijeet.cse@gmail.com
> > > > > > >
> > > > > > wrote:
> > > > > >
> > > > > > > Hi Jun,
> > > > > > >
> > > > > > > Clarified the meaning of the two metrics. Also updated the KIP.
> > > > > > >
> > > > > > > kafka.log.remote:type=RemoteLogManager,
> > > name=RemoteFetchThrottleTime
> > > > ->
> > > > > > The
> > > > > > > duration of time required at a given moment to bring the
> observed
> > > > fetch
> > > > > > > rate within the allowed limit, by preventing further reads.
> > > > > > > kafka.log.remote:type=RemoteLogManager,
> > name=RemoteCopyThrottleTime
> > > > ->
> > > > > > The
> > > > > > > duration of time required at a given moment to bring the
> observed
> > > > > remote
> > > > > > > copy rate within the allowed limit, by preventing further
> copies.
> > > > > > >
> > > > > > > Regards.
> > > > > > >
> > > > > > > On Wed, Feb 28, 2024 at 12:28 AM Jun Rao
> >  > > >
> > > > > > wrote:
> > > > > > >
> > > > > > > > Hi, Abhijeet,
> > > > > > > >
> > > > > > > > Thanks for the explanation. Makes sense to me now.
> > > > > > > >
> > > > > > > > Just a minor comment. Could you document the exact meaning of
> > the
> > > > > > > following
> > > > > > > > two metrics? For example, is the time accumulated? If so, is
> it
> > > > from
> > > > > > the
> > > > > > > > start of the broker or over some window?
> > > > > > > >
> > > > > > > > kafka.log.remote:type=RemoteLogManager,
> > > > name=RemoteFetchThrottleTime
> > > > > > > > kafka.log.remote:type=RemoteLogManager,
> > > name=RemoteCopyThrottleTime
> > > > > > > >
> > > > > > > > Jun
> > > > > > > >
> > > > > > 

Jenkins build is still unstable: Kafka » Kafka Branch Builder » 3.6 #155

2024-03-16 Thread Apache Jenkins Server
See 




[jira] [Resolved] (KAFKA-16342) Fix compressed records

2024-03-16 Thread Luke Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-16342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Chen resolved KAFKA-16342.
---
Fix Version/s: 3.6.2
   3.8.0
   3.7.1
   Resolution: Fixed

> Fix compressed records
> --
>
> Key: KAFKA-16342
> URL: https://issues.apache.org/jira/browse/KAFKA-16342
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Luke Chen
>Assignee: Luke Chen
>Priority: Major
> Fix For: 3.6.2, 3.8.0, 3.7.1
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)