+ users@kafka

Hi users of Apache Kafka

With the upcoming 4.0 release, we have an opportunity to improve the
constraints and default values for various Kafka configurations.

We are soliciting your feedback and suggestions on configurations where the
default values and/or constraints should be adjusted. Please reply in this
thread directly.

--
Divij Vaidya
Apache Kafka PMC



On Wed, Mar 13, 2024 at 12:56 PM Divij Vaidya <divijvaidy...@gmail.com>
wrote:

> Thanks for the discussion folks. I have started a KIP
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1030%3A+Change+constraints+and+default+values+for+various+configurations
> to keep track of the changes that we are discussion. Please consider this
> as a collaborative work-in-progress KIP and once it is ready to be
> published, we can start a discussion thread on it.
>
> I am also going to start a thread to solicit feedback from users@ mailing
> list as well.
>
> --
> Divij Vaidya
>
>
>
> On Wed, Mar 13, 2024 at 12:55 PM Christopher Shannon <
> christopher.l.shan...@gmail.com> wrote:
>
>> I think it's a great idea to raise a KIP to look at adjusting defaults and
>> minimum/maximum config values for version 4.0.
>>
>> As pointed out, the minimum values for segment.ms and segment.bytes don't
>> make sense and would probably bring down a cluster pretty quickly if set
>> that low, so version 4.0 is a good time to fix it and to also look at the
>> other configs as well for adjustments.
>>
>> On Wed, Mar 13, 2024 at 4:39 AM Sergio Daniel Troiano
>> <sergio.troi...@adevinta.com.invalid> wrote:
>>
>> > hey guys,
>> >
>> > Regarding to num.recovery.threads.per.data.dir: I agree, in our company
>> we
>> > use the number of vCPUs to do so as this is not competing with ready
>> > cluster traffic.
>> >
>> >
>> > On Wed, 13 Mar 2024 at 09:29, Luke Chen <show...@gmail.com> wrote:
>> >
>> > > Hi Divij,
>> > >
>> > > Thanks for raising this.
>> > > The valid minimum value 1 for `segment.ms` is completely
>> unreasonable.
>> > > Similarly for `segment.bytes`, `metadata.log.segment.ms`,
>> > > `metadata.log.segment.bytes`.
>> > >
>> > > In addition to that, there are also some config default values we'd
>> like
>> > to
>> > > propose to change in v4.0.
>> > > We can collect more comments from the community, and come out with a
>> KIP
>> > > for them.
>> > >
>> > > 1. num.recovery.threads.per.data.dir:
>> > > The current default value is 1. But the log recovery is happening
>> before
>> > > brokers are in ready state, which means, we should use all the
>> available
>> > > resource to speed up the log recovery to bring the broker to ready
>> state
>> > > soon. Default value should be... maybe 4 (to be decided)?
>> > >
>> > > 2. Other configs might be able to consider to change the default, but
>> > open
>> > > for comments:
>> > >    2.1. num.replica.fetchers: default is 1, but that's not enough when
>> > > there are multiple partitions in the cluster
>> > >    2.2. `socket.send.buffer.bytes`/`socket.receive.buffer.bytes`:
>> > > Currently, we set 100kb as default value, but that's not enough for
>> > > high-speed network.
>> > >
>> > > Thank you.
>> > > Luke
>> > >
>> > >
>> > > On Tue, Mar 12, 2024 at 1:32 AM Divij Vaidya <divijvaidy...@gmail.com
>> >
>> > > wrote:
>> > >
>> > > > Hey folks
>> > > >
>> > > > Before I file a KIP to change this in 4.0, I wanted to understand
>> the
>> > > > historical context for the value of the following setting.
>> > > >
>> > > > Currently, segment.ms minimum threshold is set to 1ms [1].
>> > > >
>> > > > Segments are expensive. Every segment uses multiple file descriptors
>> > and
>> > > > it's easy to run out of OS limits when creating a large number of
>> > > segments.
>> > > > Large number of segments also delays log loading on startup because
>> of
>> > > > expensive operations such as iterating through all directories &
>> > > > conditionally loading all producer state.
>> > > >
>> > > > I am currently not aware of a reason as to why someone might want to
>> > work
>> > > > with a segment.ms of less than ~10s (number chosen arbitrary that
>> > looks
>> > > > sane)
>> > > >
>> > > > What was the historical context of setting the minimum threshold to
>> 1ms
>> > > for
>> > > > this setting?
>> > > >
>> > > > [1]
>> > https://kafka.apache.org/documentation.html#topicconfigs_segment.ms
>> > > >
>> > > > --
>> > > > Divij Vaidya
>> > > >
>> > >
>> >
>>
>

Reply via email to