[jira] [Commented] (KAFKA-16584) Make log processing summary configurable or debug

2024-05-07 Thread dujian (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-16584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17844506#comment-17844506
 ] 

dujian commented on KAFKA-16584:


hello [~mjsax] 

very thanks you willing help me create KIP, but it’s  my first creat KIP, I am 
very worried that there will be many modifications and updates in the future, 
I'm worried that multiple revisions will affect you,  Therefore no KIP doc is 
provided。

 

now , the assignee of ‘https://issues.apache.org/jira/browse/INFRA-25451’ can 
create  wiki id for me, but must the PMC

 of project(Kafka) send email or  comment the issues, are you know how to 
connect the PMC。 

my  email is ‘dujian0...@gmail.com’

> Make log processing summary configurable or debug
> -
>
> Key: KAFKA-16584
> URL: https://issues.apache.org/jira/browse/KAFKA-16584
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 3.6.2
>Reporter: Andras Hatvani
>Assignee: dujian
>Priority: Major
>  Labels: needs-kip, newbie
>
> Currently *every two minutes for every stream thread* statistics will be 
> logged on INFO log level. 
> {code}
> 2024-04-18T09:18:23.790+02:00  INFO 33178 --- [service] [-StreamThread-1] 
> o.a.k.s.p.internals.StreamThread         : stream-thread 
> [service-149405a3-c7e3-4505-8bbd-c3bff226b115-StreamThread-1] Processed 0 
> total records, ran 0 punctuators, and committed 0 total tasks since the last 
> update {code}
> This is absolutely unnecessary and even harmful since it fills the logs and 
> thus storage space with unwanted and useless data. Otherwise the INFO logs 
> are useful and helpful, therefore it's not an option to raise the log level 
> to WARN.
> Please make the logProcessingSummary 
> * either to a DEBUG level log or
> * make it configurable so that it can be disabled.
> This is the relevant code: 
> https://github.com/apache/kafka/blob/aee9724ee15ed539ae73c09cc2c2eda83ae3c864/streams/src/main/java/org/apache/kafka/streams/processor/internals/StreamThread.java#L1073



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KAFKA-16584) Make log processing summary configurable or debug

2024-04-25 Thread dujian (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-16584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17841056#comment-17841056
 ] 

dujian commented on KAFKA-16584:


hello  [~mjsax] 

before create KIP, i must create a wiki ID, but “ 
[https://cwiki.apache.org/confluence/signup.action]”  registration function 
turned off,can you help me

> Make log processing summary configurable or debug
> -
>
> Key: KAFKA-16584
> URL: https://issues.apache.org/jira/browse/KAFKA-16584
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 3.6.2
>Reporter: Andras Hatvani
>Assignee: dujian
>Priority: Major
>  Labels: needs-kip, newbie
>
> Currently *every two minutes for every stream thread* statistics will be 
> logged on INFO log level. 
> {code}
> 2024-04-18T09:18:23.790+02:00  INFO 33178 --- [service] [-StreamThread-1] 
> o.a.k.s.p.internals.StreamThread         : stream-thread 
> [service-149405a3-c7e3-4505-8bbd-c3bff226b115-StreamThread-1] Processed 0 
> total records, ran 0 punctuators, and committed 0 total tasks since the last 
> update {code}
> This is absolutely unnecessary and even harmful since it fills the logs and 
> thus storage space with unwanted and useless data. Otherwise the INFO logs 
> are useful and helpful, therefore it's not an option to raise the log level 
> to WARN.
> Please make the logProcessingSummary 
> * either to a DEBUG level log or
> * make it configurable so that it can be disabled.
> This is the relevant code: 
> https://github.com/apache/kafka/blob/aee9724ee15ed539ae73c09cc2c2eda83ae3c864/streams/src/main/java/org/apache/kafka/streams/processor/internals/StreamThread.java#L1073



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (KAFKA-16584) Make log processing summary configurable or debug

2024-04-25 Thread dujian (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-16584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dujian reassigned KAFKA-16584:
--

Assignee: dujian

> Make log processing summary configurable or debug
> -
>
> Key: KAFKA-16584
> URL: https://issues.apache.org/jira/browse/KAFKA-16584
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 3.6.2
>Reporter: Andras Hatvani
>Assignee: dujian
>Priority: Major
>  Labels: needs-kip, newbie
>
> Currently *every two minutes for every stream thread* statistics will be 
> logged on INFO log level. 
> {code}
> 2024-04-18T09:18:23.790+02:00  INFO 33178 --- [service] [-StreamThread-1] 
> o.a.k.s.p.internals.StreamThread         : stream-thread 
> [service-149405a3-c7e3-4505-8bbd-c3bff226b115-StreamThread-1] Processed 0 
> total records, ran 0 punctuators, and committed 0 total tasks since the last 
> update {code}
> This is absolutely unnecessary and even harmful since it fills the logs and 
> thus storage space with unwanted and useless data. Otherwise the INFO logs 
> are useful and helpful, therefore it's not an option to raise the log level 
> to WARN.
> Please make the logProcessingSummary 
> * either to a DEBUG level log or
> * make it configurable so that it can be disabled.
> This is the relevant code: 
> https://github.com/apache/kafka/blob/aee9724ee15ed539ae73c09cc2c2eda83ae3c864/streams/src/main/java/org/apache/kafka/streams/processor/internals/StreamThread.java#L1073



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KAFKA-16582) Feature Request: Introduce max.record.size Configuration Parameter for Producers

2024-04-25 Thread dujian (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-16582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17840684#comment-17840684
 ] 

dujian commented on KAFKA-16582:


hello [~ramiz.mehran] 

I have reproduced this problem and found that the delay will further increase 
as more messages are sent.

But I haven't found the reason why the problem occurs

> Feature Request: Introduce max.record.size Configuration Parameter for 
> Producers
> 
>
> Key: KAFKA-16582
> URL: https://issues.apache.org/jira/browse/KAFKA-16582
> Project: Kafka
>  Issue Type: New Feature
>  Components: producer 
>Affects Versions: 3.6.2
>Reporter: Ramiz Mehran
>Priority: Major
>
> {*}Summary{*}:
> Currently, Kafka producers have a {{max.request.size}} configuration that 
> limits the size of the request sent to Kafka brokers, which includes both 
> compressed and uncompressed data sizes. However, it is also the maximum size 
> of an individual record before it is compressed. This can lead to 
> inefficiencies and unexpected behaviours, particularly when records are 
> significantly large before compression but fit multiple times into the 
> {{max.request.size}} after compression.
> {*}Problem{*}:
> During spikes in data transmission, especially with large records, even when 
> compressed within the limits of {{{}max.request.size{}}}, it causes an 
> increase in latency and potential backlog in processing due to the large 
> batch sizes formed by compressed records. This problem is particularly 
> pronounced when using highly efficient compression algorithms like zstd, 
> where the compressed size may allow for large batches that are inefficient to 
> process.
> {*}Proposed Solution{*}:
> Introduce a new producer configuration parameter: {{{}max.record.size{}}}. 
> This parameter will allow administrators to define the maximum size of a 
> record before it is compressed. This would help in managing expectations and 
> system behavior more predictably by separating uncompressed record size limit 
> from compressed request size limit.
> {*}Benefits{*}:
>  # {*}Predictability{*}: Producers can reject records that exceed the 
> {{max.record.size}} before spending resources on compression.
>  # {*}Efficiency{*}: Helps in maintaining efficient batch sizes and system 
> throughput, especially under high load conditions.
>  # {*}System Stability{*}: Avoids the potential for large batch processing 
> which can affect latency and throughput negatively.
> {*}Example{*}: Consider a scenario where the producer sends records up to 20 
> MB in size which, when compressed, fit into a batch under the 25 MB 
> {{max.request.size }}multiple times. These batches can be problematic to 
> process efficiently, even though they meet the current maximum request size 
> constraints. With {{{}max.record.size{}}}, we could separate max.request.size 
> to only limit compressed request size creation, thus helping us limit that to 
> say 5 MB. Thus, preventing very large requests being made, and causing 
> latency spikes.
> {*}Steps to Reproduce{*}:
>  # Configure a Kafka producer with {{max.request.size}} set to 25 MB.
>  # Send multiple uncompressed records close to 20 MB that compress to less 
> than 25 MB.
>  # Observe the impact on Kafka broker performance and client side latency.
> {*}Expected Behavior{*}: The producer should allow administrators to set both 
> pre-compression record size limits and total request size limits post 
> compression.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KAFKA-16584) Make log processing summary configurable or debug

2024-04-24 Thread dujian (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-16584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17840631#comment-17840631
 ] 

dujian commented on KAFKA-16584:


hello  [~mjsax] 

I would like to confirm whether this problem requires code modification. If so, 
can assign it to me?

> Make log processing summary configurable or debug
> -
>
> Key: KAFKA-16584
> URL: https://issues.apache.org/jira/browse/KAFKA-16584
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 3.6.2
>Reporter: Andras Hatvani
>Priority: Major
>  Labels: needs-kip, newbie
>
> Currently *every two minutes for every stream thread* statistics will be 
> logged on INFO log level. 
> {code}
> 2024-04-18T09:18:23.790+02:00  INFO 33178 --- [service] [-StreamThread-1] 
> o.a.k.s.p.internals.StreamThread         : stream-thread 
> [service-149405a3-c7e3-4505-8bbd-c3bff226b115-StreamThread-1] Processed 0 
> total records, ran 0 punctuators, and committed 0 total tasks since the last 
> update {code}
> This is absolutely unnecessary and even harmful since it fills the logs and 
> thus storage space with unwanted and useless data. Otherwise the INFO logs 
> are useful and helpful, therefore it's not an option to raise the log level 
> to WARN.
> Please make the logProcessingSummary 
> * either to a DEBUG level log or
> * make it configurable so that it can be disabled.
> This is the relevant code: 
> https://github.com/apache/kafka/blob/aee9724ee15ed539ae73c09cc2c2eda83ae3c864/streams/src/main/java/org/apache/kafka/streams/processor/internals/StreamThread.java#L1073



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (KAFKA-16582) Feature Request: Introduce max.record.size Configuration Parameter for Producers

2024-04-22 Thread dujian (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-16582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dujian reassigned KAFKA-16582:
--

Assignee: (was: dujian)

> Feature Request: Introduce max.record.size Configuration Parameter for 
> Producers
> 
>
> Key: KAFKA-16582
> URL: https://issues.apache.org/jira/browse/KAFKA-16582
> Project: Kafka
>  Issue Type: New Feature
>  Components: producer 
>Affects Versions: 3.6.2
>Reporter: Ramiz Mehran
>Priority: Major
>
> {*}Summary{*}:
> Currently, Kafka producers have a {{max.request.size}} configuration that 
> limits the size of the request sent to Kafka brokers, which includes both 
> compressed and uncompressed data sizes. However, it is also the maximum size 
> of an individual record before it is compressed. This can lead to 
> inefficiencies and unexpected behaviours, particularly when records are 
> significantly large before compression but fit multiple times into the 
> {{max.request.size}} after compression.
> {*}Problem{*}:
> During spikes in data transmission, especially with large records, even when 
> compressed within the limits of {{{}max.request.size{}}}, it causes an 
> increase in latency and potential backlog in processing due to the large 
> batch sizes formed by compressed records. This problem is particularly 
> pronounced when using highly efficient compression algorithms like zstd, 
> where the compressed size may allow for large batches that are inefficient to 
> process.
> {*}Proposed Solution{*}:
> Introduce a new producer configuration parameter: {{{}max.record.size{}}}. 
> This parameter will allow administrators to define the maximum size of a 
> record before it is compressed. This would help in managing expectations and 
> system behavior more predictably by separating uncompressed record size limit 
> from compressed request size limit.
> {*}Benefits{*}:
>  # {*}Predictability{*}: Producers can reject records that exceed the 
> {{max.record.size}} before spending resources on compression.
>  # {*}Efficiency{*}: Helps in maintaining efficient batch sizes and system 
> throughput, especially under high load conditions.
>  # {*}System Stability{*}: Avoids the potential for large batch processing 
> which can affect latency and throughput negatively.
> {*}Example{*}: Consider a scenario where the producer sends records up to 20 
> MB in size which, when compressed, fit into a batch under the 25 MB 
> {{max.request.size }}multiple times. These batches can be problematic to 
> process efficiently, even though they meet the current maximum request size 
> constraints. With {{{}max.record.size{}}}, we could separate max.request.size 
> to only limit compressed request size creation, thus helping us limit that to 
> say 5 MB. Thus, preventing very large requests being made, and causing 
> latency spikes.
> {*}Steps to Reproduce{*}:
>  # Configure a Kafka producer with {{max.request.size}} set to 25 MB.
>  # Send multiple uncompressed records close to 20 MB that compress to less 
> than 25 MB.
>  # Observe the impact on Kafka broker performance and client side latency.
> {*}Expected Behavior{*}: The producer should allow administrators to set both 
> pre-compression record size limits and total request size limits post 
> compression.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (KAFKA-16582) Feature Request: Introduce max.record.size Configuration Parameter for Producers

2024-04-22 Thread dujian (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-16582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dujian reassigned KAFKA-16582:
--

Assignee: dujian

> Feature Request: Introduce max.record.size Configuration Parameter for 
> Producers
> 
>
> Key: KAFKA-16582
> URL: https://issues.apache.org/jira/browse/KAFKA-16582
> Project: Kafka
>  Issue Type: New Feature
>  Components: producer 
>Affects Versions: 3.6.2
>Reporter: Ramiz Mehran
>Assignee: dujian
>Priority: Major
>
> {*}Summary{*}:
> Currently, Kafka producers have a {{max.request.size}} configuration that 
> limits the size of the request sent to Kafka brokers, which includes both 
> compressed and uncompressed data sizes. However, it is also the maximum size 
> of an individual record before it is compressed. This can lead to 
> inefficiencies and unexpected behaviours, particularly when records are 
> significantly large before compression but fit multiple times into the 
> {{max.request.size}} after compression.
> {*}Problem{*}:
> During spikes in data transmission, especially with large records, even when 
> compressed within the limits of {{{}max.request.size{}}}, it causes an 
> increase in latency and potential backlog in processing due to the large 
> batch sizes formed by compressed records. This problem is particularly 
> pronounced when using highly efficient compression algorithms like zstd, 
> where the compressed size may allow for large batches that are inefficient to 
> process.
> {*}Proposed Solution{*}:
> Introduce a new producer configuration parameter: {{{}max.record.size{}}}. 
> This parameter will allow administrators to define the maximum size of a 
> record before it is compressed. This would help in managing expectations and 
> system behavior more predictably by separating uncompressed record size limit 
> from compressed request size limit.
> {*}Benefits{*}:
>  # {*}Predictability{*}: Producers can reject records that exceed the 
> {{max.record.size}} before spending resources on compression.
>  # {*}Efficiency{*}: Helps in maintaining efficient batch sizes and system 
> throughput, especially under high load conditions.
>  # {*}System Stability{*}: Avoids the potential for large batch processing 
> which can affect latency and throughput negatively.
> {*}Example{*}: Consider a scenario where the producer sends records up to 20 
> MB in size which, when compressed, fit into a batch under the 25 MB 
> {{max.request.size }}multiple times. These batches can be problematic to 
> process efficiently, even though they meet the current maximum request size 
> constraints. With {{{}max.record.size{}}}, we could separate max.request.size 
> to only limit compressed request size creation, thus helping us limit that to 
> say 5 MB. Thus, preventing very large requests being made, and causing 
> latency spikes.
> {*}Steps to Reproduce{*}:
>  # Configure a Kafka producer with {{max.request.size}} set to 25 MB.
>  # Send multiple uncompressed records close to 20 MB that compress to less 
> than 25 MB.
>  # Observe the impact on Kafka broker performance and client side latency.
> {*}Expected Behavior{*}: The producer should allow administrators to set both 
> pre-compression record size limits and total request size limits post 
> compression.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (KAFKA-16582) Feature Request: Introduce max.record.size Configuration Parameter for Producers

2024-04-22 Thread dujian (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-16582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dujian reassigned KAFKA-16582:
--

Assignee: (was: dujian)

> Feature Request: Introduce max.record.size Configuration Parameter for 
> Producers
> 
>
> Key: KAFKA-16582
> URL: https://issues.apache.org/jira/browse/KAFKA-16582
> Project: Kafka
>  Issue Type: New Feature
>  Components: producer 
>Affects Versions: 3.6.2
>Reporter: Ramiz Mehran
>Priority: Major
>
> {*}Summary{*}:
> Currently, Kafka producers have a {{max.request.size}} configuration that 
> limits the size of the request sent to Kafka brokers, which includes both 
> compressed and uncompressed data sizes. However, it is also the maximum size 
> of an individual record before it is compressed. This can lead to 
> inefficiencies and unexpected behaviours, particularly when records are 
> significantly large before compression but fit multiple times into the 
> {{max.request.size}} after compression.
> {*}Problem{*}:
> During spikes in data transmission, especially with large records, even when 
> compressed within the limits of {{{}max.request.size{}}}, it causes an 
> increase in latency and potential backlog in processing due to the large 
> batch sizes formed by compressed records. This problem is particularly 
> pronounced when using highly efficient compression algorithms like zstd, 
> where the compressed size may allow for large batches that are inefficient to 
> process.
> {*}Proposed Solution{*}:
> Introduce a new producer configuration parameter: {{{}max.record.size{}}}. 
> This parameter will allow administrators to define the maximum size of a 
> record before it is compressed. This would help in managing expectations and 
> system behavior more predictably by separating uncompressed record size limit 
> from compressed request size limit.
> {*}Benefits{*}:
>  # {*}Predictability{*}: Producers can reject records that exceed the 
> {{max.record.size}} before spending resources on compression.
>  # {*}Efficiency{*}: Helps in maintaining efficient batch sizes and system 
> throughput, especially under high load conditions.
>  # {*}System Stability{*}: Avoids the potential for large batch processing 
> which can affect latency and throughput negatively.
> {*}Example{*}: Consider a scenario where the producer sends records up to 20 
> MB in size which, when compressed, fit into a batch under the 25 MB 
> {{max.request.size }}multiple times. These batches can be problematic to 
> process efficiently, even though they meet the current maximum request size 
> constraints. With {{{}max.record.size{}}}, we could separate max.request.size 
> to only limit compressed request size creation, thus helping us limit that to 
> say 5 MB. Thus, preventing very large requests being made, and causing 
> latency spikes.
> {*}Steps to Reproduce{*}:
>  # Configure a Kafka producer with {{max.request.size}} set to 25 MB.
>  # Send multiple uncompressed records close to 20 MB that compress to less 
> than 25 MB.
>  # Observe the impact on Kafka broker performance and client side latency.
> {*}Expected Behavior{*}: The producer should allow administrators to set both 
> pre-compression record size limits and total request size limits post 
> compression.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (KAFKA-16582) Feature Request: Introduce max.record.size Configuration Parameter for Producers

2024-04-22 Thread dujian (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-16582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dujian reassigned KAFKA-16582:
--

Assignee: dujian

> Feature Request: Introduce max.record.size Configuration Parameter for 
> Producers
> 
>
> Key: KAFKA-16582
> URL: https://issues.apache.org/jira/browse/KAFKA-16582
> Project: Kafka
>  Issue Type: New Feature
>  Components: producer 
>Affects Versions: 3.6.2
>Reporter: Ramiz Mehran
>Assignee: dujian
>Priority: Major
>
> {*}Summary{*}:
> Currently, Kafka producers have a {{max.request.size}} configuration that 
> limits the size of the request sent to Kafka brokers, which includes both 
> compressed and uncompressed data sizes. However, it is also the maximum size 
> of an individual record before it is compressed. This can lead to 
> inefficiencies and unexpected behaviours, particularly when records are 
> significantly large before compression but fit multiple times into the 
> {{max.request.size}} after compression.
> {*}Problem{*}:
> During spikes in data transmission, especially with large records, even when 
> compressed within the limits of {{{}max.request.size{}}}, it causes an 
> increase in latency and potential backlog in processing due to the large 
> batch sizes formed by compressed records. This problem is particularly 
> pronounced when using highly efficient compression algorithms like zstd, 
> where the compressed size may allow for large batches that are inefficient to 
> process.
> {*}Proposed Solution{*}:
> Introduce a new producer configuration parameter: {{{}max.record.size{}}}. 
> This parameter will allow administrators to define the maximum size of a 
> record before it is compressed. This would help in managing expectations and 
> system behavior more predictably by separating uncompressed record size limit 
> from compressed request size limit.
> {*}Benefits{*}:
>  # {*}Predictability{*}: Producers can reject records that exceed the 
> {{max.record.size}} before spending resources on compression.
>  # {*}Efficiency{*}: Helps in maintaining efficient batch sizes and system 
> throughput, especially under high load conditions.
>  # {*}System Stability{*}: Avoids the potential for large batch processing 
> which can affect latency and throughput negatively.
> {*}Example{*}: Consider a scenario where the producer sends records up to 20 
> MB in size which, when compressed, fit into a batch under the 25 MB 
> {{max.request.size }}multiple times. These batches can be problematic to 
> process efficiently, even though they meet the current maximum request size 
> constraints. With {{{}max.record.size{}}}, we could separate max.request.size 
> to only limit compressed request size creation, thus helping us limit that to 
> say 5 MB. Thus, preventing very large requests being made, and causing 
> latency spikes.
> {*}Steps to Reproduce{*}:
>  # Configure a Kafka producer with {{max.request.size}} set to 25 MB.
>  # Send multiple uncompressed records close to 20 MB that compress to less 
> than 25 MB.
>  # Observe the impact on Kafka broker performance and client side latency.
> {*}Expected Behavior{*}: The producer should allow administrators to set both 
> pre-compression record size limits and total request size limits post 
> compression.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)