[jira] [Commented] (KAFKA-16584) Make log processing summary configurable or debug
[ https://issues.apache.org/jira/browse/KAFKA-16584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17844506#comment-17844506 ] dujian commented on KAFKA-16584: hello [~mjsax] very thanks you willing help me create KIP, but it’s my first creat KIP, I am very worried that there will be many modifications and updates in the future, I'm worried that multiple revisions will affect you, Therefore no KIP doc is provided。 now , the assignee of ‘https://issues.apache.org/jira/browse/INFRA-25451’ can create wiki id for me, but must the PMC of project(Kafka) send email or comment the issues, are you know how to connect the PMC。 my email is ‘dujian0...@gmail.com’ > Make log processing summary configurable or debug > - > > Key: KAFKA-16584 > URL: https://issues.apache.org/jira/browse/KAFKA-16584 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 3.6.2 >Reporter: Andras Hatvani >Assignee: dujian >Priority: Major > Labels: needs-kip, newbie > > Currently *every two minutes for every stream thread* statistics will be > logged on INFO log level. > {code} > 2024-04-18T09:18:23.790+02:00 INFO 33178 --- [service] [-StreamThread-1] > o.a.k.s.p.internals.StreamThread : stream-thread > [service-149405a3-c7e3-4505-8bbd-c3bff226b115-StreamThread-1] Processed 0 > total records, ran 0 punctuators, and committed 0 total tasks since the last > update {code} > This is absolutely unnecessary and even harmful since it fills the logs and > thus storage space with unwanted and useless data. Otherwise the INFO logs > are useful and helpful, therefore it's not an option to raise the log level > to WARN. > Please make the logProcessingSummary > * either to a DEBUG level log or > * make it configurable so that it can be disabled. > This is the relevant code: > https://github.com/apache/kafka/blob/aee9724ee15ed539ae73c09cc2c2eda83ae3c864/streams/src/main/java/org/apache/kafka/streams/processor/internals/StreamThread.java#L1073 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-16584) Make log processing summary configurable or debug
[ https://issues.apache.org/jira/browse/KAFKA-16584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17841056#comment-17841056 ] dujian commented on KAFKA-16584: hello [~mjsax] before create KIP, i must create a wiki ID, but “ [https://cwiki.apache.org/confluence/signup.action]” registration function turned off,can you help me > Make log processing summary configurable or debug > - > > Key: KAFKA-16584 > URL: https://issues.apache.org/jira/browse/KAFKA-16584 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 3.6.2 >Reporter: Andras Hatvani >Assignee: dujian >Priority: Major > Labels: needs-kip, newbie > > Currently *every two minutes for every stream thread* statistics will be > logged on INFO log level. > {code} > 2024-04-18T09:18:23.790+02:00 INFO 33178 --- [service] [-StreamThread-1] > o.a.k.s.p.internals.StreamThread : stream-thread > [service-149405a3-c7e3-4505-8bbd-c3bff226b115-StreamThread-1] Processed 0 > total records, ran 0 punctuators, and committed 0 total tasks since the last > update {code} > This is absolutely unnecessary and even harmful since it fills the logs and > thus storage space with unwanted and useless data. Otherwise the INFO logs > are useful and helpful, therefore it's not an option to raise the log level > to WARN. > Please make the logProcessingSummary > * either to a DEBUG level log or > * make it configurable so that it can be disabled. > This is the relevant code: > https://github.com/apache/kafka/blob/aee9724ee15ed539ae73c09cc2c2eda83ae3c864/streams/src/main/java/org/apache/kafka/streams/processor/internals/StreamThread.java#L1073 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (KAFKA-16584) Make log processing summary configurable or debug
[ https://issues.apache.org/jira/browse/KAFKA-16584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dujian reassigned KAFKA-16584: -- Assignee: dujian > Make log processing summary configurable or debug > - > > Key: KAFKA-16584 > URL: https://issues.apache.org/jira/browse/KAFKA-16584 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 3.6.2 >Reporter: Andras Hatvani >Assignee: dujian >Priority: Major > Labels: needs-kip, newbie > > Currently *every two minutes for every stream thread* statistics will be > logged on INFO log level. > {code} > 2024-04-18T09:18:23.790+02:00 INFO 33178 --- [service] [-StreamThread-1] > o.a.k.s.p.internals.StreamThread : stream-thread > [service-149405a3-c7e3-4505-8bbd-c3bff226b115-StreamThread-1] Processed 0 > total records, ran 0 punctuators, and committed 0 total tasks since the last > update {code} > This is absolutely unnecessary and even harmful since it fills the logs and > thus storage space with unwanted and useless data. Otherwise the INFO logs > are useful and helpful, therefore it's not an option to raise the log level > to WARN. > Please make the logProcessingSummary > * either to a DEBUG level log or > * make it configurable so that it can be disabled. > This is the relevant code: > https://github.com/apache/kafka/blob/aee9724ee15ed539ae73c09cc2c2eda83ae3c864/streams/src/main/java/org/apache/kafka/streams/processor/internals/StreamThread.java#L1073 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-16582) Feature Request: Introduce max.record.size Configuration Parameter for Producers
[ https://issues.apache.org/jira/browse/KAFKA-16582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17840684#comment-17840684 ] dujian commented on KAFKA-16582: hello [~ramiz.mehran] I have reproduced this problem and found that the delay will further increase as more messages are sent. But I haven't found the reason why the problem occurs > Feature Request: Introduce max.record.size Configuration Parameter for > Producers > > > Key: KAFKA-16582 > URL: https://issues.apache.org/jira/browse/KAFKA-16582 > Project: Kafka > Issue Type: New Feature > Components: producer >Affects Versions: 3.6.2 >Reporter: Ramiz Mehran >Priority: Major > > {*}Summary{*}: > Currently, Kafka producers have a {{max.request.size}} configuration that > limits the size of the request sent to Kafka brokers, which includes both > compressed and uncompressed data sizes. However, it is also the maximum size > of an individual record before it is compressed. This can lead to > inefficiencies and unexpected behaviours, particularly when records are > significantly large before compression but fit multiple times into the > {{max.request.size}} after compression. > {*}Problem{*}: > During spikes in data transmission, especially with large records, even when > compressed within the limits of {{{}max.request.size{}}}, it causes an > increase in latency and potential backlog in processing due to the large > batch sizes formed by compressed records. This problem is particularly > pronounced when using highly efficient compression algorithms like zstd, > where the compressed size may allow for large batches that are inefficient to > process. > {*}Proposed Solution{*}: > Introduce a new producer configuration parameter: {{{}max.record.size{}}}. > This parameter will allow administrators to define the maximum size of a > record before it is compressed. This would help in managing expectations and > system behavior more predictably by separating uncompressed record size limit > from compressed request size limit. > {*}Benefits{*}: > # {*}Predictability{*}: Producers can reject records that exceed the > {{max.record.size}} before spending resources on compression. > # {*}Efficiency{*}: Helps in maintaining efficient batch sizes and system > throughput, especially under high load conditions. > # {*}System Stability{*}: Avoids the potential for large batch processing > which can affect latency and throughput negatively. > {*}Example{*}: Consider a scenario where the producer sends records up to 20 > MB in size which, when compressed, fit into a batch under the 25 MB > {{max.request.size }}multiple times. These batches can be problematic to > process efficiently, even though they meet the current maximum request size > constraints. With {{{}max.record.size{}}}, we could separate max.request.size > to only limit compressed request size creation, thus helping us limit that to > say 5 MB. Thus, preventing very large requests being made, and causing > latency spikes. > {*}Steps to Reproduce{*}: > # Configure a Kafka producer with {{max.request.size}} set to 25 MB. > # Send multiple uncompressed records close to 20 MB that compress to less > than 25 MB. > # Observe the impact on Kafka broker performance and client side latency. > {*}Expected Behavior{*}: The producer should allow administrators to set both > pre-compression record size limits and total request size limits post > compression. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-16584) Make log processing summary configurable or debug
[ https://issues.apache.org/jira/browse/KAFKA-16584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17840631#comment-17840631 ] dujian commented on KAFKA-16584: hello [~mjsax] I would like to confirm whether this problem requires code modification. If so, can assign it to me? > Make log processing summary configurable or debug > - > > Key: KAFKA-16584 > URL: https://issues.apache.org/jira/browse/KAFKA-16584 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 3.6.2 >Reporter: Andras Hatvani >Priority: Major > Labels: needs-kip, newbie > > Currently *every two minutes for every stream thread* statistics will be > logged on INFO log level. > {code} > 2024-04-18T09:18:23.790+02:00 INFO 33178 --- [service] [-StreamThread-1] > o.a.k.s.p.internals.StreamThread : stream-thread > [service-149405a3-c7e3-4505-8bbd-c3bff226b115-StreamThread-1] Processed 0 > total records, ran 0 punctuators, and committed 0 total tasks since the last > update {code} > This is absolutely unnecessary and even harmful since it fills the logs and > thus storage space with unwanted and useless data. Otherwise the INFO logs > are useful and helpful, therefore it's not an option to raise the log level > to WARN. > Please make the logProcessingSummary > * either to a DEBUG level log or > * make it configurable so that it can be disabled. > This is the relevant code: > https://github.com/apache/kafka/blob/aee9724ee15ed539ae73c09cc2c2eda83ae3c864/streams/src/main/java/org/apache/kafka/streams/processor/internals/StreamThread.java#L1073 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (KAFKA-16582) Feature Request: Introduce max.record.size Configuration Parameter for Producers
[ https://issues.apache.org/jira/browse/KAFKA-16582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dujian reassigned KAFKA-16582: -- Assignee: (was: dujian) > Feature Request: Introduce max.record.size Configuration Parameter for > Producers > > > Key: KAFKA-16582 > URL: https://issues.apache.org/jira/browse/KAFKA-16582 > Project: Kafka > Issue Type: New Feature > Components: producer >Affects Versions: 3.6.2 >Reporter: Ramiz Mehran >Priority: Major > > {*}Summary{*}: > Currently, Kafka producers have a {{max.request.size}} configuration that > limits the size of the request sent to Kafka brokers, which includes both > compressed and uncompressed data sizes. However, it is also the maximum size > of an individual record before it is compressed. This can lead to > inefficiencies and unexpected behaviours, particularly when records are > significantly large before compression but fit multiple times into the > {{max.request.size}} after compression. > {*}Problem{*}: > During spikes in data transmission, especially with large records, even when > compressed within the limits of {{{}max.request.size{}}}, it causes an > increase in latency and potential backlog in processing due to the large > batch sizes formed by compressed records. This problem is particularly > pronounced when using highly efficient compression algorithms like zstd, > where the compressed size may allow for large batches that are inefficient to > process. > {*}Proposed Solution{*}: > Introduce a new producer configuration parameter: {{{}max.record.size{}}}. > This parameter will allow administrators to define the maximum size of a > record before it is compressed. This would help in managing expectations and > system behavior more predictably by separating uncompressed record size limit > from compressed request size limit. > {*}Benefits{*}: > # {*}Predictability{*}: Producers can reject records that exceed the > {{max.record.size}} before spending resources on compression. > # {*}Efficiency{*}: Helps in maintaining efficient batch sizes and system > throughput, especially under high load conditions. > # {*}System Stability{*}: Avoids the potential for large batch processing > which can affect latency and throughput negatively. > {*}Example{*}: Consider a scenario where the producer sends records up to 20 > MB in size which, when compressed, fit into a batch under the 25 MB > {{max.request.size }}multiple times. These batches can be problematic to > process efficiently, even though they meet the current maximum request size > constraints. With {{{}max.record.size{}}}, we could separate max.request.size > to only limit compressed request size creation, thus helping us limit that to > say 5 MB. Thus, preventing very large requests being made, and causing > latency spikes. > {*}Steps to Reproduce{*}: > # Configure a Kafka producer with {{max.request.size}} set to 25 MB. > # Send multiple uncompressed records close to 20 MB that compress to less > than 25 MB. > # Observe the impact on Kafka broker performance and client side latency. > {*}Expected Behavior{*}: The producer should allow administrators to set both > pre-compression record size limits and total request size limits post > compression. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (KAFKA-16582) Feature Request: Introduce max.record.size Configuration Parameter for Producers
[ https://issues.apache.org/jira/browse/KAFKA-16582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dujian reassigned KAFKA-16582: -- Assignee: dujian > Feature Request: Introduce max.record.size Configuration Parameter for > Producers > > > Key: KAFKA-16582 > URL: https://issues.apache.org/jira/browse/KAFKA-16582 > Project: Kafka > Issue Type: New Feature > Components: producer >Affects Versions: 3.6.2 >Reporter: Ramiz Mehran >Assignee: dujian >Priority: Major > > {*}Summary{*}: > Currently, Kafka producers have a {{max.request.size}} configuration that > limits the size of the request sent to Kafka brokers, which includes both > compressed and uncompressed data sizes. However, it is also the maximum size > of an individual record before it is compressed. This can lead to > inefficiencies and unexpected behaviours, particularly when records are > significantly large before compression but fit multiple times into the > {{max.request.size}} after compression. > {*}Problem{*}: > During spikes in data transmission, especially with large records, even when > compressed within the limits of {{{}max.request.size{}}}, it causes an > increase in latency and potential backlog in processing due to the large > batch sizes formed by compressed records. This problem is particularly > pronounced when using highly efficient compression algorithms like zstd, > where the compressed size may allow for large batches that are inefficient to > process. > {*}Proposed Solution{*}: > Introduce a new producer configuration parameter: {{{}max.record.size{}}}. > This parameter will allow administrators to define the maximum size of a > record before it is compressed. This would help in managing expectations and > system behavior more predictably by separating uncompressed record size limit > from compressed request size limit. > {*}Benefits{*}: > # {*}Predictability{*}: Producers can reject records that exceed the > {{max.record.size}} before spending resources on compression. > # {*}Efficiency{*}: Helps in maintaining efficient batch sizes and system > throughput, especially under high load conditions. > # {*}System Stability{*}: Avoids the potential for large batch processing > which can affect latency and throughput negatively. > {*}Example{*}: Consider a scenario where the producer sends records up to 20 > MB in size which, when compressed, fit into a batch under the 25 MB > {{max.request.size }}multiple times. These batches can be problematic to > process efficiently, even though they meet the current maximum request size > constraints. With {{{}max.record.size{}}}, we could separate max.request.size > to only limit compressed request size creation, thus helping us limit that to > say 5 MB. Thus, preventing very large requests being made, and causing > latency spikes. > {*}Steps to Reproduce{*}: > # Configure a Kafka producer with {{max.request.size}} set to 25 MB. > # Send multiple uncompressed records close to 20 MB that compress to less > than 25 MB. > # Observe the impact on Kafka broker performance and client side latency. > {*}Expected Behavior{*}: The producer should allow administrators to set both > pre-compression record size limits and total request size limits post > compression. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (KAFKA-16582) Feature Request: Introduce max.record.size Configuration Parameter for Producers
[ https://issues.apache.org/jira/browse/KAFKA-16582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dujian reassigned KAFKA-16582: -- Assignee: (was: dujian) > Feature Request: Introduce max.record.size Configuration Parameter for > Producers > > > Key: KAFKA-16582 > URL: https://issues.apache.org/jira/browse/KAFKA-16582 > Project: Kafka > Issue Type: New Feature > Components: producer >Affects Versions: 3.6.2 >Reporter: Ramiz Mehran >Priority: Major > > {*}Summary{*}: > Currently, Kafka producers have a {{max.request.size}} configuration that > limits the size of the request sent to Kafka brokers, which includes both > compressed and uncompressed data sizes. However, it is also the maximum size > of an individual record before it is compressed. This can lead to > inefficiencies and unexpected behaviours, particularly when records are > significantly large before compression but fit multiple times into the > {{max.request.size}} after compression. > {*}Problem{*}: > During spikes in data transmission, especially with large records, even when > compressed within the limits of {{{}max.request.size{}}}, it causes an > increase in latency and potential backlog in processing due to the large > batch sizes formed by compressed records. This problem is particularly > pronounced when using highly efficient compression algorithms like zstd, > where the compressed size may allow for large batches that are inefficient to > process. > {*}Proposed Solution{*}: > Introduce a new producer configuration parameter: {{{}max.record.size{}}}. > This parameter will allow administrators to define the maximum size of a > record before it is compressed. This would help in managing expectations and > system behavior more predictably by separating uncompressed record size limit > from compressed request size limit. > {*}Benefits{*}: > # {*}Predictability{*}: Producers can reject records that exceed the > {{max.record.size}} before spending resources on compression. > # {*}Efficiency{*}: Helps in maintaining efficient batch sizes and system > throughput, especially under high load conditions. > # {*}System Stability{*}: Avoids the potential for large batch processing > which can affect latency and throughput negatively. > {*}Example{*}: Consider a scenario where the producer sends records up to 20 > MB in size which, when compressed, fit into a batch under the 25 MB > {{max.request.size }}multiple times. These batches can be problematic to > process efficiently, even though they meet the current maximum request size > constraints. With {{{}max.record.size{}}}, we could separate max.request.size > to only limit compressed request size creation, thus helping us limit that to > say 5 MB. Thus, preventing very large requests being made, and causing > latency spikes. > {*}Steps to Reproduce{*}: > # Configure a Kafka producer with {{max.request.size}} set to 25 MB. > # Send multiple uncompressed records close to 20 MB that compress to less > than 25 MB. > # Observe the impact on Kafka broker performance and client side latency. > {*}Expected Behavior{*}: The producer should allow administrators to set both > pre-compression record size limits and total request size limits post > compression. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (KAFKA-16582) Feature Request: Introduce max.record.size Configuration Parameter for Producers
[ https://issues.apache.org/jira/browse/KAFKA-16582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dujian reassigned KAFKA-16582: -- Assignee: dujian > Feature Request: Introduce max.record.size Configuration Parameter for > Producers > > > Key: KAFKA-16582 > URL: https://issues.apache.org/jira/browse/KAFKA-16582 > Project: Kafka > Issue Type: New Feature > Components: producer >Affects Versions: 3.6.2 >Reporter: Ramiz Mehran >Assignee: dujian >Priority: Major > > {*}Summary{*}: > Currently, Kafka producers have a {{max.request.size}} configuration that > limits the size of the request sent to Kafka brokers, which includes both > compressed and uncompressed data sizes. However, it is also the maximum size > of an individual record before it is compressed. This can lead to > inefficiencies and unexpected behaviours, particularly when records are > significantly large before compression but fit multiple times into the > {{max.request.size}} after compression. > {*}Problem{*}: > During spikes in data transmission, especially with large records, even when > compressed within the limits of {{{}max.request.size{}}}, it causes an > increase in latency and potential backlog in processing due to the large > batch sizes formed by compressed records. This problem is particularly > pronounced when using highly efficient compression algorithms like zstd, > where the compressed size may allow for large batches that are inefficient to > process. > {*}Proposed Solution{*}: > Introduce a new producer configuration parameter: {{{}max.record.size{}}}. > This parameter will allow administrators to define the maximum size of a > record before it is compressed. This would help in managing expectations and > system behavior more predictably by separating uncompressed record size limit > from compressed request size limit. > {*}Benefits{*}: > # {*}Predictability{*}: Producers can reject records that exceed the > {{max.record.size}} before spending resources on compression. > # {*}Efficiency{*}: Helps in maintaining efficient batch sizes and system > throughput, especially under high load conditions. > # {*}System Stability{*}: Avoids the potential for large batch processing > which can affect latency and throughput negatively. > {*}Example{*}: Consider a scenario where the producer sends records up to 20 > MB in size which, when compressed, fit into a batch under the 25 MB > {{max.request.size }}multiple times. These batches can be problematic to > process efficiently, even though they meet the current maximum request size > constraints. With {{{}max.record.size{}}}, we could separate max.request.size > to only limit compressed request size creation, thus helping us limit that to > say 5 MB. Thus, preventing very large requests being made, and causing > latency spikes. > {*}Steps to Reproduce{*}: > # Configure a Kafka producer with {{max.request.size}} set to 25 MB. > # Send multiple uncompressed records close to 20 MB that compress to less > than 25 MB. > # Observe the impact on Kafka broker performance and client side latency. > {*}Expected Behavior{*}: The producer should allow administrators to set both > pre-compression record size limits and total request size limits post > compression. -- This message was sent by Atlassian Jira (v8.20.10#820010)