[jira] [Commented] (KAFKA-4923) Add Exactly-Once Semantics to Streams

2017-05-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16019149#comment-16019149
 ] 

ASF GitHub Bot commented on KAFKA-4923:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/2974


> Add Exactly-Once Semantics to Streams
> -
>
> Key: KAFKA-4923
> URL: https://issues.apache.org/jira/browse/KAFKA-4923
> Project: Kafka
>  Issue Type: New Feature
>  Components: streams
>Reporter: Matthias J. Sax
>Assignee: Matthias J. Sax
>  Labels: exactly-once, kip
> Fix For: 0.11.0.0
>
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-129%3A+Streams+Exactly-Once+Semantics



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] kafka pull request #2974: KAFKA-4923: Add Exaclty-Once Semantics to Streams ...

2017-05-21 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/2974


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (KAFKA-5063) Flaky ResetIntegrationTest.testReprocessingFromScratchAfterResetWithIntermediateUserTopic

2017-05-21 Thread Guozhang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang updated KAFKA-5063:
-
Labels: unit-test  (was: )

> Flaky 
> ResetIntegrationTest.testReprocessingFromScratchAfterResetWithIntermediateUserTopic
> -
>
> Key: KAFKA-5063
> URL: https://issues.apache.org/jira/browse/KAFKA-5063
> Project: Kafka
>  Issue Type: Bug
>  Components: streams, unit tests
>Reporter: Matthias J. Sax
>Assignee: Matthias J. Sax
>  Labels: unit-test
> Fix For: 0.11.0.0
>
>
> {noformat}
> org.apache.kafka.streams.integration.ResetIntegrationTest > 
> testReprocessingFromScratchAfterResetWithIntermediateUserTopic FAILED
> java.lang.AssertionError: 
> Expected: <[KeyValue(2983939126775, 1), KeyValue(2983939126815, 1), 
> KeyValue(2983939126835, 1), KeyValue(2983939126795, 1), 
> KeyValue(2983939126815, 1), KeyValue(2983939126855, 1), 
> KeyValue(2983939126835, 1), KeyValue(2983939126795, 1), 
> KeyValue(2983939126815, 2), KeyValue(2983939126835, 2), 
> KeyValue(2983939126855, 1), KeyValue(2983939126875, 1), 
> KeyValue(2983939126835, 2), KeyValue(2983939126855, 2), 
> KeyValue(2983939126895, 1), KeyValue(2983939126875, 1), 
> KeyValue(2983939126855, 2), KeyValue(2983939126875, 2), 
> KeyValue(2983939126895, 1), KeyValue(2983939126915, 1), 
> KeyValue(2983939126875, 2), KeyValue(2983939126895, 2), 
> KeyValue(2983939126915, 1), KeyValue(2983939126935, 1), 
> KeyValue(2983939126875, 3), KeyValue(2983939126895, 2), 
> KeyValue(2983939126915, 2), KeyValue(2983939126935, 1), 
> KeyValue(2983939126875, 3), KeyValue(2983939126895, 3), 
> KeyValue(2983939126915, 2), KeyValue(2983939126935, 2), 
> KeyValue(2983939126875, 4), KeyValue(2983939126895, 3), 
> KeyValue(2983939126915, 3), KeyValue(2983939126935, 2), 
> KeyValue(2983939126875, 4), KeyValue(2983939126895, 4), 
> KeyValue(2983939126915, 3), KeyValue(2983939126935, 3)]>
>  but: was <[KeyValue(2983939126775, 1), KeyValue(2983939126815, 1), 
> KeyValue(2983939126835, 1), KeyValue(2983939126795, 1), 
> KeyValue(2983939126815, 1), KeyValue(2983939126855, 1), 
> KeyValue(2983939126835, 1), KeyValue(2983939126795, 1), 
> KeyValue(2983939126815, 2), KeyValue(2983939126835, 2), 
> KeyValue(2983939126855, 1), KeyValue(2983939126875, 1), 
> KeyValue(2983939126835, 2), KeyValue(2983939126855, 2), 
> KeyValue(2983939126895, 1), KeyValue(2983939126875, 1), 
> KeyValue(2983939126855, 2), KeyValue(2983939126875, 2), 
> KeyValue(2983939126895, 1), KeyValue(2983939126915, 1), 
> KeyValue(2983939126875, 2), KeyValue(2983939126895, 2), 
> KeyValue(2983939126915, 1), KeyValue(2983939126935, 1), 
> KeyValue(2983939126875, 3), KeyValue(2983939126895, 2), 
> KeyValue(2983939126915, 2), KeyValue(2983939126935, 1), 
> KeyValue(2983939126875, 3), KeyValue(2983939126895, 3), 
> KeyValue(2983939126915, 2), KeyValue(2983939126935, 2), 
> KeyValue(2983939126875, 4)]>
> at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
> at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:8)
> at 
> org.apache.kafka.streams.integration.ResetIntegrationTest.testReprocessingFromScratchAfterResetWithIntermediateUserTopic(ResetIntegrationTest.java:190)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (KAFKA-3502) Build is killed during kafka streams tests due to `pure virtual method called` error

2017-05-21 Thread Guozhang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang updated KAFKA-3502:
-
Component/s: streams

> Build is killed during kafka streams tests due to `pure virtual method 
> called` error
> 
>
> Key: KAFKA-3502
> URL: https://issues.apache.org/jira/browse/KAFKA-3502
> Project: Kafka
>  Issue Type: Sub-task
>  Components: streams, unit tests
>Reporter: Ashish Singh
>  Labels: transient-unit-test-failure
> Fix For: 0.10.2.0
>
>
> Build failed due to failure in streams' test. Not clear which test led to 
> this.
> Jenkins console: 
> https://builds.apache.org/job/kafka-trunk-git-pr-jdk7/3210/console
> {code}
> org.apache.kafka.streams.kstream.internals.KTableFilterTest > testValueGetter 
> PASSED
> org.apache.kafka.streams.kstream.internals.KStreamFlatMapTest > testFlatMap 
> PASSED
> org.apache.kafka.streams.kstream.internals.KTableAggregateTest > testAggBasic 
> PASSED
> org.apache.kafka.streams.kstream.internals.KStreamFlatMapValuesTest > 
> testFlatMapValues PASSED
> org.apache.kafka.streams.kstream.KStreamBuilderTest > testMerge PASSED
> org.apache.kafka.streams.kstream.KStreamBuilderTest > testFrom PASSED
> org.apache.kafka.streams.kstream.KStreamBuilderTest > testNewName PASSED
> pure virtual method called
> terminate called without an active exception
> :streams:test FAILED
> FAILURE: Build failed with an exception.
> * What went wrong:
> Execution failed for task ':streams:test'.
> > Process 'Gradle Test Executor 4' finished with non-zero exit value 134
> {code}
> Tried reproducing the issue locally, but could not.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (KAFKA-3502) Build is killed during kafka streams tests due to `pure virtual method called` error

2017-05-21 Thread Guozhang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang updated KAFKA-3502:
-
Component/s: unit tests

> Build is killed during kafka streams tests due to `pure virtual method 
> called` error
> 
>
> Key: KAFKA-3502
> URL: https://issues.apache.org/jira/browse/KAFKA-3502
> Project: Kafka
>  Issue Type: Sub-task
>  Components: unit tests
>Reporter: Ashish Singh
>  Labels: transient-unit-test-failure
> Fix For: 0.10.2.0
>
>
> Build failed due to failure in streams' test. Not clear which test led to 
> this.
> Jenkins console: 
> https://builds.apache.org/job/kafka-trunk-git-pr-jdk7/3210/console
> {code}
> org.apache.kafka.streams.kstream.internals.KTableFilterTest > testValueGetter 
> PASSED
> org.apache.kafka.streams.kstream.internals.KStreamFlatMapTest > testFlatMap 
> PASSED
> org.apache.kafka.streams.kstream.internals.KTableAggregateTest > testAggBasic 
> PASSED
> org.apache.kafka.streams.kstream.internals.KStreamFlatMapValuesTest > 
> testFlatMapValues PASSED
> org.apache.kafka.streams.kstream.KStreamBuilderTest > testMerge PASSED
> org.apache.kafka.streams.kstream.KStreamBuilderTest > testFrom PASSED
> org.apache.kafka.streams.kstream.KStreamBuilderTest > testNewName PASSED
> pure virtual method called
> terminate called without an active exception
> :streams:test FAILED
> FAILURE: Build failed with an exception.
> * What went wrong:
> Execution failed for task ':streams:test'.
> > Process 'Gradle Test Executor 4' finished with non-zero exit value 134
> {code}
> Tried reproducing the issue locally, but could not.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-3502) Build is killed during kafka streams tests due to `pure virtual method called` error

2017-05-21 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16019117#comment-16019117
 ] 

Guozhang Wang commented on KAFKA-3502:
--

Have not seen this happen again since KAFKA-5198 is merged, resolving for now.

> Build is killed during kafka streams tests due to `pure virtual method 
> called` error
> 
>
> Key: KAFKA-3502
> URL: https://issues.apache.org/jira/browse/KAFKA-3502
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Ashish Singh
>  Labels: transient-unit-test-failure
> Fix For: 0.10.2.0
>
>
> Build failed due to failure in streams' test. Not clear which test led to 
> this.
> Jenkins console: 
> https://builds.apache.org/job/kafka-trunk-git-pr-jdk7/3210/console
> {code}
> org.apache.kafka.streams.kstream.internals.KTableFilterTest > testValueGetter 
> PASSED
> org.apache.kafka.streams.kstream.internals.KStreamFlatMapTest > testFlatMap 
> PASSED
> org.apache.kafka.streams.kstream.internals.KTableAggregateTest > testAggBasic 
> PASSED
> org.apache.kafka.streams.kstream.internals.KStreamFlatMapValuesTest > 
> testFlatMapValues PASSED
> org.apache.kafka.streams.kstream.KStreamBuilderTest > testMerge PASSED
> org.apache.kafka.streams.kstream.KStreamBuilderTest > testFrom PASSED
> org.apache.kafka.streams.kstream.KStreamBuilderTest > testNewName PASSED
> pure virtual method called
> terminate called without an active exception
> :streams:test FAILED
> FAILURE: Build failed with an exception.
> * What went wrong:
> Execution failed for task ':streams:test'.
> > Process 'Gradle Test Executor 4' finished with non-zero exit value 134
> {code}
> Tried reproducing the issue locally, but could not.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-5153) KAFKA Cluster : 0.10.2.0 : Servers Getting disconnected : Service Impacting

2017-05-21 Thread Qingsong Xie (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16019115#comment-16019115
 ] 

Qingsong Xie commented on KAFKA-5153:
-

[~arpan.khagram0...@gmail.com] Hi, did this issue occur again?

> KAFKA Cluster : 0.10.2.0 : Servers Getting disconnected : Service Impacting
> ---
>
> Key: KAFKA-5153
> URL: https://issues.apache.org/jira/browse/KAFKA-5153
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.10.2.0
> Environment: RHEL 6
> Java Version  1.8.0_91-b14
>Reporter: Arpan
>Priority: Critical
> Attachments: server_1_72server.log, server_2_73_server.log, 
> server_3_74Server.log, server.properties, ThreadDump_1493564142.dump, 
> ThreadDump_1493564177.dump, ThreadDump_1493564249.dump
>
>
> Hi Team, 
> I was earlier referring to issue KAFKA-4477 because the problem i am facing 
> is similar. I tried to search the same reference in release docs as well but 
> did not get anything in 0.10.1.1 or 0.10.2.0. I am currently using 
> 2.11_0.10.2.0.
> I am have 3 node cluster for KAFKA and cluster for ZK as well on the same set 
> of servers in cluster mode. We are having around 240GB of data getting 
> transferred through KAFKA everyday. What we are observing is disconnect of 
> the server from cluster and ISR getting reduced and it starts impacting 
> service.
> I have also observed file descriptor count getting increased a bit, in normal 
> circumstances we have not observed FD count more than 500 but when issue 
> started we were observing it in the range of 650-700 on all 3 servers. 
> Attaching thread dumps of all 3 servers when we started facing the issue 
> recently.
> The issue get vanished once you bounce the nodes and the set up is not 
> working more than 5 days without this issue. Attaching server logs as well.
> Kindly let me know if you need any additional information. Attaching 
> server.properties as well for one of the server (It's similar on all 3 
> serversP)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (KAFKA-3995) Split the ProducerBatch and resend when received RecordTooLargeException

2017-05-21 Thread Jiangjie Qin (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiangjie Qin updated KAFKA-3995:

Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Split the ProducerBatch and resend when received RecordTooLargeException
> 
>
> Key: KAFKA-3995
> URL: https://issues.apache.org/jira/browse/KAFKA-3995
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients
>Affects Versions: 0.10.0.0
>Reporter: Jiangjie Qin
>Assignee: Jiangjie Qin
>
> We recently see a few cases where RecordTooLargeException is thrown because 
> the compressed message sent by KafkaProducer exceeded the max message size.
> The root cause of this issue is because the compressor is estimating the 
> batch size using an estimated compression ratio based on heuristic 
> compression ratio statistics. This does not quite work for the traffic with 
> highly variable compression ratios. 
> For example, if the batch size is set to 1MB and the max message size is 1MB. 
> Initially a the producer is sending messages (each message is 1MB) to topic_1 
> whose data can be compressed to 1/10 of the original size. After a while the 
> estimated compression ratio in the compressor will be trained to 1/10 and the 
> producer would put 10 messages into one batch. Now the producer starts to 
> send messages (each message is also 1MB) to topic_2 whose message can only be 
> compress to 1/5 of the original size. The producer would still use 1/10 as 
> the estimated compression ratio and put 10 messages into a batch. That batch 
> would be 2 MB after compression which exceeds the maximum message size. In 
> this case the user do not have many options other than resend everything or 
> close the producer if they care about ordering.
> This is especially an issue for services like MirrorMaker whose producer is 
> shared by many different topics.
> To solve this issue, we can probably add a configuration 
> "enable.compression.ratio.estimation" to the producer. So when this 
> configuration is set to false, we stop estimating the compressed size but 
> will close the batch once the uncompressed bytes in the batch reaches the 
> batch size.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (KAFKA-3995) Split the ProducerBatch and resend when received RecordTooLargeException

2017-05-21 Thread Jiangjie Qin (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiangjie Qin updated KAFKA-3995:

Fix Version/s: 0.11.0.0

> Split the ProducerBatch and resend when received RecordTooLargeException
> 
>
> Key: KAFKA-3995
> URL: https://issues.apache.org/jira/browse/KAFKA-3995
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients
>Affects Versions: 0.10.0.0
>Reporter: Jiangjie Qin
>Assignee: Jiangjie Qin
> Fix For: 0.11.0.0
>
>
> We recently see a few cases where RecordTooLargeException is thrown because 
> the compressed message sent by KafkaProducer exceeded the max message size.
> The root cause of this issue is because the compressor is estimating the 
> batch size using an estimated compression ratio based on heuristic 
> compression ratio statistics. This does not quite work for the traffic with 
> highly variable compression ratios. 
> For example, if the batch size is set to 1MB and the max message size is 1MB. 
> Initially a the producer is sending messages (each message is 1MB) to topic_1 
> whose data can be compressed to 1/10 of the original size. After a while the 
> estimated compression ratio in the compressor will be trained to 1/10 and the 
> producer would put 10 messages into one batch. Now the producer starts to 
> send messages (each message is also 1MB) to topic_2 whose message can only be 
> compress to 1/5 of the original size. The producer would still use 1/10 as 
> the estimated compression ratio and put 10 messages into a batch. That batch 
> would be 2 MB after compression which exceeds the maximum message size. In 
> this case the user do not have many options other than resend everything or 
> close the producer if they care about ordering.
> This is especially an issue for services like MirrorMaker whose producer is 
> shared by many different topics.
> To solve this issue, we can probably add a configuration 
> "enable.compression.ratio.estimation" to the producer. So when this 
> configuration is set to false, we stop estimating the compressed size but 
> will close the batch once the uncompressed bytes in the batch reaches the 
> batch size.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-5296) Unable to write to some partitions of newly created topic in 10.2

2017-05-21 Thread huxi (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16019074#comment-16019074
 ] 

huxi commented on KAFKA-5296:
-

With 0.10.2 client, could you try to enlarge `request.timeout.ms` a little bit 
to see if this issue happens again?

> Unable to write to some partitions of newly created topic in 10.2
> -
>
> Key: KAFKA-5296
> URL: https://issues.apache.org/jira/browse/KAFKA-5296
> Project: Kafka
>  Issue Type: Bug
>Reporter: Abhisek Saikia
>
> We are using kafka 10.2 and the cluster was running fine for a month with 50 
> topics and now we are having issue in producing message by creating new 
> topics. The create topic command is successful but producers are throwing 
> error while writing to some partitions. 
> Error in producer-
> java.util.concurrent.ExecutionException: 
> org.apache.kafka.common.errors.TimeoutException: Expiring 1 record(s) for 
> [topic1]-8: 30039 ms has passed since batch creation plus linger time
>   at 
> org.apache.kafka.clients.producer.internals.FutureRecordMetadata.valueOrError(FutureRecordMetadata.java:70)
>  ~[kafka-clients-0.10.2.0.jar:na]
>   at 
> org.apache.kafka.clients.producer.internals.FutureRecordMetadata.get(FutureRecordMetadata.java:57)
>  ~[kafka-clients-0.10.2.0.jar:na]
>   at 
> org.apache.kafka.clients.producer.internals.FutureRecordMetadata.get(FutureRecordMetadata.java:25)
>  ~[kafka-clients-0.10.2.0.jar:na]
> On the broker side, I don't see any topic-parition folder getting created for 
> the broker who is the leader for the partition. 
> While using 0.8 client, the write used to hang while it starts writing to the 
> partition having this issue. With 10.2 it resolved the the producer hang issue
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-5299) MirrorMaker with New.consumer doesn't consume message from multiple topics whitelisted

2017-05-21 Thread huxi (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16019072#comment-16019072
 ] 

huxi commented on KAFKA-5299:
-

How did you specify the whitelist? It should be a valid regular expression.

> MirrorMaker with New.consumer doesn't consume message from multiple topics 
> whitelisted 
> ---
>
> Key: KAFKA-5299
> URL: https://issues.apache.org/jira/browse/KAFKA-5299
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.9.0.1
>Reporter: Jyoti
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] kafka pull request #2765: MINOR: Fix race condition in TestVerifiableProduce...

2017-05-21 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/2765


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] kafka pull request #3112: HOTFIX: In Connect test with auto topic creation d...

2017-05-21 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/3112


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (KAFKA-5186) Avoid expensive initialization of producer state when upgrading

2017-05-21 Thread Jason Gustafson (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Gustafson updated KAFKA-5186:
---
Status: Patch Available  (was: In Progress)

> Avoid expensive initialization of producer state when upgrading
> ---
>
> Key: KAFKA-5186
> URL: https://issues.apache.org/jira/browse/KAFKA-5186
> Project: Kafka
>  Issue Type: Sub-task
>  Components: clients, core, producer 
>Reporter: Jason Gustafson
>Assignee: Jason Gustafson
>Priority: Blocker
>  Labels: exactly-once
> Fix For: 0.11.0.0
>
>
> Currently the producer state is always loaded upon broker initialization. If 
> we don't find a snapshot file to load from, then we scan the log segments 
> from the beginning to rebuild the state. Of course, when users upgrade to the 
> new version, there will be no snapshot file, so the upgrade could be quite 
> intensive. It would be nice to avoid this by assuming instead that the 
> absence of a snapshot file means that the producer state should start clean 
> and we can avoid the expensive scanning.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-5186) Avoid expensive initialization of producer state when upgrading

2017-05-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16019054#comment-16019054
 ] 

ASF GitHub Bot commented on KAFKA-5186:
---

GitHub user hachikuji opened a pull request:

https://github.com/apache/kafka/pull/3113

KAFKA-5186: Avoid expensive log scan to build producer state when upgrading



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/hachikuji/kafka KAFKA-5186

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3113.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3113


commit 90b4051e7984b5611fa8e2395fa3d5643b3c1f0b
Author: Jason Gustafson 
Date:   2017-05-22T00:51:04Z

KAFKA-5186: Avoid expensive log scan to build producer state when upgrading




> Avoid expensive initialization of producer state when upgrading
> ---
>
> Key: KAFKA-5186
> URL: https://issues.apache.org/jira/browse/KAFKA-5186
> Project: Kafka
>  Issue Type: Sub-task
>  Components: clients, core, producer 
>Reporter: Jason Gustafson
>Assignee: Jason Gustafson
>Priority: Blocker
>  Labels: exactly-once
> Fix For: 0.11.0.0
>
>
> Currently the producer state is always loaded upon broker initialization. If 
> we don't find a snapshot file to load from, then we scan the log segments 
> from the beginning to rebuild the state. Of course, when users upgrade to the 
> new version, there will be no snapshot file, so the upgrade could be quite 
> intensive. It would be nice to avoid this by assuming instead that the 
> absence of a snapshot file means that the producer state should start clean 
> and we can avoid the expensive scanning.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] kafka pull request #3113: KAFKA-5186: Avoid expensive log scan to build prod...

2017-05-21 Thread hachikuji
GitHub user hachikuji opened a pull request:

https://github.com/apache/kafka/pull/3113

KAFKA-5186: Avoid expensive log scan to build producer state when upgrading



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/hachikuji/kafka KAFKA-5186

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3113.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3113


commit 90b4051e7984b5611fa8e2395fa3d5643b3c1f0b
Author: Jason Gustafson 
Date:   2017-05-22T00:51:04Z

KAFKA-5186: Avoid expensive log scan to build producer state when upgrading




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] kafka pull request #3112: HOTFIX: In Connect test with auto topic creation d...

2017-05-21 Thread ewencp
GitHub user ewencp opened a pull request:

https://github.com/apache/kafka/pull/3112

HOTFIX: In Connect test with auto topic creation disabled, ensure 
precreated topic is always used



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ewencp/kafka hotfix-precreate-topic

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3112.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3112






---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-3995) Split the ProducerBatch and resend when received RecordTooLargeException

2017-05-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16019046#comment-16019046
 ] 

ASF GitHub Bot commented on KAFKA-3995:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/2638


> Split the ProducerBatch and resend when received RecordTooLargeException
> 
>
> Key: KAFKA-3995
> URL: https://issues.apache.org/jira/browse/KAFKA-3995
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients
>Affects Versions: 0.10.0.0
>Reporter: Jiangjie Qin
>Assignee: Jiangjie Qin
>
> We recently see a few cases where RecordTooLargeException is thrown because 
> the compressed message sent by KafkaProducer exceeded the max message size.
> The root cause of this issue is because the compressor is estimating the 
> batch size using an estimated compression ratio based on heuristic 
> compression ratio statistics. This does not quite work for the traffic with 
> highly variable compression ratios. 
> For example, if the batch size is set to 1MB and the max message size is 1MB. 
> Initially a the producer is sending messages (each message is 1MB) to topic_1 
> whose data can be compressed to 1/10 of the original size. After a while the 
> estimated compression ratio in the compressor will be trained to 1/10 and the 
> producer would put 10 messages into one batch. Now the producer starts to 
> send messages (each message is also 1MB) to topic_2 whose message can only be 
> compress to 1/5 of the original size. The producer would still use 1/10 as 
> the estimated compression ratio and put 10 messages into a batch. That batch 
> would be 2 MB after compression which exceeds the maximum message size. In 
> this case the user do not have many options other than resend everything or 
> close the producer if they care about ordering.
> This is especially an issue for services like MirrorMaker whose producer is 
> shared by many different topics.
> To solve this issue, we can probably add a configuration 
> "enable.compression.ratio.estimation" to the producer. So when this 
> configuration is set to false, we stop estimating the compressed size but 
> will close the batch once the uncompressed bytes in the batch reaches the 
> batch size.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] kafka pull request #2638: KAFKA-3995: fix compression ratio estimation.

2017-05-21 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/2638


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-4208) Add Record Headers

2017-05-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16018987#comment-16018987
 ] 

ASF GitHub Bot commented on KAFKA-4208:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/2991


> Add Record Headers
> --
>
> Key: KAFKA-4208
> URL: https://issues.apache.org/jira/browse/KAFKA-4208
> Project: Kafka
>  Issue Type: New Feature
>  Components: clients, core
>Reporter: Michael Andre Pearce
>Priority: Critical
> Fix For: 0.11.0.0
>
>
> Currently headers are not natively supported unlike many transport and 
> messaging platforms or standard, this is to add support for headers to kafka
> This JIRA is related to KIP found here:
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-82+-+Add+Record+Headers



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] kafka pull request #2991: KAFKA-4208: Add Record Headers

2017-05-21 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/2991


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


RE?? [DISCUSS] KIP-148: Add a connect timeout for client

2017-05-21 Thread ????????
Hi Guozhang,


Thanks for the clarify. For the clarify 2, I think the key thing is not users 
control how much time in maximum to wait for inside code, but is the network 
client can be aware of the connecting can't be finished and try a good node. In 
the producer.sender even the selector.poll can timeout, but the next time is 
also not close the previous connecting and try another good node.


In out test env, QA shutdown one of the leader node, the producer send the 
request will timeout and close the node's connection then request the metadata. 
 But sometimes the request node is also the shutdown node.  When connecting the 
shutting down node to get the metadata, it is in the connecting phase, network 
client mark the connecting node's state to CONNECTING, but if the node is 
shutdown,  the socket can't be aware of the connecting is broken. Though the 
selector.poll has timeout parameter, but it will not close the connection, so 
the next 
time in the "networkclient.maybeUpdate" it will check if isAnyNodeConnecting, 
then will not connect to any good node the get the metadata.  It need about 
several minutes to 
aware the connecting is timeout and try other node.  


So I want to add a connect.timeout parameter,  the selector can find the 
connecting is timeout and close the connection.  It seems the currently the 
timeout value passed in `selector.poll()`
seems can not do this.


Thanks,
David






--  --
??: "Guozhang Wang";;
: 2017??5??16??(??) 1:51
??: "dev@kafka.apache.org"; 

: Re: [DISCUSS] KIP-148: Add a connect timeout for client



Hi David,

I may be a bit confused before, just clarifying a few things:

1. As you mentioned, a client will always try to first establish the
connection with a broker node before it tries to send any request to it.
And after connection is established, it will either continuously send many
requests (e.g. produce) for just a single request (e.g. metadata) to the
broker, so these two phases are indeed different.

2. In the connected phase, connections.max.idle.ms is used to
auto-disconnect the socket if no requests has been sent / received during
that period of time; in the connecting phase, we always try to create the
socket via "socketChannel.connect" in a non-blocking call, and then checks
if the connection has been established, but all the callers of this
function (in either producer or consumer) has a timeout parameter as in
`selector.poll()`, and the timeout parameter is set either by calculations
based on metadata.expiration.time and backoff for producer#sender, or by
directly passed values from consumer#poll(timeout), so although there is no
directly config controlling that, users can still control how much time in
maximum to wait for inside code.

I originally thought your scenarios is more on the connected phase, but now
I feel you are talking about the connecting phase. For that case, I still
feel currently the timeout value passed in `selector.poll()` which is
controllable from user code should be sufficient?


Guozhang




On Sun, May 14, 2017 at 2:37 AM,  <254479...@qq.com> wrote:

> Hi Guozhang,
>
>
> Sorry for the delay, thanks for the question.  It seems two different
> parameters to me:
> connect.timeout.ms: only work for the connecting phrase, after connected
> phrase this parameter is not used.
> connections.max.idle.ms: currently not work in the connecting phrase
> (only select return readyKeys >0) will add to the expired manager, after
> connected will check if the connection is still alive in some time.
>
>
> Even if we change the connections.max.idle.ms to work including the
> connecting phrase, we can not set this parameter to a small value, such as
> 5 seconds. Because the client is maybe busy sending message to other node,
> it will be disconnected in 5 seconds, so the default value of
> connections.max.idle.ms is setting to a larger time. We should have two
> parameters to control the connecting phrase behavior and the connected
> phrase behavior, do you think so?
>
>
> Thanks,
>
>
> David
>
>
>
>
> --  --
> ??: "Guozhang Wang";;
> : 2017??5??6??(??) 7:52
> ??: "dev@kafka.apache.org";
>
> : Re: [DISCUSS] KIP-148: Add a connect timeout for client
>
>
>
> Hello David,
>
> Thanks for the KIP. For the described issue, I'm wondering if it can be
> resolved by tuning the CONNECTIONS_MAX_IDLE_MS_CONFIG (
> connections.max.idle.ms) on the client side? Default is 9 minutes.
>
>
> Guozhang
>
> On Tue, May 2, 2017 at 8:22 AM,  <254479...@qq.com> wrote:
>
> > Hi all,
> >
> > Currently in our test environment, we found that after one of the broker
> > node crash (reboot or os crash), the client may still be connecting to
> the
> > crash node to send metadata request or other request, and it needs
> several
> > minutes