[jira] [Updated] (KAFKA-5154) Kafka Streams throws NPE during rebalance

2017-05-15 Thread Lukas Gemela (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lukas Gemela updated KAFKA-5154:

Attachment: clio_reduced.gz

> Kafka Streams throws NPE during rebalance
> -
>
> Key: KAFKA-5154
> URL: https://issues.apache.org/jira/browse/KAFKA-5154
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.2.0
>Reporter: Lukas Gemela
> Attachments: clio_reduced.gz, clio.txt.gz
>
>
> please see attached log, Kafka streams throws NullPointerException during 
> rebalance, which is caught by our custom exception handler
> {noformat}
> 2017-04-30T17:44:17,675 INFO  kafka-coordinator-heartbeat-thread | hades 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.coordinatorDead()
>  @618 - Marking the coordinator 10.210.200.144:9092 (id: 2147483644 rack: 
> null) dead for group hades
> 2017-04-30T17:44:27,395 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.onSuccess() 
> @573 - Discovered coordinator 10.210.200.144:9092 (id: 2147483644 rack: null) 
> for group hades.
> 2017-04-30T17:44:27,941 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinPrepare()
>  @393 - Revoking previously assigned partitions [poseidonIncidentFeed-27, 
> poseidonIncidentFeed-29, poseidonIncidentFeed-30, poseidonIncidentFeed-18] 
> for group hades
> 2017-04-30T17:44:27,947 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.sendJoinGroupRequest()
>  @407 - (Re-)joining group hades
> 2017-04-30T17:44:48,468 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.sendJoinGroupRequest()
>  @407 - (Re-)joining group hades
> 2017-04-30T17:44:53,628 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.sendJoinGroupRequest()
>  @407 - (Re-)joining group hades
> 2017-04-30T17:45:09,587 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.sendJoinGroupRequest()
>  @407 - (Re-)joining group hades
> 2017-04-30T17:45:11,961 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.onSuccess() 
> @375 - Successfully joined group hades with generation 99
> 2017-04-30T17:45:13,126 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinComplete()
>  @252 - Setting newly assigned partitions [poseidonIncidentFeed-11, 
> poseidonIncidentFeed-27, poseidonIncidentFeed-25, poseidonIncidentFeed-29, 
> poseidonIncidentFeed-19, poseidonIncidentFeed-18] for group hades
> 2017-04-30T17:46:37,254 INFO  kafka-coordinator-heartbeat-thread | hades 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.coordinatorDead()
>  @618 - Marking the coordinator 10.210.200.144:9092 (id: 2147483644 rack: 
> null) dead for group hades
> 2017-04-30T18:04:25,993 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.onSuccess() 
> @573 - Discovered coordinator 10.210.200.144:9092 (id: 2147483644 rack: null) 
> for group hades.
> 2017-04-30T18:04:29,401 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinPrepare()
>  @393 - Revoking previously assigned partitions [poseidonIncidentFeed-11, 
> poseidonIncidentFeed-27, poseidonIncidentFeed-25, poseidonIncidentFeed-29, 
> poseidonIncidentFeed-19, poseidonIncidentFeed-18] for group hades
> 2017-04-30T18:05:10,877 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.sendJoinGroupRequest()
>  @407 - (Re-)joining group hades
> 2017-05-01T00:01:55,707 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.coordinatorDead()
>  @618 - Marking the coordinator 10.210.200.144:9092 (id: 2147483644 rack: 
> null) dead for group hades
> 2017-05-01T00:01:59,027 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.onSuccess() 
> @573 - Discovered coordinator 10.210.200.144:9092 (id: 2147483644 rack: null) 
> for group hades.
> 2017-05-01T00:01:59,031 ERROR StreamThread-1 
> org.apache.kafka.streams.processor.internals.StreamThread.run() @376 - 
> stream-thread [StreamThread-1] Streams application error during processing:
>  java.lang.NullPointerException
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:619)
>  ~[kafka-streams-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:368)
>  [kafka-streams-0.10.2.0.jar!/:?]
> 2017-05-01T00:02:00,038 INFO  StreamThread-1 
> org.apache.kafka.clients.producer.KafkaProducer.close() @689 - Closing the 
> Kafka producer with timeoutMillis = 9223372036854775807 ms.
> 

[GitHub] kafka pull request #3050: (trunk) KAFKA-5232 Fix Log.parseTopicPartitionName...

2017-05-15 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/3050


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] kafka pull request #3046: MINOR: Consolidate Topic classes

2017-05-15 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/3046


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-5154) Kafka Streams throws NPE during rebalance

2017-05-15 Thread Lukas Gemela (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16010143#comment-16010143
 ] 

Lukas Gemela commented on KAFKA-5154:
-

[~mjsax] logs attached (clio_reduced)

It's just log from two days when the exception occured, if you need to go 
further to history just ask (complete log has 1.5GB in total)

Thanks!

L.

> Kafka Streams throws NPE during rebalance
> -
>
> Key: KAFKA-5154
> URL: https://issues.apache.org/jira/browse/KAFKA-5154
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.2.0
>Reporter: Lukas Gemela
> Attachments: clio_reduced.gz, clio.txt.gz
>
>
> please see attached log, Kafka streams throws NullPointerException during 
> rebalance, which is caught by our custom exception handler
> {noformat}
> 2017-04-30T17:44:17,675 INFO  kafka-coordinator-heartbeat-thread | hades 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.coordinatorDead()
>  @618 - Marking the coordinator 10.210.200.144:9092 (id: 2147483644 rack: 
> null) dead for group hades
> 2017-04-30T17:44:27,395 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.onSuccess() 
> @573 - Discovered coordinator 10.210.200.144:9092 (id: 2147483644 rack: null) 
> for group hades.
> 2017-04-30T17:44:27,941 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinPrepare()
>  @393 - Revoking previously assigned partitions [poseidonIncidentFeed-27, 
> poseidonIncidentFeed-29, poseidonIncidentFeed-30, poseidonIncidentFeed-18] 
> for group hades
> 2017-04-30T17:44:27,947 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.sendJoinGroupRequest()
>  @407 - (Re-)joining group hades
> 2017-04-30T17:44:48,468 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.sendJoinGroupRequest()
>  @407 - (Re-)joining group hades
> 2017-04-30T17:44:53,628 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.sendJoinGroupRequest()
>  @407 - (Re-)joining group hades
> 2017-04-30T17:45:09,587 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.sendJoinGroupRequest()
>  @407 - (Re-)joining group hades
> 2017-04-30T17:45:11,961 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.onSuccess() 
> @375 - Successfully joined group hades with generation 99
> 2017-04-30T17:45:13,126 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinComplete()
>  @252 - Setting newly assigned partitions [poseidonIncidentFeed-11, 
> poseidonIncidentFeed-27, poseidonIncidentFeed-25, poseidonIncidentFeed-29, 
> poseidonIncidentFeed-19, poseidonIncidentFeed-18] for group hades
> 2017-04-30T17:46:37,254 INFO  kafka-coordinator-heartbeat-thread | hades 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.coordinatorDead()
>  @618 - Marking the coordinator 10.210.200.144:9092 (id: 2147483644 rack: 
> null) dead for group hades
> 2017-04-30T18:04:25,993 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.onSuccess() 
> @573 - Discovered coordinator 10.210.200.144:9092 (id: 2147483644 rack: null) 
> for group hades.
> 2017-04-30T18:04:29,401 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinPrepare()
>  @393 - Revoking previously assigned partitions [poseidonIncidentFeed-11, 
> poseidonIncidentFeed-27, poseidonIncidentFeed-25, poseidonIncidentFeed-29, 
> poseidonIncidentFeed-19, poseidonIncidentFeed-18] for group hades
> 2017-04-30T18:05:10,877 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.sendJoinGroupRequest()
>  @407 - (Re-)joining group hades
> 2017-05-01T00:01:55,707 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.coordinatorDead()
>  @618 - Marking the coordinator 10.210.200.144:9092 (id: 2147483644 rack: 
> null) dead for group hades
> 2017-05-01T00:01:59,027 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.onSuccess() 
> @573 - Discovered coordinator 10.210.200.144:9092 (id: 2147483644 rack: null) 
> for group hades.
> 2017-05-01T00:01:59,031 ERROR StreamThread-1 
> org.apache.kafka.streams.processor.internals.StreamThread.run() @376 - 
> stream-thread [StreamThread-1] Streams application error during processing:
>  java.lang.NullPointerException
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:619)
>  ~[kafka-streams-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:368)
>  [kafka-streams-0.10.2.0.jar!/:?]
> 

[jira] [Commented] (KAFKA-5232) Kafka broker fails to start if a topic containing dot in its name is marked for delete but hasn't been deleted during previous uptime

2017-05-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16010146#comment-16010146
 ] 

ASF GitHub Bot commented on KAFKA-5232:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/3050


> Kafka broker fails to start if a topic containing dot in its name is marked 
> for delete but hasn't been deleted during previous uptime
> -
>
> Key: KAFKA-5232
> URL: https://issues.apache.org/jira/browse/KAFKA-5232
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.10.2.0, 0.10.2.1
>Reporter: jaikiran pai
>Assignee: jaikiran pai
>Priority: Critical
> Fix For: 0.11.0.0, 0.10.2.2
>
>
> We are using 0.10.2.0 (but this is reproducible even with 0.10.2.1 and latest 
> upstream) in our environments. Our topic names contain (one or more) dot 
> characters in their name. So we have topics like {{foo.bar-testtopic}}. Topic 
> deletion is enabled on the broker(s) and our application does delete the 
> topics as and when necessary.
> We just ran into a case today where for some reason the Kafka broker had 
> either to be taken down (or went down on its own). Some of the topics which 
> were deleted (i.e. a deletion marker folder was created for them) were left 
> around after Kafka broker had gone down. So the Kafka logs dir had 
> directories like 
> {{foo.bar-testtopic-0.bb7981c216b845648edfe6e2b0a5c050-delete}}.
> When we restarted the Kafka broker, it refused to start and kept shutting 
> down and running into this exception:
> {code}
> [2017-05-12 21:36:27,876] ERROR There was an error in one of the threads 
> during logs loading: java.lang.StringIndexOutOfBoundsException: String index 
> out of range: -1 (kafka.log.LogManager)
> [2017-05-12 21:36:27,900] FATAL [Kafka Server 0], Fatal error during 
> KafkaServer startup. Prepare to shutdown (kafka.server.KafkaServer)
> java.lang.StringIndexOutOfBoundsException: String index out of range: -1
>   at java.lang.String.substring(String.java:1967)
>   at kafka.log.Log$.parseTopicPartitionName(Log.scala:1146)
>   at kafka.log.LogManager.$anonfun$loadLogs$10(LogManager.scala:153)
>   at kafka.utils.CoreUtils$$anon$1.run(CoreUtils.scala:57)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> [2017-05-12 21:36:27,950] INFO [Kafka Server 0], shutting down 
> (kafka.server.KafkaServer)
> {code}
> The only way we could get past this is pointing Kafka broker to a different 
> Kafka logs directory which effectively meant a lot our topics were no longer 
> accessible to the application.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (KAFKA-4965) set internal.leave.group.on.close to false in KafkaStreams

2017-05-15 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-4965:
---
Fix Version/s: 0.10.2.2

> set internal.leave.group.on.close to false in KafkaStreams
> --
>
> Key: KAFKA-4965
> URL: https://issues.apache.org/jira/browse/KAFKA-4965
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Affects Versions: 0.11.0.0
>Reporter: Damian Guy
>Assignee: Damian Guy
> Fix For: 0.11.0.0, 0.10.2.2
>
>
> We added the internal Consumer Config {{leave.group.on.close}} so that we can 
> avoid unnecessary rebalances during bounces in streams etc. We just need to 
> set this config to false.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (KAFKA-5240) Make persistent checkpointedOffsets optionable

2017-05-15 Thread Lukas Gemela (JIRA)
Lukas Gemela created KAFKA-5240:
---

 Summary: Make persistent checkpointedOffsets optionable
 Key: KAFKA-5240
 URL: https://issues.apache.org/jira/browse/KAFKA-5240
 Project: Kafka
  Issue Type: Improvement
  Components: streams
Affects Versions: 0.10.2.0
Reporter: Lukas Gemela


By looking at the ProcessorStateManager class kafka streams library tries to 
persist current offset for each partition to the file. It has to use locking 
mechanism to be sure that there is no other thread/task running which is 
modifying the .checkpoint file. It does that even if you don't use persistent 
store in your topology (which is a bit confusing)

>From my understanding this is because you want to make active state 
>restorations faster and not to seek from the beginning 
>(ProcessorStateManager:217)

We actually run everything in docker environment and we don't restart our 
microservices - we just run another docker container and delete the old one. We 
don't use persistent stores and we don't want to have our microservices to 
write anything to the filesystem. 

We always set aggressive [compact. delete] policy to get the kafka streams 
internal topics to have them compacted as much as possible and therefore we 
don't need a fast recovery either - we always have to replay whole topic no 
matter what. 

Would it be possible to make writing to the file system optionable?

Thanks!

L.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: [DISCUSS]: KIP-149: Enabling key access in ValueTransformer, ValueMapper, and ValueJoiner

2017-05-15 Thread Damian Guy
Thanks for the KIP.

I'm not convinced on the `RichFunction` approach. Do we really want to give
every DSL method access to the `ProcessorContext` ? It has a bunch of
methods on it that seem in-appropriate for some of the DSL methods, i.e,
`register`, `getStateStore`, `forward`, `schedule` etc. It is far too
broad. I think it would be better to have a narrower interface like the
`RecordContext`  - remembering it is easier to add methods/interfaces later
than to remove them

On Sat, 13 May 2017 at 22:26 Matthias J. Sax  wrote:

> Jeyhun,
>
> I am not an expert on Lambdas. Can you elaborate a little bit? I cannot
> follow the explanation in the KIP to see what the problem is.
>
> For updating the KIP title I don't know -- guess it's up to you. Maybe a
> committer can comment on this?
>
>
> Further comments:
>
>  - The KIP get a little hard to read -- can you maybe reformat the wiki
> page a little bit? I think using `CodeBlock` would help.
>
>  - What about KStream-KTable joins? You don't have overlaods added for
> them. Why? (Even if I still hope that we don't need to add any new
> overloads)
>
>  - Why do we need `AbstractRichFunction`?
>
>  - What about interfaces Initializer, ForeachAction, Merger, Predicate,
> Reducer? I don't want to say we should/need to add to all, but we should
> discuss all of them and add where it does make sense (e.g.,
> RichForachAction does make sense IMHO)
>
>
> Btw: I like the hierarchy `ValueXX` -- `ValueXXWithKey` -- `RichValueXX`
> in general -- but why can't we do all this with interfaces only?
>
>
>
> -Matthias
>
>
>
> On 5/11/17 7:00 AM, Jeyhun Karimov wrote:
> > Hi,
> >
> > Thanks for your comments. I think we cannot extend the two interfaces if
> we
> > want to keep lambdas. I updated the KIP [1]. Maybe I should change the
> > title, because now we are not limiting the KIP to only ValueMapper,
> > ValueTransformer and ValueJoiner.
> > Please feel free to comment.
> >
> > [1]
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-149%3A+Enabling+key+access+in+ValueTransformer%2C+ValueMapper%2C+and+ValueJoiner
> >
> >
> > Cheers,
> > Jeyhun
> >
> > On Tue, May 9, 2017 at 7:36 PM Matthias J. Sax 
> > wrote:
> >
> >> If `ValueMapperWithKey` extends `ValueMapper` we don't need the new
> >> overlaod.
> >>
> >> And yes, we need to do one check -- but this happens when building the
> >> topology. At runtime (I mean after KafkaStream#start() we don't need any
> >> check as we will always use `ValueMapperWithKey`)
> >>
> >>
> >> -Matthias
> >>
> >>
> >> On 5/9/17 2:55 AM, Jeyhun Karimov wrote:
> >>> Hi,
> >>>
> >>> Thanks for feedback.
> >>> Then we need to overload method
> >>>KStream mapValues(ValueMapper
> >>> mapper);
> >>> with
> >>>KStream mapValues(ValueMapperWithKey >> VR>
> >>> mapper);
> >>>
> >>> and in runtime (inside processor) we still have to check it is
> >> ValueMapper
> >>> or ValueMapperWithKey before wrapping it into the rich function.
> >>>
> >>>
> >>> Please correct me if I am wrong.
> >>>
> >>> Cheers,
> >>> Jeyhun
> >>>
> >>>
> >>>
> >>>
> >>> On Tue, May 9, 2017 at 10:56 AM Michal Borowiecki <
> >>> michal.borowie...@openbet.com> wrote:
> >>>
>  +1 :)
> 
> 
>  On 08/05/17 23:52, Matthias J. Sax wrote:
> > Hi,
> >
> > I was reading the updated KIP and I am wondering, if we should do the
> > design a little different.
> >
> > Instead of distinguishing between a RichFunction and non-RichFunction
> >> at
> > runtime level, we would use RichFunctions all the time. Thus, on the
> >> DSL
> > entry level, if a user provides a non-RichFunction, we wrap it by a
> > RichFunction that is fully implemented by Streams. This RichFunction
> > would just forward the call omitting the key:
> >
> > Just to sketch the idea (incomplete code snippet):
> >
> >> public StreamsRichValueMapper implements RichValueMapper() {
> >>private ValueMapper userProvidedMapper; // set by constructor
> >>
> >>public VR apply(final K key, final V1 value1, final V2 value2) {
> >>return userProvidedMapper(value1, value2);
> >>}
> >> }
> >
> >  From a performance point of view, I am not sure if the
> > "if(isRichFunction)" including casts etc would have more overhead
> than
> > this approach (we would do more nested method call for
> non-RichFunction
> > which should be more common than RichFunctions).
> >
> > This approach should not effect lambdas (or do I miss something?) and
> > might be cleaner, as we could have one more top level interface
> > `RichFunction` with methods `init()` and `close()` and also
> interfaces
> > for `RichValueMapper` etc. (thus, no abstract classes required).
> >
> >
> > Any thoughts?
> >
> >
> > -Matthias
> >
> >
> > On 5/6/17 5:29 PM, Jeyhun Karimov wrote:
> >> Hi,
> >>
> >> Thanks for 

[GitHub] kafka pull request #3032: KAFKA-4965: set internal.leave.group.on.close to f...

2017-05-15 Thread dguy
Github user dguy closed the pull request at:

https://github.com/apache/kafka/pull/3032


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-4965) set internal.leave.group.on.close to false in KafkaStreams

2017-05-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16010260#comment-16010260
 ] 

ASF GitHub Bot commented on KAFKA-4965:
---

Github user dguy closed the pull request at:

https://github.com/apache/kafka/pull/3032


> set internal.leave.group.on.close to false in KafkaStreams
> --
>
> Key: KAFKA-4965
> URL: https://issues.apache.org/jira/browse/KAFKA-4965
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Affects Versions: 0.11.0.0
>Reporter: Damian Guy
>Assignee: Damian Guy
> Fix For: 0.11.0.0
>
>
> We added the internal Consumer Config {{leave.group.on.close}} so that we can 
> avoid unnecessary rebalances during bounces in streams etc. We just need to 
> set this config to false.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Build failed in Jenkins: kafka-trunk-jdk8 #1524

2017-05-15 Thread Apache Jenkins Server
See 


Changes:

[ismael] MINOR: Consolidate Topic classes

--
[...truncated 863.93 KB...]
kafka.utils.CommandLineUtilsTest > testParseEmptyArg PASSED

kafka.utils.CommandLineUtilsTest > testParseSingleArg STARTED

kafka.utils.CommandLineUtilsTest > testParseSingleArg PASSED

kafka.utils.CommandLineUtilsTest > testParseArgs STARTED

kafka.utils.CommandLineUtilsTest > testParseArgs PASSED

kafka.utils.CommandLineUtilsTest > testParseEmptyArgAsValid STARTED

kafka.utils.CommandLineUtilsTest > testParseEmptyArgAsValid PASSED

kafka.utils.ReplicationUtilsTest > testUpdateLeaderAndIsr STARTED

kafka.utils.ReplicationUtilsTest > testUpdateLeaderAndIsr PASSED

kafka.utils.ReplicationUtilsTest > testGetLeaderIsrAndEpochForPartition STARTED

kafka.utils.ReplicationUtilsTest > testGetLeaderIsrAndEpochForPartition PASSED

kafka.utils.JsonTest > testJsonEncoding STARTED

kafka.utils.JsonTest > testJsonEncoding PASSED

kafka.utils.ShutdownableThreadTest > testShutdownWhenCalledAfterThreadStart 
STARTED

kafka.utils.ShutdownableThreadTest > testShutdownWhenCalledAfterThreadStart 
PASSED

kafka.utils.SchedulerTest > testMockSchedulerNonPeriodicTask STARTED

kafka.utils.SchedulerTest > testMockSchedulerNonPeriodicTask PASSED

kafka.utils.SchedulerTest > testMockSchedulerPeriodicTask STARTED

kafka.utils.SchedulerTest > testMockSchedulerPeriodicTask PASSED

kafka.utils.SchedulerTest > testNonPeriodicTask STARTED

kafka.utils.SchedulerTest > testNonPeriodicTask PASSED

kafka.utils.SchedulerTest > testRestart STARTED

kafka.utils.SchedulerTest > testRestart PASSED

kafka.utils.SchedulerTest > testReentrantTaskInMockScheduler STARTED

kafka.utils.SchedulerTest > testReentrantTaskInMockScheduler PASSED

kafka.utils.SchedulerTest > testPeriodicTask STARTED

kafka.utils.SchedulerTest > testPeriodicTask PASSED

kafka.utils.ZkUtilsTest > testAbortedConditionalDeletePath STARTED

kafka.utils.ZkUtilsTest > testAbortedConditionalDeletePath PASSED

kafka.utils.ZkUtilsTest > testSuccessfulConditionalDeletePath STARTED

kafka.utils.ZkUtilsTest > testSuccessfulConditionalDeletePath PASSED

kafka.utils.ZkUtilsTest > testPersistentSequentialPath STARTED

kafka.utils.ZkUtilsTest > testPersistentSequentialPath PASSED

kafka.utils.ZkUtilsTest > testClusterIdentifierJsonParsing STARTED

kafka.utils.ZkUtilsTest > testClusterIdentifierJsonParsing PASSED

kafka.utils.IteratorTemplateTest > testIterator STARTED

kafka.utils.IteratorTemplateTest > testIterator PASSED

kafka.utils.UtilsTest > testGenerateUuidAsBase64 STARTED

kafka.utils.UtilsTest > testGenerateUuidAsBase64 PASSED

kafka.utils.UtilsTest > testAbs STARTED

kafka.utils.UtilsTest > testAbs PASSED

kafka.utils.UtilsTest > testReplaceSuffix STARTED

kafka.utils.UtilsTest > testReplaceSuffix PASSED

kafka.utils.UtilsTest > testCircularIterator STARTED

kafka.utils.UtilsTest > testCircularIterator PASSED

kafka.utils.UtilsTest > testReadBytes STARTED

kafka.utils.UtilsTest > testReadBytes PASSED

kafka.utils.UtilsTest > testCsvList STARTED

kafka.utils.UtilsTest > testCsvList PASSED

kafka.utils.UtilsTest > testReadInt STARTED

kafka.utils.UtilsTest > testReadInt PASSED

kafka.utils.UtilsTest > testUrlSafeBase64EncodeUUID STARTED

kafka.utils.UtilsTest > testUrlSafeBase64EncodeUUID PASSED

kafka.utils.UtilsTest > testCsvMap STARTED

kafka.utils.UtilsTest > testCsvMap PASSED

kafka.utils.UtilsTest > testInLock STARTED

kafka.utils.UtilsTest > testInLock PASSED

kafka.utils.UtilsTest > testSwallow STARTED

kafka.utils.UtilsTest > testSwallow PASSED

kafka.producer.AsyncProducerTest > testFailedSendRetryLogic STARTED

kafka.producer.AsyncProducerTest > testFailedSendRetryLogic PASSED

kafka.producer.AsyncProducerTest > testQueueTimeExpired STARTED

kafka.producer.AsyncProducerTest > testQueueTimeExpired PASSED

kafka.producer.AsyncProducerTest > testPartitionAndCollateEvents STARTED

kafka.producer.AsyncProducerTest > testPartitionAndCollateEvents PASSED

kafka.producer.AsyncProducerTest > testBatchSize STARTED

kafka.producer.AsyncProducerTest > testBatchSize PASSED

kafka.producer.AsyncProducerTest > testSerializeEvents STARTED

kafka.producer.AsyncProducerTest > testSerializeEvents PASSED

kafka.producer.AsyncProducerTest > testProducerQueueSize STARTED

kafka.producer.AsyncProducerTest > testProducerQueueSize PASSED

kafka.producer.AsyncProducerTest > testRandomPartitioner STARTED

kafka.producer.AsyncProducerTest > testRandomPartitioner PASSED

kafka.producer.AsyncProducerTest > testInvalidConfiguration STARTED

kafka.producer.AsyncProducerTest > testInvalidConfiguration PASSED

kafka.producer.AsyncProducerTest > testInvalidPartition STARTED

kafka.producer.AsyncProducerTest > testInvalidPartition PASSED

kafka.producer.AsyncProducerTest > testNoBroker STARTED

kafka.producer.AsyncProducerTest > testNoBroker PASSED

kafka.producer.AsyncProducerTest > 

[jira] [Commented] (KAFKA-5025) FetchRequestTest should use batches with more than one message

2017-05-15 Thread Umesh Chaudhary (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16010242#comment-16010242
 ] 

Umesh Chaudhary commented on KAFKA-5025:


Hi [~ijuma], I executed the above test and observed that *produceData* actually 
returns multiple records i.e. different keys with different values. 
Is that not the expected behaviour of this method ? Please correct me if my 
understanding needs correction. 

> FetchRequestTest should use batches with more than one message
> --
>
> Key: KAFKA-5025
> URL: https://issues.apache.org/jira/browse/KAFKA-5025
> Project: Kafka
>  Issue Type: Sub-task
>  Components: clients, core, producer 
>Reporter: Ismael Juma
>Assignee: Umesh Chaudhary
> Fix For: 0.11.0.0
>
>
> As part of the message format changes for KIP-98, 
> FetchRequestTest.produceData was changed to always use record batches 
> containing a single message. We should restructure the test so that it's more 
> realistic. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Build failed in Jenkins: kafka-trunk-jdk7 #2194

2017-05-15 Thread Apache Jenkins Server
See 


Changes:

[ismael] KAFKA-5232; Fix Log.parseTopicPartitionName to handle deleted topics

--
[...truncated 857.39 KB...]
kafka.log.ProducerStateManagerTest > testFirstUnstableOffsetAfterTruncation 
PASSED

kafka.log.ProducerStateManagerTest > testTakeSnapshot STARTED

kafka.log.ProducerStateManagerTest > testTakeSnapshot PASSED

kafka.log.ProducerStateManagerTest > testDeleteSnapshotsBefore STARTED

kafka.log.ProducerStateManagerTest > testDeleteSnapshotsBefore PASSED

kafka.log.ProducerStateManagerTest > 
testNonMatchingTxnFirstOffsetMetadataNotCached STARTED

kafka.log.ProducerStateManagerTest > 
testNonMatchingTxnFirstOffsetMetadataNotCached PASSED

kafka.log.ProducerStateManagerTest > testFirstUnstableOffsetAfterEviction 
STARTED

kafka.log.ProducerStateManagerTest > testFirstUnstableOffsetAfterEviction PASSED

kafka.log.ProducerStateManagerTest > testNoValidationOnFirstEntryWhenLoadingLog 
STARTED

kafka.log.ProducerStateManagerTest > testNoValidationOnFirstEntryWhenLoadingLog 
PASSED

kafka.log.ProducerStateManagerTest > testEvictUnretainedPids STARTED

kafka.log.ProducerStateManagerTest > testEvictUnretainedPids PASSED

kafka.log.ProducerStateManagerTest > 
testProducersWithOngoingTransactionsDontExpire STARTED

kafka.log.ProducerStateManagerTest > 
testProducersWithOngoingTransactionsDontExpire PASSED

kafka.log.ProducerStateManagerTest > testBasicIdMapping STARTED

kafka.log.ProducerStateManagerTest > testBasicIdMapping PASSED

kafka.log.ProducerStateManagerTest > updateProducerTransactionState STARTED

kafka.log.ProducerStateManagerTest > updateProducerTransactionState PASSED

kafka.log.ProducerStateManagerTest > testRecoverFromSnapshot STARTED

kafka.log.ProducerStateManagerTest > testRecoverFromSnapshot PASSED

kafka.log.ProducerStateManagerTest > testFirstUnstableOffset STARTED

kafka.log.ProducerStateManagerTest > testFirstUnstableOffset PASSED

kafka.log.ProducerStateManagerTest > testTxnFirstOffsetMetadataCached STARTED

kafka.log.ProducerStateManagerTest > testTxnFirstOffsetMetadataCached PASSED

kafka.log.ProducerStateManagerTest > testCoordinatorFencedAfterReload STARTED

kafka.log.ProducerStateManagerTest > testCoordinatorFencedAfterReload PASSED

kafka.log.ProducerStateManagerTest > testControlRecordBumpsEpoch STARTED

kafka.log.ProducerStateManagerTest > testControlRecordBumpsEpoch PASSED

kafka.log.ProducerStateManagerTest > testPidExpirationTimeout STARTED

kafka.log.ProducerStateManagerTest > testPidExpirationTimeout PASSED

kafka.log.ProducerStateManagerTest > testOldEpochForControlRecord STARTED

kafka.log.ProducerStateManagerTest > testOldEpochForControlRecord PASSED

kafka.log.ProducerStateManagerTest > testStartOffset STARTED

kafka.log.ProducerStateManagerTest > testStartOffset PASSED

kafka.log.ProducerStateManagerTest > 
testNonTransactionalAppendWithOngoingTransaction STARTED

kafka.log.ProducerStateManagerTest > 
testNonTransactionalAppendWithOngoingTransaction PASSED

kafka.log.ProducerStateManagerTest > testSkipSnapshotIfOffsetUnchanged STARTED

kafka.log.ProducerStateManagerTest > testSkipSnapshotIfOffsetUnchanged PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[0] STARTED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[0] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[1] STARTED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[1] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[2] STARTED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[2] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[3] STARTED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[3] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[4] STARTED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[4] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[5] STARTED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[5] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[6] STARTED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[6] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[7] STARTED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[7] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[8] STARTED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[8] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[9] STARTED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[9] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[10] STARTED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[10] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[11] STARTED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[11] PASSED

kafka.log.BrokerCompressionTest > 

Build failed in Jenkins: kafka-trunk-jdk8 #1525

2017-05-15 Thread Apache Jenkins Server
See 


Changes:

[ismael] KAFKA-5232; Fix Log.parseTopicPartitionName to handle deleted topics

--
[...truncated 860.83 KB...]
kafka.utils.CommandLineUtilsTest > testParseEmptyArg PASSED

kafka.utils.CommandLineUtilsTest > testParseSingleArg STARTED

kafka.utils.CommandLineUtilsTest > testParseSingleArg PASSED

kafka.utils.CommandLineUtilsTest > testParseArgs STARTED

kafka.utils.CommandLineUtilsTest > testParseArgs PASSED

kafka.utils.CommandLineUtilsTest > testParseEmptyArgAsValid STARTED

kafka.utils.CommandLineUtilsTest > testParseEmptyArgAsValid PASSED

kafka.utils.ReplicationUtilsTest > testUpdateLeaderAndIsr STARTED

kafka.utils.ReplicationUtilsTest > testUpdateLeaderAndIsr PASSED

kafka.utils.ReplicationUtilsTest > testGetLeaderIsrAndEpochForPartition STARTED

kafka.utils.ReplicationUtilsTest > testGetLeaderIsrAndEpochForPartition PASSED

kafka.utils.JsonTest > testJsonEncoding STARTED

kafka.utils.JsonTest > testJsonEncoding PASSED

kafka.utils.ShutdownableThreadTest > testShutdownWhenCalledAfterThreadStart 
STARTED

kafka.utils.ShutdownableThreadTest > testShutdownWhenCalledAfterThreadStart 
PASSED

kafka.utils.SchedulerTest > testMockSchedulerNonPeriodicTask STARTED

kafka.utils.SchedulerTest > testMockSchedulerNonPeriodicTask PASSED

kafka.utils.SchedulerTest > testMockSchedulerPeriodicTask STARTED

kafka.utils.SchedulerTest > testMockSchedulerPeriodicTask PASSED

kafka.utils.SchedulerTest > testNonPeriodicTask STARTED

kafka.utils.SchedulerTest > testNonPeriodicTask PASSED

kafka.utils.SchedulerTest > testRestart STARTED

kafka.utils.SchedulerTest > testRestart PASSED

kafka.utils.SchedulerTest > testReentrantTaskInMockScheduler STARTED

kafka.utils.SchedulerTest > testReentrantTaskInMockScheduler PASSED

kafka.utils.SchedulerTest > testPeriodicTask STARTED

kafka.utils.SchedulerTest > testPeriodicTask PASSED

kafka.utils.ZkUtilsTest > testAbortedConditionalDeletePath STARTED

kafka.utils.ZkUtilsTest > testAbortedConditionalDeletePath PASSED

kafka.utils.ZkUtilsTest > testSuccessfulConditionalDeletePath STARTED

kafka.utils.ZkUtilsTest > testSuccessfulConditionalDeletePath PASSED

kafka.utils.ZkUtilsTest > testPersistentSequentialPath STARTED

kafka.utils.ZkUtilsTest > testPersistentSequentialPath PASSED

kafka.utils.ZkUtilsTest > testClusterIdentifierJsonParsing STARTED

kafka.utils.ZkUtilsTest > testClusterIdentifierJsonParsing PASSED

kafka.utils.IteratorTemplateTest > testIterator STARTED

kafka.utils.IteratorTemplateTest > testIterator PASSED

kafka.utils.UtilsTest > testGenerateUuidAsBase64 STARTED

kafka.utils.UtilsTest > testGenerateUuidAsBase64 PASSED

kafka.utils.UtilsTest > testAbs STARTED

kafka.utils.UtilsTest > testAbs PASSED

kafka.utils.UtilsTest > testReplaceSuffix STARTED

kafka.utils.UtilsTest > testReplaceSuffix PASSED

kafka.utils.UtilsTest > testCircularIterator STARTED

kafka.utils.UtilsTest > testCircularIterator PASSED

kafka.utils.UtilsTest > testReadBytes STARTED

kafka.utils.UtilsTest > testReadBytes PASSED

kafka.utils.UtilsTest > testCsvList STARTED

kafka.utils.UtilsTest > testCsvList PASSED

kafka.utils.UtilsTest > testReadInt STARTED

kafka.utils.UtilsTest > testReadInt PASSED

kafka.utils.UtilsTest > testUrlSafeBase64EncodeUUID STARTED

kafka.utils.UtilsTest > testUrlSafeBase64EncodeUUID PASSED

kafka.utils.UtilsTest > testCsvMap STARTED

kafka.utils.UtilsTest > testCsvMap PASSED

kafka.utils.UtilsTest > testInLock STARTED

kafka.utils.UtilsTest > testInLock PASSED

kafka.utils.UtilsTest > testSwallow STARTED

kafka.utils.UtilsTest > testSwallow PASSED

kafka.producer.AsyncProducerTest > testFailedSendRetryLogic STARTED

kafka.producer.AsyncProducerTest > testFailedSendRetryLogic PASSED

kafka.producer.AsyncProducerTest > testQueueTimeExpired STARTED

kafka.producer.AsyncProducerTest > testQueueTimeExpired PASSED

kafka.producer.AsyncProducerTest > testPartitionAndCollateEvents STARTED

kafka.producer.AsyncProducerTest > testPartitionAndCollateEvents PASSED

kafka.producer.AsyncProducerTest > testBatchSize STARTED

kafka.producer.AsyncProducerTest > testBatchSize PASSED

kafka.producer.AsyncProducerTest > testSerializeEvents STARTED

kafka.producer.AsyncProducerTest > testSerializeEvents PASSED

kafka.producer.AsyncProducerTest > testProducerQueueSize STARTED

kafka.producer.AsyncProducerTest > testProducerQueueSize PASSED

kafka.producer.AsyncProducerTest > testRandomPartitioner STARTED

kafka.producer.AsyncProducerTest > testRandomPartitioner PASSED

kafka.producer.AsyncProducerTest > testInvalidConfiguration STARTED

kafka.producer.AsyncProducerTest > testInvalidConfiguration PASSED

kafka.producer.AsyncProducerTest > testInvalidPartition STARTED

kafka.producer.AsyncProducerTest > testInvalidPartition PASSED

kafka.producer.AsyncProducerTest > testNoBroker STARTED

kafka.producer.AsyncProducerTest > testNoBroker PASSED


[jira] [Updated] (KAFKA-5118) Improve message for Kafka failed startup with non-Kafka data in data.dirs

2017-05-15 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-5118:
---
Fix Version/s: 0.10.2.2

> Improve message for Kafka failed startup with non-Kafka data in data.dirs
> -
>
> Key: KAFKA-5118
> URL: https://issues.apache.org/jira/browse/KAFKA-5118
> Project: Kafka
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 0.10.2.0
>Reporter: Dustin Cote
>Assignee: huxi
>Priority: Minor
> Fix For: 0.11.0.0, 0.10.2.2
>
>
> Today, if you try to startup a broker with some non-Kafka data in the 
> data.dirs you end up with a cryptic message:
> {code}
> [2017-04-21 13:35:08,122] ERROR There was an error in one of the threads 
> during logs loading: java.lang.StringIndexOutOfBoundsException: String index 
> out of range: -1 (kafka.log.LogManager) 
> [2017-04-21 13:35:08,124] FATAL [Kafka Server 3], Fatal error during 
> KafkaServer startup. Prepare to shutdown (kafka.server.KafkaServer) 
> java.lang.StringIndexOutOfBoundsException: String index out of range: -1 
> {code}
> It'd be better if we could tell the user to look for non-Kafka data in the 
> data.dirs and print out the offending directory that caused the problem in 
> the first place.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Build failed in Jenkins: kafka-trunk-jdk7 #2193

2017-05-15 Thread Apache Jenkins Server
See 


Changes:

[ismael] MINOR: Consolidate Topic classes

--
[...truncated 859.32 KB...]
kafka.log.ProducerStateManagerTest > testFirstUnstableOffsetAfterTruncation 
PASSED

kafka.log.ProducerStateManagerTest > testTakeSnapshot STARTED

kafka.log.ProducerStateManagerTest > testTakeSnapshot PASSED

kafka.log.ProducerStateManagerTest > testDeleteSnapshotsBefore STARTED

kafka.log.ProducerStateManagerTest > testDeleteSnapshotsBefore PASSED

kafka.log.ProducerStateManagerTest > 
testNonMatchingTxnFirstOffsetMetadataNotCached STARTED

kafka.log.ProducerStateManagerTest > 
testNonMatchingTxnFirstOffsetMetadataNotCached PASSED

kafka.log.ProducerStateManagerTest > testFirstUnstableOffsetAfterEviction 
STARTED

kafka.log.ProducerStateManagerTest > testFirstUnstableOffsetAfterEviction PASSED

kafka.log.ProducerStateManagerTest > testNoValidationOnFirstEntryWhenLoadingLog 
STARTED

kafka.log.ProducerStateManagerTest > testNoValidationOnFirstEntryWhenLoadingLog 
PASSED

kafka.log.ProducerStateManagerTest > testEvictUnretainedPids STARTED

kafka.log.ProducerStateManagerTest > testEvictUnretainedPids PASSED

kafka.log.ProducerStateManagerTest > 
testProducersWithOngoingTransactionsDontExpire STARTED

kafka.log.ProducerStateManagerTest > 
testProducersWithOngoingTransactionsDontExpire PASSED

kafka.log.ProducerStateManagerTest > testBasicIdMapping STARTED

kafka.log.ProducerStateManagerTest > testBasicIdMapping PASSED

kafka.log.ProducerStateManagerTest > updateProducerTransactionState STARTED

kafka.log.ProducerStateManagerTest > updateProducerTransactionState PASSED

kafka.log.ProducerStateManagerTest > testRecoverFromSnapshot STARTED

kafka.log.ProducerStateManagerTest > testRecoverFromSnapshot PASSED

kafka.log.ProducerStateManagerTest > testFirstUnstableOffset STARTED

kafka.log.ProducerStateManagerTest > testFirstUnstableOffset PASSED

kafka.log.ProducerStateManagerTest > testTxnFirstOffsetMetadataCached STARTED

kafka.log.ProducerStateManagerTest > testTxnFirstOffsetMetadataCached PASSED

kafka.log.ProducerStateManagerTest > testCoordinatorFencedAfterReload STARTED

kafka.log.ProducerStateManagerTest > testCoordinatorFencedAfterReload PASSED

kafka.log.ProducerStateManagerTest > testControlRecordBumpsEpoch STARTED

kafka.log.ProducerStateManagerTest > testControlRecordBumpsEpoch PASSED

kafka.log.ProducerStateManagerTest > testPidExpirationTimeout STARTED

kafka.log.ProducerStateManagerTest > testPidExpirationTimeout PASSED

kafka.log.ProducerStateManagerTest > testOldEpochForControlRecord STARTED

kafka.log.ProducerStateManagerTest > testOldEpochForControlRecord PASSED

kafka.log.ProducerStateManagerTest > testStartOffset STARTED

kafka.log.ProducerStateManagerTest > testStartOffset PASSED

kafka.log.ProducerStateManagerTest > 
testNonTransactionalAppendWithOngoingTransaction STARTED

kafka.log.ProducerStateManagerTest > 
testNonTransactionalAppendWithOngoingTransaction PASSED

kafka.log.ProducerStateManagerTest > testSkipSnapshotIfOffsetUnchanged STARTED

kafka.log.ProducerStateManagerTest > testSkipSnapshotIfOffsetUnchanged PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[0] STARTED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[0] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[1] STARTED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[1] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[2] STARTED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[2] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[3] STARTED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[3] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[4] STARTED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[4] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[5] STARTED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[5] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[6] STARTED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[6] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[7] STARTED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[7] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[8] STARTED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[8] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[9] STARTED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[9] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[10] STARTED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[10] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[11] STARTED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[11] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[12] 

[jira] [Commented] (KAFKA-5238) BrokerTopicMetrics can be recreated after topic is deleted

2017-05-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16010472#comment-16010472
 ] 

ASF GitHub Bot commented on KAFKA-5238:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/3042


> BrokerTopicMetrics can be recreated after topic is deleted
> --
>
> Key: KAFKA-5238
> URL: https://issues.apache.org/jira/browse/KAFKA-5238
> Project: Kafka
>  Issue Type: Bug
>Reporter: Ismael Juma
>
> As part of KAFKA-3258, we added code to remove metrics during topic deletion. 
> This works fine as long as there are no fetch requests in the purgatory. If 
> there are, however, we'll recreate the metrics when we call 
> `ReplicaManager.appendToLocalLog`.
> This can be reproduced by updating 
> MetricsTest.testBrokerTopicMetricsUnregisteredAfterDeletingTopic() in the 
> following way:
> {code}
> @Test
>   def testBrokerTopicMetricsUnregisteredAfterDeletingTopic() {
> val topic = "test-broker-topic-metric"
> AdminUtils.createTopic(zkUtils, topic, 2, 1)
> // Produce a few messages and consume them to create the metrics
> TestUtils.produceMessages(servers, topic, nMessages)
> TestUtils.consumeTopicRecords(servers, topic, nMessages)
> assertTrue("Topic metrics don't exist", topicMetricGroups(topic).nonEmpty)
> assertNotNull(BrokerTopicStats.getBrokerTopicStats(topic))
> AdminUtils.deleteTopic(zkUtils, topic)
> TestUtils.verifyTopicDeletion(zkUtils, topic, 1, servers)
> Thread.sleep(1)
> assertEquals("Topic metrics exists after deleteTopic", Set.empty, 
> topicMetricGroups(topic))
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (KAFKA-5241) GlobalKTable does not checkpoint offsets after restoring state

2017-05-15 Thread Tommy Becker (JIRA)
Tommy Becker created KAFKA-5241:
---

 Summary: GlobalKTable does not checkpoint offsets after restoring 
state
 Key: KAFKA-5241
 URL: https://issues.apache.org/jira/browse/KAFKA-5241
 Project: Kafka
  Issue Type: Bug
  Components: streams
Affects Versions: 0.10.2.1
Reporter: Tommy Becker
Priority: Minor


I'm experimenting with an application that uses a relatively large 
GlobalKTable, and noticed that streams was not checkpointing its offsets on 
close(). This is because although  
{{org.apache.kafka.streams.processor.internals.GlobalStateManagerImpl#restoreState}}
 updates the checkpoint map, the actual checkpointing itself is guarded by a 
check that the offsets passed from the {{GloablStateUpdateTask}} are not empty. 
This is frustrating because if the topic backing the global table is both large 
(therefore taking a long time to restore) and infrequently written, then 
streams rebuilds the table from scratch every time the application is started.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-5241) GlobalKTable does not checkpoint offsets after restoring state

2017-05-15 Thread Tommy Becker (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16010389#comment-16010389
 ] 

Tommy Becker commented on KAFKA-5241:
-

I have a patch for this and will take the issue, if someone can grant me 
permission to assign to myself.

> GlobalKTable does not checkpoint offsets after restoring state
> --
>
> Key: KAFKA-5241
> URL: https://issues.apache.org/jira/browse/KAFKA-5241
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.2.1
>Reporter: Tommy Becker
>Priority: Minor
>
> I'm experimenting with an application that uses a relatively large 
> GlobalKTable, and noticed that streams was not checkpointing its offsets on 
> close(). This is because although  
> {{org.apache.kafka.streams.processor.internals.GlobalStateManagerImpl#restoreState}}
>  updates the checkpoint map, the actual checkpointing itself is guarded by a 
> check that the offsets passed from the {{GloablStateUpdateTask}} are not 
> empty. This is frustrating because if the topic backing the global table is 
> both large (therefore taking a long time to restore) and infrequently 
> written, then streams rebuilds the table from scratch every time the 
> application is started.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (KAFKA-5232) Kafka broker fails to start if a topic containing dot in its name is marked for delete but hasn't been deleted during previous uptime

2017-05-15 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-5232:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Issue resolved by pull request 3043
[https://github.com/apache/kafka/pull/3043]

> Kafka broker fails to start if a topic containing dot in its name is marked 
> for delete but hasn't been deleted during previous uptime
> -
>
> Key: KAFKA-5232
> URL: https://issues.apache.org/jira/browse/KAFKA-5232
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.10.2.0, 0.10.2.1
>Reporter: jaikiran pai
>Assignee: jaikiran pai
>Priority: Critical
> Fix For: 0.11.0.0, 0.10.2.2
>
>
> We are using 0.10.2.0 (but this is reproducible even with 0.10.2.1 and latest 
> upstream) in our environments. Our topic names contain (one or more) dot 
> characters in their name. So we have topics like {{foo.bar-testtopic}}. Topic 
> deletion is enabled on the broker(s) and our application does delete the 
> topics as and when necessary.
> We just ran into a case today where for some reason the Kafka broker had 
> either to be taken down (or went down on its own). Some of the topics which 
> were deleted (i.e. a deletion marker folder was created for them) were left 
> around after Kafka broker had gone down. So the Kafka logs dir had 
> directories like 
> {{foo.bar-testtopic-0.bb7981c216b845648edfe6e2b0a5c050-delete}}.
> When we restarted the Kafka broker, it refused to start and kept shutting 
> down and running into this exception:
> {code}
> [2017-05-12 21:36:27,876] ERROR There was an error in one of the threads 
> during logs loading: java.lang.StringIndexOutOfBoundsException: String index 
> out of range: -1 (kafka.log.LogManager)
> [2017-05-12 21:36:27,900] FATAL [Kafka Server 0], Fatal error during 
> KafkaServer startup. Prepare to shutdown (kafka.server.KafkaServer)
> java.lang.StringIndexOutOfBoundsException: String index out of range: -1
>   at java.lang.String.substring(String.java:1967)
>   at kafka.log.Log$.parseTopicPartitionName(Log.scala:1146)
>   at kafka.log.LogManager.$anonfun$loadLogs$10(LogManager.scala:153)
>   at kafka.utils.CoreUtils$$anon$1.run(CoreUtils.scala:57)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> [2017-05-12 21:36:27,950] INFO [Kafka Server 0], shutting down 
> (kafka.server.KafkaServer)
> {code}
> The only way we could get past this is pointing Kafka broker to a different 
> Kafka logs directory which effectively meant a lot our topics were no longer 
> accessible to the application.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] kafka pull request #3054: KAFKA-5241: GlobalKTable does not checkpoint offse...

2017-05-15 Thread twbecker
GitHub user twbecker opened a pull request:

https://github.com/apache/kafka/pull/3054

KAFKA-5241: GlobalKTable does not checkpoint offsets after restoring state

Ensure checkpointable offsets for GlobalKTables are always written on close.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/twbecker/kafka KAFKA-5241

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3054.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3054


commit 4f7263768ef73d18cf6b90894f4a0286bc79ea14
Author: Tommy Becker 
Date:   2017-05-15T12:13:07Z

Fix KAFKA-5241.

Ensure checkpointable offsets for GlobalKTables are always written on close.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-3984) Broker doesn't retry reconnecting to an expired Zookeeper connection

2017-05-15 Thread Dan (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16010436#comment-16010436
 ] 

Dan commented on KAFKA-3984:


We encountered the same problem in 0.10.1.1.  Is there any plan to fix it?

> Broker doesn't retry reconnecting to an expired Zookeeper connection
> 
>
> Key: KAFKA-3984
> URL: https://issues.apache.org/jira/browse/KAFKA-3984
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.9.0.1
>Reporter: Braedon Vickers
>
> We've been having issues with the network connectivity of our Kafka cluster, 
> and this seems to be triggering an issue where the brokers stop trying to 
> reconnect to Zookeeper, leaving us with a broken cluster even when the 
> network has recovered.
> When network issues begin we see {{java.net.NoRouteToHostException}} 
> exceptions from {{org.apache.zookeeper.ClientCnxn}} as it attempts to 
> re-establish the connection. If the network issue resolves itself while we 
> are only getting these errors the broker seems to reconnect fine.
> However, a lot of the time we end up with a message like this:
> {code}[2016-07-22 00:21:44,181] FATAL Could not establish session with 
> zookeeper (kafka.server.KafkaHealthcheck)
> org.I0Itec.zkclient.exception.ZkException: Unable to connect to  hosts>
>   at org.I0Itec.zkclient.ZkConnection.connect(ZkConnection.java:71)
>   at org.I0Itec.zkclient.ZkClient.reconnect(ZkClient.java:1279)
> ...
> Caused by: java.net.UnknownHostException: 
>   at java.net.InetAddress.getAllByName(InetAddress.java:1126)
>   at java.net.InetAddress.getAllByName(InetAddress.java:1192)
>   at 
> org.apache.zookeeper.client.StaticHostProvider.(StaticHostProvider.java:61)
>   at org.apache.zookeeper.ZooKeeper.(ZooKeeper.java:445)
> ...
> {code}
> (apologies for the partial stack traces - I'm having to try and reconstruct 
> them from a less than ideal centralised logging setup.)
> If this happens, the broker stops trying to reconnect to Zookeeper, and we 
> have to restart it.
> It looks like while the {{org.apache.zookeeper.Zookeeper}} client's state 
> isn't {{Expired}} it will keep retrying the connection, and will recover OK 
> when the network is back. However, once it changes to {{Expired}} (not 
> entirely sure how that happens - based on the session timeout perhaps?) 
> zkclient closes the existing client and attempts to create a new one. If the 
> network is still down, the client constructor throws a 
> {{java.net.UnknownHostException}}, zkclient calls 
> {{handleSessionEstablishmentError()}} on {{KafkaHealthcheck}}, 
> {{KafkaHealthcheck.handleSessionEstablishmentError()}} logs a "Fatal" error 
> and does nothing else.
> It seems like some form of retry needs to happen here, or the broker is stuck 
> with no Zookeeper connection 
> indefinitely.{{KafkaHealthcheck.handleSessionEstablishmentError()}} used to 
> kill the JVM, but that was removed in 
> https://issues.apache.org/jira/browse/KAFKA-2405. Killing the JVM would be 
> better than doing nothing, as then your init system could restart it, 
> allowing it to recover once the network was back.
> Our cluster is running 0.9.0.1, so not sure if it affects 0.10.0.0 as well. 
> However, it seems likely, as there doesn't seem to be any code changes in 
> kafka or zkclient that would affect this behaviour.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (KAFKA-5226) NullPointerException (NPE) in SourceNodeRecordDeserializer.deserialize

2017-05-15 Thread Ian Springer (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ian Springer updated KAFKA-5226:

Attachment: kafka.log

Here's the full log file.

> NullPointerException (NPE) in SourceNodeRecordDeserializer.deserialize
> --
>
> Key: KAFKA-5226
> URL: https://issues.apache.org/jira/browse/KAFKA-5226
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.2.1
> Environment: 64-bit Amazon Linux, JDK8
>Reporter: Ian Springer
> Attachments: kafka.log
>
>
> I saw the following NPE in our Kafka Streams app, which has 3 nodes running 
> on 3 separate machines.. Out of hundreds of messages processed, the NPE only 
> occurred twice. I are not sure of the cause, so I am unable to reproduce it. 
> I'm hoping the Kafka Streams team can guess the cause based on the stack 
> trace. If I can provide any additional details about our app, please let me 
> know.
>  
> {code}
> INFO  2017-05-10 02:58:26,021 org.apache.kafka.common.utils.AppInfoParser  
> Kafka version : 0.10.2.1
> INFO  2017-05-10 02:58:26,021 org.apache.kafka.common.utils.AppInfoParser  
> Kafka commitId : e89bffd6b2eff799
> INFO  2017-05-10 02:58:26,031 o.s.context.support.DefaultLifecycleProcessor  
> Starting beans in phase 0
> INFO  2017-05-10 02:58:26,075 org.apache.kafka.streams.KafkaStreams  
> stream-client [evergage-app-bd9c9868-4b9b-4d2e-850f-9b5bec1fc0a9] State 
> transition from CREATED to RUNNING.
> INFO  2017-05-10 02:58:26,075 org.apache.kafka.streams.KafkaStreams  
> stream-client [evergage-app-bd9c9868-4b9b-4d2e-850f-9b5bec1fc0a9] Started 
> Kafka Stream process
> INFO  2017-05-10 02:58:26,086 o.a.k.c.consumer.internals.AbstractCoordinator  
> Discovered coordinator p1kaf1.prod.apptegic.com:9092 (id: 2147482646 rack: 
> null) for group evergage-app.
> INFO  2017-05-10 02:58:26,126 o.a.k.c.consumer.internals.ConsumerCoordinator  
> Revoking previously assigned partitions [] for group evergage-app
> INFO  2017-05-10 02:58:26,126 org.apache.kafka.streams.KafkaStreams  
> stream-client [evergage-app-bd9c9868-4b9b-4d2e-850f-9b5bec1fc0a9] State 
> transition from RUNNING to REBALANCING.
> INFO  2017-05-10 02:58:26,127 o.a.k.c.consumer.internals.AbstractCoordinator  
> (Re-)joining group evergage-app
> INFO  2017-05-10 02:58:27,712 o.a.k.c.consumer.internals.AbstractCoordinator  
> Successfully joined group evergage-app with generation 18
> INFO  2017-05-10 02:58:27,716 o.a.k.c.consumer.internals.ConsumerCoordinator  
> Setting newly assigned partitions [us.app.Trigger-0] for group evergage-app
> INFO  2017-05-10 02:58:27,716 org.apache.kafka.streams.KafkaStreams  
> stream-client [evergage-app-bd9c9868-4b9b-4d2e-850f-9b5bec1fc0a9] State 
> transition from REBALANCING to REBALANCING.
> INFO  2017-05-10 02:58:27,729 
> o.a.kafka.streams.processor.internals.StreamTask  task [0_0] Initializing 
> state stores
> INFO  2017-05-10 02:58:27,731 
> o.a.kafka.streams.processor.internals.StreamTask  task [0_0] Initializing 
> processor nodes of the topology
> INFO  2017-05-10 02:58:27,742 org.apache.kafka.streams.KafkaStreams  
> stream-client [evergage-app-bd9c9868-4b9b-4d2e-850f-9b5bec1fc0a9] State 
> transition from REBALANCING to RUNNING.
> [14 hours pass...]
> INFO  2017-05-10 16:21:27,476 o.a.k.c.consumer.internals.ConsumerCoordinator  
> Revoking previously assigned partitions [us.app.Trigger-0] for group 
> evergage-app
> INFO  2017-05-10 16:21:27,477 org.apache.kafka.streams.KafkaStreams  
> stream-client [evergage-app-bd9c9868-4b9b-4d2e-850f-9b5bec1fc0a9] State 
> transition from RUNNING to REBALANCING.
> INFO  2017-05-10 16:21:27,482 o.a.k.c.consumer.internals.AbstractCoordinator  
> (Re-)joining group evergage-app
> INFO  2017-05-10 16:21:27,489 o.a.k.c.consumer.internals.AbstractCoordinator  
> Successfully joined group evergage-app with generation 19
> INFO  2017-05-10 16:21:27,489 o.a.k.c.consumer.internals.ConsumerCoordinator  
> Setting newly assigned partitions [us.app.Trigger-0] for group evergage-app
> INFO  2017-05-10 16:21:27,489 org.apache.kafka.streams.KafkaStreams  
> stream-client [evergage-app-bd9c9868-4b9b-4d2e-850f-9b5bec1fc0a9] State 
> transition from REBALANCING to REBALANCING.
> INFO  2017-05-10 16:21:27,489 
> o.a.kafka.streams.processor.internals.StreamTask  task [0_0] Initializing 
> processor nodes of the topology
> INFO  2017-05-10 16:21:27,493 org.apache.kafka.streams.KafkaStreams  
> stream-client [evergage-app-bd9c9868-4b9b-4d2e-850f-9b5bec1fc0a9] State 
> transition from REBALANCING to RUNNING.
> INFO  2017-05-10 16:21:30,584 o.a.k.c.consumer.internals.ConsumerCoordinator  
> Revoking previously assigned partitions [us.app.Trigger-0] for group 
> evergage-app
> INFO  2017-05-10 16:21:30,584 org.apache.kafka.streams.KafkaStreams  
> 

Re: [DISCUSS] KIP-150 - Kafka-Streams Cogroup

2017-05-15 Thread Kyle Winkelman
Damian Guy, could you let me know if you plan to review this further? There
is no rush, but if you dont have any additional comments I could start the
voting and finish my WIP PR.

Thanks,
Kyle

On May 9, 2017 11:07 AM, "Kyle Winkelman"  wrote:

> Eno, is there anyone else that is an expert in the kafka streams realm
> that I should reach out to for input?
>
> I believe Damian Guy is still planning on reviewing this more in depth so
> I will wait for his inputs before continuing.
>
> On May 9, 2017 7:30 AM, "Eno Thereska"  wrote:
>
>> Thanks Kyle, good arguments.
>>
>> Eno
>>
>> > On May 7, 2017, at 5:06 PM, Kyle Winkelman 
>> wrote:
>> >
>> > *- minor: could you add an exact example (similar to what Jay’s example
>> is,
>> > or like your Spark/Pig pointers had) to make this super concrete?*
>> > I have added a more concrete example to the KIP.
>> >
>> > *- my main concern is that we’re exposing this optimization to the DSL.
>> In
>> > an ideal world, an optimizer would take the existing DSL and do the
>> right
>> > thing under the covers (create just one state store, arrange the nodes
>> > etc). The original DSL had a bunch of small, composable pieces (group,
>> > aggregate, join) that this proposal groups together. I’d like to hear
>> your
>> > thoughts on whether it’s possible to do this optimization with the
>> current
>> > DSL, at the topology builder level.*
>> > You would have to make a lot of checks to understand if it is even
>> possible
>> > to make this optimization:
>> > 1. Make sure they are all KTableKTableOuterJoins
>> > 2. None of the intermediate KTables are used for anything else.
>> > 3. None of the intermediate stores are used. (This may be impossible
>> > especially if they use KafkaStreams#store after the topology has already
>> > been built.)
>> > You would then need to make decisions during the optimization:
>> > 1. Your new initializer would the composite of all the individual
>> > initializers and the valueJoiners.
>> > 2. I am having a hard time thinking about how you would turn the
>> > aggregators and valueJoiners into an aggregator that would work on the
>> > final object, but this may be possible.
>> > 3. Which state store would you use? The ones declared would be for the
>> > aggregate values. None of the declared ones would be guaranteed to hold
>> the
>> > final object. This would mean you must created a new state store and not
>> > created any of the declared ones.
>> >
>> > The main argument I have against it is even if it could be done I don't
>> > know that we would want to have this be an optimization in the
>> background
>> > because the user would still be required to think about all of the
>> > intermediate values that they shouldn't need to worry about if they only
>> > care about the final object.
>> >
>> > In my opinion cogroup is a common enough case that it should be part of
>> the
>> > composable pieces (group, aggregate, join) because we want to allow
>> people
>> > to join more than 2 or more streams in an easy way. Right now I don't
>> think
>> > we give them ways of handling this use case easily.
>> >
>> > *-I think there will be scope for several such optimizations in the
>> future
>> > and perhaps at some point we need to think about decoupling the 1:1
>> mapping
>> > from the DSL into the physical topology.*
>> > I would argue that cogroup is not just an optimization it is a new way
>> for
>> > the users to look at accomplishing a problem that requires multiple
>> > streams. I may sound like a broken record but I don't think users should
>> > have to build the N-1 intermediate tables and deal with their
>> initializers,
>> > serdes and stores if all they care about is the final object.
>> > Now if for example someone uses cogroup but doesn't supply additional
>> > streams and aggregators this case is equivalent to a single grouped
>> stream
>> > making an aggregate call. This case is what I view an optimization as,
>> we
>> > could remove the KStreamCogroup and act as if there was just a call to
>> > KGroupedStream#aggregate instead of calling KGroupedStream#cogroup. (I
>> > would prefer to just write a warning saying that this is not how
>> cogroup is
>> > to be used.)
>> >
>> > Thanks,
>> > Kyle
>> >
>> > On Sun, May 7, 2017 at 5:41 AM, Eno Thereska 
>> wrote:
>> >
>> >> Hi Kyle,
>> >>
>> >> Thanks for the KIP again. A couple of comments:
>> >>
>> >> - minor: could you add an exact example (similar to what Jay’s example
>> is,
>> >> or like your Spark/Pig pointers had) to make this super concrete?
>> >>
>> >> - my main concern is that we’re exposing this optimization to the DSL.
>> In
>> >> an ideal world, an optimizer would take the existing DSL and do the
>> right
>> >> thing under the covers (create just one state store, arrange the nodes
>> >> etc). The original DSL had a bunch of small, composable pieces (group,
>> >> aggregate, join) that this 

[jira] [Commented] (KAFKA-5241) GlobalKTable does not checkpoint offsets after restoring state

2017-05-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16010417#comment-16010417
 ] 

ASF GitHub Bot commented on KAFKA-5241:
---

GitHub user twbecker opened a pull request:

https://github.com/apache/kafka/pull/3054

KAFKA-5241: GlobalKTable does not checkpoint offsets after restoring state

Ensure checkpointable offsets for GlobalKTables are always written on close.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/twbecker/kafka KAFKA-5241

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3054.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3054


commit 4f7263768ef73d18cf6b90894f4a0286bc79ea14
Author: Tommy Becker 
Date:   2017-05-15T12:13:07Z

Fix KAFKA-5241.

Ensure checkpointable offsets for GlobalKTables are always written on close.




> GlobalKTable does not checkpoint offsets after restoring state
> --
>
> Key: KAFKA-5241
> URL: https://issues.apache.org/jira/browse/KAFKA-5241
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.2.1
>Reporter: Tommy Becker
>Priority: Minor
>
> I'm experimenting with an application that uses a relatively large 
> GlobalKTable, and noticed that streams was not checkpointing its offsets on 
> close(). This is because although  
> {{org.apache.kafka.streams.processor.internals.GlobalStateManagerImpl#restoreState}}
>  updates the checkpoint map, the actual checkpointing itself is guarded by a 
> check that the offsets passed from the {{GloablStateUpdateTask}} are not 
> empty. This is frustrating because if the topic backing the global table is 
> both large (therefore taking a long time to restore) and infrequently 
> written, then streams rebuilds the table from scratch every time the 
> application is started.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] kafka pull request #3042: MINOR: Fix one flaky test in MetricsTest and impro...

2017-05-15 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/3042


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Permission to edit Kafka wiki page

2017-05-15 Thread Arseniy Tashoyan
Hello Kafka developers,

Could you give me permission to update Kafka System Tools page?
https://cwiki.apache.org/confluence/display/KAFKA/System+Tools
I am going to update this page in scope of
https://github.com/apache/kafka/pull/3051

Arseniy


Jenkins build is back to normal : kafka-trunk-jdk7 #2195

2017-05-15 Thread Apache Jenkins Server
See 




[jira] [Updated] (KAFKA-5200) If a replicated topic is deleted with one broker down, it can't be recreated

2017-05-15 Thread Edoardo Comar (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Edoardo Comar updated KAFKA-5200:
-
Summary: If a replicated topic is deleted with one broker down, it can't be 
recreated  (was: Deleting topic when one broker is down will prevent topic to 
be re-creatable)

> If a replicated topic is deleted with one broker down, it can't be recreated
> 
>
> Key: KAFKA-5200
> URL: https://issues.apache.org/jira/browse/KAFKA-5200
> Project: Kafka
>  Issue Type: Improvement
>  Components: core
>Reporter: Edoardo Comar
>
> In a cluster with 5 broker, replication factor=3, min in sync=2,
> one broker went down 
> A user's app remained of course unaware of that and deleted a topic that 
> (unknowingly) had a replica on the dead broker.
> The topic went in 'pending delete' mode
> The user then tried to recreate the topic - which failed, so his app was left 
> stuck - no working topic and no ability to create one.
> The reassignment tool fails to move the replica out of the dead broker - 
> specifically because the broker with the partition replica to move is dead :-)
> Incidentally the confluent-rebalancer docs say
> http://docs.confluent.io/current/kafka/post-deployment.html#scaling-the-cluster
> > Supports moving partitions away from dead brokers
> It'd be nice to similarly improve the opensource reassignment tool



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] kafka pull request #3055: Kip 138

2017-05-15 Thread mihbor
GitHub user mihbor opened a pull request:

https://github.com/apache/kafka/pull/3055

Kip 138

Test coverage is not satisfactory but I'd like to start getting feedback 
already.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mihbor/kafka KIP-138

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3055.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3055


commit 7635ee1a3c5681e7a460b28962f7c8baf5a4094e
Author: Michal Borowiecki 
Date:   2017-05-15T10:49:36Z

KAFKA-5233 KIP-138 WIP stream time only

commit 2c61c0979aa11dddc28937d6b169cbdcaac62171
Author: Michal Borowiecki 
Date:   2017-05-15T11:15:24Z

KAFKA-5233 KIP-138 WIP added support for system time punctuation

commit f4c77375fc3531e91c3b6d1d6e94546c6fcf50de
Author: Michal Borowiecki 
Date:   2017-05-15T12:46:12Z

KAFKA-5233 KIP-138 WIP refactored

commit 925f7f34d54b31344fc4d9b81b9f6c4669f224ff
Author: Michal Borowiecki 
Date:   2017-05-15T13:06:52Z

KAFKA-5233 KIP-138 WIP fixes

commit ab4d99d0005ede916110f250a06bbd8952bf035d
Author: Michal Borowiecki 
Date:   2017-05-15T13:09:10Z

KAFKA-5233 KIP-138 WIP trivia

commit 6b04e90153c72835002745e8faa8776315b8948e
Author: Michal Borowiecki 
Date:   2017-05-15T14:35:10Z

KAFKA-5233 KIP-138 checkstyle




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (KAFKA-5242) state task directory locked forever

2017-05-15 Thread Lukas Gemela (JIRA)
Lukas Gemela created KAFKA-5242:
---

 Summary: state task directory locked forever
 Key: KAFKA-5242
 URL: https://issues.apache.org/jira/browse/KAFKA-5242
 Project: Kafka
  Issue Type: Bug
  Components: streams
Affects Versions: 0.10.2.0
Reporter: Lukas Gemela
 Attachments: clio_170511.log

>From time to time, during relabance we are getting a lot of exceptions saying 

{code}
org.apache.kafka.streams.errors.LockException: task [0_0] Failed to lock the 
state directory: /app/db/clio/0_0
at 
org.apache.kafka.streams.processor.internals.ProcessorStateManager.(ProcessorStateManager.java:102)
 ~[kafka-streams-0.10.2.0.jar!/:?]
at 
org.apache.kafka.streams.processor.internals.AbstractTask.(AbstractTask.java:73)
 ~[kafka-streams-0.10.2.0.jar!/:?]
at 
org.apache.kafka.streams.processor.internals.StreamTask.(StreamTask.java:108)
 ~[kafka-streams-0.10.2.0.jar!/:?]
at 
org.apache.kafka.streams.processor.internals.StreamThread.createStreamTask(StreamThread.java:834)
 ~[kafka-streams-0.10.2.0.jar!/:?]
at 
org.apache.kafka.streams.processor.internals.StreamThread$TaskCreator.createTask(StreamThread.java:1207)
 ~[kafka-streams-0.10.2.0.jar!/:?]
at 
org.apache.kafka.streams.processor.internals.StreamThread$AbstractTaskCreator.retryWithBackoff(StreamThread.java:1180)
 [kafka-streams-0.10.2.0.jar!/:?]
at 
org.apache.kafka.streams.processor.internals.StreamThread.addStreamTasks(StreamThread.java:937)
 [kafka-streams-0.10.2.0.jar!/:?]
at 
org.apache.kafka.streams.processor.internals.StreamThread.access$500(StreamThread.java:69)
 [kafka-streams-0.10.2.0.jar!/:?]
at 
org.apache.kafka.streams.processor.internals.StreamThread$1.onPartitionsAssigned(StreamThread.java:236)
 [kafka-streams-0.10.2.0.jar!/:?]
at 
org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinComplete(ConsumerCoordinator.java:255)
 [kafka-clients-0.10.2.0.jar!/:?]
at 
org.apache.kafka.clients.consumer.internals.AbstractCoordinator.joinGroupIfNeeded(AbstractCoordinator.java:339)
 [kafka-clients-0.10.2.0.jar!/:?]
at 
org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureActiveGroup(AbstractCoordinator.java:303)
 [kafka-clients-0.10.2.0.jar!/:?]
at 
org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:286)
 [kafka-clients-0.10.2.0.jar!/:?]
at 
org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:1030)
 [kafka-clients-0.10.2.0.jar!/:?]
at 
org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:995) 
[kafka-clients-0.10.2.0.jar!/:?]
at 
org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:582)
 [kafka-streams-0.10.2.0.jar!/:?]
at 
org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:368)
 [kafka-streams-0.10.2.0.jar!/:?]
{code}

(see attached logfile)

By looking at the code looks like the some old tasks are never being closed and 
the lock is never released.

Also, the backoff strategy in StreamThread$AbstractTaskCreator.retryWithBackoff 
can run endlessly - after 20 iterations it takes 6hours until the next attempt 
to start a task. 
I've noticed latest code contains check for rebalanceTimeoutMs, but that still 
does not solve the problem especially in case MAX_POLL_INTERVAL_MS_CONFIG is 
set to Integer.MAX_INT. 

I would personally make that backoffstrategy a bit more configurable with a 
number of retries that if it exceed a configured value it propagates the 
exception as any other exception to custom client exception handler.
(I can provide a patch)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-3455) Connect custom processors with the streams DSL

2017-05-15 Thread Michal Borowiecki (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16010530#comment-16010530
 ] 

Michal Borowiecki commented on KAFKA-3455:
--

Hi [~bobbycalderwood],

Can you please describe your use-case, where it would be useful to re-use 
Processor/Transformer implementations?

As to Transformer.punctuate return value having to be null, the javadoc was in 
error but has been fixed on trunk (to be released).

Changing the method signature of Transformer.punctuate would be a 
backward-incompatible change, however, 
[KIP-138|https://cwiki.apache.org/confluence/display/KAFKA/KIP-138%3A+Change+punctuate+semantics]
 will deprecated both methods in favour of a functional interface passed to 
ProcessorContext.schedule(), so it's a small step in the direction you're 
suggesting.

I think AutoCloseable is a false friend in this case. The intention behind 
AutoCloseable is for objects created in a try-with-resources statement to be 
closed when execution exists that statement. However, the Processor is being 
created when you are *defining* the topology and must not be closed from that 
same block of code, since it's used as long as the topology is actually 
*running*, which is happening in different threads.

As to init() and close() I think it would make sense to have them pulled out, 
however, again due to backwards-compatibility it's not as simple as it sounds.
Fortunately, once Java 7 compatibility is dropped, it will be possible to 
change their definition to a default method with an empty body. I think that 
would be backwards-compatible. That would leave only one abstract method for 
Processor and Transformer, process() and transform(), respectively. Since these 
are actually *different* from each other, I'd say that then there'd be no 
repetition.

Would that help your use-cases?

> Connect custom processors with the streams DSL
> --
>
> Key: KAFKA-3455
> URL: https://issues.apache.org/jira/browse/KAFKA-3455
> Project: Kafka
>  Issue Type: Sub-task
>  Components: streams
>Affects Versions: 0.10.0.1
>Reporter: Jonathan Bender
>  Labels: user-experience
> Fix For: 0.11.0.0
>
>
> From the kafka users email thread, we discussed the idea of connecting custom 
> processors with topologies defined from the Streams DSL (and being able to 
> sink data from the processor).  Possibly this could involve exposing the 
> underlying processor's name in the streams DSL so it can be connected with 
> the standard processor API.
> {quote}
> Thanks for the feedback. This is definitely something we wanted to support
> in the Streams DSL.
> One tricky thing, though, is that some operations do not translate to a
> single processor, but a sub-graph of processors (think of a stream-stream
> join, which is translated to actually 5 processors for windowing / state
> queries / merging, each with a different internal name). So how to define
> the API to return the processor name needs some more thinking.
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-5200) Deleting topic when one broker is down will prevent topic to be re-creatable

2017-05-15 Thread Edoardo Comar (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16010601#comment-16010601
 ] 

Edoardo Comar commented on KAFKA-5200:
--

Thanks [~huxi_2b] unfortunately such steps would imply significant downtime 
which is not acceptable to us.

We actually tested a much less intrusive way to handle this occurrence, 
i.e. delete the zookeeper info about the topic while the cluster is still 
running (minus the dead broker of course)
and then force *only the controller broker* to restart.

Even if this is less intrusive, it still means that for a short-ish time two 
brokers are down.
With replication-factor 3 and min.insync.2 this implies an outage for some 
clients 
which remains unacceptable.



> Deleting topic when one broker is down will prevent topic to be re-creatable
> 
>
> Key: KAFKA-5200
> URL: https://issues.apache.org/jira/browse/KAFKA-5200
> Project: Kafka
>  Issue Type: Improvement
>  Components: core
>Reporter: Edoardo Comar
>
> In a cluster with 5 broker, replication factor=3, min in sync=2,
> one broker went down 
> A user's app remained of course unaware of that and deleted a topic that 
> (unknowingly) had a replica on the dead broker.
> The topic went in 'pending delete' mode
> The user then tried to recreate the topic - which failed, so his app was left 
> stuck - no working topic and no ability to create one.
> The reassignment tool fails to move the replica out of the dead broker - 
> specifically because the broker with the partition replica to move is dead :-)
> Incidentally the confluent-rebalancer docs say
> http://docs.confluent.io/current/kafka/post-deployment.html#scaling-the-cluster
> > Supports moving partitions away from dead brokers
> It'd be nice to similarly improve the opensource reassignment tool



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] kafka pull request #3056: HOTFIX: AddOffsetsToTxnResponse using incorrect sc...

2017-05-15 Thread dguy
GitHub user dguy opened a pull request:

https://github.com/apache/kafka/pull/3056

HOTFIX: AddOffsetsToTxnResponse using incorrect schema in parse

The parse method was incorrectly referring to 
`ApiKeys.ADD_PARTITIONS_TO_TXN`

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dguy/kafka hotfix-add-offsets-to-txn-response

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3056.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3056


commit 8e166e3653a0396deb714a46b36480192b9c7af9
Author: Damian Guy 
Date:   2017-05-15T14:36:04Z

use correct schema




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (KAFKA-5240) Make persistent checkpointedOffsets optionable

2017-05-15 Thread Lukas Gemela (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lukas Gemela updated KAFKA-5240:

Description: 
By looking at the ProcessorStateManager class kafka streams library tries to 
persist current offset for each partition to the file. It has to use locking 
mechanism to be sure that there is no other thread/task running which is 
modifying the .checkpoint file. It does that even if you don't use persistent 
store in your topology (which is a bit confusing)

>From my understanding this is because you want to make active state 
>restorations faster and not to seek from the beginning 
>(ProcessorStateManager:217)

We actually run everything in docker environment and we don't restart our 
microservices - we just run another docker container and delete the old one. We 
don't use persistent stores and we don't want to have our microservices to 
write anything to the filesystem. 

We always set aggressive [compact. delete] policy to get the kafka streams 
internal topics to have them compacted as much as possible and therefore we 
don't need a fast recovery either - we always have to replay whole topic no 
matter what. 

We also had some issues file file locking -see 
https://issues.apache.org/jira/browse/KAFKA-5242

Would it be possible to make writing to the file system optionable?

Thanks!

L.

  was:
By looking at the ProcessorStateManager class kafka streams library tries to 
persist current offset for each partition to the file. It has to use locking 
mechanism to be sure that there is no other thread/task running which is 
modifying the .checkpoint file. It does that even if you don't use persistent 
store in your topology (which is a bit confusing)

>From my understanding this is because you want to make active state 
>restorations faster and not to seek from the beginning 
>(ProcessorStateManager:217)

We actually run everything in docker environment and we don't restart our 
microservices - we just run another docker container and delete the old one. We 
don't use persistent stores and we don't want to have our microservices to 
write anything to the filesystem. 

We always set aggressive [compact. delete] policy to get the kafka streams 
internal topics to have them compacted as much as possible and therefore we 
don't need a fast recovery either - we always have to replay whole topic no 
matter what. 

Would it be possible to make writing to the file system optionable?

Thanks!

L.


> Make persistent checkpointedOffsets optionable
> --
>
> Key: KAFKA-5240
> URL: https://issues.apache.org/jira/browse/KAFKA-5240
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Affects Versions: 0.10.2.0
>Reporter: Lukas Gemela
>
> By looking at the ProcessorStateManager class kafka streams library tries to 
> persist current offset for each partition to the file. It has to use locking 
> mechanism to be sure that there is no other thread/task running which is 
> modifying the .checkpoint file. It does that even if you don't use persistent 
> store in your topology (which is a bit confusing)
> From my understanding this is because you want to make active state 
> restorations faster and not to seek from the beginning 
> (ProcessorStateManager:217)
> We actually run everything in docker environment and we don't restart our 
> microservices - we just run another docker container and delete the old one. 
> We don't use persistent stores and we don't want to have our microservices to 
> write anything to the filesystem. 
> We always set aggressive [compact. delete] policy to get the kafka streams 
> internal topics to have them compacted as much as possible and therefore we 
> don't need a fast recovery either - we always have to replay whole topic no 
> matter what. 
> We also had some issues file file locking -see 
> https://issues.apache.org/jira/browse/KAFKA-5242
> Would it be possible to make writing to the file system optionable?
> Thanks!
> L.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Build failed in Jenkins: kafka-trunk-jdk8 #1526

2017-05-15 Thread Apache Jenkins Server
See 


Changes:

[ismael] MINOR: Fix one flaky test in MetricsTest and improve checks for another

--
[...truncated 2.76 MB...]
kafka.server.KafkaConfigTest > testLogRollTimeMsProvided PASSED

kafka.server.KafkaConfigTest > testUncleanLeaderElectionDefault STARTED

kafka.server.KafkaConfigTest > testUncleanLeaderElectionDefault PASSED

kafka.server.KafkaConfigTest > testInvalidAdvertisedListenersProtocol STARTED

kafka.server.KafkaConfigTest > testInvalidAdvertisedListenersProtocol PASSED

kafka.server.KafkaConfigTest > testUncleanElectionEnabled STARTED

kafka.server.KafkaConfigTest > testUncleanElectionEnabled PASSED

kafka.server.KafkaConfigTest > testAdvertisePortDefault STARTED

kafka.server.KafkaConfigTest > testAdvertisePortDefault PASSED

kafka.server.KafkaConfigTest > testVersionConfiguration STARTED

kafka.server.KafkaConfigTest > testVersionConfiguration PASSED

kafka.server.KafkaConfigTest > testEqualAdvertisedListenersProtocol STARTED

kafka.server.KafkaConfigTest > testEqualAdvertisedListenersProtocol PASSED

kafka.server.IsrExpirationTest > testIsrExpirationForSlowFollowers STARTED

kafka.server.IsrExpirationTest > testIsrExpirationForSlowFollowers PASSED

kafka.server.IsrExpirationTest > testIsrExpirationForStuckFollowers STARTED

kafka.server.IsrExpirationTest > testIsrExpirationForStuckFollowers PASSED

kafka.server.IsrExpirationTest > testIsrExpirationIfNoFetchRequestMade STARTED

kafka.server.IsrExpirationTest > testIsrExpirationIfNoFetchRequestMade PASSED

kafka.server.ReplicationQuotasTest > 
shouldBootstrapTwoBrokersWithLeaderThrottle STARTED

kafka.server.ReplicationQuotasTest > 
shouldBootstrapTwoBrokersWithLeaderThrottle PASSED

kafka.server.ReplicationQuotasTest > shouldThrottleOldSegments STARTED

kafka.server.ReplicationQuotasTest > shouldThrottleOldSegments PASSED

kafka.server.ReplicationQuotasTest > 
shouldBootstrapTwoBrokersWithFollowerThrottle STARTED

kafka.server.ReplicationQuotasTest > 
shouldBootstrapTwoBrokersWithFollowerThrottle PASSED

kafka.server.ServerStartupTest > testBrokerStateRunningAfterZK STARTED

kafka.server.ServerStartupTest > testBrokerStateRunningAfterZK PASSED

kafka.server.ServerStartupTest > testBrokerCreatesZKChroot STARTED

kafka.server.ServerStartupTest > testBrokerCreatesZKChroot PASSED

kafka.server.ServerStartupTest > testConflictBrokerStartupWithSamePort STARTED

kafka.server.ServerStartupTest > testConflictBrokerStartupWithSamePort PASSED

kafka.server.ServerStartupTest > testConflictBrokerRegistration STARTED

kafka.server.ServerStartupTest > testConflictBrokerRegistration PASSED

kafka.server.ServerStartupTest > testBrokerSelfAware STARTED

kafka.server.ServerStartupTest > testBrokerSelfAware PASSED

kafka.server.ProduceRequestTest > testSimpleProduceRequest STARTED

kafka.server.ProduceRequestTest > testSimpleProduceRequest PASSED

kafka.server.ProduceRequestTest > testCorruptLz4ProduceRequest STARTED

kafka.server.ProduceRequestTest > testCorruptLz4ProduceRequest PASSED

kafka.server.ReplicaManagerTest > testHighWaterMarkDirectoryMapping STARTED

kafka.server.ReplicaManagerTest > testHighWaterMarkDirectoryMapping PASSED

kafka.server.ReplicaManagerTest > 
testFetchBeyondHighWatermarkReturnEmptyResponse STARTED

kafka.server.ReplicaManagerTest > 
testFetchBeyondHighWatermarkReturnEmptyResponse PASSED

kafka.server.ReplicaManagerTest > testIllegalRequiredAcks STARTED

kafka.server.ReplicaManagerTest > testIllegalRequiredAcks PASSED

kafka.server.ReplicaManagerTest > testClearPurgatoryOnBecomingFollower STARTED

kafka.server.ReplicaManagerTest > testClearPurgatoryOnBecomingFollower PASSED

kafka.server.ReplicaManagerTest > testHighwaterMarkRelativeDirectoryMapping 
STARTED

kafka.server.ReplicaManagerTest > testHighwaterMarkRelativeDirectoryMapping 
PASSED

kafka.server.ReplicaManagerTest > testReadCommittedFetchLimitedAtLSO STARTED

kafka.server.ReplicaManagerTest > testReadCommittedFetchLimitedAtLSO PASSED

kafka.server.KafkaMetricReporterClusterIdTest > testClusterIdPresent STARTED

kafka.server.KafkaMetricReporterClusterIdTest > testClusterIdPresent PASSED

kafka.server.CreateTopicsRequestWithPolicyTest > testValidCreateTopicsRequests 
STARTED

kafka.server.CreateTopicsRequestWithPolicyTest > testValidCreateTopicsRequests 
PASSED

kafka.server.CreateTopicsRequestWithPolicyTest > testErrorCreateTopicsRequests 
STARTED

kafka.server.CreateTopicsRequestWithPolicyTest > testErrorCreateTopicsRequests 
PASSED

kafka.server.OffsetCommitTest > testUpdateOffsets STARTED

kafka.server.OffsetCommitTest > testUpdateOffsets PASSED

kafka.server.OffsetCommitTest > testLargeMetadataPayload STARTED

kafka.server.OffsetCommitTest > testLargeMetadataPayload PASSED

kafka.server.OffsetCommitTest > testOffsetsDeleteAfterTopicDeletion STARTED

kafka.server.OffsetCommitTest > testOffsetsDeleteAfterTopicDeletion PASSED


[jira] [Reopened] (KAFKA-5182) Transient failure: RequestQuotaTest.testResponseThrottleTime

2017-05-15 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma reopened KAFKA-5182:


This is still happening:

{code}
java.util.concurrent.ExecutionException: java.lang.AssertionError: Response not 
throttled: Client JOIN_GROUP apiKey JOIN_GROUP requests 3 requestTime 
0.009157322803159052 throttleTime 0.0
Stacktrace
java.util.concurrent.ExecutionException: java.lang.AssertionError: Response not 
throttled: Client JOIN_GROUP apiKey JOIN_GROUP requests 3 requestTime 
0.009157322803159052 throttleTime 0.0
at java.util.concurrent.FutureTask.report(FutureTask.java:122)
at java.util.concurrent.FutureTask.get(FutureTask.java:206)
at 
kafka.server.RequestQuotaTest.$anonfun$waitAndCheckResults$1(RequestQuotaTest.scala:327)
at scala.collection.immutable.List.foreach(List.scala:389)
at 
scala.collection.generic.TraversableForwarder.foreach(TraversableForwarder.scala:35)
at 
scala.collection.generic.TraversableForwarder.foreach$(TraversableForwarder.scala:35)
at scala.collection.mutable.ListBuffer.foreach(ListBuffer.scala:44)
at 
kafka.server.RequestQuotaTest.waitAndCheckResults(RequestQuotaTest.scala:325)
at 
kafka.server.RequestQuotaTest.testResponseThrottleTime(RequestQuotaTest.scala:105)
{code}

https://builds.apache.org/blue/organizations/jenkins/kafka-trunk-jdk8/detail/kafka-trunk-jdk8/1526/tests/

> Transient failure: RequestQuotaTest.testResponseThrottleTime
> 
>
> Key: KAFKA-5182
> URL: https://issues.apache.org/jira/browse/KAFKA-5182
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Xavier Léauté
>Assignee: Rajini Sivaram
> Fix For: 0.11.0.0
>
>
> Stacktrace
> {code}
> java.util.concurrent.ExecutionException: java.lang.AssertionError: Response 
> not throttled: Client JOIN_GROUP apiKey JOIN_GROUP requests 3 requestTime 
> 0.009982775502048156 throttleTime 0.0
>   at java.util.concurrent.FutureTask.report(FutureTask.java:122)
>   at java.util.concurrent.FutureTask.get(FutureTask.java:206)
>   at 
> kafka.server.RequestQuotaTest$$anonfun$waitAndCheckResults$1.apply(RequestQuotaTest.scala:326)
>   at 
> kafka.server.RequestQuotaTest$$anonfun$waitAndCheckResults$1.apply(RequestQuotaTest.scala:324)
>   at scala.collection.immutable.List.foreach(List.scala:392)
>   at 
> scala.collection.generic.TraversableForwarder$class.foreach(TraversableForwarder.scala:35)
>   at scala.collection.mutable.ListBuffer.foreach(ListBuffer.scala:45)
>   at 
> kafka.server.RequestQuotaTest.waitAndCheckResults(RequestQuotaTest.scala:324)
>   at 
> kafka.server.RequestQuotaTest.testResponseThrottleTime(RequestQuotaTest.scala:105)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecuter.runTestClass(JUnitTestClassExecuter.java:114)
>   at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecuter.execute(JUnitTestClassExecuter.java:57)
>   at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassProcessor.processTestClass(JUnitTestClassProcessor.java:66)
>   at 
> 

[jira] [Comment Edited] (KAFKA-5182) Transient failure: RequestQuotaTest.testResponseThrottleTime

2017-05-15 Thread Ismael Juma (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16010645#comment-16010645
 ] 

Ismael Juma edited comment on KAFKA-5182 at 5/15/17 2:57 PM:
-

This is still happening:

{code}
java.util.concurrent.ExecutionException: java.lang.AssertionError: Response not 
throttled: Client JOIN_GROUP apiKey JOIN_GROUP requests 3 requestTime 
0.009157322803159052 throttleTime 0.0
at java.util.concurrent.FutureTask.report(FutureTask.java:122)
at java.util.concurrent.FutureTask.get(FutureTask.java:206)
at 
kafka.server.RequestQuotaTest.$anonfun$waitAndCheckResults$1(RequestQuotaTest.scala:327)
at scala.collection.immutable.List.foreach(List.scala:389)
at 
scala.collection.generic.TraversableForwarder.foreach(TraversableForwarder.scala:35)
at 
scala.collection.generic.TraversableForwarder.foreach$(TraversableForwarder.scala:35)
at scala.collection.mutable.ListBuffer.foreach(ListBuffer.scala:44)
at 
kafka.server.RequestQuotaTest.waitAndCheckResults(RequestQuotaTest.scala:325)
at 
kafka.server.RequestQuotaTest.testResponseThrottleTime(RequestQuotaTest.scala:105)
{code}

https://builds.apache.org/blue/organizations/jenkins/kafka-trunk-jdk8/detail/kafka-trunk-jdk8/1526/tests/


was (Author: ijuma):
This is still happening:

{code}
java.util.concurrent.ExecutionException: java.lang.AssertionError: Response not 
throttled: Client JOIN_GROUP apiKey JOIN_GROUP requests 3 requestTime 
0.009157322803159052 throttleTime 0.0
Stacktrace
java.util.concurrent.ExecutionException: java.lang.AssertionError: Response not 
throttled: Client JOIN_GROUP apiKey JOIN_GROUP requests 3 requestTime 
0.009157322803159052 throttleTime 0.0
at java.util.concurrent.FutureTask.report(FutureTask.java:122)
at java.util.concurrent.FutureTask.get(FutureTask.java:206)
at 
kafka.server.RequestQuotaTest.$anonfun$waitAndCheckResults$1(RequestQuotaTest.scala:327)
at scala.collection.immutable.List.foreach(List.scala:389)
at 
scala.collection.generic.TraversableForwarder.foreach(TraversableForwarder.scala:35)
at 
scala.collection.generic.TraversableForwarder.foreach$(TraversableForwarder.scala:35)
at scala.collection.mutable.ListBuffer.foreach(ListBuffer.scala:44)
at 
kafka.server.RequestQuotaTest.waitAndCheckResults(RequestQuotaTest.scala:325)
at 
kafka.server.RequestQuotaTest.testResponseThrottleTime(RequestQuotaTest.scala:105)
{code}

https://builds.apache.org/blue/organizations/jenkins/kafka-trunk-jdk8/detail/kafka-trunk-jdk8/1526/tests/

> Transient failure: RequestQuotaTest.testResponseThrottleTime
> 
>
> Key: KAFKA-5182
> URL: https://issues.apache.org/jira/browse/KAFKA-5182
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Xavier Léauté
>Assignee: Rajini Sivaram
> Fix For: 0.11.0.0
>
>
> Stacktrace
> {code}
> java.util.concurrent.ExecutionException: java.lang.AssertionError: Response 
> not throttled: Client JOIN_GROUP apiKey JOIN_GROUP requests 3 requestTime 
> 0.009982775502048156 throttleTime 0.0
>   at java.util.concurrent.FutureTask.report(FutureTask.java:122)
>   at java.util.concurrent.FutureTask.get(FutureTask.java:206)
>   at 
> kafka.server.RequestQuotaTest$$anonfun$waitAndCheckResults$1.apply(RequestQuotaTest.scala:326)
>   at 
> kafka.server.RequestQuotaTest$$anonfun$waitAndCheckResults$1.apply(RequestQuotaTest.scala:324)
>   at scala.collection.immutable.List.foreach(List.scala:392)
>   at 
> scala.collection.generic.TraversableForwarder$class.foreach(TraversableForwarder.scala:35)
>   at scala.collection.mutable.ListBuffer.foreach(ListBuffer.scala:45)
>   at 
> kafka.server.RequestQuotaTest.waitAndCheckResults(RequestQuotaTest.scala:324)
>   at 
> kafka.server.RequestQuotaTest.testResponseThrottleTime(RequestQuotaTest.scala:105)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> 

Jenkins build is back to normal : kafka-0.10.2-jdk7 #163

2017-05-15 Thread Apache Jenkins Server
See 



[jira] [Comment Edited] (KAFKA-5242) add max_number _of_retries to exponential backoff strategy

2017-05-15 Thread Lukas Gemela (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16010974#comment-16010974
 ] 

Lukas Gemela edited comment on KAFKA-5242 at 5/15/17 5:48 PM:
--

[~mjsax] by multiple instances you mean multiple JVMs (nodes) running or 
multiple instances running within the same jvm process? 

What happened was that by accident we created two instances running within the 
single JVM process, touching the same data on hard drive:
new KafkaStreams(builder, streamsConfig).start(); 

If this is possible way how to run kafka streams then there is definitely a bug 
in locking mechanism. I've attached logfiles for this situation (clio_170511), 
unfortunately only with debug level set to INFO.

ad backoff strategy,  you can do something similar like how it's done in akka 
lib (cap it with maximal duration): 
http://doc.akka.io/japi/akka/2.4/akka/pattern/Backoff.html 

Thanks!

L.



was (Author: lukas gemela):
[~mjsax] by multiple instances you mean multiple JVMs (nodes) running or 
multiple instances running within the same jvm process? 

What happened was that by accident we created two instances running within the 
single JVM process, touching the same data on hard drive:
new KafkaStreams(builder, streamsConfig).start(); 

If this is possible way how to run kafka streams then there is definitely a bug 
in locking mechanism. I've attached logfiles for this situation, unfortunatelly 
only with debug level set to INFO.

ad backoff strategy,  you can do something similar like how it's done in akka 
lib (cap it with maximal duration): 
http://doc.akka.io/japi/akka/2.4/akka/pattern/Backoff.html 

Thanks!

L.


> add max_number _of_retries to exponential backoff strategy
> --
>
> Key: KAFKA-5242
> URL: https://issues.apache.org/jira/browse/KAFKA-5242
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Affects Versions: 0.10.2.0
>Reporter: Lukas Gemela
>Priority: Minor
> Attachments: clio_170511.log
>
>
> From time to time, during relabance we are getting a lot of exceptions saying 
> {code}
> org.apache.kafka.streams.errors.LockException: task [0_0] Failed to lock the 
> state directory: /app/db/clio/0_0
>   at 
> org.apache.kafka.streams.processor.internals.ProcessorStateManager.(ProcessorStateManager.java:102)
>  ~[kafka-streams-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.streams.processor.internals.AbstractTask.(AbstractTask.java:73)
>  ~[kafka-streams-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.streams.processor.internals.StreamTask.(StreamTask.java:108)
>  ~[kafka-streams-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.createStreamTask(StreamThread.java:834)
>  ~[kafka-streams-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread$TaskCreator.createTask(StreamThread.java:1207)
>  ~[kafka-streams-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread$AbstractTaskCreator.retryWithBackoff(StreamThread.java:1180)
>  [kafka-streams-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.addStreamTasks(StreamThread.java:937)
>  [kafka-streams-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.access$500(StreamThread.java:69)
>  [kafka-streams-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread$1.onPartitionsAssigned(StreamThread.java:236)
>  [kafka-streams-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinComplete(ConsumerCoordinator.java:255)
>  [kafka-clients-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.joinGroupIfNeeded(AbstractCoordinator.java:339)
>  [kafka-clients-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureActiveGroup(AbstractCoordinator.java:303)
>  [kafka-clients-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:286)
>  [kafka-clients-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:1030)
>  [kafka-clients-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:995) 
> [kafka-clients-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:582)
>  [kafka-streams-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:368)
>  [kafka-streams-0.10.2.0.jar!/:?]
> {code}
> (see attached logfile)
> It was actually problem on our side - we ran 

Re: KafkaStreams reports RUNNING even though all StreamThreads has crashed

2017-05-15 Thread Guozhang Wang
Hi Andreas,

This is not an intended behavior. Could you file a JIRA and describe your
Kafka broker version / Streams API version, the logs / stack traces you saw
in that ticket? I'd like to help investigate and walk you through the
process to contribute a fix if you are interested and have time.


Guozhang


On Fri, May 12, 2017 at 12:24 PM, Andreas Gabrielsson <
andreas.gabriels...@klarna.com> wrote:

> Hi All,
>
> We recently implemented a health check for a Kafka Streams based
> application. The health check is simply checking the state of Kafka Streams
> by calling KafkaStreams.state(). It reports healthy if it’s not in
> PENDING_SHUTDOWN or NOT_RUNNING states.
>
> We truly appreciate having the possibility to easily check the state of
> Kafka Streams but to our surprise we noticed that KafkaStreams.state()
> returns RUNNING even though all StreamThreads has crashed and reached
> NOT_RUNNING state. Is this intended behaviour or is it a bug? Semantically
> it seems weird to me that KafkaStreams would say it’s RUNNING when it is in
> fact not consuming anything since all underlying working threads has
> crashed.
>
> If this is intended behaviour I would appreciate an explanation of why
> that is the case. Also in that case, how could I determine if the
> consumption from Kafka hasn’t crashed?
>
> If this is not intended behaviour, how fast could I expect it to be fixed?
> I wouldn’t mind fixing it myself but I’m not sure if this is considered
> trivial or big enough to require a JIRA. Also, if I would implement a fix
> I’d like your input on what would be a reasonable solution. By just
> inspecting to code I have an idea but I’m not sure I understand all the
> implication so I’d be happy to hear your thoughts first.
>
> Thanks in advance,
> Andreas Gabrielsson
>
>


-- 
-- Guozhang


[jira] [Assigned] (KAFKA-5154) Kafka Streams throws NPE during rebalance

2017-05-15 Thread Guozhang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang reassigned KAFKA-5154:


Assignee: Matthias J. Sax

> Kafka Streams throws NPE during rebalance
> -
>
> Key: KAFKA-5154
> URL: https://issues.apache.org/jira/browse/KAFKA-5154
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.2.0
>Reporter: Lukas Gemela
>Assignee: Matthias J. Sax
> Attachments: clio_reduced.gz, clio.txt.gz
>
>
> please see attached log, Kafka streams throws NullPointerException during 
> rebalance, which is caught by our custom exception handler
> {noformat}
> 2017-04-30T17:44:17,675 INFO  kafka-coordinator-heartbeat-thread | hades 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.coordinatorDead()
>  @618 - Marking the coordinator 10.210.200.144:9092 (id: 2147483644 rack: 
> null) dead for group hades
> 2017-04-30T17:44:27,395 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.onSuccess() 
> @573 - Discovered coordinator 10.210.200.144:9092 (id: 2147483644 rack: null) 
> for group hades.
> 2017-04-30T17:44:27,941 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinPrepare()
>  @393 - Revoking previously assigned partitions [poseidonIncidentFeed-27, 
> poseidonIncidentFeed-29, poseidonIncidentFeed-30, poseidonIncidentFeed-18] 
> for group hades
> 2017-04-30T17:44:27,947 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.sendJoinGroupRequest()
>  @407 - (Re-)joining group hades
> 2017-04-30T17:44:48,468 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.sendJoinGroupRequest()
>  @407 - (Re-)joining group hades
> 2017-04-30T17:44:53,628 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.sendJoinGroupRequest()
>  @407 - (Re-)joining group hades
> 2017-04-30T17:45:09,587 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.sendJoinGroupRequest()
>  @407 - (Re-)joining group hades
> 2017-04-30T17:45:11,961 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.onSuccess() 
> @375 - Successfully joined group hades with generation 99
> 2017-04-30T17:45:13,126 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinComplete()
>  @252 - Setting newly assigned partitions [poseidonIncidentFeed-11, 
> poseidonIncidentFeed-27, poseidonIncidentFeed-25, poseidonIncidentFeed-29, 
> poseidonIncidentFeed-19, poseidonIncidentFeed-18] for group hades
> 2017-04-30T17:46:37,254 INFO  kafka-coordinator-heartbeat-thread | hades 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.coordinatorDead()
>  @618 - Marking the coordinator 10.210.200.144:9092 (id: 2147483644 rack: 
> null) dead for group hades
> 2017-04-30T18:04:25,993 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.onSuccess() 
> @573 - Discovered coordinator 10.210.200.144:9092 (id: 2147483644 rack: null) 
> for group hades.
> 2017-04-30T18:04:29,401 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinPrepare()
>  @393 - Revoking previously assigned partitions [poseidonIncidentFeed-11, 
> poseidonIncidentFeed-27, poseidonIncidentFeed-25, poseidonIncidentFeed-29, 
> poseidonIncidentFeed-19, poseidonIncidentFeed-18] for group hades
> 2017-04-30T18:05:10,877 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.sendJoinGroupRequest()
>  @407 - (Re-)joining group hades
> 2017-05-01T00:01:55,707 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.coordinatorDead()
>  @618 - Marking the coordinator 10.210.200.144:9092 (id: 2147483644 rack: 
> null) dead for group hades
> 2017-05-01T00:01:59,027 INFO  StreamThread-1 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.onSuccess() 
> @573 - Discovered coordinator 10.210.200.144:9092 (id: 2147483644 rack: null) 
> for group hades.
> 2017-05-01T00:01:59,031 ERROR StreamThread-1 
> org.apache.kafka.streams.processor.internals.StreamThread.run() @376 - 
> stream-thread [StreamThread-1] Streams application error during processing:
>  java.lang.NullPointerException
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:619)
>  ~[kafka-streams-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:368)
>  [kafka-streams-0.10.2.0.jar!/:?]
> 2017-05-01T00:02:00,038 INFO  StreamThread-1 
> org.apache.kafka.clients.producer.KafkaProducer.close() @689 - Closing the 
> Kafka producer with 

[jira] [Created] (KAFKA-5248) Remove retention time from TxnOffsetCommit RPC

2017-05-15 Thread Jason Gustafson (JIRA)
Jason Gustafson created KAFKA-5248:
--

 Summary: Remove retention time from TxnOffsetCommit RPC
 Key: KAFKA-5248
 URL: https://issues.apache.org/jira/browse/KAFKA-5248
 Project: Kafka
  Issue Type: Sub-task
Reporter: Jason Gustafson
Assignee: Jason Gustafson


We added offset retention time because OffsetCommitRequest had it. However, the 
new consumer has never exposed this and we have no plan of exposing it in the 
producer, so we may as well remove it. If we need it later, we can bump the 
protocol.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-5248) Remove retention time from TxnOffsetCommit RPC

2017-05-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16011027#comment-16011027
 ] 

ASF GitHub Bot commented on KAFKA-5248:
---

GitHub user hachikuji opened a pull request:

https://github.com/apache/kafka/pull/3058

KAFKA-5248: Remove unused/unneeded retention time in TxnOffsetCommitRequest



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/hachikuji/kafka KAFKA-5248

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3058.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3058


commit 9290f30b60d1264697494aaacda2d485ee6b237c
Author: Jason Gustafson 
Date:   2017-05-15T18:13:11Z

KAFKA-5248: Remove unused/unneeded retention time in TxnOffsetCommitRequest




> Remove retention time from TxnOffsetCommit RPC
> --
>
> Key: KAFKA-5248
> URL: https://issues.apache.org/jira/browse/KAFKA-5248
> Project: Kafka
>  Issue Type: Sub-task
>  Components: clients, core, producer 
>Reporter: Jason Gustafson
>Assignee: Jason Gustafson
> Fix For: 0.11.0.0
>
>
> We added offset retention time because OffsetCommitRequest had it. However, 
> the new consumer has never exposed this and we have no plan of exposing it in 
> the producer, so we may as well remove it. If we need it later, we can bump 
> the protocol.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (KAFKA-5206) RocksDBSessionStore doesn't use default aggSerde.

2017-05-15 Thread Guozhang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang resolved KAFKA-5206.
--
   Resolution: Fixed
Fix Version/s: 0.10.2.2
   0.11.0.0

Issue resolved by pull request 2971
[https://github.com/apache/kafka/pull/2971]

> RocksDBSessionStore doesn't use default aggSerde.
> -
>
> Key: KAFKA-5206
> URL: https://issues.apache.org/jira/browse/KAFKA-5206
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.2.0
>Reporter: Kyle Winkelman
>Assignee: Kyle Winkelman
> Fix For: 0.11.0.0, 0.10.2.2
>
>
> RocksDBSessionStore wasn't properly using the default aggSerde if no Serde 
> was supplied.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (KAFKA-5249) Transaction index recovery does not snapshot properly

2017-05-15 Thread Jason Gustafson (JIRA)
Jason Gustafson created KAFKA-5249:
--

 Summary: Transaction index recovery does not snapshot properly
 Key: KAFKA-5249
 URL: https://issues.apache.org/jira/browse/KAFKA-5249
 Project: Kafka
  Issue Type: Bug
Reporter: Jason Gustafson
Assignee: Jason Gustafson


When recovering the transaction index, we should take snapshots of the producer 
state after recovering each segment. Currently, the snapshot offset is not 
updated correctly so we will reread the segment multiple times. Additionally, 
it appears that we do not remove snapshots with offsets higher than the log end 
offset in all cases upon truncation.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (KAFKA-5245) KStream builder should capture serdes

2017-05-15 Thread Yeva Byzek (JIRA)
Yeva Byzek created KAFKA-5245:
-

 Summary: KStream builder should capture serdes 
 Key: KAFKA-5245
 URL: https://issues.apache.org/jira/browse/KAFKA-5245
 Project: Kafka
  Issue Type: Improvement
  Components: streams
Affects Versions: 0.10.2.1, 0.10.2.0
Reporter: Yeva Byzek
Priority: Minor


Even if one specifies a serdes in `builder.stream`, later a call to 
`groupByKey` may require the serdes again if it differs from the configured 
streams app serdes. The preferred behavior is that if no serdes is provided to 
`groupByKey`, it should use whatever was provided in `builder.stream` and not 
what was in the app.

>From the current docs:
“When to set explicit serdes: Variants of groupByKey exist to override the 
configured default serdes of your application, which you must do if the key 
and/or value types of the resulting KGroupedStream do not match the configured 
default serdes.”




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-5247) Consumer GroupCoordinator should continue to materialize committed offsets in offset order even for transactional offset commits

2017-05-15 Thread Apurva Mehta (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16010959#comment-16010959
 ] 

Apurva Mehta commented on KAFKA-5247:
-

Note that this is only an issue when we mix offset commits from the consumer 
and from the transactional producer, which may conceivably be the case during 
an upgrade. If we don't fix this it is possible for the offset commits to 
change after compaction.

For instance, if a transactional offset commit has position 100 in the log, and 
consumer offset commit has position 101. After the transaction is committed, 
the offset commit at position 100 will be materialized. If the coordinator 
changes, the logic stays the same and the offset at position 100 will continue 
to be materialized. However,  if compaction kicks in, the offset at position 
101 will now be materialized. This bug is about fixing this inconsistency so 
that the same offset commit is always materialized in a stable state.

> Consumer GroupCoordinator should continue to materialize committed offsets in 
> offset order even for transactional offset commits
> 
>
> Key: KAFKA-5247
> URL: https://issues.apache.org/jira/browse/KAFKA-5247
> Project: Kafka
>  Issue Type: Bug
>Reporter: Apurva Mehta
>  Labels: exactly-once
> Fix For: 0.11.0.0
>
>
> In the TxnOffsetCommit patch, we thought it was ok for the group coordinator 
> to use "transaction order" semantics when updating the cache, but we weren't 
> thinking about the log cleaner.
> The log cleaner uses offset order when cleaning which means that the key with 
> the largest offset always wins. So if we use transaction order when 
> dynamically updating the cache, we will get different results from when we're 
> loading the cache (even if the loading logic also uses transaction order).
> The fix should be straightforward: we need to remember the offset in the 
> offsets topic of the offset that we cache. Then we only update it if the new 
> entry has a higher offset.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-5242) add max_number _of_retries to exponential backoff strategy

2017-05-15 Thread Lukas Gemela (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16010974#comment-16010974
 ] 

Lukas Gemela commented on KAFKA-5242:
-

[~mjsax] by multiple instances you mean multiple JVMs (nodes) running or 
multiple instances running within the same jvm process? 

What happened was that by accident we created two instances running within the 
single JVM process, touching the same data on hard drive:
new KafkaStreams(builder, streamsConfig).start(); 

If this is possible way how to run kafka streams then there is definitely a bug 
in locking mechanism. I've attached logfiles for this situation, unfortunatelly 
only with debug level set to INFO.

ad backoff strategy,  you can do something similar like how it's done in akka 
lib (cap it with maximal duration): 
http://doc.akka.io/japi/akka/2.4/akka/pattern/Backoff.html 

Thanks!

L.


> add max_number _of_retries to exponential backoff strategy
> --
>
> Key: KAFKA-5242
> URL: https://issues.apache.org/jira/browse/KAFKA-5242
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Affects Versions: 0.10.2.0
>Reporter: Lukas Gemela
>Priority: Minor
>
> From time to time, during relabance we are getting a lot of exceptions saying 
> {code}
> org.apache.kafka.streams.errors.LockException: task [0_0] Failed to lock the 
> state directory: /app/db/clio/0_0
>   at 
> org.apache.kafka.streams.processor.internals.ProcessorStateManager.(ProcessorStateManager.java:102)
>  ~[kafka-streams-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.streams.processor.internals.AbstractTask.(AbstractTask.java:73)
>  ~[kafka-streams-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.streams.processor.internals.StreamTask.(StreamTask.java:108)
>  ~[kafka-streams-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.createStreamTask(StreamThread.java:834)
>  ~[kafka-streams-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread$TaskCreator.createTask(StreamThread.java:1207)
>  ~[kafka-streams-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread$AbstractTaskCreator.retryWithBackoff(StreamThread.java:1180)
>  [kafka-streams-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.addStreamTasks(StreamThread.java:937)
>  [kafka-streams-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.access$500(StreamThread.java:69)
>  [kafka-streams-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread$1.onPartitionsAssigned(StreamThread.java:236)
>  [kafka-streams-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinComplete(ConsumerCoordinator.java:255)
>  [kafka-clients-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.joinGroupIfNeeded(AbstractCoordinator.java:339)
>  [kafka-clients-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureActiveGroup(AbstractCoordinator.java:303)
>  [kafka-clients-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:286)
>  [kafka-clients-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:1030)
>  [kafka-clients-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:995) 
> [kafka-clients-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:582)
>  [kafka-streams-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:368)
>  [kafka-streams-0.10.2.0.jar!/:?]
> {code}
> (see attached logfile)
> It was actually problem on our side - we ran startStreams() twice and 
> therefore we had two threads touching the same folder structure. 
> But what I've noticed, the backoff strategy in 
> StreamThread$AbstractTaskCreator.retryWithBackoff can run endlessly - after 
> 20 iterations it takes 6hours until the next attempt to start a task. 
> I've noticed latest code contains check for rebalanceTimeoutMs, but that 
> still does not solve the problem especially in case 
> MAX_POLL_INTERVAL_MS_CONFIG is set to Integer.MAX_INT. at this stage kafka 
> streams just hangs up indefinitely.
> I would personally make that backoffstrategy a bit more configurable with a 
> number of retries that if it exceed a configured value it propagates the 
> exception as any other exception to custom client exception handler.
> (I can provide a patch)



--
This message was sent by Atlassian JIRA

[jira] [Commented] (KAFKA-5246) Remove backdoor that allows any client to produce to internal topics

2017-05-15 Thread Jun Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16010990#comment-16010990
 ] 

Jun Rao commented on KAFKA-5246:


[~BigAndy], thanks for your interest. Just added you to Kafka contributor's 
list.

>  Remove backdoor that allows any client to produce to internal topics
> -
>
> Key: KAFKA-5246
> URL: https://issues.apache.org/jira/browse/KAFKA-5246
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.10.0.0, 0.10.0.1, 0.10.1.0, 0.10.1.1, 0.10.2.0, 
> 0.10.2.1
>Reporter: Andy Coates
>Priority: Minor
>
> kafka.admim.AdminUtils defines an ‘AdminClientId' val, which looks to be 
> unused in the code, with the exception of a single use in KafkaAPis.scala in 
> handleProducerRequest, where is looks to allow any client, using the special 
> ‘__admin_client' client id, to append to internal topics.
> This looks like a security risk to me, as it would allow any client to 
> produce either rouge offsets or even a record containing something other than 
> group/offset info.
> Can we remove this please?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] kafka pull request #3041: MINOR: Eliminate PID terminology from non test cod...

2017-05-15 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/3041


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-5244) Tests which delete singleton metrics break subsequent metrics tests

2017-05-15 Thread Rajini Sivaram (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16010892#comment-16010892
 ] 

Rajini Sivaram commented on KAFKA-5244:
---

[~ijuma] Yes, that makes sense. Thank you.

> Tests which delete singleton metrics break subsequent metrics tests
> ---
>
> Key: KAFKA-5244
> URL: https://issues.apache.org/jira/browse/KAFKA-5244
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Reporter: Rajini Sivaram
>Assignee: Rajini Sivaram
> Fix For: 0.11.0.0
>
>
> Static metrics like {{BrokerTopicStats.ReplicationBytesInPerSec}} are created 
> in a singleton, resulting in one metric being created in a JVM. Some tests 
> like {{MetricsDuringTopicCreationDeletionTest}} delete all metrics from the 
> static metrics registry. The singleton metrics don't get recreated and 
> subsequent tests relying on these metrics may fail.
> Singleton metrics make testing hard - we have no idea what metrics are being 
> tested. Not sure we want to change that though since there is a lot of code 
> that relies on this. But we have to fix tests to ensure that metrics are left 
> in a good state.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (KAFKA-4850) RocksDb cannot use Bloom Filters

2017-05-15 Thread Bharat Viswanadham (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham updated KAFKA-4850:
--
Status: Patch Available  (was: In Progress)

> RocksDb cannot use Bloom Filters
> 
>
> Key: KAFKA-4850
> URL: https://issues.apache.org/jira/browse/KAFKA-4850
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Affects Versions: 0.10.2.0
>Reporter: Eno Thereska
>Assignee: Bharat Viswanadham
> Fix For: 0.11.0.0
>
>
> Bloom Filters would speed up RocksDb lookups. However they currently do not 
> work in RocksDb 5.0.2. This has been fixed in trunk, but we'll have to wait 
> until that is released and tested. 
> Then we can add the line in RocksDbStore.java in openDb:
> tableConfig.setFilter(new BloomFilter(10));



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-5203) Percentilles are calculated incorrectly

2017-05-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16010987#comment-16010987
 ] 

ASF GitHub Bot commented on KAFKA-5203:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/3002


> Percentilles are calculated incorrectly
> ---
>
> Key: KAFKA-5203
> URL: https://issues.apache.org/jira/browse/KAFKA-5203
> Project: Kafka
>  Issue Type: Bug
>  Components: metrics
>Reporter: Ivan A. Melnikov
>Priority: Minor
> Fix For: 0.11.0.0
>
>
> After the samples are purged couple of times, the calculated percentile 
> values tend to decrease comparing to the expected values.
> Consider the following simple example (sorry, idk if I can make it shorter):
> {code}
> int buckets = 100;
> Metrics metrics = new Metrics(new 
> MetricConfig().eventWindow(buckets/2).samples(2));
> Sensor sensor = metrics.sensor("test");
> sensor.add(new Percentiles(4 * buckets, 100.0, 
> Percentiles.BucketSizing.CONSTANT,
> new Percentile(metrics.metricName("test.p50", "grp1"), 50),
> new Percentile(metrics.metricName("test.p75", "grp1"), 75)));
> Metric p50 = metrics.metrics().get(metrics.metricName("test.p50", 
> "grp1"));
> Metric p75 = metrics.metrics().get(metrics.metricName("test.p75", 
> "grp1"));
> for (int i = 0; i < buckets; i++) sensor.record(i);
> System.out.printf("p50=%.3f p75=%.3f\n", p50.value(), p75.value());
> for (int i = 0; i < buckets; i++) sensor.record(i);
> System.out.printf("p50=%.3f p75=%.3f\n", p50.value(), p75.value());
> for (int i = 0; i < buckets; i++) sensor.record(i);
> System.out.printf("p50=%.3f p75=%.3f\n", p50.value(), p75.value());
> {code}
> The output from this is:
> {noformat}
> p50=50.000 p75=74.490
> p50=24.490 p75=36.735
> p50=15.306 p75=24.490
> {noformat}
> The expected output is, of course, with all three lines similar to the first 
> one.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] kafka pull request #3059: KAFKA-5244: Refactor BrokerTopicStats and Controll...

2017-05-15 Thread ijuma
GitHub user ijuma opened a pull request:

https://github.com/apache/kafka/pull/3059

KAFKA-5244: Refactor BrokerTopicStats and ControllerStats so that they are 
classes

This removes the need to force object initialisation via hacks to register
the relevant metrics during start-up.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ijuma/kafka 
kafka-5244-broker-static-stats-and-controller-stats-as-classes

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3059.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3059


commit 94340b85e4095cc8dad7ed18d9c50124d400d753
Author: Ismael Juma 
Date:   2017-05-05T13:06:08Z

Refactor BrokerTopicStats and ControllerStats so that they are classes

This removes the need to force object initialisation via hacks to register
the relevant metrics during start-up.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Build failed in Jenkins: kafka-trunk-jdk8 #1527

2017-05-15 Thread Apache Jenkins Server
See 


Changes:

[junrao] KAFKA-5203; Metrics: fix resetting of histogram sample

--
[...truncated 860.42 KB...]
kafka.server.KafkaConfigTest > testLogRollTimeMsProvided PASSED

kafka.server.KafkaConfigTest > testUncleanLeaderElectionDefault STARTED

kafka.server.KafkaConfigTest > testUncleanLeaderElectionDefault PASSED

kafka.server.KafkaConfigTest > testInvalidAdvertisedListenersProtocol STARTED

kafka.server.KafkaConfigTest > testInvalidAdvertisedListenersProtocol PASSED

kafka.server.KafkaConfigTest > testUncleanElectionEnabled STARTED

kafka.server.KafkaConfigTest > testUncleanElectionEnabled PASSED

kafka.server.KafkaConfigTest > testAdvertisePortDefault STARTED

kafka.server.KafkaConfigTest > testAdvertisePortDefault PASSED

kafka.server.KafkaConfigTest > testVersionConfiguration STARTED

kafka.server.KafkaConfigTest > testVersionConfiguration PASSED

kafka.server.KafkaConfigTest > testEqualAdvertisedListenersProtocol STARTED

kafka.server.KafkaConfigTest > testEqualAdvertisedListenersProtocol PASSED

kafka.server.IsrExpirationTest > testIsrExpirationForSlowFollowers STARTED

kafka.server.IsrExpirationTest > testIsrExpirationForSlowFollowers PASSED

kafka.server.IsrExpirationTest > testIsrExpirationForStuckFollowers STARTED

kafka.server.IsrExpirationTest > testIsrExpirationForStuckFollowers PASSED

kafka.server.IsrExpirationTest > testIsrExpirationIfNoFetchRequestMade STARTED

kafka.server.IsrExpirationTest > testIsrExpirationIfNoFetchRequestMade PASSED

kafka.server.ReplicationQuotasTest > 
shouldBootstrapTwoBrokersWithLeaderThrottle STARTED

kafka.server.ReplicationQuotasTest > 
shouldBootstrapTwoBrokersWithLeaderThrottle PASSED

kafka.server.ReplicationQuotasTest > shouldThrottleOldSegments STARTED

kafka.server.ReplicationQuotasTest > shouldThrottleOldSegments PASSED

kafka.server.ReplicationQuotasTest > 
shouldBootstrapTwoBrokersWithFollowerThrottle STARTED

kafka.server.ReplicationQuotasTest > 
shouldBootstrapTwoBrokersWithFollowerThrottle PASSED

kafka.server.ServerStartupTest > testBrokerStateRunningAfterZK STARTED

kafka.server.ServerStartupTest > testBrokerStateRunningAfterZK PASSED

kafka.server.ServerStartupTest > testBrokerCreatesZKChroot STARTED

kafka.server.ServerStartupTest > testBrokerCreatesZKChroot PASSED

kafka.server.ServerStartupTest > testConflictBrokerStartupWithSamePort STARTED

kafka.server.ServerStartupTest > testConflictBrokerStartupWithSamePort PASSED

kafka.server.ServerStartupTest > testConflictBrokerRegistration STARTED

kafka.server.ServerStartupTest > testConflictBrokerRegistration PASSED

kafka.server.ServerStartupTest > testBrokerSelfAware STARTED

kafka.server.ServerStartupTest > testBrokerSelfAware PASSED

kafka.server.ProduceRequestTest > testSimpleProduceRequest STARTED

kafka.server.ProduceRequestTest > testSimpleProduceRequest PASSED

kafka.server.ProduceRequestTest > testCorruptLz4ProduceRequest STARTED

kafka.server.ProduceRequestTest > testCorruptLz4ProduceRequest PASSED

kafka.server.ReplicaManagerTest > testHighWaterMarkDirectoryMapping STARTED

kafka.server.ReplicaManagerTest > testHighWaterMarkDirectoryMapping PASSED

kafka.server.ReplicaManagerTest > 
testFetchBeyondHighWatermarkReturnEmptyResponse STARTED

kafka.server.ReplicaManagerTest > 
testFetchBeyondHighWatermarkReturnEmptyResponse PASSED

kafka.server.ReplicaManagerTest > testIllegalRequiredAcks STARTED

kafka.server.ReplicaManagerTest > testIllegalRequiredAcks PASSED

kafka.server.ReplicaManagerTest > testClearPurgatoryOnBecomingFollower STARTED

kafka.server.ReplicaManagerTest > testClearPurgatoryOnBecomingFollower PASSED

kafka.server.ReplicaManagerTest > testHighwaterMarkRelativeDirectoryMapping 
STARTED

kafka.server.ReplicaManagerTest > testHighwaterMarkRelativeDirectoryMapping 
PASSED

kafka.server.ReplicaManagerTest > testReadCommittedFetchLimitedAtLSO STARTED

kafka.server.ReplicaManagerTest > testReadCommittedFetchLimitedAtLSO PASSED

kafka.server.KafkaMetricReporterClusterIdTest > testClusterIdPresent STARTED

kafka.server.KafkaMetricReporterClusterIdTest > testClusterIdPresent PASSED

kafka.server.CreateTopicsRequestWithPolicyTest > testValidCreateTopicsRequests 
STARTED

kafka.server.CreateTopicsRequestWithPolicyTest > testValidCreateTopicsRequests 
PASSED

kafka.server.CreateTopicsRequestWithPolicyTest > testErrorCreateTopicsRequests 
STARTED

kafka.server.CreateTopicsRequestWithPolicyTest > testErrorCreateTopicsRequests 
PASSED

kafka.server.OffsetCommitTest > testUpdateOffsets STARTED

kafka.server.OffsetCommitTest > testUpdateOffsets PASSED

kafka.server.OffsetCommitTest > testLargeMetadataPayload STARTED

kafka.server.OffsetCommitTest > testLargeMetadataPayload PASSED

kafka.server.OffsetCommitTest > testOffsetsDeleteAfterTopicDeletion STARTED

kafka.server.OffsetCommitTest > testOffsetsDeleteAfterTopicDeletion PASSED

kafka.server.OffsetCommitTest > 

Re: Can someone assign, or give me permission to assign myself to a Jira please?

2017-05-15 Thread Guozhang Wang
Andy,

I can see you now in the contributor list. Please feel free to assign JIRAs
to yourself now.


Guozhang

On Mon, May 15, 2017 at 10:49 AM, Andrew Coates 
wrote:

> Hi,
>
> I have a patch for KAFKA-5246  jira/browse/KAFKA-5246>, but don’t yet have permissions to assign the
> Jira to myself. Would someone mind either assigning it, or giving me
> permissions to assign it to myself?
>
> Thanks,
>
> Andy




-- 
-- Guozhang


[jira] [Updated] (KAFKA-5247) Consumer GroupCoordinator should continue to materialize committed offsets in offset order even for transactional offset commits

2017-05-15 Thread Apurva Mehta (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apurva Mehta updated KAFKA-5247:

Issue Type: Sub-task  (was: Bug)
Parent: KAFKA-4815

> Consumer GroupCoordinator should continue to materialize committed offsets in 
> offset order even for transactional offset commits
> 
>
> Key: KAFKA-5247
> URL: https://issues.apache.org/jira/browse/KAFKA-5247
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Apurva Mehta
>  Labels: exactly-once
> Fix For: 0.11.0.0
>
>
> In the TxnOffsetCommit patch, we thought it was ok for the group coordinator 
> to use "transaction order" semantics when updating the cache, but we weren't 
> thinking about the log cleaner.
> The log cleaner uses offset order when cleaning which means that the key with 
> the largest offset always wins. So if we use transaction order when 
> dynamically updating the cache, we will get different results from when we're 
> loading the cache (even if the loading logic also uses transaction order).
> The fix should be straightforward: we need to remember the offset in the 
> offsets topic of the offset that we cache. Then we only update it if the new 
> entry has a higher offset.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (KAFKA-5231) TransactinoCoordinator does not bump epoch when aborting open transactions

2017-05-15 Thread Apurva Mehta (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apurva Mehta updated KAFKA-5231:

Issue Type: Sub-task  (was: Bug)
Parent: KAFKA-4815

> TransactinoCoordinator does not bump epoch when aborting open transactions
> --
>
> Key: KAFKA-5231
> URL: https://issues.apache.org/jira/browse/KAFKA-5231
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Apurva Mehta
>Assignee: Guozhang Wang
>Priority: Blocker
>  Labels: exactly-once
> Fix For: 0.11.0.0
>
>
> When the TransactionCoordinator receives an InitPidRequest when there is an 
> open transaction for a transactional id, it should first bump the epoch and 
> then abort the open transaction.
> Currently, it aborts the open transaction with the existing epoch, hence the 
> old producer is never fenced.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] kafka pull request #3057: KAFKA-5182: Reduce session timeout in request quot...

2017-05-15 Thread rajinisivaram
GitHub user rajinisivaram opened a pull request:

https://github.com/apache/kafka/pull/3057

KAFKA-5182: Reduce session timeout in request quota test

Reduce session timeouts for join requests to trigger throttling in the 
request quota test.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/rajinisivaram/kafka KAFKA-5182-quotatest

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3057.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3057


commit 7885027108e9841490181b58b759d5a61cba84fc
Author: Rajini Sivaram 
Date:   2017-05-15T17:10:56Z

KAFKA-5182: Reduce session timeout in request quota test




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-5246) Remove backdoor that allows any client to produce to internal topics

2017-05-15 Thread Andy Coates (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16010948#comment-16010948
 ] 

Andy Coates commented on KAFKA-5246:


I've got a patch waiting for this is someone can assign this Jira to me and 
give me what ever permissions are needed to allow me to raise a PR in Github.

Thanks,

Andy

>  Remove backdoor that allows any client to produce to internal topics
> -
>
> Key: KAFKA-5246
> URL: https://issues.apache.org/jira/browse/KAFKA-5246
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.10.0.0, 0.10.0.1, 0.10.1.0, 0.10.1.1, 0.10.2.0, 
> 0.10.2.1
>Reporter: Andy Coates
>Priority: Minor
>
> kafka.admim.AdminUtils defines an ‘AdminClientId' val, which looks to be 
> unused in the code, with the exception of a single use in KafkaAPis.scala in 
> handleProducerRequest, where is looks to allow any client, using the special 
> ‘__admin_client' client id, to append to internal topics.
> This looks like a security risk to me, as it would allow any client to 
> produce either rouge offsets or even a record containing something other than 
> group/offset info.
> Can we remove this please?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] kafka pull request #3002: KAFKA-5203: Metrics: fix resetting of histogram sa...

2017-05-15 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/3002


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Resolved] (KAFKA-5203) Percentilles are calculated incorrectly

2017-05-15 Thread Jun Rao (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jun Rao resolved KAFKA-5203.

   Resolution: Fixed
Fix Version/s: 0.11.0.0

Issue resolved by pull request 3002
[https://github.com/apache/kafka/pull/3002]

> Percentilles are calculated incorrectly
> ---
>
> Key: KAFKA-5203
> URL: https://issues.apache.org/jira/browse/KAFKA-5203
> Project: Kafka
>  Issue Type: Bug
>  Components: metrics
>Reporter: Ivan A. Melnikov
>Priority: Minor
> Fix For: 0.11.0.0
>
>
> After the samples are purged couple of times, the calculated percentile 
> values tend to decrease comparing to the expected values.
> Consider the following simple example (sorry, idk if I can make it shorter):
> {code}
> int buckets = 100;
> Metrics metrics = new Metrics(new 
> MetricConfig().eventWindow(buckets/2).samples(2));
> Sensor sensor = metrics.sensor("test");
> sensor.add(new Percentiles(4 * buckets, 100.0, 
> Percentiles.BucketSizing.CONSTANT,
> new Percentile(metrics.metricName("test.p50", "grp1"), 50),
> new Percentile(metrics.metricName("test.p75", "grp1"), 75)));
> Metric p50 = metrics.metrics().get(metrics.metricName("test.p50", 
> "grp1"));
> Metric p75 = metrics.metrics().get(metrics.metricName("test.p75", 
> "grp1"));
> for (int i = 0; i < buckets; i++) sensor.record(i);
> System.out.printf("p50=%.3f p75=%.3f\n", p50.value(), p75.value());
> for (int i = 0; i < buckets; i++) sensor.record(i);
> System.out.printf("p50=%.3f p75=%.3f\n", p50.value(), p75.value());
> for (int i = 0; i < buckets; i++) sensor.record(i);
> System.out.printf("p50=%.3f p75=%.3f\n", p50.value(), p75.value());
> {code}
> The output from this is:
> {noformat}
> p50=50.000 p75=74.490
> p50=24.490 p75=36.735
> p50=15.306 p75=24.490
> {noformat}
> The expected output is, of course, with all three lines similar to the first 
> one.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: [DISCUSS] KIP-148: Add a connect timeout for client

2017-05-15 Thread Guozhang Wang
Hi David,

I may be a bit confused before, just clarifying a few things:

1. As you mentioned, a client will always try to first establish the
connection with a broker node before it tries to send any request to it.
And after connection is established, it will either continuously send many
requests (e.g. produce) for just a single request (e.g. metadata) to the
broker, so these two phases are indeed different.

2. In the connected phase, connections.max.idle.ms is used to
auto-disconnect the socket if no requests has been sent / received during
that period of time; in the connecting phase, we always try to create the
socket via "socketChannel.connect" in a non-blocking call, and then checks
if the connection has been established, but all the callers of this
function (in either producer or consumer) has a timeout parameter as in
`selector.poll()`, and the timeout parameter is set either by calculations
based on metadata.expiration.time and backoff for producer#sender, or by
directly passed values from consumer#poll(timeout), so although there is no
directly config controlling that, users can still control how much time in
maximum to wait for inside code.

I originally thought your scenarios is more on the connected phase, but now
I feel you are talking about the connecting phase. For that case, I still
feel currently the timeout value passed in `selector.poll()` which is
controllable from user code should be sufficient?


Guozhang




On Sun, May 14, 2017 at 2:37 AM, 东方甲乙 <254479...@qq.com> wrote:

> Hi Guozhang,
>
>
> Sorry for the delay, thanks for the question.  It seems two different
> parameters to me:
> connect.timeout.ms: only work for the connecting phrase, after connected
> phrase this parameter is not used.
> connections.max.idle.ms: currently not work in the connecting phrase
> (only select return readyKeys >0) will add to the expired manager, after
> connected will check if the connection is still alive in some time.
>
>
> Even if we change the connections.max.idle.ms to work including the
> connecting phrase, we can not set this parameter to a small value, such as
> 5 seconds. Because the client is maybe busy sending message to other node,
> it will be disconnected in 5 seconds, so the default value of
> connections.max.idle.ms is setting to a larger time. We should have two
> parameters to control the connecting phrase behavior and the connected
> phrase behavior, do you think so?
>
>
> Thanks,
>
>
> David
>
>
>
>
> -- 原始邮件 --
> 发件人: "Guozhang Wang";;
> 发送时间: 2017年5月6日(星期六) 上午7:52
> 收件人: "dev@kafka.apache.org";
>
> 主题: Re: [DISCUSS] KIP-148: Add a connect timeout for client
>
>
>
> Hello David,
>
> Thanks for the KIP. For the described issue, I'm wondering if it can be
> resolved by tuning the CONNECTIONS_MAX_IDLE_MS_CONFIG (
> connections.max.idle.ms) on the client side? Default is 9 minutes.
>
>
> Guozhang
>
> On Tue, May 2, 2017 at 8:22 AM, 东方甲乙 <254479...@qq.com> wrote:
>
> > Hi all,
> >
> > Currently in our test environment, we found that after one of the broker
> > node crash (reboot or os crash), the client may still be connecting to
> the
> > crash node to send metadata request or other request, and it needs
> several
> > minutes to be aware that the connection is timeout then try another node
> to
> > connect to send the request. Then the client may still not be aware of
> the
> > metadata change after several minutes.
> >
> >
> > So I want to add a connect timeout on the  client,  please take a look
> at:
> >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > 148%3A+Add+a+connect+timeout+for+client
> >
> > Regards,
> >
> > David
>
>
>
>
> --
> -- Guozhang
>



-- 
-- Guozhang


Can someone assign, or give me permission to assign myself to a Jira please?

2017-05-15 Thread Andrew Coates
Hi,

I have a patch for KAFKA-5246 
, but don’t yet have 
permissions to assign the Jira to myself. Would someone mind either assigning 
it, or giving me permissions to assign it to myself?

Thanks,

Andy

[jira] [Commented] (KAFKA-5242) add max_number _of_retries to exponential backoff strategy

2017-05-15 Thread Matthias J. Sax (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16011025#comment-16011025
 ] 

Matthias J. Sax commented on KAFKA-5242:


Both should work. From a Streams perspective, it's no difference if you run 
multiple threads within one {{KafkaStreams}} instance, or multiple 
{{KafkaStreams}} instances within one JVM, or multiple JVMs (with one 
{{KafkaStreams}} instance) on the same host. This should all work, including 
any combination... How many threads to you use? If you run with a single 
thread, with a single {{KafkaStreams}} instance the issue cannot occur as you 
need at least two threads running -- this would explain why the lock issues is 
exposes by your bug starting multiple instances. Anyway, as mentioned above, 
please upgrade to {{0.10.2.1}} -- we fixed couple of lock issues their.





> add max_number _of_retries to exponential backoff strategy
> --
>
> Key: KAFKA-5242
> URL: https://issues.apache.org/jira/browse/KAFKA-5242
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Affects Versions: 0.10.2.0
>Reporter: Lukas Gemela
>Priority: Minor
> Attachments: clio_170511.log
>
>
> From time to time, during relabance we are getting a lot of exceptions saying 
> {code}
> org.apache.kafka.streams.errors.LockException: task [0_0] Failed to lock the 
> state directory: /app/db/clio/0_0
>   at 
> org.apache.kafka.streams.processor.internals.ProcessorStateManager.(ProcessorStateManager.java:102)
>  ~[kafka-streams-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.streams.processor.internals.AbstractTask.(AbstractTask.java:73)
>  ~[kafka-streams-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.streams.processor.internals.StreamTask.(StreamTask.java:108)
>  ~[kafka-streams-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.createStreamTask(StreamThread.java:834)
>  ~[kafka-streams-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread$TaskCreator.createTask(StreamThread.java:1207)
>  ~[kafka-streams-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread$AbstractTaskCreator.retryWithBackoff(StreamThread.java:1180)
>  [kafka-streams-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.addStreamTasks(StreamThread.java:937)
>  [kafka-streams-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.access$500(StreamThread.java:69)
>  [kafka-streams-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread$1.onPartitionsAssigned(StreamThread.java:236)
>  [kafka-streams-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinComplete(ConsumerCoordinator.java:255)
>  [kafka-clients-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.joinGroupIfNeeded(AbstractCoordinator.java:339)
>  [kafka-clients-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureActiveGroup(AbstractCoordinator.java:303)
>  [kafka-clients-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:286)
>  [kafka-clients-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:1030)
>  [kafka-clients-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:995) 
> [kafka-clients-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:582)
>  [kafka-streams-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:368)
>  [kafka-streams-0.10.2.0.jar!/:?]
> {code}
> (see attached logfile)
> It was actually problem on our side - we ran startStreams() twice and 
> therefore we had two threads touching the same folder structure. 
> But what I've noticed, the backoff strategy in 
> StreamThread$AbstractTaskCreator.retryWithBackoff can run endlessly - after 
> 20 iterations it takes 6hours until the next attempt to start a task. 
> I've noticed latest code contains check for rebalanceTimeoutMs, but that 
> still does not solve the problem especially in case 
> MAX_POLL_INTERVAL_MS_CONFIG is set to Integer.MAX_INT. at this stage kafka 
> streams just hangs up indefinitely.
> I would personally make that backoffstrategy a bit more configurable with a 
> number of retries that if it exceed a configured value it propagates the 
> exception as any other exception to custom client exception handler.
> (I can provide a patch)



--
This message was sent by 

[GitHub] kafka pull request #3058: KAFKA-5248: Remove unused/unneeded retention time ...

2017-05-15 Thread hachikuji
GitHub user hachikuji opened a pull request:

https://github.com/apache/kafka/pull/3058

KAFKA-5248: Remove unused/unneeded retention time in TxnOffsetCommitRequest



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/hachikuji/kafka KAFKA-5248

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3058.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3058


commit 9290f30b60d1264697494aaacda2d485ee6b237c
Author: Jason Gustafson 
Date:   2017-05-15T18:13:11Z

KAFKA-5248: Remove unused/unneeded retention time in TxnOffsetCommitRequest




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (KAFKA-5249) Transaction index recovery does not snapshot properly

2017-05-15 Thread Jason Gustafson (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Gustafson updated KAFKA-5249:
---
Issue Type: Sub-task  (was: Bug)
Parent: KAFKA-4815

> Transaction index recovery does not snapshot properly
> -
>
> Key: KAFKA-5249
> URL: https://issues.apache.org/jira/browse/KAFKA-5249
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Jason Gustafson
>Assignee: Jason Gustafson
>
> When recovering the transaction index, we should take snapshots of the 
> producer state after recovering each segment. Currently, the snapshot offset 
> is not updated correctly so we will reread the segment multiple times. 
> Additionally, it appears that we do not remove snapshots with offsets higher 
> than the log end offset in all cases upon truncation.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Build failed in Jenkins: kafka-trunk-jdk7 #2196

2017-05-15 Thread Apache Jenkins Server
See 


Changes:

[junrao] KAFKA-5203; Metrics: fix resetting of histogram sample

--
[...truncated 1.66 MB...]
org.apache.kafka.streams.processor.TopologyBuilderTest > 
shouldNotAllowOffsetResetSourceWithDuplicateSourceName STARTED

org.apache.kafka.streams.processor.TopologyBuilderTest > 
shouldNotAllowOffsetResetSourceWithDuplicateSourceName PASSED

org.apache.kafka.streams.processor.TopologyBuilderTest > 
shouldNotAllowNullProcessorSupplier STARTED

org.apache.kafka.streams.processor.TopologyBuilderTest > 
shouldNotAllowNullProcessorSupplier PASSED

org.apache.kafka.streams.processor.TopologyBuilderTest > 
shouldNotSetApplicationIdToNull STARTED

org.apache.kafka.streams.processor.TopologyBuilderTest > 
shouldNotSetApplicationIdToNull PASSED

org.apache.kafka.streams.processor.TopologyBuilderTest > testSourceTopics 
STARTED

org.apache.kafka.streams.processor.TopologyBuilderTest > testSourceTopics PASSED

org.apache.kafka.streams.processor.TopologyBuilderTest > 
shouldNotAllowNullNameWhenAddingSink STARTED

org.apache.kafka.streams.processor.TopologyBuilderTest > 
shouldNotAllowNullNameWhenAddingSink PASSED

org.apache.kafka.streams.processor.TopologyBuilderTest > 
testNamedTopicMatchesAlreadyProvidedPattern STARTED

org.apache.kafka.streams.processor.TopologyBuilderTest > 
testNamedTopicMatchesAlreadyProvidedPattern PASSED

org.apache.kafka.streams.processor.TopologyBuilderTest > 
shouldAddInternalTopicConfigWithCompactAndDeleteSetForWindowStores STARTED

org.apache.kafka.streams.processor.TopologyBuilderTest > 
shouldAddInternalTopicConfigWithCompactAndDeleteSetForWindowStores PASSED

org.apache.kafka.streams.processor.TopologyBuilderTest > 
shouldAddInternalTopicConfigWithCompactForNonWindowStores STARTED

org.apache.kafka.streams.processor.TopologyBuilderTest > 
shouldAddInternalTopicConfigWithCompactForNonWindowStores PASSED

org.apache.kafka.streams.processor.TopologyBuilderTest > 
shouldAddTimestampExtractorWithOffsetResetAndPatternPerSource STARTED

org.apache.kafka.streams.processor.TopologyBuilderTest > 
shouldAddTimestampExtractorWithOffsetResetAndPatternPerSource PASSED

org.apache.kafka.streams.processor.TopologyBuilderTest > 
testAddSinkWithSameName STARTED

org.apache.kafka.streams.processor.TopologyBuilderTest > 
testAddSinkWithSameName PASSED

org.apache.kafka.streams.processor.TopologyBuilderTest > 
testAddSinkWithSelfParent STARTED

org.apache.kafka.streams.processor.TopologyBuilderTest > 
testAddSinkWithSelfParent PASSED

org.apache.kafka.streams.processor.TopologyBuilderTest > 
testAddProcessorWithSelfParent STARTED

org.apache.kafka.streams.processor.TopologyBuilderTest > 
testAddProcessorWithSelfParent PASSED

org.apache.kafka.streams.processor.TopologyBuilderTest > 
shouldAssociateStateStoreNameWhenStateStoreSupplierIsInternal STARTED

org.apache.kafka.streams.processor.TopologyBuilderTest > 
shouldAssociateStateStoreNameWhenStateStoreSupplierIsInternal PASSED

org.apache.kafka.streams.processor.TopologyBuilderTest > 
testAddStateStoreWithSink STARTED

org.apache.kafka.streams.processor.TopologyBuilderTest > 
testAddStateStoreWithSink PASSED

org.apache.kafka.streams.processor.TopologyBuilderTest > testTopicGroups STARTED

org.apache.kafka.streams.processor.TopologyBuilderTest > testTopicGroups PASSED

org.apache.kafka.streams.processor.TopologyBuilderTest > testBuild STARTED

org.apache.kafka.streams.processor.TopologyBuilderTest > testBuild PASSED

org.apache.kafka.streams.processor.TopologyBuilderTest > 
shouldNotAllowOffsetResetSourceWithoutTopics STARTED

org.apache.kafka.streams.processor.TopologyBuilderTest > 
shouldNotAllowOffsetResetSourceWithoutTopics PASSED

org.apache.kafka.streams.processor.TopologyBuilderTest > 
shouldNotAddNullStateStoreSupplier STARTED

org.apache.kafka.streams.processor.TopologyBuilderTest > 
shouldNotAddNullStateStoreSupplier PASSED

org.apache.kafka.streams.processor.TopologyBuilderTest > 
shouldNotAllowNullNameWhenAddingSource STARTED

org.apache.kafka.streams.processor.TopologyBuilderTest > 
shouldNotAllowNullNameWhenAddingSource PASSED

org.apache.kafka.streams.processor.TopologyBuilderTest > 
shouldNotAllowNullTopicWhenAddingSink STARTED

org.apache.kafka.streams.processor.TopologyBuilderTest > 
shouldNotAllowNullTopicWhenAddingSink PASSED

org.apache.kafka.streams.processor.TopologyBuilderTest > 
shouldNotAllowToAddGlobalStoreWithSourceNameEqualsProcessorName STARTED

org.apache.kafka.streams.processor.TopologyBuilderTest > 
shouldNotAllowToAddGlobalStoreWithSourceNameEqualsProcessorName PASSED

org.apache.kafka.streams.processor.TopologyBuilderTest > 
shouldAddSourceWithOffsetReset STARTED

org.apache.kafka.streams.processor.TopologyBuilderTest > 
shouldAddSourceWithOffsetReset PASSED

org.apache.kafka.streams.processor.TopologyBuilderTest > 
testAddStateStoreWithSource STARTED


Re: [DISCUSS]: KIP-149: Enabling key access in ValueTransformer, ValueMapper, and ValueJoiner

2017-05-15 Thread Damian Guy
I'd rather not see the use of  `ProcessorContext` spread any further than
it currently is. So maybe we need another KIP that is done before this?
Otherwise i think the scope of this KIP is becoming too large.


On Mon, 15 May 2017 at 18:06 Matthias J. Sax  wrote:

> I agree that that `ProcessorContext` interface is too broad in general
> -- this is even true for transform/process, and it's also reflected in
> the API improvement list we want to do.
>
> https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Streams+Discussions
>
> So I am wondering, if you question the `RichFunction` approach in
> general? Or if you suggest to either extend the scope of this KIP to
> include this---or maybe better, do another KIP for it and delay this KIP
> until the other one is done?
>
>
> -Matthias
>
> On 5/15/17 2:35 AM, Damian Guy wrote:
> > Thanks for the KIP.
> >
> > I'm not convinced on the `RichFunction` approach. Do we really want to
> give
> > every DSL method access to the `ProcessorContext` ? It has a bunch of
> > methods on it that seem in-appropriate for some of the DSL methods, i.e,
> > `register`, `getStateStore`, `forward`, `schedule` etc. It is far too
> > broad. I think it would be better to have a narrower interface like the
> > `RecordContext`  - remembering it is easier to add methods/interfaces
> later
> > than to remove them
> >
> > On Sat, 13 May 2017 at 22:26 Matthias J. Sax 
> wrote:
> >
> >> Jeyhun,
> >>
> >> I am not an expert on Lambdas. Can you elaborate a little bit? I cannot
> >> follow the explanation in the KIP to see what the problem is.
> >>
> >> For updating the KIP title I don't know -- guess it's up to you. Maybe a
> >> committer can comment on this?
> >>
> >>
> >> Further comments:
> >>
> >>  - The KIP get a little hard to read -- can you maybe reformat the wiki
> >> page a little bit? I think using `CodeBlock` would help.
> >>
> >>  - What about KStream-KTable joins? You don't have overlaods added for
> >> them. Why? (Even if I still hope that we don't need to add any new
> >> overloads)
> >>
> >>  - Why do we need `AbstractRichFunction`?
> >>
> >>  - What about interfaces Initializer, ForeachAction, Merger, Predicate,
> >> Reducer? I don't want to say we should/need to add to all, but we should
> >> discuss all of them and add where it does make sense (e.g.,
> >> RichForachAction does make sense IMHO)
> >>
> >>
> >> Btw: I like the hierarchy `ValueXX` -- `ValueXXWithKey` -- `RichValueXX`
> >> in general -- but why can't we do all this with interfaces only?
> >>
> >>
> >>
> >> -Matthias
> >>
> >>
> >>
> >> On 5/11/17 7:00 AM, Jeyhun Karimov wrote:
> >>> Hi,
> >>>
> >>> Thanks for your comments. I think we cannot extend the two interfaces
> if
> >> we
> >>> want to keep lambdas. I updated the KIP [1]. Maybe I should change the
> >>> title, because now we are not limiting the KIP to only ValueMapper,
> >>> ValueTransformer and ValueJoiner.
> >>> Please feel free to comment.
> >>>
> >>> [1]
> >>>
> >>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-149%3A+Enabling+key+access+in+ValueTransformer%2C+ValueMapper%2C+and+ValueJoiner
> >>>
> >>>
> >>> Cheers,
> >>> Jeyhun
> >>>
> >>> On Tue, May 9, 2017 at 7:36 PM Matthias J. Sax 
> >>> wrote:
> >>>
>  If `ValueMapperWithKey` extends `ValueMapper` we don't need the new
>  overlaod.
> 
>  And yes, we need to do one check -- but this happens when building the
>  topology. At runtime (I mean after KafkaStream#start() we don't need
> any
>  check as we will always use `ValueMapperWithKey`)
> 
> 
>  -Matthias
> 
> 
>  On 5/9/17 2:55 AM, Jeyhun Karimov wrote:
> > Hi,
> >
> > Thanks for feedback.
> > Then we need to overload method
> >KStream mapValues(ValueMapper
> > mapper);
> > with
> >KStream mapValues(ValueMapperWithKey extends
>  VR>
> > mapper);
> >
> > and in runtime (inside processor) we still have to check it is
>  ValueMapper
> > or ValueMapperWithKey before wrapping it into the rich function.
> >
> >
> > Please correct me if I am wrong.
> >
> > Cheers,
> > Jeyhun
> >
> >
> >
> >
> > On Tue, May 9, 2017 at 10:56 AM Michal Borowiecki <
> > michal.borowie...@openbet.com> wrote:
> >
> >> +1 :)
> >>
> >>
> >> On 08/05/17 23:52, Matthias J. Sax wrote:
> >>> Hi,
> >>>
> >>> I was reading the updated KIP and I am wondering, if we should do
> the
> >>> design a little different.
> >>>
> >>> Instead of distinguishing between a RichFunction and
> non-RichFunction
>  at
> >>> runtime level, we would use RichFunctions all the time. Thus, on
> the
>  DSL
> >>> entry level, if a user provides a non-RichFunction, we wrap it by a
> >>> RichFunction that is fully implemented by Streams. This
> RichFunction
> >>> would just forward the 

[jira] [Created] (KAFKA-5247) Consumer GroupCoordinator should continue to materialize committed offsets in offset order even for transactional offset commits

2017-05-15 Thread Apurva Mehta (JIRA)
Apurva Mehta created KAFKA-5247:
---

 Summary: Consumer GroupCoordinator should continue to materialize 
committed offsets in offset order even for transactional offset commits
 Key: KAFKA-5247
 URL: https://issues.apache.org/jira/browse/KAFKA-5247
 Project: Kafka
  Issue Type: Bug
Reporter: Apurva Mehta
 Fix For: 0.11.0.0


In the TxnOffsetCommit patch, we thought it was ok for the group coordinator to 
use "transaction order" semantics when updating the cache, but we weren't 
thinking about the log cleaner.

The log cleaner uses offset order when cleaning which means that the key with 
the largest offset always wins. So if we use transaction order when dynamically 
updating the cache, we will get different results from when we're loading the 
cache (even if the loading logic also uses transaction order).

The fix should be straightforward: we need to remember the offset in the 
offsets topic of the offset that we cache. Then we only update it if the new 
entry has a higher offset.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] kafka pull request #2971: Kafka-5206: Remove use of aggSerde in RocksDBSessi...

2017-05-15 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/2971


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Assigned] (KAFKA-5246) Remove backdoor that allows any client to produce to internal topics

2017-05-15 Thread Andy Coates (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Coates reassigned KAFKA-5246:
--

Assignee: Andy Coates

>  Remove backdoor that allows any client to produce to internal topics
> -
>
> Key: KAFKA-5246
> URL: https://issues.apache.org/jira/browse/KAFKA-5246
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.10.0.0, 0.10.0.1, 0.10.1.0, 0.10.1.1, 0.10.2.0, 
> 0.10.2.1
>Reporter: Andy Coates
>Assignee: Andy Coates
>Priority: Minor
>
> kafka.admim.AdminUtils defines an ‘AdminClientId' val, which looks to be 
> unused in the code, with the exception of a single use in KafkaAPis.scala in 
> handleProducerRequest, where is looks to allow any client, using the special 
> ‘__admin_client' client id, to append to internal topics.
> This looks like a security risk to me, as it would allow any client to 
> produce either rouge offsets or even a record containing something other than 
> group/offset info.
> Can we remove this please?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] kafka pull request #2963: Kafka-5205: Removed use of keySerde in CachingSess...

2017-05-15 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/2963


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-5249) Transaction index recovery does not snapshot properly

2017-05-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16011162#comment-16011162
 ] 

ASF GitHub Bot commented on KAFKA-5249:
---

GitHub user hachikuji opened a pull request:

https://github.com/apache/kafka/pull/3060

KAFKA-5249: Fix incorrect producer snapshot offsets when recovering segments



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/hachikuji/kafka KAFKA-5249

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3060.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3060


commit 5018fcdca321c322850f8d7bddfbc503a7dda8a2
Author: Jason Gustafson 
Date:   2017-05-15T19:19:42Z

KAFKA-5249: Fix incorrect producer snapshot offsets when recovering segments




> Transaction index recovery does not snapshot properly
> -
>
> Key: KAFKA-5249
> URL: https://issues.apache.org/jira/browse/KAFKA-5249
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Jason Gustafson
>Assignee: Jason Gustafson
>
> When recovering the transaction index, we should take snapshots of the 
> producer state after recovering each segment. Currently, the snapshot offset 
> is not updated correctly so we will reread the segment multiple times. 
> Additionally, it appears that we do not remove snapshots with offsets higher 
> than the log end offset in all cases upon truncation.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-5244) Tests which delete singleton metrics break subsequent metrics tests

2017-05-15 Thread Ismael Juma (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16010820#comment-16010820
 ] 

Ismael Juma commented on KAFKA-5244:


Good find [~rsivaram]. Note that in https://github.com/apache/kafka/pull/2983, 
I changed `BrokerTopicStats` to be a class. The underlying metrics are still a 
singleton, but it means that we would recreate them on start-up of KafkaServer. 
Maybe that fixes this issue?

> Tests which delete singleton metrics break subsequent metrics tests
> ---
>
> Key: KAFKA-5244
> URL: https://issues.apache.org/jira/browse/KAFKA-5244
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Reporter: Rajini Sivaram
>Assignee: Rajini Sivaram
> Fix For: 0.11.0.0
>
>
> Static metrics like {{BrokerTopicStats.ReplicationBytesInPerSec}} are created 
> in a singleton, resulting in one metric being created in a JVM. Some tests 
> like {{MetricsDuringTopicCreationDeletionTest}} delete all metrics from the 
> static metrics registry. The singleton metrics don't get recreated and 
> subsequent tests relying on these metrics may fail.
> Singleton metrics make testing hard - we have no idea what metrics are being 
> tested. Not sure we want to change that though since there is a lot of code 
> that relies on this. But we have to fix tests to ensure that metrics are left 
> in a good state.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (KAFKA-5246) Remove backdoor that allows any client to produce to internal topics

2017-05-15 Thread Andy Coates (JIRA)
Andy Coates created KAFKA-5246:
--

 Summary:  Remove backdoor that allows any client to produce to 
internal topics
 Key: KAFKA-5246
 URL: https://issues.apache.org/jira/browse/KAFKA-5246
 Project: Kafka
  Issue Type: Bug
  Components: core
Affects Versions: 0.10.2.1, 0.10.2.0, 0.10.1.1, 0.10.1.0, 0.10.0.1, 0.10.0.0
Reporter: Andy Coates
Priority: Minor


kafka.admim.AdminUtils defines an ‘AdminClientId' val, which looks to be unused 
in the code, with the exception of a single use in KafkaAPis.scala in 
handleProducerRequest, where is looks to allow any client, using the special 
‘__admin_client' client id, to append to internal topics.

This looks like a security risk to me, as it would allow any client to produce 
either rouge offsets or even a record containing something other than 
group/offset info.

Can we remove this please?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (KAFKA-5205) CachingSessionStore doesn't use the default keySerde.

2017-05-15 Thread Guozhang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang resolved KAFKA-5205.
--
   Resolution: Fixed
Fix Version/s: 0.10.2.2
   0.11.0.0

> CachingSessionStore doesn't use the default keySerde.
> -
>
> Key: KAFKA-5205
> URL: https://issues.apache.org/jira/browse/KAFKA-5205
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.2.0
>Reporter: Kyle Winkelman
>Assignee: Kyle Winkelman
> Fix For: 0.11.0.0, 0.10.2.2
>
>
> CachingSessionStore wasn't properly using the default keySerde if no Serde 
> was supplied. I saw the below error in the logs for one of my test cases.
> ERROR stream-thread 
> [cogroup-integration-test-3-5570fe48-d2a3-4271-80b1-81962295553d-StreamThread-6]
>  Streams application error during processing: 
> (org.apache.kafka.streams.processor.internals.StreamThread:335)
> java.lang.NullPointerException
> at 
> org.apache.kafka.streams.state.internals.CachingSessionStore.findSessions(CachingSessionStore.java:93)
> at 
> org.apache.kafka.streams.kstream.internals.KStreamSessionWindowAggregate$KStreamSessionWindowAggregateProcessor.process(KStreamSessionWindowAggregate.java:94)
> at 
> org.apache.kafka.streams.processor.internals.ProcessorNode$1.run(ProcessorNode.java:47)
> at 
> org.apache.kafka.streams.processor.internals.StreamsMetricsImpl.measureLatencyNs(StreamsMetricsImpl.java:187)
> at 
> org.apache.kafka.streams.processor.internals.ProcessorNode.process(ProcessorNode.java:133)
> at 
> org.apache.kafka.streams.processor.internals.ProcessorContextImpl.forward(ProcessorContextImpl.java:82)
> at 
> org.apache.kafka.streams.processor.internals.SourceNode.process(SourceNode.java:69)
> at 
> org.apache.kafka.streams.processor.internals.StreamTask.process(StreamTask.java:206)
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.processAndPunctuate(StreamThread.java:657)
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:728)
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:327)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (KAFKA-5244) Tests which delete singleton metrics break subsequent metrics tests

2017-05-15 Thread Rajini Sivaram (JIRA)
Rajini Sivaram created KAFKA-5244:
-

 Summary: Tests which delete singleton metrics break subsequent 
metrics tests
 Key: KAFKA-5244
 URL: https://issues.apache.org/jira/browse/KAFKA-5244
 Project: Kafka
  Issue Type: Bug
  Components: core
Reporter: Rajini Sivaram
Assignee: Rajini Sivaram
 Fix For: 0.11.0.0


Static metrics like {{BrokerTopicStats.ReplicationBytesInPerSec}} are created 
in a singleton, resulting in one metric being created in a JVM. Some tests like 
{{MetricsDuringTopicCreationDeletionTest}} delete all metrics from the static 
metrics registry. The singleton metrics don't get recreated and subsequent 
tests relying on these metrics may fail.

Singleton metrics make testing hard - we have no idea what metrics are being 
tested. Not sure we want to change that though since there is a lot of code 
that relies on this. But we have to fix tests to ensure that metrics are left 
in a good state.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-3514) Stream timestamp computation needs some further thoughts

2017-05-15 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16010813#comment-16010813
 ] 

Guozhang Wang commented on KAFKA-3514:
--

Thanks [~mihbor]

> Stream timestamp computation needs some further thoughts
> 
>
> Key: KAFKA-3514
> URL: https://issues.apache.org/jira/browse/KAFKA-3514
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Guozhang Wang
>Assignee: Eno Thereska
>  Labels: architecture
> Fix For: 0.11.0.0
>
>
> Our current stream task's timestamp is used for punctuate function as well as 
> selecting which stream to process next (i.e. best effort stream 
> synchronization). And it is defined as the smallest timestamp over all 
> partitions in the task's partition group. This results in two unintuitive 
> corner cases:
> 1) observing a late arrived record would keep that stream's timestamp low for 
> a period of time, and hence keep being process until that late record. For 
> example take two partitions within the same task annotated by their 
> timestamps:
> {code}
> Stream A: 5, 6, 7, 8, 9, 1, 10
> {code}
> {code}
> Stream B: 2, 3, 4, 5
> {code}
> The late arrived record with timestamp "1" will cause stream A to be selected 
> continuously in the thread loop, i.e. messages with timestamp 5, 6, 7, 8, 9 
> until the record itself is dequeued and processed, then stream B will be 
> selected starting with timestamp 2.
> 2) an empty buffered partition will cause its timestamp to be not advanced, 
> and hence the task timestamp as well since it is the smallest among all 
> partitions. This may not be a severe problem compared with 1) above though.
> *Update*
> There is one more thing to consider (full discussion found here: 
> http://search-hadoop.com/m/Kafka/uyzND1iKZJN1yz0E5?subj=Order+of+punctuate+and+process+in+a+stream+processor)
> {quote}
> Let's assume the following case.
> - a stream processor that uses the Processor API
> - context.schedule(1000) is called in the init()
> - the processor reads only one topic that has one partition
> - using custom timestamp extractor, but that timestamp is just a wall 
> clock time
> Image the following events:
> 1., for 10 seconds I send in 5 messages / second
> 2., does not send any messages for 3 seconds
> 3., starts the 5 messages / second again
> I see that punctuate() is not called during the 3 seconds when I do not 
> send any messages. This is ok according to the documentation, because 
> there is not any new messages to trigger the punctuate() call. When the 
> first few messages arrives after a restart the sending (point 3. above) I 
> see the following sequence of method calls:
> 1., process() on the 1st message
> 2., punctuate() is called 3 times
> 3., process() on the 2nd message
> 4., process() on each following message
> What I would expect instead is that punctuate() is called first and then 
> process() is called on the messages, because the first message's timestamp 
> is already 3 seconds older then the last punctuate() was called, so the 
> first message belongs after the 3 punctuate() calls.
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-3356) Remove ConsumerOffsetChecker, deprecated in 0.9, in 0.11

2017-05-15 Thread Onur Karaman (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16010823#comment-16010823
 ] 

Onur Karaman commented on KAFKA-3356:
-

I agree with [~jeffwidman].

> Remove ConsumerOffsetChecker, deprecated in 0.9, in 0.11
> 
>
> Key: KAFKA-3356
> URL: https://issues.apache.org/jira/browse/KAFKA-3356
> Project: Kafka
>  Issue Type: Bug
>  Components: tools
>Affects Versions: 0.10.2.0
>Reporter: Ashish Singh
>Assignee: Mickael Maison
>Priority: Blocker
> Fix For: 0.11.0.0
>
>
> ConsumerOffsetChecker is marked deprecated as of 0.9, should be removed in 
> 0.11.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-5244) Tests which delete singleton metrics break subsequent metrics tests

2017-05-15 Thread Ismael Juma (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16010821#comment-16010821
 ] 

Ismael Juma commented on KAFKA-5244:


If you think the change makes sense, I could create a separate PR for it, it's 
the first commit in the PR:

https://github.com/apache/kafka/pull/2983/commits/9896bf99c66547fabad31ca53d89962d3a6f526a

> Tests which delete singleton metrics break subsequent metrics tests
> ---
>
> Key: KAFKA-5244
> URL: https://issues.apache.org/jira/browse/KAFKA-5244
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Reporter: Rajini Sivaram
>Assignee: Rajini Sivaram
> Fix For: 0.11.0.0
>
>
> Static metrics like {{BrokerTopicStats.ReplicationBytesInPerSec}} are created 
> in a singleton, resulting in one metric being created in a JVM. Some tests 
> like {{MetricsDuringTopicCreationDeletionTest}} delete all metrics from the 
> static metrics registry. The singleton metrics don't get recreated and 
> subsequent tests relying on these metrics may fail.
> Singleton metrics make testing hard - we have no idea what metrics are being 
> tested. Not sure we want to change that though since there is a lot of code 
> that relies on this. But we have to fix tests to ensure that metrics are left 
> in a good state.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: [DISCUSS]: KIP-149: Enabling key access in ValueTransformer, ValueMapper, and ValueJoiner

2017-05-15 Thread Jeyhun Karimov
Hi,

Sorry for super late response. Thanks for your comments.

I am not an expert on Lambdas. Can you elaborate a little bit? I cannot
> follow the explanation in the KIP to see what the problem is.


- From [1] says "A functional interface is an interface that has just one
abstract method, and thus represents a single function contract".
So basically once we extend some interface from another (in our case,
ValueMapperWithKey from ValueMapper) we cannot use lambdas in the extended
interface.


Further comments:
>  - The KIP get a little hard to read -- can you maybe reformat the wiki
> page a little bit? I think using `CodeBlock` would help.


- I will work on the KIP.

 - What about KStream-KTable joins? You don't have overlaods added for
> them. Why? (Even if I still hope that we don't need to add any new
> overloads)


- Actually there are more than one Processor and public APIs to be
changed (KStream-KTable
joins is one case). However all of them has similar structure: we overload
the *method* with  *methodWithKey*,
wrap it into the Rich function, send to processor and inside the processor
call *init* and *close* methods of the Rich function.
As I wrote in KIP, I wanted to demonstrate the overall idea with only
*ValueMapper* as the same can be applied to all changes.
Anyway I will update the KIP.

 - Why do we need `AbstractRichFunction`?


Instead of overriding the *init(ProcessorContext p)* and* close()* methods
in every Rich function with empty body like:

@Override
void init(ProcessorContext context) {}

@Override
void close () {}

I thought that we can override them once in *AbstractRichFunction* and
extent new Rich functions from *AbstractRichFunction*.
Basically this can eliminate code copy-paste and ease the maintenance.

 - What about interfaces Initializer, ForeachAction, Merger, Predicate,
> Reducer? I don't want to say we should/need to add to all, but we should
> discuss all of them and add where it does make sense (e.g.,
> RichForachAction does make sense IMHO)


Definitely agree. As I said, the same technique applies to all this
interfaces and I didn't want to explode the KIP, just wanted to give the
overall intuition.
However, I will update the KIP as I said.


Btw: I like the hierarchy `ValueXX` -- `ValueXXWithKey` -- `RichValueXX`
> in general -- but why can't we do all this with interfaces only?


Sure we can. However the main intuition is we should not force users to
implement *init(ProcessorContext)* and *close()* functions every time they
use Rich functions.
If one needs, she can override the respective methods. However, I am open
for discussion.


I'd rather not see the use of  `ProcessorContext` spread any further than
> it currently is. So maybe we need another KIP that is done before this?
> Otherwise i think the scope of this KIP is becoming too large.


That is good point. I wanted to make *init(ProcessorContext)* method
persistent among the library (which use ProcessorContext as an input),
therefore I put *ProcessorContext* as an input.
So the important question is that (as @dguy and @mjsax mentioned) whether
continue this KIP without providing users an access to *ProcessorContext*
(change *init (ProcessorContext)* to * init()* ) or
initiate another KIP before this.

[1]
http://cr.openjdk.java.net/~mr/se/8/java-se-8-pfd-spec/java-se-8-jls-pfd-diffs.pdf


Cheers,
Jeyhun

On Mon, May 15, 2017 at 7:15 PM, Damian Guy  wrote:

> I'd rather not see the use of  `ProcessorContext` spread any further than
> it currently is. So maybe we need another KIP that is done before this?
> Otherwise i think the scope of this KIP is becoming too large.
>
>
> On Mon, 15 May 2017 at 18:06 Matthias J. Sax 
> wrote:
>
> > I agree that that `ProcessorContext` interface is too broad in general
> > -- this is even true for transform/process, and it's also reflected in
> > the API improvement list we want to do.
> >
> > https://cwiki.apache.org/confluence/display/KAFKA/
> Kafka+Streams+Discussions
> >
> > So I am wondering, if you question the `RichFunction` approach in
> > general? Or if you suggest to either extend the scope of this KIP to
> > include this---or maybe better, do another KIP for it and delay this KIP
> > until the other one is done?
> >
> >
> > -Matthias
> >
> > On 5/15/17 2:35 AM, Damian Guy wrote:
> > > Thanks for the KIP.
> > >
> > > I'm not convinced on the `RichFunction` approach. Do we really want to
> > give
> > > every DSL method access to the `ProcessorContext` ? It has a bunch of
> > > methods on it that seem in-appropriate for some of the DSL methods,
> i.e,
> > > `register`, `getStateStore`, `forward`, `schedule` etc. It is far too
> > > broad. I think it would be better to have a narrower interface like the
> > > `RecordContext`  - remembering it is easier to add methods/interfaces
> > later
> > > than to remove them
> > >
> > > On Sat, 13 May 2017 at 22:26 Matthias J. Sax 
> > wrote:
> > >
> > >> 

[GitHub] kafka pull request #3056: HOTFIX: AddOffsetsToTxnResponse using incorrect sc...

2017-05-15 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/3056


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] kafka pull request #3060: KAFKA-5249: Fix incorrect producer snapshot offset...

2017-05-15 Thread hachikuji
GitHub user hachikuji opened a pull request:

https://github.com/apache/kafka/pull/3060

KAFKA-5249: Fix incorrect producer snapshot offsets when recovering segments



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/hachikuji/kafka KAFKA-5249

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3060.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3060


commit 5018fcdca321c322850f8d7bddfbc503a7dda8a2
Author: Jason Gustafson 
Date:   2017-05-15T19:19:42Z

KAFKA-5249: Fix incorrect producer snapshot offsets when recovering segments




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Comment Edited] (KAFKA-5242) add max_number _of_retries to exponential backoff strategy

2017-05-15 Thread Lukas Gemela (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16010974#comment-16010974
 ] 

Lukas Gemela edited comment on KAFKA-5242 at 5/15/17 5:50 PM:
--

[~mjsax] by multiple instances you mean multiple JVMs (nodes) running or 
multiple instances running within the same jvm process? 

What happened was that by accident we created two instances running within the 
single JVM process, touching the same data on hard drive:
new KafkaStreams(builder, streamsConfig).start(); 

We started to experience this issue when we introduced this bug into our code

If this is actually possible way how to run kafka streams then there is 
definitely a bug in locking mechanism. I've attached logfiles for this 
situation (clio_170511), unfortunately only with debug level set to INFO.

ad backoff strategy,  you can do something similar like how it's done in akka 
lib (cap it with maximal duration): 
http://doc.akka.io/japi/akka/2.4/akka/pattern/Backoff.html 

Thanks!

L.



was (Author: lukas gemela):
[~mjsax] by multiple instances you mean multiple JVMs (nodes) running or 
multiple instances running within the same jvm process? 

What happened was that by accident we created two instances running within the 
single JVM process, touching the same data on hard drive:
new KafkaStreams(builder, streamsConfig).start(); 

We started to experience this issue when we introduced this bug into our code

If this is possible way how to run kafka streams then there is definitely a bug 
in locking mechanism. I've attached logfiles for this situation (clio_170511), 
unfortunately only with debug level set to INFO.

ad backoff strategy,  you can do something similar like how it's done in akka 
lib (cap it with maximal duration): 
http://doc.akka.io/japi/akka/2.4/akka/pattern/Backoff.html 

Thanks!

L.


> add max_number _of_retries to exponential backoff strategy
> --
>
> Key: KAFKA-5242
> URL: https://issues.apache.org/jira/browse/KAFKA-5242
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Affects Versions: 0.10.2.0
>Reporter: Lukas Gemela
>Priority: Minor
> Attachments: clio_170511.log
>
>
> From time to time, during relabance we are getting a lot of exceptions saying 
> {code}
> org.apache.kafka.streams.errors.LockException: task [0_0] Failed to lock the 
> state directory: /app/db/clio/0_0
>   at 
> org.apache.kafka.streams.processor.internals.ProcessorStateManager.(ProcessorStateManager.java:102)
>  ~[kafka-streams-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.streams.processor.internals.AbstractTask.(AbstractTask.java:73)
>  ~[kafka-streams-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.streams.processor.internals.StreamTask.(StreamTask.java:108)
>  ~[kafka-streams-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.createStreamTask(StreamThread.java:834)
>  ~[kafka-streams-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread$TaskCreator.createTask(StreamThread.java:1207)
>  ~[kafka-streams-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread$AbstractTaskCreator.retryWithBackoff(StreamThread.java:1180)
>  [kafka-streams-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.addStreamTasks(StreamThread.java:937)
>  [kafka-streams-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.access$500(StreamThread.java:69)
>  [kafka-streams-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread$1.onPartitionsAssigned(StreamThread.java:236)
>  [kafka-streams-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinComplete(ConsumerCoordinator.java:255)
>  [kafka-clients-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.joinGroupIfNeeded(AbstractCoordinator.java:339)
>  [kafka-clients-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureActiveGroup(AbstractCoordinator.java:303)
>  [kafka-clients-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:286)
>  [kafka-clients-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:1030)
>  [kafka-clients-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:995) 
> [kafka-clients-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:582)
>  [kafka-streams-0.10.2.0.jar!/:?]
>   at 
> 

[jira] [Comment Edited] (KAFKA-5242) add max_number _of_retries to exponential backoff strategy

2017-05-15 Thread Lukas Gemela (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16010974#comment-16010974
 ] 

Lukas Gemela edited comment on KAFKA-5242 at 5/15/17 5:50 PM:
--

[~mjsax] by multiple instances you mean multiple JVMs (nodes) running or 
multiple instances running within the same jvm process? 

What happened was that by accident we created two instances running within the 
single JVM process, touching the same data on hard drive:
new KafkaStreams(builder, streamsConfig).start(); 

We started to experience this issue when we introduced this bug into our code

If this is possible way how to run kafka streams then there is definitely a bug 
in locking mechanism. I've attached logfiles for this situation (clio_170511), 
unfortunately only with debug level set to INFO.

ad backoff strategy,  you can do something similar like how it's done in akka 
lib (cap it with maximal duration): 
http://doc.akka.io/japi/akka/2.4/akka/pattern/Backoff.html 

Thanks!

L.



was (Author: lukas gemela):
[~mjsax] by multiple instances you mean multiple JVMs (nodes) running or 
multiple instances running within the same jvm process? 

What happened was that by accident we created two instances running within the 
single JVM process, touching the same data on hard drive:
new KafkaStreams(builder, streamsConfig).start(); 

If this is possible way how to run kafka streams then there is definitely a bug 
in locking mechanism. I've attached logfiles for this situation (clio_170511), 
unfortunately only with debug level set to INFO.

ad backoff strategy,  you can do something similar like how it's done in akka 
lib (cap it with maximal duration): 
http://doc.akka.io/japi/akka/2.4/akka/pattern/Backoff.html 

Thanks!

L.


> add max_number _of_retries to exponential backoff strategy
> --
>
> Key: KAFKA-5242
> URL: https://issues.apache.org/jira/browse/KAFKA-5242
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Affects Versions: 0.10.2.0
>Reporter: Lukas Gemela
>Priority: Minor
> Attachments: clio_170511.log
>
>
> From time to time, during relabance we are getting a lot of exceptions saying 
> {code}
> org.apache.kafka.streams.errors.LockException: task [0_0] Failed to lock the 
> state directory: /app/db/clio/0_0
>   at 
> org.apache.kafka.streams.processor.internals.ProcessorStateManager.(ProcessorStateManager.java:102)
>  ~[kafka-streams-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.streams.processor.internals.AbstractTask.(AbstractTask.java:73)
>  ~[kafka-streams-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.streams.processor.internals.StreamTask.(StreamTask.java:108)
>  ~[kafka-streams-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.createStreamTask(StreamThread.java:834)
>  ~[kafka-streams-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread$TaskCreator.createTask(StreamThread.java:1207)
>  ~[kafka-streams-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread$AbstractTaskCreator.retryWithBackoff(StreamThread.java:1180)
>  [kafka-streams-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.addStreamTasks(StreamThread.java:937)
>  [kafka-streams-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.access$500(StreamThread.java:69)
>  [kafka-streams-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread$1.onPartitionsAssigned(StreamThread.java:236)
>  [kafka-streams-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinComplete(ConsumerCoordinator.java:255)
>  [kafka-clients-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.joinGroupIfNeeded(AbstractCoordinator.java:339)
>  [kafka-clients-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureActiveGroup(AbstractCoordinator.java:303)
>  [kafka-clients-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:286)
>  [kafka-clients-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:1030)
>  [kafka-clients-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:995) 
> [kafka-clients-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:582)
>  [kafka-streams-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:368)
>  

[jira] [Commented] (KAFKA-5244) Tests which delete singleton metrics break subsequent metrics tests

2017-05-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16011053#comment-16011053
 ] 

ASF GitHub Bot commented on KAFKA-5244:
---

GitHub user ijuma opened a pull request:

https://github.com/apache/kafka/pull/3059

KAFKA-5244: Refactor BrokerTopicStats and ControllerStats so that they are 
classes

This removes the need to force object initialisation via hacks to register
the relevant metrics during start-up.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ijuma/kafka 
kafka-5244-broker-static-stats-and-controller-stats-as-classes

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3059.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3059


commit 94340b85e4095cc8dad7ed18d9c50124d400d753
Author: Ismael Juma 
Date:   2017-05-05T13:06:08Z

Refactor BrokerTopicStats and ControllerStats so that they are classes

This removes the need to force object initialisation via hacks to register
the relevant metrics during start-up.




> Tests which delete singleton metrics break subsequent metrics tests
> ---
>
> Key: KAFKA-5244
> URL: https://issues.apache.org/jira/browse/KAFKA-5244
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Reporter: Rajini Sivaram
>Assignee: Rajini Sivaram
> Fix For: 0.11.0.0
>
>
> Static metrics like {{BrokerTopicStats.ReplicationBytesInPerSec}} are created 
> in a singleton, resulting in one metric being created in a JVM. Some tests 
> like {{MetricsDuringTopicCreationDeletionTest}} delete all metrics from the 
> static metrics registry. The singleton metrics don't get recreated and 
> subsequent tests relying on these metrics may fail.
> Singleton metrics make testing hard - we have no idea what metrics are being 
> tested. Not sure we want to change that though since there is a lot of code 
> that relies on this. But we have to fix tests to ensure that metrics are left 
> in a good state.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-3356) Remove ConsumerOffsetChecker, deprecated in 0.9, in 0.11

2017-05-15 Thread Jeff Widman (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16010801#comment-16010801
 ] 

Jeff Widman commented on KAFKA-3356:


Personally, I don't think we need feature parity for the old-consumers since 
those are in the process of being deprecated/removed.

> Remove ConsumerOffsetChecker, deprecated in 0.9, in 0.11
> 
>
> Key: KAFKA-3356
> URL: https://issues.apache.org/jira/browse/KAFKA-3356
> Project: Kafka
>  Issue Type: Bug
>  Components: tools
>Affects Versions: 0.10.2.0
>Reporter: Ashish Singh
>Assignee: Mickael Maison
>Priority: Blocker
> Fix For: 0.11.0.0
>
>
> ConsumerOffsetChecker is marked deprecated as of 0.9, should be removed in 
> 0.11.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: [DISCUSS]: KIP-149: Enabling key access in ValueTransformer, ValueMapper, and ValueJoiner

2017-05-15 Thread Matthias J. Sax
I agree that that `ProcessorContext` interface is too broad in general
-- this is even true for transform/process, and it's also reflected in
the API improvement list we want to do.

https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Streams+Discussions

So I am wondering, if you question the `RichFunction` approach in
general? Or if you suggest to either extend the scope of this KIP to
include this---or maybe better, do another KIP for it and delay this KIP
until the other one is done?


-Matthias

On 5/15/17 2:35 AM, Damian Guy wrote:
> Thanks for the KIP.
> 
> I'm not convinced on the `RichFunction` approach. Do we really want to give
> every DSL method access to the `ProcessorContext` ? It has a bunch of
> methods on it that seem in-appropriate for some of the DSL methods, i.e,
> `register`, `getStateStore`, `forward`, `schedule` etc. It is far too
> broad. I think it would be better to have a narrower interface like the
> `RecordContext`  - remembering it is easier to add methods/interfaces later
> than to remove them
> 
> On Sat, 13 May 2017 at 22:26 Matthias J. Sax  wrote:
> 
>> Jeyhun,
>>
>> I am not an expert on Lambdas. Can you elaborate a little bit? I cannot
>> follow the explanation in the KIP to see what the problem is.
>>
>> For updating the KIP title I don't know -- guess it's up to you. Maybe a
>> committer can comment on this?
>>
>>
>> Further comments:
>>
>>  - The KIP get a little hard to read -- can you maybe reformat the wiki
>> page a little bit? I think using `CodeBlock` would help.
>>
>>  - What about KStream-KTable joins? You don't have overlaods added for
>> them. Why? (Even if I still hope that we don't need to add any new
>> overloads)
>>
>>  - Why do we need `AbstractRichFunction`?
>>
>>  - What about interfaces Initializer, ForeachAction, Merger, Predicate,
>> Reducer? I don't want to say we should/need to add to all, but we should
>> discuss all of them and add where it does make sense (e.g.,
>> RichForachAction does make sense IMHO)
>>
>>
>> Btw: I like the hierarchy `ValueXX` -- `ValueXXWithKey` -- `RichValueXX`
>> in general -- but why can't we do all this with interfaces only?
>>
>>
>>
>> -Matthias
>>
>>
>>
>> On 5/11/17 7:00 AM, Jeyhun Karimov wrote:
>>> Hi,
>>>
>>> Thanks for your comments. I think we cannot extend the two interfaces if
>> we
>>> want to keep lambdas. I updated the KIP [1]. Maybe I should change the
>>> title, because now we are not limiting the KIP to only ValueMapper,
>>> ValueTransformer and ValueJoiner.
>>> Please feel free to comment.
>>>
>>> [1]
>>>
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-149%3A+Enabling+key+access+in+ValueTransformer%2C+ValueMapper%2C+and+ValueJoiner
>>>
>>>
>>> Cheers,
>>> Jeyhun
>>>
>>> On Tue, May 9, 2017 at 7:36 PM Matthias J. Sax 
>>> wrote:
>>>
 If `ValueMapperWithKey` extends `ValueMapper` we don't need the new
 overlaod.

 And yes, we need to do one check -- but this happens when building the
 topology. At runtime (I mean after KafkaStream#start() we don't need any
 check as we will always use `ValueMapperWithKey`)


 -Matthias


 On 5/9/17 2:55 AM, Jeyhun Karimov wrote:
> Hi,
>
> Thanks for feedback.
> Then we need to overload method
>KStream mapValues(ValueMapper
> mapper);
> with
>KStream mapValues(ValueMapperWithKey>>> VR>
> mapper);
>
> and in runtime (inside processor) we still have to check it is
 ValueMapper
> or ValueMapperWithKey before wrapping it into the rich function.
>
>
> Please correct me if I am wrong.
>
> Cheers,
> Jeyhun
>
>
>
>
> On Tue, May 9, 2017 at 10:56 AM Michal Borowiecki <
> michal.borowie...@openbet.com> wrote:
>
>> +1 :)
>>
>>
>> On 08/05/17 23:52, Matthias J. Sax wrote:
>>> Hi,
>>>
>>> I was reading the updated KIP and I am wondering, if we should do the
>>> design a little different.
>>>
>>> Instead of distinguishing between a RichFunction and non-RichFunction
 at
>>> runtime level, we would use RichFunctions all the time. Thus, on the
 DSL
>>> entry level, if a user provides a non-RichFunction, we wrap it by a
>>> RichFunction that is fully implemented by Streams. This RichFunction
>>> would just forward the call omitting the key:
>>>
>>> Just to sketch the idea (incomplete code snippet):
>>>
 public StreamsRichValueMapper implements RichValueMapper() {
private ValueMapper userProvidedMapper; // set by constructor

public VR apply(final K key, final V1 value1, final V2 value2) {
return userProvidedMapper(value1, value2);
}
 }
>>>
>>>  From a performance point of view, I am not sure if the
>>> "if(isRichFunction)" including casts etc would have more overhead
>> than
>>> this approach 

[jira] [Commented] (KAFKA-5242) add max_number _of_retries to exponential backoff strategy

2017-05-15 Thread Matthias J. Sax (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16010936#comment-16010936
 ] 

Matthias J. Sax commented on KAFKA-5242:


We fixed a couple of state directory lock issues in {{0.10.2.1}} -- thus I am 
wondering if is it already fixed there?

About the retry logic: the fact that you did create multiple {{KafkaStreams}} 
instances and started both should not have any influence on the behavior and 
should not cause the issue (if it hangs forever, it's a bug in lock management 
and should be independent on starting one or two instances -- maybe multiple 
instances expose the bug with higher probability, but it should not be the root 
cause). You should only get more parallel running instances, and load should be 
redistributed over more threads. We do try infinitely atm, as we know that the 
lock will be release eventually (as long as there is no bug).

We try to keep the number of config values as small as possible. Thus, I am 
wondering if if might be sufficient to hard code a max retry time or count? We 
might also put an upper bound on back-off time -- the current strategy seems to 
be too aggressive. WDYT?

> add max_number _of_retries to exponential backoff strategy
> --
>
> Key: KAFKA-5242
> URL: https://issues.apache.org/jira/browse/KAFKA-5242
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Affects Versions: 0.10.2.0
>Reporter: Lukas Gemela
>Priority: Minor
>
> From time to time, during relabance we are getting a lot of exceptions saying 
> {code}
> org.apache.kafka.streams.errors.LockException: task [0_0] Failed to lock the 
> state directory: /app/db/clio/0_0
>   at 
> org.apache.kafka.streams.processor.internals.ProcessorStateManager.(ProcessorStateManager.java:102)
>  ~[kafka-streams-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.streams.processor.internals.AbstractTask.(AbstractTask.java:73)
>  ~[kafka-streams-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.streams.processor.internals.StreamTask.(StreamTask.java:108)
>  ~[kafka-streams-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.createStreamTask(StreamThread.java:834)
>  ~[kafka-streams-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread$TaskCreator.createTask(StreamThread.java:1207)
>  ~[kafka-streams-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread$AbstractTaskCreator.retryWithBackoff(StreamThread.java:1180)
>  [kafka-streams-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.addStreamTasks(StreamThread.java:937)
>  [kafka-streams-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.access$500(StreamThread.java:69)
>  [kafka-streams-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread$1.onPartitionsAssigned(StreamThread.java:236)
>  [kafka-streams-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinComplete(ConsumerCoordinator.java:255)
>  [kafka-clients-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.joinGroupIfNeeded(AbstractCoordinator.java:339)
>  [kafka-clients-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureActiveGroup(AbstractCoordinator.java:303)
>  [kafka-clients-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:286)
>  [kafka-clients-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:1030)
>  [kafka-clients-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:995) 
> [kafka-clients-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:582)
>  [kafka-streams-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:368)
>  [kafka-streams-0.10.2.0.jar!/:?]
> {code}
> (see attached logfile)
> It was actually problem on our side - we ran startStreams() twice and 
> therefore we had two threads touching the same folder structure. 
> But what I've noticed, the backoff strategy in 
> StreamThread$AbstractTaskCreator.retryWithBackoff can run endlessly - after 
> 20 iterations it takes 6hours until the next attempt to start a task. 
> I've noticed latest code contains check for rebalanceTimeoutMs, but that 
> still does not solve the problem especially in case 
> MAX_POLL_INTERVAL_MS_CONFIG is set to Integer.MAX_INT. at this stage kafka 
> streams just hangs up indefinitely.
> I 

[jira] [Commented] (KAFKA-5182) Transient failure: RequestQuotaTest.testResponseThrottleTime

2017-05-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16010950#comment-16010950
 ] 

ASF GitHub Bot commented on KAFKA-5182:
---

GitHub user rajinisivaram opened a pull request:

https://github.com/apache/kafka/pull/3057

KAFKA-5182: Reduce session timeout in request quota test

Reduce session timeouts for join requests to trigger throttling in the 
request quota test.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/rajinisivaram/kafka KAFKA-5182-quotatest

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/3057.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3057


commit 7885027108e9841490181b58b759d5a61cba84fc
Author: Rajini Sivaram 
Date:   2017-05-15T17:10:56Z

KAFKA-5182: Reduce session timeout in request quota test




> Transient failure: RequestQuotaTest.testResponseThrottleTime
> 
>
> Key: KAFKA-5182
> URL: https://issues.apache.org/jira/browse/KAFKA-5182
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Xavier Léauté
>Assignee: Rajini Sivaram
> Fix For: 0.11.0.0
>
>
> Stacktrace
> {code}
> java.util.concurrent.ExecutionException: java.lang.AssertionError: Response 
> not throttled: Client JOIN_GROUP apiKey JOIN_GROUP requests 3 requestTime 
> 0.009982775502048156 throttleTime 0.0
>   at java.util.concurrent.FutureTask.report(FutureTask.java:122)
>   at java.util.concurrent.FutureTask.get(FutureTask.java:206)
>   at 
> kafka.server.RequestQuotaTest$$anonfun$waitAndCheckResults$1.apply(RequestQuotaTest.scala:326)
>   at 
> kafka.server.RequestQuotaTest$$anonfun$waitAndCheckResults$1.apply(RequestQuotaTest.scala:324)
>   at scala.collection.immutable.List.foreach(List.scala:392)
>   at 
> scala.collection.generic.TraversableForwarder$class.foreach(TraversableForwarder.scala:35)
>   at scala.collection.mutable.ListBuffer.foreach(ListBuffer.scala:45)
>   at 
> kafka.server.RequestQuotaTest.waitAndCheckResults(RequestQuotaTest.scala:324)
>   at 
> kafka.server.RequestQuotaTest.testResponseThrottleTime(RequestQuotaTest.scala:105)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecuter.runTestClass(JUnitTestClassExecuter.java:114)
>   at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecuter.execute(JUnitTestClassExecuter.java:57)
>   at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassProcessor.processTestClass(JUnitTestClassProcessor.java:66)
>   at 
> org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.processTestClass(SuiteTestClassProcessor.java:51)
>   at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)
>   at 
> 

[jira] [Updated] (KAFKA-5242) add max_number _of_retries to exponential backoff strategy

2017-05-15 Thread Lukas Gemela (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lukas Gemela updated KAFKA-5242:

Attachment: clio_170511.log

> add max_number _of_retries to exponential backoff strategy
> --
>
> Key: KAFKA-5242
> URL: https://issues.apache.org/jira/browse/KAFKA-5242
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Affects Versions: 0.10.2.0
>Reporter: Lukas Gemela
>Priority: Minor
> Attachments: clio_170511.log
>
>
> From time to time, during relabance we are getting a lot of exceptions saying 
> {code}
> org.apache.kafka.streams.errors.LockException: task [0_0] Failed to lock the 
> state directory: /app/db/clio/0_0
>   at 
> org.apache.kafka.streams.processor.internals.ProcessorStateManager.(ProcessorStateManager.java:102)
>  ~[kafka-streams-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.streams.processor.internals.AbstractTask.(AbstractTask.java:73)
>  ~[kafka-streams-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.streams.processor.internals.StreamTask.(StreamTask.java:108)
>  ~[kafka-streams-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.createStreamTask(StreamThread.java:834)
>  ~[kafka-streams-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread$TaskCreator.createTask(StreamThread.java:1207)
>  ~[kafka-streams-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread$AbstractTaskCreator.retryWithBackoff(StreamThread.java:1180)
>  [kafka-streams-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.addStreamTasks(StreamThread.java:937)
>  [kafka-streams-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.access$500(StreamThread.java:69)
>  [kafka-streams-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread$1.onPartitionsAssigned(StreamThread.java:236)
>  [kafka-streams-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinComplete(ConsumerCoordinator.java:255)
>  [kafka-clients-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.joinGroupIfNeeded(AbstractCoordinator.java:339)
>  [kafka-clients-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureActiveGroup(AbstractCoordinator.java:303)
>  [kafka-clients-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:286)
>  [kafka-clients-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:1030)
>  [kafka-clients-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:995) 
> [kafka-clients-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:582)
>  [kafka-streams-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:368)
>  [kafka-streams-0.10.2.0.jar!/:?]
> {code}
> (see attached logfile)
> It was actually problem on our side - we ran startStreams() twice and 
> therefore we had two threads touching the same folder structure. 
> But what I've noticed, the backoff strategy in 
> StreamThread$AbstractTaskCreator.retryWithBackoff can run endlessly - after 
> 20 iterations it takes 6hours until the next attempt to start a task. 
> I've noticed latest code contains check for rebalanceTimeoutMs, but that 
> still does not solve the problem especially in case 
> MAX_POLL_INTERVAL_MS_CONFIG is set to Integer.MAX_INT. at this stage kafka 
> streams just hangs up indefinitely.
> I would personally make that backoffstrategy a bit more configurable with a 
> number of retries that if it exceed a configured value it propagates the 
> exception as any other exception to custom client exception handler.
> (I can provide a patch)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (KAFKA-5242) add max_number _of_retries to exponential backoff strategy

2017-05-15 Thread Lukas Gemela (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16011222#comment-16011222
 ] 

Lukas Gemela edited comment on KAFKA-5242 at 5/15/17 7:48 PM:
--

We used two stream threads, like I said we have "num.stream.threads" = 1 but we 
called 

{code}
new KafkaStreams(builder, streamsConfig).start();
{code}

twice which resulted into two running threads.

We will definitely update to 0.10.2.1. Anyway, this jira is not about the 
locking itself, if we reproduce the issue with 0.10.2.1 I raise another jira 
ticket.


was (Author: lukas gemela):
We used two stream threads, like I said we have "num.stream.threads" = 1 but we 
called 

{code}
new KafkaStreams(builder, streamsConfig).start();
{code}

twice which results into two running threads.

We will definitely update to 0.10.2.1. Anyway, this jira is not about the 
locking itself, if we reproduce the issue with 0.10.2.1 I raise another jira 
ticket.

> add max_number _of_retries to exponential backoff strategy
> --
>
> Key: KAFKA-5242
> URL: https://issues.apache.org/jira/browse/KAFKA-5242
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Affects Versions: 0.10.2.0
>Reporter: Lukas Gemela
>Priority: Minor
> Attachments: clio_170511.log
>
>
> From time to time, during relabance we are getting a lot of exceptions saying 
> {code}
> org.apache.kafka.streams.errors.LockException: task [0_0] Failed to lock the 
> state directory: /app/db/clio/0_0
>   at 
> org.apache.kafka.streams.processor.internals.ProcessorStateManager.(ProcessorStateManager.java:102)
>  ~[kafka-streams-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.streams.processor.internals.AbstractTask.(AbstractTask.java:73)
>  ~[kafka-streams-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.streams.processor.internals.StreamTask.(StreamTask.java:108)
>  ~[kafka-streams-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.createStreamTask(StreamThread.java:834)
>  ~[kafka-streams-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread$TaskCreator.createTask(StreamThread.java:1207)
>  ~[kafka-streams-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread$AbstractTaskCreator.retryWithBackoff(StreamThread.java:1180)
>  [kafka-streams-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.addStreamTasks(StreamThread.java:937)
>  [kafka-streams-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.access$500(StreamThread.java:69)
>  [kafka-streams-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread$1.onPartitionsAssigned(StreamThread.java:236)
>  [kafka-streams-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinComplete(ConsumerCoordinator.java:255)
>  [kafka-clients-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.joinGroupIfNeeded(AbstractCoordinator.java:339)
>  [kafka-clients-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureActiveGroup(AbstractCoordinator.java:303)
>  [kafka-clients-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:286)
>  [kafka-clients-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:1030)
>  [kafka-clients-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:995) 
> [kafka-clients-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:582)
>  [kafka-streams-0.10.2.0.jar!/:?]
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:368)
>  [kafka-streams-0.10.2.0.jar!/:?]
> {code}
> (see attached logfile)
> It was actually problem on our side - we ran startStreams() twice and 
> therefore we had two threads touching the same folder structure. 
> But what I've noticed, the backoff strategy in 
> StreamThread$AbstractTaskCreator.retryWithBackoff can run endlessly - after 
> 20 iterations it takes 6hours until the next attempt to start a task. 
> I've noticed latest code contains check for rebalanceTimeoutMs, but that 
> still does not solve the problem especially in case 
> MAX_POLL_INTERVAL_MS_CONFIG is set to Integer.MAX_INT. at this stage kafka 
> streams just hangs up indefinitely.
> I would personally make that backoffstrategy a bit more configurable with a 
> number of retries that if it exceed a configured value it propagates the 
> exception as 

[jira] [Resolved] (KAFKA-5248) Remove retention time from TxnOffsetCommit RPC

2017-05-15 Thread Jason Gustafson (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Gustafson resolved KAFKA-5248.

Resolution: Fixed

Issue resolved by pull request 3058
[https://github.com/apache/kafka/pull/3058]

> Remove retention time from TxnOffsetCommit RPC
> --
>
> Key: KAFKA-5248
> URL: https://issues.apache.org/jira/browse/KAFKA-5248
> Project: Kafka
>  Issue Type: Sub-task
>  Components: clients, core, producer 
>Reporter: Jason Gustafson
>Assignee: Jason Gustafson
> Fix For: 0.11.0.0
>
>
> We added offset retention time because OffsetCommitRequest had it. However, 
> the new consumer has never exposed this and we have no plan of exposing it in 
> the producer, so we may as well remove it. If we need it later, we can bump 
> the protocol.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] kafka pull request #3058: KAFKA-5248: Remove unused/unneeded retention time ...

2017-05-15 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/3058


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Jenkins build is back to normal : kafka-trunk-jdk7 #2197

2017-05-15 Thread Apache Jenkins Server
See 




[jira] [Updated] (KAFKA-5225) StreamsResetter doesn't allow custom Consumer properties

2017-05-15 Thread Bharat Viswanadham (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham updated KAFKA-5225:
--
Status: Patch Available  (was: In Progress)

> StreamsResetter doesn't allow custom Consumer properties
> 
>
> Key: KAFKA-5225
> URL: https://issues.apache.org/jira/browse/KAFKA-5225
> Project: Kafka
>  Issue Type: Bug
>  Components: streams, tools
>Affects Versions: 0.10.2.1
>Reporter: Dustin Cote
>Assignee: Bharat Viswanadham
>
> The StreamsResetter doesn't let the user pass in any configurations to the 
> embedded consumer. This is a problem in secured environments because you 
> can't configure the embedded consumer to talk to the cluster. The tool should 
> take an approach similar to `kafka.admin.ConsumerGroupCommand` which allows a 
> config file to be passed in the command line for such operations.
> cc [~mjsax]



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (KAFKA-5251) Producer should drop queued sends when transaction is aborted

2017-05-15 Thread Jason Gustafson (JIRA)
Jason Gustafson created KAFKA-5251:
--

 Summary: Producer should drop queued sends when transaction is 
aborted
 Key: KAFKA-5251
 URL: https://issues.apache.org/jira/browse/KAFKA-5251
 Project: Kafka
  Issue Type: Sub-task
Reporter: Jason Gustafson
Assignee: Apurva Mehta


As an optimization, if a transaction is aborted, we can drop any records which 
have not yet been sent to the brokers. However, to avoid the sequence number 
getting out of sync, we need to continue sending any request which has been 
sent at least once.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (KAFKA-5249) Transaction index recovery does not snapshot properly

2017-05-15 Thread Sriram Subramanian (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-5249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sriram Subramanian updated KAFKA-5249:
--
Labels: exactly-once  (was: )

> Transaction index recovery does not snapshot properly
> -
>
> Key: KAFKA-5249
> URL: https://issues.apache.org/jira/browse/KAFKA-5249
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Jason Gustafson
>Assignee: Jason Gustafson
>  Labels: exactly-once
>
> When recovering the transaction index, we should take snapshots of the 
> producer state after recovering each segment. Currently, the snapshot offset 
> is not updated correctly so we will reread the segment multiple times. 
> Additionally, it appears that we do not remove snapshots with offsets higher 
> than the log end offset in all cases upon truncation.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


  1   2   >