[jira] [Updated] (KAFKA-4942) Kafka Connect: Offset committing times out before expected

2017-03-22 Thread Stephane Maarek (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephane Maarek updated KAFKA-4942:
---
Affects Version/s: 0.10.2.0
 Priority: Major  (was: Critical)
  Component/s: KafkaConnect

> Kafka Connect: Offset committing times out before expected
> --
>
> Key: KAFKA-4942
> URL: https://issues.apache.org/jira/browse/KAFKA-4942
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 0.10.2.0
>Reporter: Stephane Maarek
>
> On Kafka 0.10.2.0
> I run a connector that deals with a lot of data, in a kafka connect cluster
> When the offsets are getting committed, I get the following:
> {code}
> [2017-03-23 03:56:25,134] INFO WorkerSinkTask{id=MyConnector-1} Committing 
> offsets (org.apache.kafka.connect.runtime.WorkerSinkTask)
> [2017-03-23 03:56:25,135] WARN Commit of WorkerSinkTask{id=MyConnector-1} 
> offsets timed out (org.apache.kafka.connect.runtime.WorkerSinkTask)
> {code}
> If you look at the timestamps, they're 1 ms apart. My settings are the 
> following: 
> {code}
>   offset.flush.interval.ms = 12
>   offset.flush.timeout.ms = 6
>   offset.storage.topic = _connect_offsets
> {code}
> It seems the offset flush timeout setting is completely ignored for the look 
> of the logs. I would expect the timeout message to happen 60 seconds after 
> the commit offset INFO message, not 1 millisecond later.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (KAFKA-4942) Kafka Connect: Offset committing times out before expected

2017-03-22 Thread Stephane Maarek (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephane Maarek updated KAFKA-4942:
---
Description: 
On Kafka 0.10.2.0
I run a connector that deals with a lot of data, in a kafka connect cluster

When the offsets are getting committed, I get the following:
{code}
[2017-03-23 03:56:25,134] INFO WorkerSinkTask{id=MyConnector-1} Committing 
offsets (org.apache.kafka.connect.runtime.WorkerSinkTask)
[2017-03-23 03:56:25,135] WARN Commit of WorkerSinkTask{id=MyConnector-1} 
offsets timed out (org.apache.kafka.connect.runtime.WorkerSinkTask)
{code}

If you look at the timestamps, they're 1 ms apart. My settings are the 
following: 
offset.flush.interval.ms = 12
offset.flush.timeout.ms = 6
offset.storage.topic = _connect_offsets

It seems the offset flush timeout setting is completely ignored for the look of 
the logs. I would expect the timeout message to happen 60 seconds after the 
commit offset INFO message, not 1 millisecond later.

  was:
I run a connector that deals with a lot of data, in a kafka connect cluster

When the offsets are getting committed, I get the following:
[2017-03-23 03:56:25,134] INFO WorkerSinkTask{id=MyConnector-1} Committing 
offsets (org.apache.kafka.connect.runtime.WorkerSinkTask)
[2017-03-23 03:56:25,135] WARN Commit of WorkerSinkTask{id=MyConnector-1} 
offsets timed out (org.apache.kafka.connect.runtime.WorkerSinkTask)

If you look at the timestamps, they're 1 ms apart. My settings are the 
following: 
offset.flush.interval.ms = 12
offset.flush.timeout.ms = 6
offset.storage.topic = _connect_offsets

It seems the offset flush timeout setting is completely ignored for the look of 
the logs. I would expect the timeout message to happen 60 seconds after the 
commit offset INFO message, not 1 millisecond later.


> Kafka Connect: Offset committing times out before expected
> --
>
> Key: KAFKA-4942
> URL: https://issues.apache.org/jira/browse/KAFKA-4942
> Project: Kafka
>  Issue Type: Bug
>Reporter: Stephane Maarek
>Priority: Critical
>
> On Kafka 0.10.2.0
> I run a connector that deals with a lot of data, in a kafka connect cluster
> When the offsets are getting committed, I get the following:
> {code}
> [2017-03-23 03:56:25,134] INFO WorkerSinkTask{id=MyConnector-1} Committing 
> offsets (org.apache.kafka.connect.runtime.WorkerSinkTask)
> [2017-03-23 03:56:25,135] WARN Commit of WorkerSinkTask{id=MyConnector-1} 
> offsets timed out (org.apache.kafka.connect.runtime.WorkerSinkTask)
> {code}
> If you look at the timestamps, they're 1 ms apart. My settings are the 
> following: 
>   offset.flush.interval.ms = 12
>   offset.flush.timeout.ms = 6
>   offset.storage.topic = _connect_offsets
> It seems the offset flush timeout setting is completely ignored for the look 
> of the logs. I would expect the timeout message to happen 60 seconds after 
> the commit offset INFO message, not 1 millisecond later.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (KAFKA-4942) Kafka Connect: Offset committing times out before expected

2017-03-22 Thread Stephane Maarek (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephane Maarek updated KAFKA-4942:
---
Description: 
On Kafka 0.10.2.0
I run a connector that deals with a lot of data, in a kafka connect cluster

When the offsets are getting committed, I get the following:
{code}
[2017-03-23 03:56:25,134] INFO WorkerSinkTask{id=MyConnector-1} Committing 
offsets (org.apache.kafka.connect.runtime.WorkerSinkTask)
[2017-03-23 03:56:25,135] WARN Commit of WorkerSinkTask{id=MyConnector-1} 
offsets timed out (org.apache.kafka.connect.runtime.WorkerSinkTask)
{code}

If you look at the timestamps, they're 1 ms apart. My settings are the 
following: 
{code}
offset.flush.interval.ms = 12
offset.flush.timeout.ms = 6
offset.storage.topic = _connect_offsets
{code}

It seems the offset flush timeout setting is completely ignored for the look of 
the logs. I would expect the timeout message to happen 60 seconds after the 
commit offset INFO message, not 1 millisecond later.

  was:
On Kafka 0.10.2.0
I run a connector that deals with a lot of data, in a kafka connect cluster

When the offsets are getting committed, I get the following:
{code}
[2017-03-23 03:56:25,134] INFO WorkerSinkTask{id=MyConnector-1} Committing 
offsets (org.apache.kafka.connect.runtime.WorkerSinkTask)
[2017-03-23 03:56:25,135] WARN Commit of WorkerSinkTask{id=MyConnector-1} 
offsets timed out (org.apache.kafka.connect.runtime.WorkerSinkTask)
{code}

If you look at the timestamps, they're 1 ms apart. My settings are the 
following: 
offset.flush.interval.ms = 12
offset.flush.timeout.ms = 6
offset.storage.topic = _connect_offsets

It seems the offset flush timeout setting is completely ignored for the look of 
the logs. I would expect the timeout message to happen 60 seconds after the 
commit offset INFO message, not 1 millisecond later.


> Kafka Connect: Offset committing times out before expected
> --
>
> Key: KAFKA-4942
> URL: https://issues.apache.org/jira/browse/KAFKA-4942
> Project: Kafka
>  Issue Type: Bug
>Reporter: Stephane Maarek
>Priority: Critical
>
> On Kafka 0.10.2.0
> I run a connector that deals with a lot of data, in a kafka connect cluster
> When the offsets are getting committed, I get the following:
> {code}
> [2017-03-23 03:56:25,134] INFO WorkerSinkTask{id=MyConnector-1} Committing 
> offsets (org.apache.kafka.connect.runtime.WorkerSinkTask)
> [2017-03-23 03:56:25,135] WARN Commit of WorkerSinkTask{id=MyConnector-1} 
> offsets timed out (org.apache.kafka.connect.runtime.WorkerSinkTask)
> {code}
> If you look at the timestamps, they're 1 ms apart. My settings are the 
> following: 
> {code}
>   offset.flush.interval.ms = 12
>   offset.flush.timeout.ms = 6
>   offset.storage.topic = _connect_offsets
> {code}
> It seems the offset flush timeout setting is completely ignored for the look 
> of the logs. I would expect the timeout message to happen 60 seconds after 
> the commit offset INFO message, not 1 millisecond later.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (KAFKA-4942) Kafka Connect: Offset committing times out before expected

2017-03-22 Thread Stephane Maarek (JIRA)
Stephane Maarek created KAFKA-4942:
--

 Summary: Kafka Connect: Offset committing times out before expected
 Key: KAFKA-4942
 URL: https://issues.apache.org/jira/browse/KAFKA-4942
 Project: Kafka
  Issue Type: Bug
Reporter: Stephane Maarek
Priority: Critical


I run a connector that deals with a lot of data, in a kafka connect cluster

When the offsets are getting committed, I get the following:
[2017-03-23 03:56:25,134] INFO WorkerSinkTask{id=MyConnector-1} Committing 
offsets (org.apache.kafka.connect.runtime.WorkerSinkTask)
[2017-03-23 03:56:25,135] WARN Commit of WorkerSinkTask{id=MyConnector-1} 
offsets timed out (org.apache.kafka.connect.runtime.WorkerSinkTask)

If you look at the timestamps, they're 1 ms apart. My settings are the 
following: 
offset.flush.interval.ms = 12
offset.flush.timeout.ms = 6
offset.storage.topic = _connect_offsets

It seems the offset flush timeout setting is completely ignored for the look of 
the logs. I would expect the timeout message to happen 60 seconds after the 
commit offset INFO message, not 1 millisecond later.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: [DISCUSS] KIP-120: Cleanup Kafka Streams builder API

2017-03-22 Thread Jay Kreps
I don't feel strongly on this, so I'm happy with whatever everyone else
wants.

Michael, I'm not arguing that people don't need to understand topologies, I
just think it is like rocks db, you need to understand it when
debugging/operating but not in the initial coding since the metaphor we're
providing at this layer isn't a topology of processors but rather something
like the collections api. Anyhow it won't hurt people to have it there.

For the original KStreamBuilder thing, I think that came from the naming we
discussed originally:

   1. "kstreams" - the DSL
   2. "processor api" - the lower level callback/topology api
   3. KStream/KTable - entities in the kstreams dsl
   4. "Kafka Streams" - General name for stream processing stuff in Kafka,
   including both kstreams and the processor API plus the underlying
   implementation.

I think referring to the dsl as "kstreams" is cute and pneumonic and not
particularly confusing. Just like referring to the "java collections
library" isn't confusing even though it contains the Iterator interface
which is not actually itself a collection.

So I think KStreamBuilder should technically have been KstreamsBuilder and
is intended not to be a builder of a KStream but rather the builder for the
kstreams DSL. Okay, yes, that *is* slightly confusing. :-)

-Jay

On Wed, Mar 22, 2017 at 11:25 AM, Guozhang Wang  wrote:

> Regarding the naming of `StreamsTopologyBuilder` v.s. `StreamsBuilder` that
> are going to be used in DSL, I agree both has their arguments:
>
> 1. On one side, people using the DSL layer probably do not need to be aware
> (or rather, "learn about") of the "topology" concept, although this concept
> is a publicly exposed one in Kafka Streams.
>
> 2. On the other side, StreamsBuilder#build() returning a Topology object
> sounds a little weird, at least to me (admittedly subjective matter).
>
>
> Since the second bullet point seems to be more "subjective" and many people
> are not worried about it, I'm OK to go with the other option.
>
>
> Guozhang
>
>
> On Wed, Mar 22, 2017 at 8:58 AM, Michael Noll 
> wrote:
>
> > Forwarding to kafka-user.
> >
> >
> > -- Forwarded message --
> > From: Michael Noll 
> > Date: Wed, Mar 22, 2017 at 8:48 AM
> > Subject: Re: [DISCUSS] KIP-120: Cleanup Kafka Streams builder API
> > To: dev@kafka.apache.org
> >
> >
> > Matthias,
> >
> > > @Michael:
> > >
> > > You seemed to agree with Jay about not exposing the `Topology` concept
> > > in our main entry class (ie, current KStreamBuilder), thus, I
> > > interpreted that you do not want `Topology` in the name either (I am a
> > > little surprised by your last response, that goes the opposite
> > direction).
> >
> > Oh, sorry for not being clear.
> >
> > What I wanted to say in my earlier email was the following:  Yes, I do
> > agree with most of Jay's reasoning, notably about carefully deciding how
> > much and which parts of the API/concept "surface" we expose to users of
> the
> > DSL.  However, and this is perhaps where I wasn't very clear, I disagree
> on
> > the particular opinion about not exposing the topology concept to DSL
> > users.  Instead, I think the concept of a topology is important to
> > understand even for DSL users -- particularly because of the way the DSL
> is
> > currently wiring your processing logic via the builder pattern.  (As I
> > noted, e.g. Akka uses a different approach where you might be able to get
> > away with not exposing the "topology" concept, but even in Akka there's
> the
> > notion of graphs and flows.)
> >
> >
> > > > StreamsBuilder builder = new StreamsBuilder();
> > > >
> > > > // And here you'd define your...well, what actually?
> > > > // Ah right, you are composing a topology here, though you are
> not
> > > > aware of it.
> > >
> > > Yes. You are not aware of if -- that's the whole point about it --
> don't
> > > put the Topology concept in the focus...
> >
> > Let me turn this around, because that was my point: it's confusing to
> have
> > a name "StreamsBuilder" if that thing isn't building streams, and it is
> > not.
> >
> > As I mentioned before, I do think it is a benefit to make it clear to DSL
> > users that there are two aspects at play: (1) defining the logic/plan of
> > your processing, and (2) the execution of that plan.  I have a less
> strong
> > opinion whether or not having "topology" in the names would help to
> > communicate this separation as well as combination of (1) and (2) to make
> > your app work as expected.
> >
> > If we stick with `KafkaStreams` for (2) *and* don't like having
> "topology"
> > in the name, then perhaps we should rename `KStreamBuilder` to
> > `KafkaStreamsBuilder`.  That at least gives some illusion of a combo of
> (1)
> > and (2).  IMHO, `KafkaStreamsBuilder` highlights better that "it is a
> > builder/helper for the Kafka Streams API", rather than "a builder for
> > streams".
> >
> > Also, I 

Re: [VOTE] KIP-111 Kafka should preserve the Principal generated by the PrincipalBuilder while processing the request received on socket channel, on the broker.

2017-03-22 Thread Mayuresh Gharat
Hi Jun,

Please find the replies inline.

One reason to have KafkaPrincipal in ACL is that we can extend it to
support group in the future. Have you thought about how to support that in
your new proposal?
---> This is a feature of PrincipalBuilder and Authorizer, which are
pluggable.
The type of Principal should be opaque to core Kafka. If we want to add
support to group, we can add that to KafkaPrincipal class and modify the
SimpleAclAuthorizer to add/modify/check the ACL accordingly.


Another reason that we had KafkaPrincipal is simplicity. It can be
constructed from a simple string and makes matching easier. If we
expose java.security.Principal,then I guess that when an ACL is set, we
have to be able to construct
a java.security.Principal from some string to match the
java.security.Principal generated from the
SSL or SASL library. How do we make sure that the same type of
java.security.Principal
can be created and will match?
> Again this will be determined by the plugged in Authorizer and
PrincipalBuilder. Your PrincipalBuilder can make sure that it creates a
Principal whose name matches the string you specified while creating the
ACL. The Authorizer should make sure that it extracts the String from the
Principal and do the matching.
In our earlier discussions, we discussed about having a PrincipalBuilder
class specifier as a command line argument for the kafka-acls.sh to handle
this case but we decided that it would be an overkill at this stage.

Thanks,

Mayuresh

On Mon, Mar 20, 2017 at 7:42 AM, Jun Rao  wrote:

> Hi, Mayuresh,
>
> One reason to have KafkaPrincipal in ACL is that we can extend it to
> support group in the future. Have you thought about how to support that in
> your new proposal?
>
> Another reason that we had KafkaPrincipal is simplicity. It can be
> constructed from a simple string and makes matching easier. If we
> expose java.security.Principal,
> then I guess that when an ACL is set, we have to be able to construct
> a java.security.Principal
> from some string to match the java.security.Principal generated from the
> SSL or SASL library. How do we make sure that the same type of
> java.security.Principal
> can be created and will match?
>
> Thanks,
>
> Jun
>
>
> On Wed, Mar 15, 2017 at 8:48 PM, Mayuresh Gharat <
> gharatmayures...@gmail.com
> > wrote:
>
> > Hi Jun,
> >
> > Sorry for the delayed reply.
> > I agree that the easiest thing will be to add an additional field in the
> > Session class and we should be OK.
> > But having a KafkaPrincipal and java Principal with in the same class
> looks
> > little weird.
> >
> > So we can do this and slowly deprecate the usage of KafkaPrincipal in
> > public api's.
> >
> > We add new apis and make changes to the existing apis as follows :
> >
> >
> >- Changes to Session class :
> >
> > @Deprecated
> > case class Session(principal: KafkaPrincipal, clientAddress:
> InetAddress) {
> > val sanitizedUser = QuotaId.sanitize(principal.getName)
> > }
> >
> >
> > *@Deprecated .. (NEW)*
> >
> >
> > *case class Session(principal: KafkaPrincipal, clientAddress:
> InetAddress,
> > channelPrincipal: Java.security.Principal) {val sanitizedUser =
> > QuotaId.sanitize(principal.getName)}*
> >
> > *(NEW)*
> >
> >
> > *case class Session(principal: Java.security.Principal, clientAddress:
> > InetAddress) {val sanitizedUser = QuotaId.sanitize(principal.get
> > Name)}*
> >
> >
> >- Changes to Authorizer Interface :
> >
> > @Deprecated
> > def getAcls(principal: KafkaPrincipal): Map[Resource, Set[Acl]]
> >
> > *(NEW)*
> > *def getAcls(principal: Java.security.Principal): Map[Resource,
> Set[Acl]]*
> >
> >
> >- Changes to Acl class :
> >
> > @Deprecated
> > case class Acl(principal: KafkaPrincipal, permissionType: PermissionType,
> > host: String, operation: Operation)
> >
> >*(NEW)*
> >
> >
> > *case class Acl(principal: Java.security.Principal, permissionType:
> > PermissionType, host: String, operation: Operation) *
> > The one in Bold are the new api's. We will remove them eventually,
> probably
> > in next major release.
> > We don't want to get rid of KafkaPrincipal class and it will be used in
> the
> > same way as it does right now for out of box authorizer and commandline
> > tool. We would only be removing its direct usage from public apis.
> > Doing the above deprecation will help us to support other implementation
> of
> > Java.security.Principal as well which seems necessary especially since
> > Kafka provides pluggable Authorizer and PrincipalBuilder.
> >
> > Let me know your thoughts on this.
> >
> > Thanks,
> >
> > Mayuresh
> >
> > On Tue, Feb 28, 2017 at 2:33 PM, Mayuresh Gharat <
> > gharatmayures...@gmail.com
> > > wrote:
> >
> > > Hi Jun,
> > >
> > > Sure.
> > > I had an offline discussion with Joel on how we can deprecate the
> > > KafkaPrincipal from  Session and Authorizer.
> > > I will update the KIP to see if we can address all the concerns here.
> If
> > > not we can 

[jira] [Created] (KAFKA-4941) Better definition/introduction to the term brokers

2017-03-22 Thread Yih Feng Low (JIRA)
Yih Feng Low created KAFKA-4941:
---

 Summary: Better definition/introduction to the term brokers
 Key: KAFKA-4941
 URL: https://issues.apache.org/jira/browse/KAFKA-4941
 Project: Kafka
  Issue Type: Improvement
  Components: documentation
Reporter: Yih Feng Low
Priority: Trivial


Hi,

I just wanted to point out that in the documentation at 
https://kafka.apache.org/documentation/, there are over 500 references to the 
word "broker". However, the idea of what a broker is not clear. 

The first mention of a Kafka broker comes from the sentence:

??Alternatively, instead of manually creating topics you can also configure 
your brokers to auto-create topics when a non-existent topic is published to.??

Perhaps there could be a better discussion of what a broker is, similar to the 
discussions on what a consumer/producer/partition etc. is?





--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Offset data fetch in Kafka

2017-03-22 Thread Aneesh Varghese
Team ,

Can you help me with hot to use offset in bin/kafka-console-consumer.sh
Instead of --from-beginning

Like 01/02/2017  to 02/03/2017


[jira] [Updated] (KAFKA-4940) Cluster partially working if broker blocked with IO

2017-03-22 Thread Victor Garcia (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Victor Garcia updated KAFKA-4940:
-
Description: 
A cluster can partially work if there is an IO issue that blocks the broker 
process and leaves it in a state of uninterruptible sleep.

All the threads connected to this bad broker will hang and the cluster ends up 
partially working.

I reproduced it and this is what happened:

Let's say we have brokers 1, 2 and 3 and broker 2 is IO blocked, non 
responsive, you can't even kill it unless -9.
Let's say we have a topic with replication 3. The partitions with leader 1 and 
3, will see that broker 2 has issues and will take it out from ISR. That's fine.

But the partitions where the leader is 2, think the problematic brokers are 1 
and 3 and will take these replicas out of the ISR. And this is a problem.

The consumers and producers will only work with the ones that don't have the 
broker 2 in their ISR.

This is an example of the output for 2 topics after provoking this:

{code}
./kafka-topics.sh --describe --zookeeper 127.0.0.1:2181 --unavailable-partitions

Topic: agent_ping   Partition: 0Leader: 2   Replicas: 2,1,3 
Isr: 2
Topic: agent_ping   Partition: 1Leader: 3   Replicas: 3,2,1 
Isr: 1,3
Topic: agent_ping   Partition: 2Leader: 1   Replicas: 1,3,2 
Isr: 3,1
Topic: agent_ping   Partition: 3Leader: 2   Replicas: 2,3,1 
Isr: 2
Topic: agent_ping   Partition: 4Leader: 3   Replicas: 3,1,2 
Isr: 1,3
Topic: agent_ping   Partition: 5Leader: 1   Replicas: 1,2,3 
Isr: 3,1
Topic: agent_ping   Partition: 6Leader: 2   Replicas: 2,1,3 
Isr: 2
Topic: agent_ping   Partition: 9Leader: 2   Replicas: 2,3,1 
Isr: 2
Topic: agent_ping   Partition: 12   Leader: 2   Replicas: 2,1,3 
Isr: 2
Topic: agent_ping   Partition: 13   Leader: 3   Replicas: 3,2,1 
Isr: 1,3
Topic: agent_ping   Partition: 14   Leader: 1   Replicas: 1,3,2 
Isr: 3,1
Topic: agent_ping   Partition: 15   Leader: 2   Replicas: 2,3,1 
Isr: 2
Topic: agent_ping   Partition: 16   Leader: 3   Replicas: 3,1,2 
Isr: 1,3
Topic: agent_ping   Partition: 17   Leader: 1   Replicas: 1,2,3 
Isr: 3,1
Topic: agent_ping   Partition: 18   Leader: 2   Replicas: 2,1,3 
Isr: 2
Topic: imback   Partition: 0Leader: 3   Replicas: 3,1,2 Isr: 1,3
Topic: imback   Partition: 1Leader: 1   Replicas: 1,2,3 Isr: 3,1
Topic: imback   Partition: 2Leader: 2   Replicas: 2,3,1 Isr: 2
Topic: imback   Partition: 3Leader: 3   Replicas: 3,2,1 Isr: 1,3
Topic: imback   Partition: 4Leader: 1   Replicas: 1,3,2 Isr: 3,1
{code}

Kafka should be able to handle this in a better way and find out what are the 
problematic brokers and remove its replicas accordingly.
IO problems can be caused by hardware issues, kernel misconfiguration or 
others, and are not that infrequent.

Kafka is highly available, but in this case it is not.

To reproduce this, creating IO to block a process is not easy but the same 
symptoms can be easily reproducible using NFS.
Create an simple NFS server 
(https://help.ubuntu.com/community/SettingUpNFSHowTo), mount a NFS partition in 
the broker log.dirs and once the cluster is working, stop NFS in the server 
(service nfs-kernel-server stop)

This will make broker hang waiting for IO.



This is the output of a producer, some messages go through but others fail with 
this error:

{code}
./kafka-console-producer.sh --topic imback --broker 
broker1:6667,broker2:6667,broker3:6667

text
text
text
[2017-03-22 18:18:42,864] WARN Got error produce response with correlation id 
44 on topic-partition imback-2, retrying (2 attempts left). Error: 
NETWORK_EXCEPTION (org.apache.kafka.clients.producer.internals.Sender)
[2017-03-22 18:18:44,467] WARN Got error produce response with correlation id 
46 on topic-partition imback-2, retrying (1 attempts left). Error: 
NETWORK_EXCEPTION (org.apache.kafka.clients.producer.internals.Sender)
[2017-03-22 18:18:46,075] WARN Got error produce response with correlation id 
48 on topic-partition imback-2, retrying (0 attempts left). Error: 
NETWORK_EXCEPTION (org.apache.kafka.clients.producer.internals.Sender)
[2017-03-22 18:18:47,677] ERROR Error when sending message to topic imback with 
key: null, value: 1 bytes with error: 
(org.apache.kafka.clients.producer.internals.ErrorLoggingCallback)
org.apache.kafka.common.errors.NetworkException: The server disconnected before 
a response was received.
text
text
text
text
text
[2017-03-22 18:20:31,002] WARN Got error produce response with correlation id 
55 on topic-partition imback-2, retrying (2 attempts left). Error: 
NETWORK_EXCEPTION 

[jira] [Created] (KAFKA-4940) Cluster partially working if broker blocked with IO

2017-03-22 Thread Victor Garcia (JIRA)
Victor Garcia created KAFKA-4940:


 Summary: Cluster partially working if broker blocked with IO
 Key: KAFKA-4940
 URL: https://issues.apache.org/jira/browse/KAFKA-4940
 Project: Kafka
  Issue Type: Bug
  Components: core
Affects Versions: 0.10.2.0, 0.10.1.1
Reporter: Victor Garcia


A cluster can partially work if there is an IO issue that blocks the broker 
process and leaves it in a state of uninterruptible sleep.

All the threads connected to this bad broker will hang and the cluster ends up 
partially working.

I reproduced it and this is what happened:

Let's say we have brokers 1, 2 and 3 and broker 2 is IO blocked, non 
responsive, you can't even kill it unless -9.
Let's say we have a topic with replication 3. The partitions with leader 1 and 
3, will see that broker 2 has issues and will take it out from ISR. That's fine.

But the partitions where the leader is 2, think the problematic brokers are 1 
and 3 and will take these replicas out of the ISR. And this is a problem.

The consumers and producers will only work with the ones that don't have the 
broker 2 in their ISR.

This is an example of the output for 2 topics after provoking this:

{code}
./kafka-topics.sh --describe --zookeeper 127.0.0.1:2181 --unavailable-partitions

Topic: agent_ping   Partition: 0Leader: 2   Replicas: 2,1,3 
Isr: 2
Topic: agent_ping   Partition: 1Leader: 3   Replicas: 3,2,1 
Isr: 1,3
Topic: agent_ping   Partition: 2Leader: 1   Replicas: 1,3,2 
Isr: 3,1
Topic: agent_ping   Partition: 3Leader: 2   Replicas: 2,3,1 
Isr: 2
Topic: agent_ping   Partition: 4Leader: 3   Replicas: 3,1,2 
Isr: 1,3
Topic: agent_ping   Partition: 5Leader: 1   Replicas: 1,2,3 
Isr: 3,1
Topic: agent_ping   Partition: 6Leader: 2   Replicas: 2,1,3 
Isr: 2
Topic: agent_ping   Partition: 9Leader: 2   Replicas: 2,3,1 
Isr: 2
Topic: agent_ping   Partition: 12   Leader: 2   Replicas: 2,1,3 
Isr: 2
Topic: agent_ping   Partition: 13   Leader: 3   Replicas: 3,2,1 
Isr: 1,3
Topic: agent_ping   Partition: 14   Leader: 1   Replicas: 1,3,2 
Isr: 3,1
Topic: agent_ping   Partition: 15   Leader: 2   Replicas: 2,3,1 
Isr: 2
Topic: agent_ping   Partition: 16   Leader: 3   Replicas: 3,1,2 
Isr: 1,3
Topic: agent_ping   Partition: 17   Leader: 1   Replicas: 1,2,3 
Isr: 3,1
Topic: agent_ping   Partition: 18   Leader: 2   Replicas: 2,1,3 
Isr: 2
Topic: imback   Partition: 0Leader: 3   Replicas: 3,1,2 Isr: 1,3
Topic: imback   Partition: 1Leader: 1   Replicas: 1,2,3 Isr: 3,1
Topic: imback   Partition: 2Leader: 2   Replicas: 2,3,1 Isr: 2
Topic: imback   Partition: 3Leader: 3   Replicas: 3,2,1 Isr: 1,3
Topic: imback   Partition: 4Leader: 1   Replicas: 1,3,2 Isr: 3,1
{code}

Kafka should be able to handle this in a better way and find out what are the 
problematic brokers and remove its replicas accordingly.
IO problems can be caused by hardware issues, kernel misconfiguration or 
others, and are not that infrequent.

Kafka is highly available, but in this case it is not.

To reproduce this, creating IO to block a process is not easy but the same 
symptoms can be easily reproducible using NFS.
Create an simple NFS server 
(https://help.ubuntu.com/community/SettingUpNFSHowTo), mount a NFS partition in 
the broker log.dirs and once the cluster is working, stop NFS in the server 
(service nfs-kernel-server stop)

This will make broker hang waiting for IO.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-2729) Cached zkVersion not equal to that in zookeeper, broker not recovering.

2017-03-22 Thread Stephane Maarek (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15937468#comment-15937468
 ] 

Stephane Maarek commented on KAFKA-2729:


If I may add, this is a pretty bad issue, but it got worse. You not only have 
to recover Kafka, but also recover your Kafka Connect ClusterS. They got stuck 
for me in the following state:

[2017-03-23 00:06:05,478] INFO Marking the coordinator kafka-1:9092 (id: 
2147483626 rack: null) dead for group connect-MyConnector 
(org.apache.kafka.clients.consumer.internals.AbstractCoordinator)
[2017-03-23 00:06:05,478] INFO Marking the coordinator kafka-1:9092 (id: 
2147483626 rack: null) dead for group connect-MyConnector 
(org.apache.kafka.clients.consumer.internals.AbstractCoordinator)

> Cached zkVersion not equal to that in zookeeper, broker not recovering.
> ---
>
> Key: KAFKA-2729
> URL: https://issues.apache.org/jira/browse/KAFKA-2729
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8.2.1
>Reporter: Danil Serdyuchenko
>
> After a small network wobble where zookeeper nodes couldn't reach each other, 
> we started seeing a large number of undereplicated partitions. The zookeeper 
> cluster recovered, however we continued to see a large number of 
> undereplicated partitions. Two brokers in the kafka cluster were showing this 
> in the logs:
> {code}
> [2015-10-27 11:36:00,888] INFO Partition 
> [__samza_checkpoint_event-creation_1,3] on broker 5: Shrinking ISR for 
> partition [__samza_checkpoint_event-creation_1,3] from 6,5 to 5 
> (kafka.cluster.Partition)
> [2015-10-27 11:36:00,891] INFO Partition 
> [__samza_checkpoint_event-creation_1,3] on broker 5: Cached zkVersion [66] 
> not equal to that in zookeeper, skip updating ISR (kafka.cluster.Partition)
> {code}
> For all of the topics on the effected brokers. Both brokers only recovered 
> after a restart. Our own investigation yielded nothing, I was hoping you 
> could shed some light on this issue. Possibly if it's related to: 
> https://issues.apache.org/jira/browse/KAFKA-1382 , however we're using 
> 0.8.2.1.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (KAFKA-4939) Kafka does not log NoRouteToHostException in ERROR log level

2017-03-22 Thread radha (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

radha updated KAFKA-4939:
-
Description: 
If you have many brokers in your bootstrap.servers list and some cannot be 
reached by a specific Kafka client for whatever reason, it does not log this as 
ERROR and fails publishing with other errors that can never be resolved by 
increasing timeouts or metadata or retries.

{noformat}
ERROR pool-3-thread-3 [ProducerDroppedMessageExceptionLogger   ]
  - Exception occured while producing message: Failed to update metadata after 
1000 ms.
org.apache.kafka.common.errors.TimeoutException: Failed to update metadata 
after 1000 ms.

ERROR kafka-producer-network-thread | producer-1 
[ProducerDroppedMessageExceptionLogger   ]  - Exception occured while producing 
message: Expiring 1 record(s) for Q.REST.TOPIC-18 due to 5048 ms has passed 
since batch creation plus linger time
org.apache.kafka.common.errors.TimeoutException: Expiring 1 record(s) for 
Q.REST.TOPIC-18 due to 5048 ms has passed since batch creation plus linger time
{noformat}

You will see connections established to other Kafka brokers when doing netstat, 
even though these messages fail to be published.

We have wasted several hours before increasing log levels to TRACE and seeing 
these and confirming that we cannot even ping that specific Kafka Broker.

Logs that should be in ERROR and also retried:
{noformat}
[org.apache.kafka.common.network.Selector]  - Connection with 
some-prd-kafk02/*.*.*.* disconnected
java.net.NoRouteToHostException: No route to host
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at 
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739)
at 
org.apache.kafka.common.network.PlaintextTransportLayer.finishConnect(PlaintextTransportLayer.java:51)
at 
org.apache.kafka.common.network.KafkaChannel.finishConnect(KafkaChannel.java:73)
at 
org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:323)
at org.apache.kafka.common.network.Selector.poll(Selector.java:291)
at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:260)
at 
org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:236)
at 
org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:135)
at java.lang.Thread.run(Thread.java:745)
[org.apache.kafka.clients.NetworkClient  ]  - Node 206 disconnected.
{noformat}


  was:
If you have many brokers in your bootstrap.servers list and some cannot be 
reached by a specific Kafka client for whatever reason, (cannot ping), it does 
not log this as ERROR and fails publishing with other errors that can never be 
resolved by increasing timeouts or metadata or retries.

{noformat}
ERROR pool-3-thread-3 [ProducerDroppedMessageExceptionLogger   ]
  - Exception occured while producing message: Failed to update metadata after 
1000 ms.
org.apache.kafka.common.errors.TimeoutException: Failed to update metadata 
after 1000 ms.

ERROR kafka-producer-network-thread | producer-1 
[ProducerDroppedMessageExceptionLogger   ]  - Exception occured while producing 
message: Expiring 1 record(s) for Q.REST.TOPIC-18 due to 5048 ms has passed 
since batch creation plus linger time
org.apache.kafka.common.errors.TimeoutException: Expiring 1 record(s) for 
Q.REST.TOPIC-18 due to 5048 ms has passed since batch creation plus linger time
{noformat}

You will see connections established to other Kafka brokers when doing netstat, 
even though these messages fail to be published.

We have wasted several hours before increasing log levels to TRACE and seeing 
these and confirming that we cannot even ping that specific Kafka Broker.

Logs that should be in ERROR and also retried:
{noformat}
[org.apache.kafka.common.network.Selector]  - Connection with 
some-prd-kafk02/*.*.*.* disconnected
java.net.NoRouteToHostException: No route to host
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at 
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739)
at 
org.apache.kafka.common.network.PlaintextTransportLayer.finishConnect(PlaintextTransportLayer.java:51)
at 
org.apache.kafka.common.network.KafkaChannel.finishConnect(KafkaChannel.java:73)
at 
org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:323)
at org.apache.kafka.common.network.Selector.poll(Selector.java:291)
at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:260)
at 
org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:236)
at 
org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:135)
at java.lang.Thread.run(Thread.java:745)
[org.apache.kafka.clients.NetworkClient  ]  - Node 206 disconnected.
{noformat}



> Kafka does not log NoRouteToHostException in ERROR log level 
> 

[jira] [Updated] (KAFKA-4939) Kafka does not log NoRouteToHostException in ERROR log level

2017-03-22 Thread radha (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

radha updated KAFKA-4939:
-
Description: 
If you have many brokers in your bootstrap.servers list and some cannot be 
reached by a specific Kafka client for whatever reason, (cannot ping), it does 
not log this as ERROR and fails publishing with other errors that can never be 
resolved by increasing timeouts or metadata or retries.

{noformat}
ERROR pool-3-thread-3 [ProducerDroppedMessageExceptionLogger   ]
  - Exception occured while producing message: Failed to update metadata after 
1000 ms.
org.apache.kafka.common.errors.TimeoutException: Failed to update metadata 
after 1000 ms.

ERROR kafka-producer-network-thread | producer-1 
[ProducerDroppedMessageExceptionLogger   ]  - Exception occured while producing 
message: Expiring 1 record(s) for Q.REST.TOPIC-18 due to 5048 ms has passed 
since batch creation plus linger time
org.apache.kafka.common.errors.TimeoutException: Expiring 1 record(s) for 
Q.REST.TOPIC-18 due to 5048 ms has passed since batch creation plus linger time
{noformat}

You will see connections established to other Kafka brokers when doing netstat, 
even though these messages fail to be published.

We have wasted several hours before increasing log levels to TRACE and seeing 
these and confirming that we cannot even ping that specific Kafka Broker.

Logs that should be in ERROR and also retried:
{noformat}
[org.apache.kafka.common.network.Selector]  - Connection with 
some-prd-kafk02/*.*.*.* disconnected
java.net.NoRouteToHostException: No route to host
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at 
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739)
at 
org.apache.kafka.common.network.PlaintextTransportLayer.finishConnect(PlaintextTransportLayer.java:51)
at 
org.apache.kafka.common.network.KafkaChannel.finishConnect(KafkaChannel.java:73)
at 
org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:323)
at org.apache.kafka.common.network.Selector.poll(Selector.java:291)
at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:260)
at 
org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:236)
at 
org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:135)
at java.lang.Thread.run(Thread.java:745)
[org.apache.kafka.clients.NetworkClient  ]  - Node 206 disconnected.
{noformat}


  was:
If you have many brokers and some cannot be reached by a specific Kafka client 
for whatever reason, (cannot ping), it does not log this as ERROR and fails 
publishing with other errors that can never be resolved.

{noformat}
ERROR pool-3-thread-3 [ProducerDroppedMessageExceptionLogger   ]
  - Exception occured while producing message: Failed to update metadata after 
1000 ms.
org.apache.kafka.common.errors.TimeoutException: Failed to update metadata 
after 1000 ms.

ERROR kafka-producer-network-thread | producer-1 
[ProducerDroppedMessageExceptionLogger   ]  - Exception occured while producing 
message: Expiring 1 record(s) for Q.REST.TOPIC-18 due to 5048 ms has passed 
since batch creation plus linger time
org.apache.kafka.common.errors.TimeoutException: Expiring 1 record(s) for 
Q.REST.TOPIC-18 due to 5048 ms has passed since batch creation plus linger time
{noformat}

You will see connections established to Kafka when doing netstat, even though 
these messages fail to be published.

We have wasted several hours before increasing log levels to TRACE and seeing 
these and confirming that we cannot even ping that specific Kafka Broker.

Logs that should be in ERROR and also retried:
{noformat}
[org.apache.kafka.common.network.Selector]  - Connection with 
some-prd-kafk02/*.*.*.* disconnected
java.net.NoRouteToHostException: No route to host
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at 
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739)
at 
org.apache.kafka.common.network.PlaintextTransportLayer.finishConnect(PlaintextTransportLayer.java:51)
at 
org.apache.kafka.common.network.KafkaChannel.finishConnect(KafkaChannel.java:73)
at 
org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:323)
at org.apache.kafka.common.network.Selector.poll(Selector.java:291)
at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:260)
at 
org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:236)
at 
org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:135)
at java.lang.Thread.run(Thread.java:745)
[org.apache.kafka.clients.NetworkClient  ]  - Node 206 disconnected.
{noformat}



> Kafka does not log NoRouteToHostException in ERROR log level 
> -
>
> Key: KAFKA-4939
>  

[jira] [Updated] (KAFKA-4939) Kafka does not log NoRouteToHostException in ERROR log level

2017-03-22 Thread radha (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

radha updated KAFKA-4939:
-
Description: 
If you have many brokers and some cannot be reached by a specific Kafka client 
for whatever reason, (cannot ping), it does not log this as ERROR and fails 
publishing with other errors that can never be resolved.

{noformat}
ERROR pool-3-thread-3 [ProducerDroppedMessageExceptionLogger   ]
  - Exception occured while producing message: Failed to update metadata after 
1000 ms.
org.apache.kafka.common.errors.TimeoutException: Failed to update metadata 
after 1000 ms.

ERROR kafka-producer-network-thread | producer-1 
[ProducerDroppedMessageExceptionLogger   ]  - Exception occured while producing 
message: Expiring 1 record(s) for Q.REST.TOPIC-18 due to 5048 ms has passed 
since batch creation plus linger time
org.apache.kafka.common.errors.TimeoutException: Expiring 1 record(s) for 
Q.REST.TOPIC-18 due to 5048 ms has passed since batch creation plus linger time
{noformat}

You will see connections established to Kafka when doing netstat, even though 
these messages fail to be published.

We have wasted several hours before increasing log levels to TRACE and seeing 
these and confirming that we cannot even ping that specific Kafka Broker.

Logs that should be in ERROR and also retried:
{noformat}
[org.apache.kafka.common.network.Selector]  - Connection with 
some-prd-kafk02/*.*.*.* disconnected
java.net.NoRouteToHostException: No route to host
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at 
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739)
at 
org.apache.kafka.common.network.PlaintextTransportLayer.finishConnect(PlaintextTransportLayer.java:51)
at 
org.apache.kafka.common.network.KafkaChannel.finishConnect(KafkaChannel.java:73)
at 
org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:323)
at org.apache.kafka.common.network.Selector.poll(Selector.java:291)
at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:260)
at 
org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:236)
at 
org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:135)
at java.lang.Thread.run(Thread.java:745)
[org.apache.kafka.clients.NetworkClient  ]  - Node 206 disconnected.
{noformat}


  was:
If you have many brokers and some cannot be reached by a specific Kafka client 
for whatever reason, (cannot ping), it does not log this as ERROR and fails 
publishing with other errors that can never be resolved.

ERROR pool-3-thread-3 [ProducerDroppedMessageExceptionLogger   ]
  - Exception occured while producing message: Failed to update metadata after 
1000 ms.
org.apache.kafka.common.errors.TimeoutException: Failed to update metadata 
after 1000 ms.

ERROR kafka-producer-network-thread | producer-1 
[ProducerDroppedMessageExceptionLogger   ]  - Exception occured while producing 
message: Expiring 1 record(s) for Q.REST.TOPIC-18 due to 5048 ms has passed 
since batch creation plus linger time
org.apache.kafka.common.errors.TimeoutException: Expiring 1 record(s) for 
Q.REST.TOPIC-18 due to 5048 ms has passed since batch creation plus linger time

You will see connections established to Kafka when doing netstat, even though 
these messages fail to be published.

Logs that should be in ERROR and also retried. We have wasted several hours 
before increasing log levels to TRACE and seeing these and confirming that we 
cannot even ping that specific Kafka Broker.
[org.apache.kafka.common.network.Selector]  - Connection with 
some-prd-kafk02/*.*.*.* disconnected
java.net.NoRouteToHostException: No route to host
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at 
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739)
at 
org.apache.kafka.common.network.PlaintextTransportLayer.finishConnect(PlaintextTransportLayer.java:51)
at 
org.apache.kafka.common.network.KafkaChannel.finishConnect(KafkaChannel.java:73)
at 
org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:323)
at org.apache.kafka.common.network.Selector.poll(Selector.java:291)
at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:260)
at 
org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:236)
at 
org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:135)
at java.lang.Thread.run(Thread.java:745)
[org.apache.kafka.clients.NetworkClient  ]  - Node 206 disconnected.




> Kafka does not log NoRouteToHostException in ERROR log level 
> -
>
> Key: KAFKA-4939
> URL: https://issues.apache.org/jira/browse/KAFKA-4939
> Project: Kafka
>  Issue Type: Bug
> 

[jira] [Created] (KAFKA-4939) Kafka does not log NoRouteToHostException in ERROR log level

2017-03-22 Thread radha (JIRA)
radha created KAFKA-4939:


 Summary: Kafka does not log NoRouteToHostException in ERROR log 
level 
 Key: KAFKA-4939
 URL: https://issues.apache.org/jira/browse/KAFKA-4939
 Project: Kafka
  Issue Type: Bug
  Components: clients
Affects Versions: 0.10.1.1
Reporter: radha
Priority: Minor


If you have many brokers and some cannot be reached by a specific Kafka client 
for whatever reason, (cannot ping), it does not log this as ERROR and fails 
publishing with other errors that can never be resolved.

ERROR pool-3-thread-3 [ProducerDroppedMessageExceptionLogger   ]
  - Exception occured while producing message: Failed to update metadata after 
1000 ms.
org.apache.kafka.common.errors.TimeoutException: Failed to update metadata 
after 1000 ms.

ERROR kafka-producer-network-thread | producer-1 
[ProducerDroppedMessageExceptionLogger   ]  - Exception occured while producing 
message: Expiring 1 record(s) for Q.REST.TOPIC-18 due to 5048 ms has passed 
since batch creation plus linger time
org.apache.kafka.common.errors.TimeoutException: Expiring 1 record(s) for 
Q.REST.TOPIC-18 due to 5048 ms has passed since batch creation plus linger time

You will see connections established to Kafka when doing netstat, even though 
these messages fail to be published.

Logs that should be in ERROR and also retried. We have wasted several hours 
before increasing log levels to TRACE and seeing these and confirming that we 
cannot even ping that specific Kafka Broker.
[org.apache.kafka.common.network.Selector]  - Connection with 
some-prd-kafk02/*.*.*.* disconnected
java.net.NoRouteToHostException: No route to host
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at 
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739)
at 
org.apache.kafka.common.network.PlaintextTransportLayer.finishConnect(PlaintextTransportLayer.java:51)
at 
org.apache.kafka.common.network.KafkaChannel.finishConnect(KafkaChannel.java:73)
at 
org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:323)
at org.apache.kafka.common.network.Selector.poll(Selector.java:291)
at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:260)
at 
org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:236)
at 
org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:135)
at java.lang.Thread.run(Thread.java:745)
[org.apache.kafka.clients.NetworkClient  ]  - Node 206 disconnected.





--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Jenkins build is back to normal : kafka-0.10.2-jdk7 #110

2017-03-22 Thread Apache Jenkins Server
See 




[jira] [Issue Comment Deleted] (KAFKA-4392) Failed to lock the state directory due to an unexpected exception

2017-03-22 Thread Elias Levy (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elias Levy updated KAFKA-4392:
--
Comment: was deleted

(was: I am still seeing this error in 0.10.2.0 during rebalances.  Reopen or 
create a new issue?

WARN  2017-03-22 19:06:14,423 [StreamThread-20][StreamThread.java:1184] : Could 
not create task 3_346. Will retry.
org.apache.kafka.streams.errors.LockException: task [3_346] Failed to lock the 
state directory: /data/kafka_streams/some_job/3_346
at 
org.apache.kafka.streams.processor.internals.ProcessorStateManager.(ProcessorStateManager.java:102)
at 
org.apache.kafka.streams.processor.internals.AbstractTask.(AbstractTask.java:73)
at 
org.apache.kafka.streams.processor.internals.StreamTask.(StreamTask.java:108)
at 
org.apache.kafka.streams.processor.internals.StreamThread.createStreamTask(StreamThread.java:834)
at 
org.apache.kafka.streams.processor.internals.StreamThread$TaskCreator.createTask(StreamThread.java:1207)
at 
org.apache.kafka.streams.processor.internals.StreamThread$AbstractTaskCreator.retryWithBackoff(StreamThread.java:1180)
at 
org.apache.kafka.streams.processor.internals.StreamThread.addStreamTasks(StreamThread.java:937)
at 
org.apache.kafka.streams.processor.internals.StreamThread.access$500(StreamThread.java:69)
at 
org.apache.kafka.streams.processor.internals.StreamThread$1.onPartitionsAssigned(StreamThread.java:236)
at 
org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinComplete(ConsumerCoordinator.java:255)
at 
org.apache.kafka.clients.consumer.internals.AbstractCoordinator.joinGroupIfNeeded(AbstractCoordinator.java:339)
at 
org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureActiveGroup(AbstractCoordinator.java:303)
at 
org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:286)
at 
org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:1030)
at 
org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:995)
at 
org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:582)
at 
org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:368)

)

> Failed to lock the state directory due to an unexpected exception
> -
>
> Key: KAFKA-4392
> URL: https://issues.apache.org/jira/browse/KAFKA-4392
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.1.0
>Reporter: Ara Ebrahimi
>Assignee: Guozhang Wang
> Fix For: 0.10.2.0
>
>
> This happened on streaming startup, on a clean installation, no existing 
> folder. Here I was starting 4 instances of our streaming app on 4 machines 
> and one threw this exception. Seems to me there’s a race condition somewhere 
> when instances discover others, or something like that.
> 2016-11-02 15:43:47 INFO  StreamRunner:59 - Started http server successfully.
> 2016-11-02 15:44:50 ERROR StateDirectory:147 - Failed to lock the state 
> directory due to an unexpected exception
> java.nio.file.NoSuchFileException: 
> /data/1/kafka-streams/myapp-streams/7_21/.lock
>   at 
> sun.nio.fs.UnixException.translateToIOException(UnixException.java:86)
>   at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
>   at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
>   at 
> sun.nio.fs.UnixFileSystemProvider.newFileChannel(UnixFileSystemProvider.java:177)
>   at java.nio.channels.FileChannel.open(FileChannel.java:287)
>   at java.nio.channels.FileChannel.open(FileChannel.java:335)
>   at 
> org.apache.kafka.streams.processor.internals.StateDirectory.getOrCreateFileChannel(StateDirectory.java:176)
>   at 
> org.apache.kafka.streams.processor.internals.StateDirectory.lock(StateDirectory.java:90)
>   at 
> org.apache.kafka.streams.processor.internals.StateDirectory.cleanRemovedTasks(StateDirectory.java:140)
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.maybeClean(StreamThread.java:552)
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:459)
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:242)
> ^C
> [arae@a4 ~]$ ls -al /data/1/kafka-streams/myapp-streams/7_21/
> ls: cannot access /data/1/kafka-streams/myapp-streams/7_21/: No such file or 
> directory
> [arae@a4 ~]$ ls -al /data/1/kafka-streams/myapp-streams/
> total 4
> drwxr-xr-x 74 root root 4096 Nov  2 15:44 .
> drwxr-xr-x  3 root root   27 Nov  2 15:43 ..
> drwxr-xr-x  3 root root   32 Nov  2 15:43 0_1
> drwxr-xr-x  3 root root   32 Nov  2 

[jira] [Commented] (KAFKA-4392) Failed to lock the state directory due to an unexpected exception

2017-03-22 Thread Elias Levy (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15937388#comment-15937388
 ] 

Elias Levy commented on KAFKA-4392:
---

I am still seeing this error in 0.10.2.0 during rebalances.  Reopen or create a 
new issue?

WARN  2017-03-22 19:06:14,423 [StreamThread-20][StreamThread.java:1184] : Could 
not create task 3_346. Will retry.
org.apache.kafka.streams.errors.LockException: task [3_346] Failed to lock the 
state directory: /data/kafka_streams/some_job/3_346
at 
org.apache.kafka.streams.processor.internals.ProcessorStateManager.(ProcessorStateManager.java:102)
at 
org.apache.kafka.streams.processor.internals.AbstractTask.(AbstractTask.java:73)
at 
org.apache.kafka.streams.processor.internals.StreamTask.(StreamTask.java:108)
at 
org.apache.kafka.streams.processor.internals.StreamThread.createStreamTask(StreamThread.java:834)
at 
org.apache.kafka.streams.processor.internals.StreamThread$TaskCreator.createTask(StreamThread.java:1207)
at 
org.apache.kafka.streams.processor.internals.StreamThread$AbstractTaskCreator.retryWithBackoff(StreamThread.java:1180)
at 
org.apache.kafka.streams.processor.internals.StreamThread.addStreamTasks(StreamThread.java:937)
at 
org.apache.kafka.streams.processor.internals.StreamThread.access$500(StreamThread.java:69)
at 
org.apache.kafka.streams.processor.internals.StreamThread$1.onPartitionsAssigned(StreamThread.java:236)
at 
org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinComplete(ConsumerCoordinator.java:255)
at 
org.apache.kafka.clients.consumer.internals.AbstractCoordinator.joinGroupIfNeeded(AbstractCoordinator.java:339)
at 
org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureActiveGroup(AbstractCoordinator.java:303)
at 
org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:286)
at 
org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:1030)
at 
org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:995)
at 
org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:582)
at 
org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:368)



> Failed to lock the state directory due to an unexpected exception
> -
>
> Key: KAFKA-4392
> URL: https://issues.apache.org/jira/browse/KAFKA-4392
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.1.0
>Reporter: Ara Ebrahimi
>Assignee: Guozhang Wang
> Fix For: 0.10.2.0
>
>
> This happened on streaming startup, on a clean installation, no existing 
> folder. Here I was starting 4 instances of our streaming app on 4 machines 
> and one threw this exception. Seems to me there’s a race condition somewhere 
> when instances discover others, or something like that.
> 2016-11-02 15:43:47 INFO  StreamRunner:59 - Started http server successfully.
> 2016-11-02 15:44:50 ERROR StateDirectory:147 - Failed to lock the state 
> directory due to an unexpected exception
> java.nio.file.NoSuchFileException: 
> /data/1/kafka-streams/myapp-streams/7_21/.lock
>   at 
> sun.nio.fs.UnixException.translateToIOException(UnixException.java:86)
>   at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
>   at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
>   at 
> sun.nio.fs.UnixFileSystemProvider.newFileChannel(UnixFileSystemProvider.java:177)
>   at java.nio.channels.FileChannel.open(FileChannel.java:287)
>   at java.nio.channels.FileChannel.open(FileChannel.java:335)
>   at 
> org.apache.kafka.streams.processor.internals.StateDirectory.getOrCreateFileChannel(StateDirectory.java:176)
>   at 
> org.apache.kafka.streams.processor.internals.StateDirectory.lock(StateDirectory.java:90)
>   at 
> org.apache.kafka.streams.processor.internals.StateDirectory.cleanRemovedTasks(StateDirectory.java:140)
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.maybeClean(StreamThread.java:552)
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:459)
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:242)
> ^C
> [arae@a4 ~]$ ls -al /data/1/kafka-streams/myapp-streams/7_21/
> ls: cannot access /data/1/kafka-streams/myapp-streams/7_21/: No such file or 
> directory
> [arae@a4 ~]$ ls -al /data/1/kafka-streams/myapp-streams/
> total 4
> drwxr-xr-x 74 root root 4096 Nov  2 15:44 .
> drwxr-xr-x  3 root root   27 Nov  2 15:43 ..
> drwxr-xr-x  3 root root   32 Nov  2 15:43 0_1
> drwxr-xr-x  3 root root   32 

[jira] [Commented] (KAFKA-4392) Failed to lock the state directory due to an unexpected exception

2017-03-22 Thread Elias Levy (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15937389#comment-15937389
 ] 

Elias Levy commented on KAFKA-4392:
---

I am still seeing this error in 0.10.2.0 during rebalances.  Reopen or create a 
new issue?

WARN  2017-03-22 19:06:14,423 [StreamThread-20][StreamThread.java:1184] : Could 
not create task 3_346. Will retry.
org.apache.kafka.streams.errors.LockException: task [3_346] Failed to lock the 
state directory: /data/kafka_streams/some_job/3_346
at 
org.apache.kafka.streams.processor.internals.ProcessorStateManager.(ProcessorStateManager.java:102)
at 
org.apache.kafka.streams.processor.internals.AbstractTask.(AbstractTask.java:73)
at 
org.apache.kafka.streams.processor.internals.StreamTask.(StreamTask.java:108)
at 
org.apache.kafka.streams.processor.internals.StreamThread.createStreamTask(StreamThread.java:834)
at 
org.apache.kafka.streams.processor.internals.StreamThread$TaskCreator.createTask(StreamThread.java:1207)
at 
org.apache.kafka.streams.processor.internals.StreamThread$AbstractTaskCreator.retryWithBackoff(StreamThread.java:1180)
at 
org.apache.kafka.streams.processor.internals.StreamThread.addStreamTasks(StreamThread.java:937)
at 
org.apache.kafka.streams.processor.internals.StreamThread.access$500(StreamThread.java:69)
at 
org.apache.kafka.streams.processor.internals.StreamThread$1.onPartitionsAssigned(StreamThread.java:236)
at 
org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinComplete(ConsumerCoordinator.java:255)
at 
org.apache.kafka.clients.consumer.internals.AbstractCoordinator.joinGroupIfNeeded(AbstractCoordinator.java:339)
at 
org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureActiveGroup(AbstractCoordinator.java:303)
at 
org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:286)
at 
org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:1030)
at 
org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:995)
at 
org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:582)
at 
org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:368)



> Failed to lock the state directory due to an unexpected exception
> -
>
> Key: KAFKA-4392
> URL: https://issues.apache.org/jira/browse/KAFKA-4392
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.1.0
>Reporter: Ara Ebrahimi
>Assignee: Guozhang Wang
> Fix For: 0.10.2.0
>
>
> This happened on streaming startup, on a clean installation, no existing 
> folder. Here I was starting 4 instances of our streaming app on 4 machines 
> and one threw this exception. Seems to me there’s a race condition somewhere 
> when instances discover others, or something like that.
> 2016-11-02 15:43:47 INFO  StreamRunner:59 - Started http server successfully.
> 2016-11-02 15:44:50 ERROR StateDirectory:147 - Failed to lock the state 
> directory due to an unexpected exception
> java.nio.file.NoSuchFileException: 
> /data/1/kafka-streams/myapp-streams/7_21/.lock
>   at 
> sun.nio.fs.UnixException.translateToIOException(UnixException.java:86)
>   at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
>   at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
>   at 
> sun.nio.fs.UnixFileSystemProvider.newFileChannel(UnixFileSystemProvider.java:177)
>   at java.nio.channels.FileChannel.open(FileChannel.java:287)
>   at java.nio.channels.FileChannel.open(FileChannel.java:335)
>   at 
> org.apache.kafka.streams.processor.internals.StateDirectory.getOrCreateFileChannel(StateDirectory.java:176)
>   at 
> org.apache.kafka.streams.processor.internals.StateDirectory.lock(StateDirectory.java:90)
>   at 
> org.apache.kafka.streams.processor.internals.StateDirectory.cleanRemovedTasks(StateDirectory.java:140)
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.maybeClean(StreamThread.java:552)
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:459)
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:242)
> ^C
> [arae@a4 ~]$ ls -al /data/1/kafka-streams/myapp-streams/7_21/
> ls: cannot access /data/1/kafka-streams/myapp-streams/7_21/: No such file or 
> directory
> [arae@a4 ~]$ ls -al /data/1/kafka-streams/myapp-streams/
> total 4
> drwxr-xr-x 74 root root 4096 Nov  2 15:44 .
> drwxr-xr-x  3 root root   27 Nov  2 15:43 ..
> drwxr-xr-x  3 root root   32 Nov  2 15:43 0_1
> drwxr-xr-x  3 root root   32 

Re: [DISCUSS] KIP-135 : Send of null key to a compacted topic should throw non-retriable error back to user

2017-03-22 Thread James Cheng
Mayuresh,

The Compatibility/Migration section says to upgrade the clients first, before 
the brokers. Are you talking about implementation or deployment? Do you mean to 
implement the client changes before the broker changes? That would imply that 
it would take 2 Kafka releases to implement this KIP.

Or, are you saying that when deploying this change, that you would recommend 
upgrading clients before upgrading brokers?

Thanks,
-James


> On Mar 22, 2017, at 3:07 PM, Mayuresh Gharat  
> wrote:
> 
> Hi All,
> 
> We have created KIP-135 to propose that Kafka should return a non-retriable
> error when the producer produces a message with null key to a log compacted
> topic.
> 
> Please find the KIP wiki in the link :
> 
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-135+%3A+Send+of+null+key+to+a+compacted+topic+should+throw+non-retriable+error+back+to+user.
> 
> 
> We would love to hear your comments and suggestions.
> 
> 
> Thanks,
> 
> Mayuresh



Jenkins build is back to normal : kafka-trunk-jdk8 #1369

2017-03-22 Thread Apache Jenkins Server
See 




Build failed in Jenkins: kafka-trunk-jdk7 #2035

2017-03-22 Thread Apache Jenkins Server
See 


Changes:

[wangguoz] MINOR: Improvements on Streams log4j

--
[...truncated 19.54 KB...]
kafka.log.LogValidatorTest > testCreateTimeUpConversion PASSED

kafka.log.LogValidatorTest > testInvalidCreateTimeNonCompressed STARTED

kafka.log.LogValidatorTest > testInvalidCreateTimeNonCompressed PASSED

kafka.log.LogValidatorTest > testAbsoluteOffsetAssignmentCompressed STARTED

kafka.log.LogValidatorTest > testAbsoluteOffsetAssignmentCompressed PASSED

kafka.log.LogValidatorTest > 
testOffsetAssignmentAfterMessageFormatConversionV0Compressed STARTED

kafka.log.LogValidatorTest > 
testOffsetAssignmentAfterMessageFormatConversionV0Compressed PASSED

kafka.log.LogValidatorTest > testInvalidCreateTimeCompressed STARTED

kafka.log.LogValidatorTest > testInvalidCreateTimeCompressed PASSED

kafka.log.LogValidatorTest > 
testOffsetAssignmentAfterMessageFormatConversionV0NonCompressed STARTED

kafka.log.LogValidatorTest > 
testOffsetAssignmentAfterMessageFormatConversionV0NonCompressed PASSED

kafka.log.LogValidatorTest > testCreateTimeNonCompressed STARTED

kafka.log.LogValidatorTest > testCreateTimeNonCompressed PASSED

kafka.log.LogValidatorTest > testCreateTimeCompressed STARTED

kafka.log.LogValidatorTest > testCreateTimeCompressed PASSED

kafka.log.LogValidatorTest > testInvalidInnerMagicVersion STARTED

kafka.log.LogValidatorTest > testInvalidInnerMagicVersion PASSED

kafka.log.LogValidatorTest > 
testOffsetAssignmentAfterMessageFormatConversionV1Compressed STARTED

kafka.log.LogValidatorTest > 
testOffsetAssignmentAfterMessageFormatConversionV1Compressed PASSED

kafka.log.LogValidatorTest > testAbsoluteOffsetAssignmentNonCompressed STARTED

kafka.log.LogValidatorTest > testAbsoluteOffsetAssignmentNonCompressed PASSED

kafka.log.LogValidatorTest > testRelativeOffsetAssignmentCompressed STARTED

kafka.log.LogValidatorTest > testRelativeOffsetAssignmentCompressed PASSED

kafka.log.LogValidatorTest > testLogAppendTimeWithoutRecompression STARTED

kafka.log.LogValidatorTest > testLogAppendTimeWithoutRecompression PASSED

kafka.log.LogValidatorTest > testLogAppendTimeNonCompressed STARTED

kafka.log.LogValidatorTest > testLogAppendTimeNonCompressed PASSED

kafka.log.LogValidatorTest > testLogAppendTimeWithRecompression STARTED

kafka.log.LogValidatorTest > testLogAppendTimeWithRecompression PASSED

kafka.log.LogValidatorTest > testRelativeOffsetAssignmentNonCompressed STARTED

kafka.log.LogValidatorTest > testRelativeOffsetAssignmentNonCompressed PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[0] STARTED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[0] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[1] STARTED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[1] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[2] STARTED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[2] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[3] STARTED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[3] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[4] STARTED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[4] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[5] STARTED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[5] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[6] STARTED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[6] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[7] STARTED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[7] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[8] STARTED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[8] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[9] STARTED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[9] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[10] STARTED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[10] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[11] STARTED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[11] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[12] STARTED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[12] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[13] STARTED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[13] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[14] STARTED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[14] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[15] STARTED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[15] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[16] STARTED


Build failed in Jenkins: kafka-trunk-jdk7 #2034

2017-03-22 Thread Apache Jenkins Server
See 


Changes:

[cshapi] KAFKA-4929: Transformation Key/Value type references should be to class

--
[...truncated 860.24 KB...]
org.apache.kafka.streams.integration.JoinIntegrationTest > 
testInnerKStreamKTable PASSED

org.apache.kafka.streams.integration.QueryableStateIntegrationTest > 
shouldNotMakeStoreAvailableUntilAllStoresAvailable STARTED

org.apache.kafka.streams.integration.QueryableStateIntegrationTest > 
shouldNotMakeStoreAvailableUntilAllStoresAvailable PASSED

org.apache.kafka.streams.integration.QueryableStateIntegrationTest > 
queryOnRebalance STARTED

org.apache.kafka.streams.integration.QueryableStateIntegrationTest > 
queryOnRebalance PASSED

org.apache.kafka.streams.integration.QueryableStateIntegrationTest > 
concurrentAccesses STARTED

org.apache.kafka.streams.integration.QueryableStateIntegrationTest > 
concurrentAccesses PASSED

org.apache.kafka.streams.integration.QueryableStateIntegrationTest > 
shouldBeAbleToQueryStateWithZeroSizedCache STARTED

org.apache.kafka.streams.integration.QueryableStateIntegrationTest > 
shouldBeAbleToQueryStateWithZeroSizedCache PASSED

org.apache.kafka.streams.integration.QueryableStateIntegrationTest > 
shouldBeAbleToQueryStateWithNonZeroSizedCache STARTED

org.apache.kafka.streams.integration.QueryableStateIntegrationTest > 
shouldBeAbleToQueryStateWithNonZeroSizedCache PASSED

org.apache.kafka.streams.integration.KStreamRepartitionJoinTest > 
shouldCorrectlyRepartitionOnJoinOperationsWithZeroSizedCache STARTED

org.apache.kafka.streams.integration.KStreamRepartitionJoinTest > 
shouldCorrectlyRepartitionOnJoinOperationsWithZeroSizedCache PASSED

org.apache.kafka.streams.integration.KStreamRepartitionJoinTest > 
shouldCorrectlyRepartitionOnJoinOperationsWithNonZeroSizedCache STARTED

org.apache.kafka.streams.integration.KStreamRepartitionJoinTest > 
shouldCorrectlyRepartitionOnJoinOperationsWithNonZeroSizedCache PASSED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldInnerLeftJoin STARTED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldInnerLeftJoin PASSED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldOuterOuterJoin STARTED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldOuterOuterJoin PASSED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldInnerOuterJoin STARTED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldInnerOuterJoin PASSED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldLeftInnerJoin STARTED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldLeftInnerJoin PASSED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldOuterLeftJoin STARTED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldOuterLeftJoin PASSED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldOuterInnerJoin STARTED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldOuterInnerJoin PASSED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldInnerInnerJoin STARTED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldInnerInnerJoin PASSED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldLeftOuterJoin STARTED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldLeftOuterJoin PASSED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldLeftLeftJoin STARTED

org.apache.kafka.streams.integration.KTableKTableJoinIntegrationTest > 
shouldLeftLeftJoin PASSED

org.apache.kafka.streams.integration.KStreamsFineGrainedAutoResetIntegrationTest
 > shouldThrowExceptionOverlappingPattern STARTED

org.apache.kafka.streams.integration.KStreamsFineGrainedAutoResetIntegrationTest
 > shouldThrowExceptionOverlappingPattern PASSED

org.apache.kafka.streams.integration.KStreamsFineGrainedAutoResetIntegrationTest
 > shouldThrowExceptionOverlappingTopic STARTED

org.apache.kafka.streams.integration.KStreamsFineGrainedAutoResetIntegrationTest
 > shouldThrowExceptionOverlappingTopic PASSED

org.apache.kafka.streams.integration.KStreamsFineGrainedAutoResetIntegrationTest
 > shouldThrowStreamsExceptionNoResetSpecified STARTED

org.apache.kafka.streams.integration.KStreamsFineGrainedAutoResetIntegrationTest
 > shouldThrowStreamsExceptionNoResetSpecified PASSED

org.apache.kafka.streams.integration.KStreamsFineGrainedAutoResetIntegrationTest
 > shouldOnlyReadRecordsWhereEarliestSpecified STARTED

org.apache.kafka.streams.integration.KStreamsFineGrainedAutoResetIntegrationTest
 > shouldOnlyReadRecordsWhereEarliestSpecified PASSED

org.apache.kafka.streams.integration.KStreamAggregationDedupIntegrationTest > 

[DISCUSS] KIP-135 : Send of null key to a compacted topic should throw non-retriable error back to user

2017-03-22 Thread Mayuresh Gharat
Hi All,

We have created KIP-135 to propose that Kafka should return a non-retriable
error when the producer produces a message with null key to a log compacted
topic.

Please find the KIP wiki in the link :

https://cwiki.apache.org/confluence/display/KAFKA/KIP-135+%3A+Send+of+null+key+to+a+compacted+topic+should+throw+non-retriable+error+back+to+user.


We would love to hear your comments and suggestions.


Thanks,

Mayuresh


[jira] [Created] (KAFKA-4938) Creating a connector with missing name parameter throws a NullPointerException

2017-03-22 Thread JIRA
Sönke Liebau created KAFKA-4938:
---

 Summary: Creating a connector with missing name parameter throws a 
NullPointerException
 Key: KAFKA-4938
 URL: https://issues.apache.org/jira/browse/KAFKA-4938
 Project: Kafka
  Issue Type: Bug
  Components: KafkaConnect
Affects Versions: 0.10.2.0
Reporter: Sönke Liebau
Priority: Minor


Creating a connector via the rest api runs into a NullPointerException, when 
omitting the name parameter in the request.

{code}
POST 127.0.0.1:8083/connectors
{
  "config": {
"connector.class": "org.apache.kafka.connect.tools.MockSourceConnector",
"tasks.max": "1",
"topics": "test-topic"
  }
}
{code}

Results in a 500 return code, due to a NullPointerException being thrown when 
checking the name for slashes 
[here|https://github.com/apache/kafka/blob/trunk/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/rest/resources/ConnectorsResource.java#L91].
 I believe this was introduced with the fix for 
[KAFKA-4372|https://issues.apache.org/jira/browse/KAFKA-4372]



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-4930) Connect Rest API allows creating connectors with an empty name

2017-03-22 Thread JIRA

[ 
https://issues.apache.org/jira/browse/KAFKA-4930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15937220#comment-15937220
 ] 

Sönke Liebau commented on KAFKA-4930:
-

I've created a potential fix for this issue and pushed to 
https://github.com/soenkeliebau/kafka/tree/KAFKA-4930

However I am unsure, whether this fully addresses all potential issues tbh. All 
tests pass, but I won't pretend that I have fully understood all dependencies 
in the validation model, so some feedback would be very welcome. 
My fundamental worry with this implementation is, that I create a validator for 
a option that cannot have a default value, which means that the validator won't 
greenlight a default config, which I then get around by allowing a null value. 
This seems not entirely clean. 
An alternative approach would be to add checks 
[here|https://github.com/apache/kafka/blob/trunk/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/rest/resources/ConnectorsResource.java#L91]
 but I actually found that the existing check here throws a 
NullPointerException for requests that don't have the _name_ parameter (will 
create an issue for that), so there are caveats here as well.

Anyway, thoughts?

> Connect Rest API allows creating connectors with an empty name
> --
>
> Key: KAFKA-4930
> URL: https://issues.apache.org/jira/browse/KAFKA-4930
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 0.10.2.0
>Reporter: Sönke Liebau
>Priority: Minor
>
> The Connect Rest API allows to deploy connectors with an empty name field, 
> which then cannot be removed through the api.
> Sending the following request:
> {code}
> {
> "name": "",
> "config": {
> "connector.class": 
> "org.apache.kafka.connect.tools.MockSourceConnector",
> "tasks.max": "1",
> "topics": "test-topic"
> 
> }
> }
> {code}
> Results in a connector being deployed which can be seen in the list of 
> connectors:
> {code}
> [
>   "",
>   "testconnector"
> ]{code}
> But cannot be removed via a DELETE call, as the api thinks we are trying to 
> delete the /connectors endpoint and declines the request.
> I don't think there is a valid case for the connector name to be empty so 
> perhaps we should add a check for this. I am happy to work on this.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] kafka pull request #2702: MINOR: Improvements on log4j

2017-03-22 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/2702


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (KAFKA-4937) Batch resetting offsets in Streams' StoreChangelogReader

2017-03-22 Thread Guozhang Wang (JIRA)
Guozhang Wang created KAFKA-4937:


 Summary: Batch resetting offsets in Streams' StoreChangelogReader
 Key: KAFKA-4937
 URL: https://issues.apache.org/jira/browse/KAFKA-4937
 Project: Kafka
  Issue Type: Bug
  Components: streams
Reporter: Guozhang Wang


Currently in {{StoreChangelogReader}} we are calling {{consumer.position()}} 
when logging as well as setting starting offset right after 
{{seekingToBeginning}}, which will incur a blocking round trip with offset 
request. We should consider batching those in a single round trip for all 
partitions that needs to seek to beginning (i.e. needs to reset offset).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Build failed in Jenkins: kafka-trunk-jdk8 #1368

2017-03-22 Thread Apache Jenkins Server
See 


Changes:

[wangguoz] MINOR: MyProcessor doc example should implement, not extend 
`Processor`

--
[...truncated 862.25 KB...]
org.apache.kafka.streams.kstream.internals.KStreamKStreamJoinTest > testJoin 
PASSED

org.apache.kafka.streams.kstream.internals.KStreamKStreamJoinTest > 
testWindowing STARTED

org.apache.kafka.streams.kstream.internals.KStreamKStreamJoinTest > 
testWindowing PASSED

org.apache.kafka.streams.kstream.internals.KStreamKStreamJoinTest > 
testAsymetricWindowingBefore STARTED

org.apache.kafka.streams.kstream.internals.KStreamKStreamJoinTest > 
testAsymetricWindowingBefore PASSED

org.apache.kafka.streams.kstream.internals.KStreamKStreamJoinTest > 
testAsymetricWindowingAfter STARTED

org.apache.kafka.streams.kstream.internals.KStreamKStreamJoinTest > 
testAsymetricWindowingAfter PASSED

org.apache.kafka.streams.kstream.internals.KStreamKTableJoinTest > testJoin 
STARTED

org.apache.kafka.streams.kstream.internals.KStreamKTableJoinTest > testJoin 
PASSED

org.apache.kafka.streams.kstream.internals.KTableKTableJoinTest > testJoin 
STARTED

org.apache.kafka.streams.kstream.internals.KTableKTableJoinTest > testJoin 
PASSED

org.apache.kafka.streams.kstream.internals.KTableKTableJoinTest > 
testNotSendingOldValues STARTED

org.apache.kafka.streams.kstream.internals.KTableKTableJoinTest > 
testNotSendingOldValues PASSED

org.apache.kafka.streams.kstream.internals.KTableKTableJoinTest > 
testSendingOldValues STARTED

org.apache.kafka.streams.kstream.internals.KTableKTableJoinTest > 
testSendingOldValues PASSED

org.apache.kafka.streams.kstream.internals.KStreamKStreamLeftJoinTest > 
testLeftJoin STARTED

org.apache.kafka.streams.kstream.internals.KStreamKStreamLeftJoinTest > 
testLeftJoin PASSED

org.apache.kafka.streams.kstream.internals.KStreamKStreamLeftJoinTest > 
testWindowing STARTED

org.apache.kafka.streams.kstream.internals.KStreamKStreamLeftJoinTest > 
testWindowing PASSED

org.apache.kafka.streams.kstream.internals.KTableForeachTest > testForeach 
STARTED

org.apache.kafka.streams.kstream.internals.KTableForeachTest > testForeach 
PASSED

org.apache.kafka.streams.kstream.internals.KTableForeachTest > testTypeVariance 
STARTED

org.apache.kafka.streams.kstream.internals.KTableForeachTest > testTypeVariance 
PASSED

org.apache.kafka.streams.kstream.internals.GlobalKTableJoinsTest > 
shouldInnerJoinWithStream STARTED

org.apache.kafka.streams.kstream.internals.GlobalKTableJoinsTest > 
shouldInnerJoinWithStream PASSED

org.apache.kafka.streams.kstream.internals.GlobalKTableJoinsTest > 
shouldLeftJoinWithStream STARTED

org.apache.kafka.streams.kstream.internals.GlobalKTableJoinsTest > 
shouldLeftJoinWithStream PASSED

org.apache.kafka.streams.kstream.internals.KStreamMapValuesTest > 
testFlatMapValues STARTED

org.apache.kafka.streams.kstream.internals.KStreamMapValuesTest > 
testFlatMapValues PASSED

org.apache.kafka.streams.kstream.internals.KStreamFlatMapTest > testFlatMap 
STARTED

org.apache.kafka.streams.kstream.internals.KStreamFlatMapTest > testFlatMap 
PASSED

org.apache.kafka.streams.kstream.internals.WindowedStreamPartitionerTest > 
testCopartitioning STARTED

org.apache.kafka.streams.kstream.internals.WindowedStreamPartitionerTest > 
testCopartitioning PASSED

org.apache.kafka.streams.kstream.internals.WindowedStreamPartitionerTest > 
testWindowedSerializerNoArgConstructors STARTED

org.apache.kafka.streams.kstream.internals.WindowedStreamPartitionerTest > 
testWindowedSerializerNoArgConstructors PASSED

org.apache.kafka.streams.kstream.internals.WindowedStreamPartitionerTest > 
testWindowedDeserializerNoArgConstructors STARTED

org.apache.kafka.streams.kstream.internals.WindowedStreamPartitionerTest > 
testWindowedDeserializerNoArgConstructors PASSED

org.apache.kafka.streams.kstream.internals.KStreamKTableLeftJoinTest > testJoin 
STARTED

org.apache.kafka.streams.kstream.internals.KStreamKTableLeftJoinTest > testJoin 
PASSED

org.apache.kafka.streams.kstream.internals.UnlimitedWindowTest > 
shouldAlwaysOverlap STARTED

org.apache.kafka.streams.kstream.internals.UnlimitedWindowTest > 
shouldAlwaysOverlap PASSED

org.apache.kafka.streams.kstream.internals.UnlimitedWindowTest > 
cannotCompareUnlimitedWindowWithDifferentWindowType STARTED

org.apache.kafka.streams.kstream.internals.UnlimitedWindowTest > 
cannotCompareUnlimitedWindowWithDifferentWindowType PASSED

org.apache.kafka.streams.kstream.internals.KTableImplTest > 
shouldNotAllowNullMapperOnMapValues STARTED

org.apache.kafka.streams.kstream.internals.KTableImplTest > 
shouldNotAllowNullMapperOnMapValues PASSED

org.apache.kafka.streams.kstream.internals.KTableImplTest > 
shouldNotAllowNullPredicateOnFilter STARTED

org.apache.kafka.streams.kstream.internals.KTableImplTest > 
shouldNotAllowNullPredicateOnFilter PASSED

org.apache.kafka.streams.kstream.internals.KTableImplTest > 

[jira] [Commented] (KAFKA-4929) Transformation Key/Value type references should be to class name(), not canonicalName()

2017-03-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15937118#comment-15937118
 ] 

ASF GitHub Bot commented on KAFKA-4929:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/2720


> Transformation Key/Value type references should be to class name(), not 
> canonicalName()
> ---
>
> Key: KAFKA-4929
> URL: https://issues.apache.org/jira/browse/KAFKA-4929
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 0.10.2.0
>Reporter: bruce szalwinski
>Priority: Minor
> Fix For: 0.11.0.0
>
>
> The docs suggest that referencing the Key/Value transformations is done as 
> follows:
> {code}
> "transforms": "replaceFieldValue",
> "transforms.replaceFieldValue.type":  
> "org.apache.kafka.connect.transforms.ReplaceField.Value"
> {code}
> But that results in a validation failure saying that the class cannot be 
> found.
> {code}
> "value": {
> "errors": [
> "Invalid value 
> org.apache.kafka.connect.transforms.ReplaceField.Value for configuration 
> transforms.replaceFieldValue.type: Class 
> org.apache.kafka.connect.transforms.ReplaceField.Value could not be found.",
> "Invalid value null for configuration 
> transforms.replaceFieldValue.type: Not a Transformation"
> ],
> "name": "transforms.replaceFieldValue.type",
> "recommended_values": [
> "org.apache.kafka.connect.transforms.ExtractField.Key",
> "org.apache.kafka.connect.transforms.ExtractField.Value",
> "org.apache.kafka.connect.transforms.HoistField.Key",
> "org.apache.kafka.connect.transforms.HoistField.Value",
> "org.apache.kafka.connect.transforms.InsertField.Key",
> "org.apache.kafka.connect.transforms.InsertField.Value",
> "org.apache.kafka.connect.transforms.MaskField.Key",
> "org.apache.kafka.connect.transforms.MaskField.Value",
> "org.apache.kafka.connect.transforms.RegexRouter",
> "org.apache.kafka.connect.transforms.ReplaceField.Key",
> "org.apache.kafka.connect.transforms.ReplaceField.Value",
> 
> "org.apache.kafka.connect.transforms.SetSchemaMetadata.Key",
> 
> "org.apache.kafka.connect.transforms.SetSchemaMetadata.Value",
> "org.apache.kafka.connect.transforms.TimestampRouter",
> "org.apache.kafka.connect.transforms.ValueToKey"
> ],
> {code}
> Since the Key / Value transformations are defined as static nested classes, 
> the proper notation is
> {code}
> "transforms": "replaceFieldValue",
> "transforms.replaceFieldValue.type":  
> "org.apache.kafka.connect.transforms.ReplaceField$Value"
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] kafka pull request #2720: KAFKA-4929: Transformation Key/Value type referenc...

2017-03-22 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/2720


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (KAFKA-4929) Transformation Key/Value type references should be to class name(), not canonicalName()

2017-03-22 Thread Gwen Shapira (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gwen Shapira updated KAFKA-4929:

   Resolution: Fixed
Fix Version/s: 0.11.0.0
   Status: Resolved  (was: Patch Available)

Issue resolved by pull request 2720
[https://github.com/apache/kafka/pull/2720]

> Transformation Key/Value type references should be to class name(), not 
> canonicalName()
> ---
>
> Key: KAFKA-4929
> URL: https://issues.apache.org/jira/browse/KAFKA-4929
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 0.10.2.0
>Reporter: bruce szalwinski
>Priority: Minor
> Fix For: 0.11.0.0
>
>
> The docs suggest that referencing the Key/Value transformations is done as 
> follows:
> {code}
> "transforms": "replaceFieldValue",
> "transforms.replaceFieldValue.type":  
> "org.apache.kafka.connect.transforms.ReplaceField.Value"
> {code}
> But that results in a validation failure saying that the class cannot be 
> found.
> {code}
> "value": {
> "errors": [
> "Invalid value 
> org.apache.kafka.connect.transforms.ReplaceField.Value for configuration 
> transforms.replaceFieldValue.type: Class 
> org.apache.kafka.connect.transforms.ReplaceField.Value could not be found.",
> "Invalid value null for configuration 
> transforms.replaceFieldValue.type: Not a Transformation"
> ],
> "name": "transforms.replaceFieldValue.type",
> "recommended_values": [
> "org.apache.kafka.connect.transforms.ExtractField.Key",
> "org.apache.kafka.connect.transforms.ExtractField.Value",
> "org.apache.kafka.connect.transforms.HoistField.Key",
> "org.apache.kafka.connect.transforms.HoistField.Value",
> "org.apache.kafka.connect.transforms.InsertField.Key",
> "org.apache.kafka.connect.transforms.InsertField.Value",
> "org.apache.kafka.connect.transforms.MaskField.Key",
> "org.apache.kafka.connect.transforms.MaskField.Value",
> "org.apache.kafka.connect.transforms.RegexRouter",
> "org.apache.kafka.connect.transforms.ReplaceField.Key",
> "org.apache.kafka.connect.transforms.ReplaceField.Value",
> 
> "org.apache.kafka.connect.transforms.SetSchemaMetadata.Key",
> 
> "org.apache.kafka.connect.transforms.SetSchemaMetadata.Value",
> "org.apache.kafka.connect.transforms.TimestampRouter",
> "org.apache.kafka.connect.transforms.ValueToKey"
> ],
> {code}
> Since the Key / Value transformations are defined as static nested classes, 
> the proper notation is
> {code}
> "transforms": "replaceFieldValue",
> "transforms.replaceFieldValue.type":  
> "org.apache.kafka.connect.transforms.ReplaceField$Value"
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Build failed in Jenkins: kafka-0.10.2-jdk7 #109

2017-03-22 Thread Apache Jenkins Server
See 


Changes:

[wangguoz] MINOR: MyProcessor doc example should implement, not extend 
`Processor`

--
[...truncated 618.30 KB...]

org.apache.kafka.clients.producer.KafkaProducerTest > testSerializerClose PASSED

org.apache.kafka.clients.producer.KafkaProducerTest > 
testInterceptorConstructClose STARTED

org.apache.kafka.clients.producer.KafkaProducerTest > 
testInterceptorConstructClose PASSED

org.apache.kafka.clients.producer.KafkaProducerTest > testPartitionerClose 
STARTED

org.apache.kafka.clients.producer.KafkaProducerTest > testPartitionerClose 
PASSED

org.apache.kafka.clients.producer.KafkaProducerTest > testMetadataFetch STARTED

org.apache.kafka.clients.producer.KafkaProducerTest > testMetadataFetch PASSED

org.apache.kafka.clients.producer.KafkaProducerTest > 
testMetadataFetchOnStaleMetadata STARTED

org.apache.kafka.clients.producer.KafkaProducerTest > 
testMetadataFetchOnStaleMetadata PASSED

org.apache.kafka.clients.MetadataTest > testListenerCanUnregister STARTED

org.apache.kafka.clients.MetadataTest > testListenerCanUnregister PASSED

org.apache.kafka.clients.MetadataTest > testTopicExpiry STARTED

org.apache.kafka.clients.MetadataTest > testTopicExpiry PASSED

org.apache.kafka.clients.MetadataTest > testFailedUpdate STARTED

org.apache.kafka.clients.MetadataTest > testFailedUpdate PASSED

org.apache.kafka.clients.MetadataTest > testMetadataUpdateWaitTime STARTED

org.apache.kafka.clients.MetadataTest > testMetadataUpdateWaitTime PASSED

org.apache.kafka.clients.MetadataTest > testUpdateWithNeedMetadataForAllTopics 
STARTED

org.apache.kafka.clients.MetadataTest > testUpdateWithNeedMetadataForAllTopics 
PASSED

org.apache.kafka.clients.MetadataTest > testClusterListenerGetsNotifiedOfUpdate 
STARTED

org.apache.kafka.clients.MetadataTest > testClusterListenerGetsNotifiedOfUpdate 
PASSED

org.apache.kafka.clients.MetadataTest > testTimeToNextUpdate_RetryBackoff 
STARTED

org.apache.kafka.clients.MetadataTest > testTimeToNextUpdate_RetryBackoff PASSED

org.apache.kafka.clients.MetadataTest > testMetadata STARTED

org.apache.kafka.clients.MetadataTest > testMetadata PASSED

org.apache.kafka.clients.MetadataTest > testTimeToNextUpdate_OverwriteBackoff 
STARTED

org.apache.kafka.clients.MetadataTest > testTimeToNextUpdate_OverwriteBackoff 
PASSED

org.apache.kafka.clients.MetadataTest > testTimeToNextUpdate STARTED

org.apache.kafka.clients.MetadataTest > testTimeToNextUpdate PASSED

org.apache.kafka.clients.MetadataTest > testListenerGetsNotifiedOfUpdate STARTED

org.apache.kafka.clients.MetadataTest > testListenerGetsNotifiedOfUpdate PASSED

org.apache.kafka.clients.MetadataTest > testNonExpiringMetadata STARTED

org.apache.kafka.clients.MetadataTest > testNonExpiringMetadata PASSED

org.apache.kafka.clients.ClientUtilsTest > testOnlyBadHostname STARTED

org.apache.kafka.clients.ClientUtilsTest > testOnlyBadHostname PASSED

org.apache.kafka.clients.ClientUtilsTest > testParseAndValidateAddresses STARTED

org.apache.kafka.clients.ClientUtilsTest > testParseAndValidateAddresses PASSED

org.apache.kafka.clients.ClientUtilsTest > testNoPort STARTED

org.apache.kafka.clients.ClientUtilsTest > testNoPort PASSED

org.apache.kafka.clients.NodeApiVersionsTest > 
testUsableVersionCalculationNoKnownVersions STARTED

org.apache.kafka.clients.NodeApiVersionsTest > 
testUsableVersionCalculationNoKnownVersions PASSED

org.apache.kafka.clients.NodeApiVersionsTest > testVersionsToString STARTED

org.apache.kafka.clients.NodeApiVersionsTest > testVersionsToString PASSED

org.apache.kafka.clients.NodeApiVersionsTest > testUnsupportedVersionsToString 
STARTED

org.apache.kafka.clients.NodeApiVersionsTest > testUnsupportedVersionsToString 
PASSED

org.apache.kafka.clients.NodeApiVersionsTest > testUnknownApiVersionsToString 
STARTED

org.apache.kafka.clients.NodeApiVersionsTest > testUnknownApiVersionsToString 
PASSED

org.apache.kafka.clients.NodeApiVersionsTest > testUsableVersionCalculation 
STARTED

org.apache.kafka.clients.NodeApiVersionsTest > testUsableVersionCalculation 
PASSED

org.apache.kafka.clients.NodeApiVersionsTest > testUsableVersionLatestVersions 
STARTED

org.apache.kafka.clients.NodeApiVersionsTest > testUsableVersionLatestVersions 
PASSED
:clients:determineCommitId UP-TO-DATE
:clients:createVersionFile
:clients:jar UP-TO-DATE
:core:compileJava UP-TO-DATE
:core:compileScala
:79:
 value DEFAULT_TIMESTAMP in object OffsetCommitRequest is deprecated: see 
corresponding Javadoc for more information.

org.apache.kafka.common.requests.OffsetCommitRequest.DEFAULT_TIMESTAMP
 ^
:36:

[GitHub] kafka pull request #2724: MINOR: log state store restore offsets

2017-03-22 Thread dguy
Github user dguy closed the pull request at:

https://github.com/apache/kafka/pull/2724


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: [VOTE] KIP-123: Allow per stream/table timestamp extractor

2017-03-22 Thread Guozhang Wang
Jeyhun,

Could you update the status of this KIP since it has been some time since
the last vote?

I'm +1 besides the minor comments mentioned above.


Guozhang


On Mon, Mar 6, 2017 at 3:14 PM, Jeyhun Karimov  wrote:

> Sorry I was late. I will update javadocs in related methods to emphasize
> that TimestampExtractor is stateless.
>
> Cheers,
> Jeyhun
>
> On Mon, Mar 6, 2017 at 8:17 PM Guozhang Wang  wrote:
>
> > 1) Sounds good.
> >
> > 2) Yeah what I meant is to emphasize that TimestampExtractor to be
> > stateless in the docs somewhere.
> >
> >
> > Guozhang
> >
> >
> > On Sun, Mar 5, 2017 at 4:27 PM, Matthias J. Sax 
> > wrote:
> >
> > > Guozhang,
> > >
> > > about renaming the config parameters. I like this idea, but want to
> > > point out, that this change should be done in a backward compatible
> way.
> > > Thus, we need to keep (and only deprecate) the current parameter names.
> > >
> > > I am not sure about (2)? What do you worry about? Using a "stateful
> > > extractor"? -- this would be an antipattern IMHO. We can clarify that a
> > > TimestampExtrator should be stateless though (even if this should be
> > > clear).
> > >
> > >
> > > -Matthias
> > >
> > >
> > > On 3/4/17 6:36 PM, Guozhang Wang wrote:
> > > > Jeyhun,
> > > >
> > > > Thanks for proposing this KIP! And sorry for getting late in the
> > > discussion.
> > > >
> > > > I have a general suggestion not directly related to this KIP and a
> > couple
> > > > of comments for this KIP here:
> > > >
> > > > I agree with Mathieu's observation, partly because we are now having
> > lots
> > > > of overloaded functions both in the DSL and in PAPI, and it would be
> > > quite
> > > > confusing to users. As Matthias mentioned we do have some plans to
> > > refactor
> > > > this API, but just wanted to point it out that this KIP may likely
> urge
> > > us
> > > > to do the API refactoring sooner than planned. My personal preference
> > > would
> > > > be doing that the next release (i.e. 0.11.0.0 in June).
> > > >
> > > >
> > > > Now some detailed comments:
> > > >
> > > > 1. I'd suggest change TIMESTAMP_EXTRACTOR_CLASS_CONFIG to
> > > > "default.timestamp.extractor" or "global.timestamp.extractor" (also
> the
> > > > Java variable name can be changed accordingly) along with this
> change.
> > In
> > > > addition, maybe we can piggy-backing this to also rename
> > > > KEY_SERDE_CLASS_CONFIG/VALUE_SERDE_CLASS_CONFIG to "default.key.."
> etc
> > > in
> > > > this KIP.
> > > >
> > > > 2. Another thing we should consider, is that since now we could
> > > potentially
> > > > use multiple timestamp extractor instances than a single one, this
> may
> > be
> > > > breaking if user's customization did some global bookkeeping based on
> > the
> > > > previous assumption (maybe a wild thought but e.g. keeping track some
> > > > global counts in the extractor as a local variable). We need to
> clarify
> > > > this change in the javadoc and also potentially in the upgrade web
> doc
> > > > sections.
> > > >
> > > >
> > > >
> > > > Guozhang
> > > >
> > > >
> > > > On Wed, Mar 1, 2017 at 6:09 AM, Michael Noll 
> > > wrote:
> > > >
> > > >> +1 (non-binding)
> > > >>
> > > >> Thanks for the KIP!
> > > >>
> > > >> On Wed, Mar 1, 2017 at 1:49 PM, Bill Bejeck 
> > wrote:
> > > >>
> > > >>> +1
> > > >>>
> > > >>> Thanks
> > > >>> Bill
> > > >>>
> > > >>> On Wed, Mar 1, 2017 at 5:06 AM, Eno Thereska <
> eno.there...@gmail.com
> > >
> > > >>> wrote:
> > > >>>
> > >  +1 (non binding).
> > > 
> > >  Thanks
> > >  Eno
> > > > On 28 Feb 2017, at 17:22, Matthias J. Sax  >
> > > >>> wrote:
> > > >
> > > > +1
> > > >
> > > > Thanks a lot for the KIP!
> > > >
> > > > -Matthias
> > > >
> > > >
> > > > On 2/28/17 1:35 AM, Damian Guy wrote:
> > > >> Thanks for the KIP Jeyhun!
> > > >>
> > > >> +1
> > > >>
> > > >> On Tue, 28 Feb 2017 at 08:59 Jeyhun Karimov <
> je.kari...@gmail.com
> > >
> > >  wrote:
> > > >>
> > > >>> Dear community,
> > > >>>
> > > >>> I'd like to start the vote for KIP-123:
> > > >>> https://cwiki.apache.org/confluence/pages/viewpage.
> > >  action?pageId=68714788
> > > >>>
> > > >>>
> > > >>> Cheers,
> > > >>> Jeyhun
> > > >>> --
> > > >>> -Cheers
> > > >>>
> > > >>> Jeyhun
> > > >>>
> > > >>
> > > >
> > > 
> > > 
> > > >>>
> > > >>
> > > >
> > > >
> > > >
> > >
> > >
> >
> >
> > --
> > -- Guozhang
> >
> --
> -Cheers
>
> Jeyhun
>



-- 
-- Guozhang


[jira] [Resolved] (KAFKA-4732) Unstable test: KStreamKTableJoinIntegrationTest.shouldCountClicksPerRegion[1]

2017-03-22 Thread Guozhang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4732?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang resolved KAFKA-4732.
--
Resolution: Fixed

As part of https://issues.apache.org/jira/browse/KAFKA-4859.

> Unstable test: KStreamKTableJoinIntegrationTest.shouldCountClicksPerRegion[1]
> -
>
> Key: KAFKA-4732
> URL: https://issues.apache.org/jira/browse/KAFKA-4732
> Project: Kafka
>  Issue Type: Bug
>  Components: streams, unit tests
>Reporter: Matthias J. Sax
>Assignee: Guozhang Wang
>
> {noformat}
> java.lang.AssertionError: Condition not met within timeout 3. Expecting 3 
> records from topic output-topic-2 while only received 0: []
>   at org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:259)
>   at 
> org.apache.kafka.streams.integration.utils.IntegrationTestUtils.waitUntilMinKeyValueRecordsReceived(IntegrationTestUtils.java:221)
>   at 
> org.apache.kafka.streams.integration.utils.IntegrationTestUtils.waitUntilMinKeyValueRecordsReceived(IntegrationTestUtils.java:190)
>   at 
> org.apache.kafka.streams.integration.KStreamKTableJoinIntegrationTest.shouldCountClicksPerRegion(KStreamKTableJoinIntegrationTest.java:295)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (KAFKA-4732) Unstable test: KStreamKTableJoinIntegrationTest.shouldCountClicksPerRegion[1]

2017-03-22 Thread Matthias J. Sax (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4732?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias J. Sax reassigned KAFKA-4732:
--

Assignee: Guozhang Wang  (was: Matthias J. Sax)

> Unstable test: KStreamKTableJoinIntegrationTest.shouldCountClicksPerRegion[1]
> -
>
> Key: KAFKA-4732
> URL: https://issues.apache.org/jira/browse/KAFKA-4732
> Project: Kafka
>  Issue Type: Bug
>  Components: streams, unit tests
>Reporter: Matthias J. Sax
>Assignee: Guozhang Wang
>
> {noformat}
> java.lang.AssertionError: Condition not met within timeout 3. Expecting 3 
> records from topic output-topic-2 while only received 0: []
>   at org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:259)
>   at 
> org.apache.kafka.streams.integration.utils.IntegrationTestUtils.waitUntilMinKeyValueRecordsReceived(IntegrationTestUtils.java:221)
>   at 
> org.apache.kafka.streams.integration.utils.IntegrationTestUtils.waitUntilMinKeyValueRecordsReceived(IntegrationTestUtils.java:190)
>   at 
> org.apache.kafka.streams.integration.KStreamKTableJoinIntegrationTest.shouldCountClicksPerRegion(KStreamKTableJoinIntegrationTest.java:295)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: [DISCUSS] KIP 130: Expose states of active tasks to KafkaStreams public API

2017-03-22 Thread Guozhang Wang
Thanks for the updated KIP, and sorry for the late replies!

I think a little bit more about KIP-130, and I feel that if we are going to
deprecate the `toString` function (it is not explicitly said in the KIP, so
I'm not sure if you plan to still keep the `KafkaStreams#toString` as is or
are going to replace it with the proposed APIs) with the proposed ones, it
may be okay. More specifically, after both KIP-120 and KIP-130:

1. users can use `#describe` function to check the generated topology
before calling `KafkaStreams#start`, which is static information.
2. users can use the `StreamsMetadata -> ThreadMetadata -> TaskMetadata`
programmatically after called `KafkaStreams#start` to get the dynamically
changeable information.

One thing I'm still not sure though, is that in `TaskMetadata` we only have
the TaskId and assigned partitions, whereas in "TopologyDescription"
introduced in KIP-120, it will simply describe the whole topology possibly
composed of multiple sub-topologies. So it is hard for users to tell which
sub-topology is executed under which task on-the-fly.

Hence I'm thinking if we can expose the "sub-topology-id" (named as
topicsGroupId internally) in TopologyDescription#Subtopology, and then from
the task id which is essentially "sub-topology-id DASH partition-group-id"
users can make the link, though it is still not that straight-forward.

Thoughts?

Guozhang



On Wed, Mar 15, 2017 at 3:16 PM, Florian Hussonnois 
wrote:

> Thanks Guozhang for pointing me to the KIP-120.
>
> I've made some modifications to the KIP. I also proposed a new PR (there is
> still some tests to make).
> https://cwiki.apache.org/confluence/display/KAFKA/KIP+
> 130%3A+Expose+states+of+active+tasks+to+KafkaStreams+public+API
>
> Exposing consumed offsets through JMX is sufficient for debugging purpose.
> But I think this could be part to another JIRA as there is no impact to
> public API.
>
> Thanks
>
> 2017-03-10 22:35 GMT+01:00 Guozhang Wang :
>
> > Hello Florian,
> >
> > As for programmatically discover monitoring data by piping metrics into a
> > dedicated topic. I think you can actually use a KafkaMetricsReporter
> which
> > pipes the polled metric values into a pre-defined topic (note that in
> Kafka
> > the MetricsReporter is simply an interface and users can build their own
> > impl in addition to the JMXReporter), for example :
> >
> > https://github.com/krux/kafka-metrics-reporter
> >
> > As for the "static task-level assignment", what I meant is that the
> mapping
> > from source-topic-partitions -> tasks are static, via the
> > "PartitionGrouper", and a task won't switch from an active task to a
> > standby task, it is actually that an active task could be migrated, as a
> > whole along with all its assigned partitions, to another thread / process
> > and a new standby task will be created on the host that this active task
> is
> > migrating from. So for the SAME task, its taskMetadata.
> > assignedPartitions()
> > will always return you the same partitions.
> >
> > As for the `toString` function that what we have today, I feel it has
> some
> > correlations with KIP-120 so I'm trying to coordinate some discussions
> here
> > (cc'ing Matthias as the owner of KIP-120). My understand is that:
> >
> > 1. In KIP-120, the `toString` function of `KafkaStreams` will be removed
> > and instead the `Topology#describe` function will be introduced for users
> > to debug the topology BEFORE start running their instance with the
> > topology. And hence the description won't contain any task information as
> > they are not formed yet.
> > 2. In KIP-130, we want to add the task-level information for monitoring
> > purposes, which is not static and can only be captured AFTER the instance
> > has started running. Again I'm wondering for KIP-130 alone if adding
> those
> > metrics mentioned in my previous email would suffice even for the use
> case
> > that you have mentioned.
> >
> >
> > Guozhang
> >
> > On Wed, Mar 8, 2017 at 3:18 PM, Florian Hussonnois <
> fhussonn...@gmail.com>
> > wrote:
> >
> > > Hi Guozhang
> > >
> > > Thank you for your feedback. I've started to look more deeply into the
> > > code. As you mention, it would be more clever to use the current
> > > StreamMetadata API to expose these information.
> > >
> > > I think exposing metrics through JMX is great for building monitoring
> > > dashboards using some tools like jmxtrans and grafana.
> > > But for our use case we would like to expose the states directely from
> > the
> > > application embedding the kstreams topologies.
> > > So we expect to be able to retrieve states in a programmatic way.
> > >
> > > For instance, we could imagin to produce those states into a dedicated
> > > topic. In that way a third application could automatically discover all
> > > kafka-streams applications which could be monitored.
> > > In production environment, that can be clearly a solution to have a
> > > complete overview of 

[jira] [Commented] (KAFKA-4912) Add check for topic name length

2017-03-22 Thread Matthias J. Sax (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15936975#comment-15936975
 ] 

Matthias J. Sax commented on KAFKA-4912:


I am actually not 100% sure -- I guess, it depends how KAFKA-4893 is going to 
be solved. Maybe we don't need to do anything. I created the Jira to make sure 
we don't miss anything. \cc [~ijuma]

> Add check for topic name length
> ---
>
> Key: KAFKA-4912
> URL: https://issues.apache.org/jira/browse/KAFKA-4912
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Matthias J. Sax
>Assignee: Sharad
>Priority: Minor
>  Labels: newbie
>
> We should check topic name length (if internal topics, and maybe for source 
> topics? -> in cause, {{topic.auto.create}} is enabled this might prevent 
> problems), and raise an exception if they are too long. Cf. KAFKA-4893



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] kafka pull request #2723: MINOR: MyProcessor doc example should implement, n...

2017-03-22 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/2723


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (KAFKA-4936) Allow dynamic routing of output records

2017-03-22 Thread Matthias J. Sax (JIRA)
Matthias J. Sax created KAFKA-4936:
--

 Summary: Allow dynamic routing of output records
 Key: KAFKA-4936
 URL: https://issues.apache.org/jira/browse/KAFKA-4936
 Project: Kafka
  Issue Type: Improvement
  Components: streams
Reporter: Matthias J. Sax


Currently, all used output topics must be know beforehand, and thus, it's not 
possible to send output records to topic in a dynamic fashion.

There have been couple of request for this feature and we should consider 
adding it. There are many open questions however, with regard to topic creation 
and configuration (replication factor, number of partitions) etc.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-4919) Document that stores must not be closed when Processors are closed

2017-03-22 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15936891#comment-15936891
 ] 

Guozhang Wang commented on KAFKA-4919:
--

[~elevy] There is a PR open for this ticket but not yet linked to it 
automatically: https://github.com/apache/kafka/pull/2725

Please feel free to share your opinions on that PR if you want.

> Document that stores must not be closed when Processors are closed
> --
>
> Key: KAFKA-4919
> URL: https://issues.apache.org/jira/browse/KAFKA-4919
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.2.0
>Reporter: Elias Levy
>
> I have a streams job, that previously worked, that consumes and writes to a 
> large number of topics with many partitions and that uses many threads.  I 
> upgraded the job to 0.10.2.0.  The job now fails after a short time running, 
> seemingly after a rebalance.
> {quote}
> WARN  2017-03-19 18:03:20,432 [StreamThread-10][StreamThread.java:160] : 
> Unexpected state transition from RUNNING to NOT_RUNNING
> {quote}
> The first observation is that Streams is no longer outputting exceptions and 
> backtraces.  I had to add code to get this information.
> The exception:
> {quote}
> Exception: org.apache.kafka.streams.errors.StreamsException: Exception caught 
> in process. taskId=1_225, processor=KSTREAM-SOURCE-03, 
> topic=some_topic, partition=225, offset=266411
> org.apache.kafka.streams.errors.StreamsException: Exception caught in 
> process. taskId=1_225, processor=KSTREAM-SOURCE-03, topic=some_topic, 
> partition=225, offset=266411
>   at 
> org.apache.kafka.streams.processor.internals.StreamTask.process(StreamTask.java:216)
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:641)
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:368)
> Caused by: org.apache.kafka.streams.errors.InvalidStateStoreException: Store 
> someStore-201701060400 is currently closed
>   at 
> org.apache.kafka.streams.state.internals.RocksDBStore.validateStoreOpen(RocksDBStore.java:205)
>   at 
> org.apache.kafka.streams.state.internals.RocksDBStore.put(RocksDBStore.java:221)
>   at 
> org.apache.kafka.streams.state.internals.RocksDBSegmentedBytesStore.put(RocksDBSegmentedBytesStore.java:74)
>   at 
> org.apache.kafka.streams.state.internals.ChangeLoggingSegmentedBytesStore.put(ChangeLoggingSegmentedBytesStore.java:54)
>   at 
> org.apache.kafka.streams.state.internals.MeteredSegmentedBytesStore.put(MeteredSegmentedBytesStore.java:101)
>   at 
> org.apache.kafka.streams.state.internals.RocksDBWindowStore.put(RocksDBWindowStore.java:109)
>   ... X more
> {quote}
> The error occurs for many partitions.
> This was preceded by (for this partition):
> {quote}
> INFO  2017-03-19 18:03:16,403 [StreamThread-10][ConsumerCoordinator.java:393] 
> : Revoking previously assigned partitions [some_topic-225] for group some_job
> INFO  2017-03-19 18:03:16,403 [StreamThread-10][StreamThread.java:254] : 
> stream-thread [StreamThread-10] partitions [[some_topic-225]] revoked at the 
> beginning of consumer rebalance.
> INFO  2017-03-19 18:03:16,403 [StreamThread-10][StreamThread.java:1056] : 
> stream-thread [StreamThread-10] Closing a task's topology 1_225
> INFO  2017-03-19 18:03:17,887 [StreamThread-10][StreamThread.java:544] : 
> stream-thread [StreamThread-10] Flushing state stores of task 1_225
> INFO  2017-03-19 18:03:17,887 [StreamThread-10][StreamThread.java:534] : 
> stream-thread [StreamThread-10] Committing consumer offsets of task 1_225
> INFO  2017-03-19 18:03:17,891 [StreamThread-10][StreamThread.java:1012] : 
> stream-thread [StreamThread-10] Updating suspended tasks to contain active 
> tasks [[1_225, 0_445, 0_30]]
> INFO  2017-03-19 18:03:17,891 [StreamThread-10][StreamThread.java:1019] : 
> stream-thread [StreamThread-10] Removing all active tasks [[1_225, 0_445, 
> 0_30]]
> INFO  2017-03-19 18:03:19,925 [StreamThread-10][ConsumerCoordinator.java:252] 
> : Setting newly assigned partitions [some_tpoic-225] for group some_job
> INFO  2017-03-19 18:03:19,927 [StreamThread-10][StreamThread.java:228] : 
> stream-thread [StreamThread-10] New partitions [[some_topic-225]] assigned at 
> the end of consumer rebalance.
> INFO  2017-03-19 18:03:19,929 [StreamThread-10][StreamTask.java:333] : task 
> [1_225] Initializing processor nodes of the topology
> Something happens.  What ???
> INFO  2017-03-19 18:03:20,135 [StreamThread-10][StreamThread.java:1045] : 
> stream-thread [StreamThread-10] Closing a task 1_225
> INFO  2017-03-19 18:03:20,355 [StreamThread-10][StreamThread.java:544] : 
> stream-thread [StreamThread-10] Flushing state stores of task 1_225
> INFO  2017-03-19 

Re: [DISCUSS] KIP-120: Cleanup Kafka Streams builder API

2017-03-22 Thread Guozhang Wang
Regarding the naming of `StreamsTopologyBuilder` v.s. `StreamsBuilder` that
are going to be used in DSL, I agree both has their arguments:

1. On one side, people using the DSL layer probably do not need to be aware
(or rather, "learn about") of the "topology" concept, although this concept
is a publicly exposed one in Kafka Streams.

2. On the other side, StreamsBuilder#build() returning a Topology object
sounds a little weird, at least to me (admittedly subjective matter).


Since the second bullet point seems to be more "subjective" and many people
are not worried about it, I'm OK to go with the other option.


Guozhang


On Wed, Mar 22, 2017 at 8:58 AM, Michael Noll  wrote:

> Forwarding to kafka-user.
>
>
> -- Forwarded message --
> From: Michael Noll 
> Date: Wed, Mar 22, 2017 at 8:48 AM
> Subject: Re: [DISCUSS] KIP-120: Cleanup Kafka Streams builder API
> To: dev@kafka.apache.org
>
>
> Matthias,
>
> > @Michael:
> >
> > You seemed to agree with Jay about not exposing the `Topology` concept
> > in our main entry class (ie, current KStreamBuilder), thus, I
> > interpreted that you do not want `Topology` in the name either (I am a
> > little surprised by your last response, that goes the opposite
> direction).
>
> Oh, sorry for not being clear.
>
> What I wanted to say in my earlier email was the following:  Yes, I do
> agree with most of Jay's reasoning, notably about carefully deciding how
> much and which parts of the API/concept "surface" we expose to users of the
> DSL.  However, and this is perhaps where I wasn't very clear, I disagree on
> the particular opinion about not exposing the topology concept to DSL
> users.  Instead, I think the concept of a topology is important to
> understand even for DSL users -- particularly because of the way the DSL is
> currently wiring your processing logic via the builder pattern.  (As I
> noted, e.g. Akka uses a different approach where you might be able to get
> away with not exposing the "topology" concept, but even in Akka there's the
> notion of graphs and flows.)
>
>
> > > StreamsBuilder builder = new StreamsBuilder();
> > >
> > > // And here you'd define your...well, what actually?
> > > // Ah right, you are composing a topology here, though you are not
> > > aware of it.
> >
> > Yes. You are not aware of if -- that's the whole point about it -- don't
> > put the Topology concept in the focus...
>
> Let me turn this around, because that was my point: it's confusing to have
> a name "StreamsBuilder" if that thing isn't building streams, and it is
> not.
>
> As I mentioned before, I do think it is a benefit to make it clear to DSL
> users that there are two aspects at play: (1) defining the logic/plan of
> your processing, and (2) the execution of that plan.  I have a less strong
> opinion whether or not having "topology" in the names would help to
> communicate this separation as well as combination of (1) and (2) to make
> your app work as expected.
>
> If we stick with `KafkaStreams` for (2) *and* don't like having "topology"
> in the name, then perhaps we should rename `KStreamBuilder` to
> `KafkaStreamsBuilder`.  That at least gives some illusion of a combo of (1)
> and (2).  IMHO, `KafkaStreamsBuilder` highlights better that "it is a
> builder/helper for the Kafka Streams API", rather than "a builder for
> streams".
>
> Also, I think some of the naming challenges we're discussing here are
> caused by having this builder pattern in the first place.  If the Streams
> API was implemented in Scala, for example, we could use implicits for
> helping us to "stitch streams/tables together to build the full topology",
> thus using a different (better?) approach to composing your topologies that
> through a builder pattern.  So: perhaps there's a better way then the
> builder, and that way would also be clearer on terminology?  That said,
> this might take this KIP off-scope.
>
> -Michael
>
>
>
>
> On Wed, Mar 22, 2017 at 12:33 AM, Matthias J. Sax 
> wrote:
>
> > @Guozhang:
> >
> > I recognized that you want to have `Topology` in the name. But it seems
> > that more people preferred to not have it (Jay, Ram, Michael [?],
> myself).
> >
> > @Michael:
> >
> > You seemed to agree with Jay about not exposing the `Topology` concept
> > in our main entry class (ie, current KStreamBuilder), thus, I
> > interpreted that you do not want `Topology` in the name either (I am a
> > little surprised by your last response, that goes the opposite
> direction).
> >
> > > StreamsBuilder builder = new StreamsBuilder();
> > >
> > > // And here you'd define your...well, what actually?
> > > // Ah right, you are composing a topology here, though you are not
> > > aware of it.
> >
> > Yes. You are not aware of if -- that's the whole point about it -- don't
> > put the Topology concept in the focus...
> >
> > Furthermore,
> >
> > >>> So what are you building here with 

Re: [DISCUSS] KIP-126 - Allow KafkaProducer to batch based on uncompressed size

2017-03-22 Thread Dong Lin
Never mind about my second comment. I misunderstood the semantics of
producer's batch.size.

On Wed, Mar 22, 2017 at 10:20 AM, Dong Lin  wrote:

> Hey Becket,
>
> In addition to the batch-split-rate, should we also add batch-split-ratio
> sensor to gauge the probability that we have to split batch?
>
> Also, in the case that the batch size configured for the producer is
> smaller than the max message size configured for the broker, why can't we
> just split the batch if its size exceeds the configured batch size? The
> benefit of this approach is that the semantics of producer is
> straightforward because we enforce the batch size that user has configured.
> The implementation would also be simpler because we don't have to reply on
> KIP-4 to fetch the max message size from broker. I guess you are worrying
> about the overhead of "unnecessary" split if a batch size is between
> user-configured batch size and broker's max message size. But is overhead
> really a concern? If overhead is too large because user has configured a
> very low batch size for producer, shouldn't user adjust produce config?
>
> Thanks,
> Dong
>
> On Wed, Mar 15, 2017 at 2:50 PM, Becket Qin  wrote:
>
>> I see, then we are thinking about the same thing :)
>>
>> On Wed, Mar 15, 2017 at 2:26 PM, Ismael Juma  wrote:
>>
>> > I meant finishing what's described in the following section and then
>> > starting a discussion followed by a vote:
>> >
>> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>> > 4+-+Command+line+and+centralized+administrative+operations#KIP-4-
>> > Commandlineandcentralizedadministrativeoperations-DescribeCo
>> nfigsRequest
>> >
>> > We have only voted on KIP-4 Metadata, KIP-4 Create Topics, KIP-4 Delete
>> > Topics so far.
>> >
>> > Ismael
>> >
>> > On Wed, Mar 15, 2017 at 8:58 PM, Becket Qin 
>> wrote:
>> >
>> > > Hi Ismael,
>> > >
>> > > KIP-4 is also the one that I was thinking about. We have introduced a
>> > > DescribeConfigRequest there so the producer can easily get the
>> > > configurations. By "another KIP" do you mean a new (or maybe extended)
>> > > protocol or using that protocol in clients?
>> > >
>> > > Thanks,
>> > >
>> > > Jiangjie (Becket) Qin
>> > >
>> > > On Wed, Mar 15, 2017 at 1:21 PM, Ismael Juma 
>> wrote:
>> > >
>> > > > Hi Becket,
>> > > >
>> > > > How were you thinking of retrieving the configuration items you
>> > > mentioned?
>> > > > I am asking because I was planning to post a KIP for Describe
>> Configs
>> > > (one
>> > > > of the protocols in KIP-4), which would expose such information. But
>> > > maybe
>> > > > you are thinking of extending Metadata request?
>> > > >
>> > > > Ismael
>> > > >
>> > > > On Wed, Mar 15, 2017 at 7:33 PM, Becket Qin 
>> > > wrote:
>> > > >
>> > > > > Hi Jason,
>> > > > >
>> > > > > Good point. I was thinking about that, too. I was not sure if
>> that is
>> > > the
>> > > > > right thing to do by default.
>> > > > >
>> > > > > If we assume people always set the batch size to max message size,
>> > > > > splitting the oversized batch makes a lot of sense. But it seems
>> > > possible
>> > > > > that users want to control the memory footprint so they would set
>> the
>> > > > batch
>> > > > > size to smaller than the max message size so the producer can have
>> > hold
>> > > > > batches for more partitions. In this case, splitting the batch
>> might
>> > > not
>> > > > be
>> > > > > the desired behavior.
>> > > > >
>> > > > > I think the most intuitive approach to this is allow the producer
>> to
>> > > get
>> > > > > the max message size configuration (as well as some other
>> > > configurations
>> > > > > such as timestamp type)  from the broker side and use that to
>> decide
>> > > > > whether a batch should be split or not. I probably should add
>> this to
>> > > the
>> > > > > KIP wiki.
>> > > > >
>> > > > > Thanks,
>> > > > >
>> > > > > Jiangjie (Becket) Qin
>> > > > >
>> > > > > On Wed, Mar 15, 2017 at 9:47 AM, Jason Gustafson <
>> ja...@confluent.io
>> > >
>> > > > > wrote:
>> > > > >
>> > > > > > Hey Becket,
>> > > > > >
>> > > > > > Thanks for the KIP! The approach seems reasonable. One
>> > clarification:
>> > > > is
>> > > > > > the intent to do the splitting after the broker rejects the
>> request
>> > > > with
>> > > > > > MESSAGE_TOO_LARGE, or prior to sending if the configured batch
>> size
>> > > is
>> > > > > > exceeded?
>> > > > > >
>> > > > > > -Jason
>> > > > > >
>> > > > > > On Mon, Mar 13, 2017 at 8:10 PM, Becket Qin <
>> becket@gmail.com>
>> > > > > wrote:
>> > > > > >
>> > > > > > > Bump up the thread for further comments. If there is no more
>> > > comments
>> > > > > on
>> > > > > > > the KIP I will start the voting thread on Wed.
>> > > > > > >
>> > > > > > > Thanks,
>> > > > > > >
>> > > > > > > Jiangjie (Becket) Qin
>> > > > > > >
>> > > > > > > On Tue, Mar 7, 2017 at 9:48 AM, Becket Qin <
>> 

[jira] [Updated] (KAFKA-4935) Disable record level crc checks on the consumer with the KIP-98 message format

2017-03-22 Thread Apurva Mehta (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apurva Mehta updated KAFKA-4935:

Issue Type: Improvement  (was: Bug)

> Disable record level crc checks on the consumer with the KIP-98 message format
> --
>
> Key: KAFKA-4935
> URL: https://issues.apache.org/jira/browse/KAFKA-4935
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Apurva Mehta
>
> With the new message format proposed in KIP-98, the record level CRC has been 
> moved to the the batch header. The consumer record API still allows retrieval 
> of CRC per record, and in this case we compute the CRC on the fly. This 
> degrades performance, and also such a computation becomes unnecessary with 
> the batch level CRC.
> We can address this by deprecating the record level CRC api. We can also work 
> around this by disabling record level checks when the check crc config is set 
> to false.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-4159) Allow overriding producer & consumer properties at the connector level

2017-03-22 Thread Ewen Cheslack-Postava (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15936835#comment-15936835
 ] 

Ewen Cheslack-Postava commented on KAFKA-4159:
--

[~sjdurfey] Allowing configs like this is a public API change, so it'd need a 
KIP. You can find the instructions for a KIP here: 
https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Improvement+Proposals 
It might sound like quite a bit of overhead, but for simple changes (especially 
additions with no compatibility concerns) it's usually very straightforward. 
Once that's through a PR could be merged.

That said, I'm curious about the use case. Why would multiple connectors need 
to share the same consumer group? Why couldn't you just change the number of 
tasks for the connector? This seems like quite an unusual use case, so I'm not 
sure it's something that makes sense in the framework. More generally, I 
personally tend to view the producer and consumer (and their configs) as 
implementation details -- I think exposing all configs might be masking other 
issues that could be better addressed by the framework. To me, the motivation 
section of a KIP would be important in this case. (But that's just my 
viewpoint, I know other people have asked about this in the past.)

> Allow overriding producer & consumer properties at the connector level
> --
>
> Key: KAFKA-4159
> URL: https://issues.apache.org/jira/browse/KAFKA-4159
> Project: Kafka
>  Issue Type: New Feature
>  Components: KafkaConnect
>Reporter: Shikhar Bhushan
>Assignee: Stephen Durfey
>
> As an example use cases, overriding a sink connector's consumer's partition 
> assignment strategy.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (KAFKA-4935) Disable record level crc checks on the consumer with the KIP-98 message format

2017-03-22 Thread Apurva Mehta (JIRA)
Apurva Mehta created KAFKA-4935:
---

 Summary: Disable record level crc checks on the consumer with the 
KIP-98 message format
 Key: KAFKA-4935
 URL: https://issues.apache.org/jira/browse/KAFKA-4935
 Project: Kafka
  Issue Type: Bug
Reporter: Apurva Mehta


With the new message format proposed in KIP-98, the record level CRC has been 
moved to the the batch header. The consumer record API still allows retrieval 
of CRC per record, and in this case we compute the CRC on the fly. This 
degrades performance, and also such a computation becomes unnecessary with the 
batch level CRC.

We can address this by deprecating the record level CRC api. We can also work 
around this by disabling record level checks when the check crc config is set 
to false.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (KAFKA-4934) Add streams test with RocksDb failures

2017-03-22 Thread Eno Thereska (JIRA)
Eno Thereska created KAFKA-4934:
---

 Summary: Add streams test with RocksDb failures
 Key: KAFKA-4934
 URL: https://issues.apache.org/jira/browse/KAFKA-4934
 Project: Kafka
  Issue Type: Bug
  Components: streams
Affects Versions: 0.10.2.0
Reporter: Eno Thereska
Priority: Critical
 Fix For: 0.11.0.0


We need to add either integration of system tests with RocksDb failing 
underneath and fix any problems that occur, including deadlocks, wrong 
exceptions, etc.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: [VOTE] KIP-117: Add a public AdminClient API for Kafka admin operations

2017-03-22 Thread Colin McCabe
On Fri, Mar 17, 2017, at 10:50, Jun Rao wrote:
> Hi, Colin,
> 
> Thanks for the KIP. Looks good overall. A few comments below.
> 
> 1. Sometimes we return
> CompletableFuture>
> and some other times we return
> Map
> , which doesn't seem consistent. Is that intentional?

Yes, this is intentional.  We got feedback from some people that they
wanted a single future that would fail if anything failed.  Other people
wanted to be able to detect failures on individual elements of a batch. 
This API lets us have both (you just choose which future you want to
wait on).

> 
> 2. We support batching in CreateTopic/DeleteTopic/ListTopic, but not in
> DescribeTopic. Should we add batching in DescribeTopic to make it
> consistent?

Good idea.  Let's add batching to DescribeTopic(s).

> Also, both ListTopic and DescribeTopic seem to return
> TopicDescription. Could we just consolidate the two by just keeping
> DescribeTopic?

Sorry, that was a typo.  ListTopics is supposed to return TopicListing,
which tells you only the name of the topic and whether it is internal. 
The idea is that later we will add another RPC which allows us to fetch
just this information, and not the other topic fields. That way, we can
be more efficient.  The idea is that ListTopics is like readdir()/ls and
DescribeTopics is like stat().  Getting detailed information about
1,000s of topics could be quite a resource hog for cluster management
systems in a big cluster.

> 
> 3. listNodes: At the request protocol level, we can get things like
> clusterId and controller broker id. Both are useful info from an admin
> perspective, but are not exposed through the api. Perhaps we can
> generalize
> listNodes to sth like describeCluster so that we can return those
> additional info as well?

Yeah, let's change listNodes -> describeCluster.

> 
> 4. Configurations: To support security, we will need to include all
> properties related to SSL and SASL.

Yeah

best,
Colin

> 
> Thanks,
> 
> Jun
> 
> 
> On Thu, Mar 16, 2017 at 11:59 PM, Colin McCabe 
> wrote:
> 
> > Hi all,
> >
> > It seems like people agree with the basic direction of the proposal and
> > the API, including the operations that are included, the async and
> > batching support, and the mechanisms for extending it in the future.  If
> > there's no more votes, I'd like to close the vote and start progress on
> > this.
> >
> > I think the API should be unstable for a while (at least until the 0.11
> > release is made), so we can consider ways to improve it.  A few have
> > been suggested here: removing or adding functions, renaming things a
> > bit, or using request objects instead of options objects.  I think once
> > people try out the API a bit, it will be easier to evaluate these.
> >
> > best,
> > Colin
> >
> >
> > On Tue, Mar 14, 2017, at 10:12, Dong Lin wrote:
> > > +1
> > >
> > > On Tue, Mar 14, 2017 at 8:50 AM, Grant Henke 
> > wrote:
> > >
> > > > +1
> > > >
> > > > On Tue, Mar 14, 2017 at 2:44 AM, Sriram Subramanian 
> > > > wrote:
> > > >
> > > > > +1 (binding)
> > > > >
> > > > > Nice work in driving this.
> > > > >
> > > > > On Mon, Mar 13, 2017 at 10:31 PM, Gwen Shapira 
> > > > wrote:
> > > > >
> > > > > > +1 (binding)
> > > > > >
> > > > > > I expressed few concerns in the discussion thread, but in general
> > this
> > > > is
> > > > > > super important to get done.
> > > > > >
> > > > > > On Fri, Mar 10, 2017 at 10:38 AM, Colin McCabe  > >
> > > > > wrote:
> > > > > >
> > > > > > > Hi all,
> > > > > > >
> > > > > > > I'd like to start voting on KIP-117
> > > > > > > (https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > > > > > 117%3A+Add+a+public+AdminClient+API+for+Kafka+admin+operations
> > > > > > > ).
> > > > > > >
> > > > > > > The discussion thread can be found here:
> > > > > > > https://www.mail-archive.com/dev@kafka.apache.org/msg65697.html
> > > > > > >
> > > > > > > best,
> > > > > > > Colin
> > > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > *Gwen Shapira*
> > > > > > Product Manager | Confluent
> > > > > > 650.450.2760 | @gwenshap
> > > > > > Follow us: Twitter  | blog
> > > > > > 
> > > > > >
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Grant Henke
> > > > Software Engineer | Cloudera
> > > > gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke
> > > >
> >


Re: [DISCUSS] KIP-126 - Allow KafkaProducer to batch based on uncompressed size

2017-03-22 Thread Dong Lin
Hey Becket,

In addition to the batch-split-rate, should we also add batch-split-ratio
sensor to gauge the probability that we have to split batch?

Also, in the case that the batch size configured for the producer is
smaller than the max message size configured for the broker, why can't we
just split the batch if its size exceeds the configured batch size? The
benefit of this approach is that the semantics of producer is
straightforward because we enforce the batch size that user has configured.
The implementation would also be simpler because we don't have to reply on
KIP-4 to fetch the max message size from broker. I guess you are worrying
about the overhead of "unnecessary" split if a batch size is between
user-configured batch size and broker's max message size. But is overhead
really a concern? If overhead is too large because user has configured a
very low batch size for producer, shouldn't user adjust produce config?

Thanks,
Dong

On Wed, Mar 15, 2017 at 2:50 PM, Becket Qin  wrote:

> I see, then we are thinking about the same thing :)
>
> On Wed, Mar 15, 2017 at 2:26 PM, Ismael Juma  wrote:
>
> > I meant finishing what's described in the following section and then
> > starting a discussion followed by a vote:
> >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > 4+-+Command+line+and+centralized+administrative+operations#KIP-4-
> > Commandlineandcentralizedadministrativeoperations-DescribeConfigsRequest
> >
> > We have only voted on KIP-4 Metadata, KIP-4 Create Topics, KIP-4 Delete
> > Topics so far.
> >
> > Ismael
> >
> > On Wed, Mar 15, 2017 at 8:58 PM, Becket Qin 
> wrote:
> >
> > > Hi Ismael,
> > >
> > > KIP-4 is also the one that I was thinking about. We have introduced a
> > > DescribeConfigRequest there so the producer can easily get the
> > > configurations. By "another KIP" do you mean a new (or maybe extended)
> > > protocol or using that protocol in clients?
> > >
> > > Thanks,
> > >
> > > Jiangjie (Becket) Qin
> > >
> > > On Wed, Mar 15, 2017 at 1:21 PM, Ismael Juma 
> wrote:
> > >
> > > > Hi Becket,
> > > >
> > > > How were you thinking of retrieving the configuration items you
> > > mentioned?
> > > > I am asking because I was planning to post a KIP for Describe Configs
> > > (one
> > > > of the protocols in KIP-4), which would expose such information. But
> > > maybe
> > > > you are thinking of extending Metadata request?
> > > >
> > > > Ismael
> > > >
> > > > On Wed, Mar 15, 2017 at 7:33 PM, Becket Qin 
> > > wrote:
> > > >
> > > > > Hi Jason,
> > > > >
> > > > > Good point. I was thinking about that, too. I was not sure if that
> is
> > > the
> > > > > right thing to do by default.
> > > > >
> > > > > If we assume people always set the batch size to max message size,
> > > > > splitting the oversized batch makes a lot of sense. But it seems
> > > possible
> > > > > that users want to control the memory footprint so they would set
> the
> > > > batch
> > > > > size to smaller than the max message size so the producer can have
> > hold
> > > > > batches for more partitions. In this case, splitting the batch
> might
> > > not
> > > > be
> > > > > the desired behavior.
> > > > >
> > > > > I think the most intuitive approach to this is allow the producer
> to
> > > get
> > > > > the max message size configuration (as well as some other
> > > configurations
> > > > > such as timestamp type)  from the broker side and use that to
> decide
> > > > > whether a batch should be split or not. I probably should add this
> to
> > > the
> > > > > KIP wiki.
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Jiangjie (Becket) Qin
> > > > >
> > > > > On Wed, Mar 15, 2017 at 9:47 AM, Jason Gustafson <
> ja...@confluent.io
> > >
> > > > > wrote:
> > > > >
> > > > > > Hey Becket,
> > > > > >
> > > > > > Thanks for the KIP! The approach seems reasonable. One
> > clarification:
> > > > is
> > > > > > the intent to do the splitting after the broker rejects the
> request
> > > > with
> > > > > > MESSAGE_TOO_LARGE, or prior to sending if the configured batch
> size
> > > is
> > > > > > exceeded?
> > > > > >
> > > > > > -Jason
> > > > > >
> > > > > > On Mon, Mar 13, 2017 at 8:10 PM, Becket Qin <
> becket@gmail.com>
> > > > > wrote:
> > > > > >
> > > > > > > Bump up the thread for further comments. If there is no more
> > > comments
> > > > > on
> > > > > > > the KIP I will start the voting thread on Wed.
> > > > > > >
> > > > > > > Thanks,
> > > > > > >
> > > > > > > Jiangjie (Becket) Qin
> > > > > > >
> > > > > > > On Tue, Mar 7, 2017 at 9:48 AM, Becket Qin <
> becket@gmail.com
> > >
> > > > > wrote:
> > > > > > >
> > > > > > > > Hi Dong,
> > > > > > > >
> > > > > > > > Thanks for the comments.
> > > > > > > >
> > > > > > > > The patch is mostly for proof of concept in case there is any
> > > > concern
> > > > > > > > about the implementation which is indeed a little tricky.
> > > > > > 

[jira] [Updated] (KAFKA-4919) Document that stores must not be closed when Processors are closed

2017-03-22 Thread Elias Levy (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elias Levy updated KAFKA-4919:
--
Summary: Document that stores must not be closed when Processors are closed 
 (was: Streams job fails with InvalidStateStoreException: Store is currently 
closed)

> Document that stores must not be closed when Processors are closed
> --
>
> Key: KAFKA-4919
> URL: https://issues.apache.org/jira/browse/KAFKA-4919
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.2.0
>Reporter: Elias Levy
>
> I have a streams job, that previously worked, that consumes and writes to a 
> large number of topics with many partitions and that uses many threads.  I 
> upgraded the job to 0.10.2.0.  The job now fails after a short time running, 
> seemingly after a rebalance.
> {quote}
> WARN  2017-03-19 18:03:20,432 [StreamThread-10][StreamThread.java:160] : 
> Unexpected state transition from RUNNING to NOT_RUNNING
> {quote}
> The first observation is that Streams is no longer outputting exceptions and 
> backtraces.  I had to add code to get this information.
> The exception:
> {quote}
> Exception: org.apache.kafka.streams.errors.StreamsException: Exception caught 
> in process. taskId=1_225, processor=KSTREAM-SOURCE-03, 
> topic=some_topic, partition=225, offset=266411
> org.apache.kafka.streams.errors.StreamsException: Exception caught in 
> process. taskId=1_225, processor=KSTREAM-SOURCE-03, topic=some_topic, 
> partition=225, offset=266411
>   at 
> org.apache.kafka.streams.processor.internals.StreamTask.process(StreamTask.java:216)
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:641)
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:368)
> Caused by: org.apache.kafka.streams.errors.InvalidStateStoreException: Store 
> someStore-201701060400 is currently closed
>   at 
> org.apache.kafka.streams.state.internals.RocksDBStore.validateStoreOpen(RocksDBStore.java:205)
>   at 
> org.apache.kafka.streams.state.internals.RocksDBStore.put(RocksDBStore.java:221)
>   at 
> org.apache.kafka.streams.state.internals.RocksDBSegmentedBytesStore.put(RocksDBSegmentedBytesStore.java:74)
>   at 
> org.apache.kafka.streams.state.internals.ChangeLoggingSegmentedBytesStore.put(ChangeLoggingSegmentedBytesStore.java:54)
>   at 
> org.apache.kafka.streams.state.internals.MeteredSegmentedBytesStore.put(MeteredSegmentedBytesStore.java:101)
>   at 
> org.apache.kafka.streams.state.internals.RocksDBWindowStore.put(RocksDBWindowStore.java:109)
>   ... X more
> {quote}
> The error occurs for many partitions.
> This was preceded by (for this partition):
> {quote}
> INFO  2017-03-19 18:03:16,403 [StreamThread-10][ConsumerCoordinator.java:393] 
> : Revoking previously assigned partitions [some_topic-225] for group some_job
> INFO  2017-03-19 18:03:16,403 [StreamThread-10][StreamThread.java:254] : 
> stream-thread [StreamThread-10] partitions [[some_topic-225]] revoked at the 
> beginning of consumer rebalance.
> INFO  2017-03-19 18:03:16,403 [StreamThread-10][StreamThread.java:1056] : 
> stream-thread [StreamThread-10] Closing a task's topology 1_225
> INFO  2017-03-19 18:03:17,887 [StreamThread-10][StreamThread.java:544] : 
> stream-thread [StreamThread-10] Flushing state stores of task 1_225
> INFO  2017-03-19 18:03:17,887 [StreamThread-10][StreamThread.java:534] : 
> stream-thread [StreamThread-10] Committing consumer offsets of task 1_225
> INFO  2017-03-19 18:03:17,891 [StreamThread-10][StreamThread.java:1012] : 
> stream-thread [StreamThread-10] Updating suspended tasks to contain active 
> tasks [[1_225, 0_445, 0_30]]
> INFO  2017-03-19 18:03:17,891 [StreamThread-10][StreamThread.java:1019] : 
> stream-thread [StreamThread-10] Removing all active tasks [[1_225, 0_445, 
> 0_30]]
> INFO  2017-03-19 18:03:19,925 [StreamThread-10][ConsumerCoordinator.java:252] 
> : Setting newly assigned partitions [some_tpoic-225] for group some_job
> INFO  2017-03-19 18:03:19,927 [StreamThread-10][StreamThread.java:228] : 
> stream-thread [StreamThread-10] New partitions [[some_topic-225]] assigned at 
> the end of consumer rebalance.
> INFO  2017-03-19 18:03:19,929 [StreamThread-10][StreamTask.java:333] : task 
> [1_225] Initializing processor nodes of the topology
> Something happens.  What ???
> INFO  2017-03-19 18:03:20,135 [StreamThread-10][StreamThread.java:1045] : 
> stream-thread [StreamThread-10] Closing a task 1_225
> INFO  2017-03-19 18:03:20,355 [StreamThread-10][StreamThread.java:544] : 
> stream-thread [StreamThread-10] Flushing state stores of task 1_225
> INFO  2017-03-19 18:03:20,355 [StreamThread-10][StreamThread.java:523] : 
> stream-thread 

[jira] [Commented] (KAFKA-4930) Connect Rest API allows creating connectors with an empty name

2017-03-22 Thread Nacho Munoz (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15936720#comment-15936720
 ] 

Nacho Munoz commented on KAFKA-4930:


Actually, ignore my comment. I have just realised that validation is now part 
of the creation process in the latest release :) 

> Connect Rest API allows creating connectors with an empty name
> --
>
> Key: KAFKA-4930
> URL: https://issues.apache.org/jira/browse/KAFKA-4930
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 0.10.2.0
>Reporter: Sönke Liebau
>Priority: Minor
>
> The Connect Rest API allows to deploy connectors with an empty name field, 
> which then cannot be removed through the api.
> Sending the following request:
> {code}
> {
> "name": "",
> "config": {
> "connector.class": 
> "org.apache.kafka.connect.tools.MockSourceConnector",
> "tasks.max": "1",
> "topics": "test-topic"
> 
> }
> }
> {code}
> Results in a connector being deployed which can be seen in the list of 
> connectors:
> {code}
> [
>   "",
>   "testconnector"
> ]{code}
> But cannot be removed via a DELETE call, as the api thinks we are trying to 
> delete the /connectors endpoint and declines the request.
> I don't think there is a valid case for the connector name to be empty so 
> perhaps we should add a check for this. I am happy to work on this.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (KAFKA-4919) Streams job fails with InvalidStateStoreException: Store is currently closed

2017-03-22 Thread Elias Levy (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15936714#comment-15936714
 ] 

Elias Levy commented on KAFKA-4919:
---

Thanks for the information.  It is something that should probably be mentioned 
in the release notes.

> Streams job fails with InvalidStateStoreException: Store is currently closed
> 
>
> Key: KAFKA-4919
> URL: https://issues.apache.org/jira/browse/KAFKA-4919
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.2.0
>Reporter: Elias Levy
>
> I have a streams job, that previously worked, that consumes and writes to a 
> large number of topics with many partitions and that uses many threads.  I 
> upgraded the job to 0.10.2.0.  The job now fails after a short time running, 
> seemingly after a rebalance.
> {quote}
> WARN  2017-03-19 18:03:20,432 [StreamThread-10][StreamThread.java:160] : 
> Unexpected state transition from RUNNING to NOT_RUNNING
> {quote}
> The first observation is that Streams is no longer outputting exceptions and 
> backtraces.  I had to add code to get this information.
> The exception:
> {quote}
> Exception: org.apache.kafka.streams.errors.StreamsException: Exception caught 
> in process. taskId=1_225, processor=KSTREAM-SOURCE-03, 
> topic=some_topic, partition=225, offset=266411
> org.apache.kafka.streams.errors.StreamsException: Exception caught in 
> process. taskId=1_225, processor=KSTREAM-SOURCE-03, topic=some_topic, 
> partition=225, offset=266411
>   at 
> org.apache.kafka.streams.processor.internals.StreamTask.process(StreamTask.java:216)
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:641)
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:368)
> Caused by: org.apache.kafka.streams.errors.InvalidStateStoreException: Store 
> someStore-201701060400 is currently closed
>   at 
> org.apache.kafka.streams.state.internals.RocksDBStore.validateStoreOpen(RocksDBStore.java:205)
>   at 
> org.apache.kafka.streams.state.internals.RocksDBStore.put(RocksDBStore.java:221)
>   at 
> org.apache.kafka.streams.state.internals.RocksDBSegmentedBytesStore.put(RocksDBSegmentedBytesStore.java:74)
>   at 
> org.apache.kafka.streams.state.internals.ChangeLoggingSegmentedBytesStore.put(ChangeLoggingSegmentedBytesStore.java:54)
>   at 
> org.apache.kafka.streams.state.internals.MeteredSegmentedBytesStore.put(MeteredSegmentedBytesStore.java:101)
>   at 
> org.apache.kafka.streams.state.internals.RocksDBWindowStore.put(RocksDBWindowStore.java:109)
>   ... X more
> {quote}
> The error occurs for many partitions.
> This was preceded by (for this partition):
> {quote}
> INFO  2017-03-19 18:03:16,403 [StreamThread-10][ConsumerCoordinator.java:393] 
> : Revoking previously assigned partitions [some_topic-225] for group some_job
> INFO  2017-03-19 18:03:16,403 [StreamThread-10][StreamThread.java:254] : 
> stream-thread [StreamThread-10] partitions [[some_topic-225]] revoked at the 
> beginning of consumer rebalance.
> INFO  2017-03-19 18:03:16,403 [StreamThread-10][StreamThread.java:1056] : 
> stream-thread [StreamThread-10] Closing a task's topology 1_225
> INFO  2017-03-19 18:03:17,887 [StreamThread-10][StreamThread.java:544] : 
> stream-thread [StreamThread-10] Flushing state stores of task 1_225
> INFO  2017-03-19 18:03:17,887 [StreamThread-10][StreamThread.java:534] : 
> stream-thread [StreamThread-10] Committing consumer offsets of task 1_225
> INFO  2017-03-19 18:03:17,891 [StreamThread-10][StreamThread.java:1012] : 
> stream-thread [StreamThread-10] Updating suspended tasks to contain active 
> tasks [[1_225, 0_445, 0_30]]
> INFO  2017-03-19 18:03:17,891 [StreamThread-10][StreamThread.java:1019] : 
> stream-thread [StreamThread-10] Removing all active tasks [[1_225, 0_445, 
> 0_30]]
> INFO  2017-03-19 18:03:19,925 [StreamThread-10][ConsumerCoordinator.java:252] 
> : Setting newly assigned partitions [some_tpoic-225] for group some_job
> INFO  2017-03-19 18:03:19,927 [StreamThread-10][StreamThread.java:228] : 
> stream-thread [StreamThread-10] New partitions [[some_topic-225]] assigned at 
> the end of consumer rebalance.
> INFO  2017-03-19 18:03:19,929 [StreamThread-10][StreamTask.java:333] : task 
> [1_225] Initializing processor nodes of the topology
> Something happens.  What ???
> INFO  2017-03-19 18:03:20,135 [StreamThread-10][StreamThread.java:1045] : 
> stream-thread [StreamThread-10] Closing a task 1_225
> INFO  2017-03-19 18:03:20,355 [StreamThread-10][StreamThread.java:544] : 
> stream-thread [StreamThread-10] Flushing state stores of task 1_225
> INFO  2017-03-19 18:03:20,355 [StreamThread-10][StreamThread.java:523] : 
> stream-thread 

[jira] [Commented] (KAFKA-4930) Connect Rest API allows creating connectors with an empty name

2017-03-22 Thread Nacho Munoz (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15936679#comment-15936679
 ] 

Nacho Munoz commented on KAFKA-4930:


[~gwenshap] is there any reason why there's no validation when creating a 
connector? I'm aware of the validating endpoint to test the connector 
configuration but, would it make sense to validate the configuration as part of 
the creation process? 

> Connect Rest API allows creating connectors with an empty name
> --
>
> Key: KAFKA-4930
> URL: https://issues.apache.org/jira/browse/KAFKA-4930
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 0.10.2.0
>Reporter: Sönke Liebau
>Priority: Minor
>
> The Connect Rest API allows to deploy connectors with an empty name field, 
> which then cannot be removed through the api.
> Sending the following request:
> {code}
> {
> "name": "",
> "config": {
> "connector.class": 
> "org.apache.kafka.connect.tools.MockSourceConnector",
> "tasks.max": "1",
> "topics": "test-topic"
> 
> }
> }
> {code}
> Results in a connector being deployed which can be seen in the list of 
> connectors:
> {code}
> [
>   "",
>   "testconnector"
> ]{code}
> But cannot be removed via a DELETE call, as the api thinks we are trying to 
> delete the /connectors endpoint and declines the request.
> I don't think there is a valid case for the connector name to be empty so 
> perhaps we should add a check for this. I am happy to work on this.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: [VOTE] KIP-82 Add Record Headers

2017-03-22 Thread Joel Koshy
+1

On Tue, Mar 21, 2017 at 5:01 PM, Jason Gustafson  wrote:

> Thanks for the KIP! +1 (binding) from me. Just one nit: can we change
> `Headers.header(key)` to be `Headers.lastHeader(key)`? It's not a
> deal-breaker, but I think it's better to let the name reflect the actual
> behavior as clearly as possible.
>
> -Jason
>
> On Wed, Feb 15, 2017 at 6:10 AM, Jeroen van Disseldorp 
> wrote:
>
> > +1 on introducing the concept of headers, neutral on specific
> > implementation.
> >
> >
> >
> > On 14/02/2017 22:34, Jay Kreps wrote:
> >
> >> Couple of things I think we still need to work out:
> >>
> >> 1. I think we agree about the key, but I think we haven't talked
> about
> >> the value yet. I think if our goal is an open ecosystem of these
> >> header
> >> spread across many plugins from many systems we should consider
> >> making this
> >> a string as well so it can be printed, set via a UI, set in config,
> >> etc.
> >> Basically encouraging pluggable serialization formats here will lead
> >> to a
> >> bit of a tower of babel.
> >> 2. This proposal still includes a pretty big change to our
> >> serialization
> >> and protocol definition layer. Essentially it is introducing an
> >> optional
> >> type, where the format is data dependent. I think this is actually a
> >> big
> >> change though it doesn't seem like it. It means you can no longer
> >> specify
> >> this type with our type definition DSL, and likewise it requires
> >> custom
> >> handling in client libs. This isn't a huge thing, since the Record
> >> definition is custom anyway, but I think this kind of protocol
> >> inconsistency is very non-desirable and ties you to hand-coding
> >> things. I
> >> think the type should instead by [Key Value] in our BNF, where key
> and
> >> value are both short strings as used elsewhere. This brings it in
> >> line with
> >> the rest of the protocol.
> >> 3. Could we get more specific about the exact Java API change to
> >> ProducerRecord, ConsumerRecord, Record, etc?
> >>
> >> -Jay
> >>
> >> On Tue, Feb 14, 2017 at 9:42 AM, Michael Pearce 
> >> wrote:
> >>
> >> Hi all,
> >>>
> >>> We would like to start the voting process for KIP-82 – Add record
> >>> headers.
> >>> The KIP can be found
> >>> at
> >>>
> >>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> >>> 82+-+Add+Record+Headers
> >>>
> >>> Discussion thread(s) can be found here:
> >>>
> >>> http://search-hadoop.com/m/Kafka/uyzND1nSTOHTvj81?subj=
> >>> Re+DISCUSS+KIP+82+Add+Record+Headers
> >>> http://search-hadoop.com/m/Kafka/uyzND1Arxt22Tvj81?subj=
> >>> Re+DISCUSS+KIP+82+Add+Record+Headers
> >>> http://search-hadoop.com/?project=Kafka=KIP-82
> >>>
> >>>
> >>>
> >>> Thanks,
> >>> Mike
> >>>
> >>> The information contained in this email is strictly confidential and
> for
> >>> the use of the addressee only, unless otherwise indicated. If you are
> not
> >>> the intended recipient, please do not read, copy, use or disclose to
> >>> others
> >>> this message or any attachment. Please also notify the sender by
> replying
> >>> to this email or by telephone (+44(020 7896 0011) and then delete the
> >>> email
> >>> and any copies of it. Opinions, conclusion (etc) that do not relate to
> >>> the
> >>> official business of this company shall be understood as neither given
> >>> nor
> >>> endorsed by it. IG is a trading name of IG Markets Limited (a company
> >>> registered in England and Wales, company number 04008957) and IG Index
> >>> Limited (a company registered in England and Wales, company number
> >>> 01190902). Registered address at Cannon Bridge House, 25 Dowgate Hill,
> >>> London EC4R 2YA. Both IG Markets Limited (register number 195355) and
> IG
> >>> Index Limited (register number 114059) are authorised and regulated by
> >>> the
> >>> Financial Conduct Authority.
> >>>
> >>>
> >
>


Re: [DISCUSS] KIP-129: Kafka Streams Exactly-Once Semantics

2017-03-22 Thread Sriram Subramanian
This is a completely new feature which is controlled by a config. It would
be a regression if you upgraded streams and saw a different behavior. That
would not happen in this case. This is similar to how we are introducing
idempotent producer in core kafka with a config to turn it on. There is no
guarantee that the performance of the producer with the config turned on
will be the same although eventually we will like to get to it.

On Wed, Mar 22, 2017 at 7:12 AM, Michael Noll  wrote:

> I second Eno's concern regarding the impact of Streams EOS on state stores.
>
> >  We do a full recovery today and the EOS proposal will not make this any
> worse.
>
> Yes, today we do a full state store recovery under certain failures.
> However, I think the point (or perhaps: open question) is that, with the
> EOS design, there's now an *increased likelihood* of such failures that
> trigger full state store recovery.  If this increase is significant, then I
> would consider this to be a regression that we should address.
>
> As Eno said:
>
> > currently we pay the recovery price for a Kafka Streams instance failure.
> > Now we might pay it for a transaction failure. Will transaction failures
> be
> > more or less common than the previous types of failures?
>
> Damian voiced similar concerns at the very beginning of this discussion,
> not sure what his current opinion is here.
>
> -Michael
>
>
>
>
>
> On Wed, Mar 22, 2017 at 1:04 AM, Sriram Subramanian 
> wrote:
>
> > To add to this discussion, I do think we should think about this in
> > increments. We do a full recovery today and the EOS proposal will not
> make
> > this any worse. Using store snapshot is a good option to avoid store
> > recovery in the future but as Eno points out, all pluggable stores would
> > need to have this ability. W.r.t transaction failures, this should not be
> > an issue. We should be simply retrying. There is one optimization we can
> do
> > for clean shutdowns. We could store a clean shutdown file that contains
> the
> > input offsets. This file gets written when you close the streams
> instance.
> > On start, you could can check the offsets from the shutdown file and
> > compare it with the offsets we get from the consumer and ensure they
> match.
> > If they do, you could use the same store instead of recovering. However,
> if
> > we go with the snapshot approach, this will not be required. My vote
> would
> > be to implement V1 and solve the bootstrap problem which exist today in
> the
> > future versions.
> >
> > On Tue, Mar 21, 2017 at 4:43 PM, Matthias J. Sax 
> > wrote:
> >
> > > Thanks for your feedback Eno.
> > >
> > > For now, I still think that the KIP itself does not need to talk about
> > > this in more detail, because we apply the same strategy for EoS as for
> > > non-EoS (as of 0.10.2).
> > >
> > > Thus, in case of a clean shutdown, we write the checkpoint file for a
> > > store and thus know we can reuse the store. In case of failure, we need
> > > to recreate the store from the changelog.
> > >
> > > > Will a V1 design that relies on plain store recovery from Kafka for
> > > > each transaction abort be good enough, or usable?
> > >
> > > Why should it not be usable? It's the same strategy as used in 0.10.2
> > > and it runs in production in many companies already.
> > >
> > > > however it seems to me we might have a regression of sorts
> > > > Now we might pay it for a transaction failure.
> > >
> > > I would assume transaction failures to be quite rare. Maybe the core
> EoS
> > > folks can comment here, too.
> > >
> > >
> > >
> > > -Matthias
> > >
> > >
> > >
> > > On 3/20/17 3:16 PM, Eno Thereska wrote:
> > > > Hi Matthias,
> > > >
> > > > I'd like to see some more info on how you propose to handle
> > transactions
> > > that involve state stores in the KIP itself. The design doc has info
> > about
> > > various optimisations like RocksDb snapshots and transactions and such,
> > but
> > > will there be a user-visible interface that indicates whether a store
> has
> > > snapshot and/or transactional capabilities? If a user plugs in another
> > > store, what guarantees are they expected to get?
> > > >
> > > > Will a V1 design that relies on plain store recovery from Kafka for
> > each
> > > transaction abort be good enough, or usable? If your dataset is large
> > > (e.g., 200GB) the recovery time might be so large as to effectively
> > render
> > > that Kafka Streams instance unavailable for tens of minutes. You
> mention
> > > that is not a regression to what we currently have, however it seems to
> > me
> > > we might have a regression of sorts: currently we pay the recovery
> price
> > > for a Kafka Streams instance failure. Now we might pay it for a
> > transaction
> > > failure. Will transaction failures be more or less common than the
> > previous
> > > types of failures? I'd like to see this addressed.
> > > >
> > > > Thanks
> > > > Eno
> > > >
> > > 

Re: [DISCUSS] KIP-120: Cleanup Kafka Streams builder API

2017-03-22 Thread Michael Noll
Forwarding to kafka-user.


-- Forwarded message --
From: Michael Noll 
Date: Wed, Mar 22, 2017 at 8:48 AM
Subject: Re: [DISCUSS] KIP-120: Cleanup Kafka Streams builder API
To: dev@kafka.apache.org


Matthias,

> @Michael:
>
> You seemed to agree with Jay about not exposing the `Topology` concept
> in our main entry class (ie, current KStreamBuilder), thus, I
> interpreted that you do not want `Topology` in the name either (I am a
> little surprised by your last response, that goes the opposite direction).

Oh, sorry for not being clear.

What I wanted to say in my earlier email was the following:  Yes, I do
agree with most of Jay's reasoning, notably about carefully deciding how
much and which parts of the API/concept "surface" we expose to users of the
DSL.  However, and this is perhaps where I wasn't very clear, I disagree on
the particular opinion about not exposing the topology concept to DSL
users.  Instead, I think the concept of a topology is important to
understand even for DSL users -- particularly because of the way the DSL is
currently wiring your processing logic via the builder pattern.  (As I
noted, e.g. Akka uses a different approach where you might be able to get
away with not exposing the "topology" concept, but even in Akka there's the
notion of graphs and flows.)


> > StreamsBuilder builder = new StreamsBuilder();
> >
> > // And here you'd define your...well, what actually?
> > // Ah right, you are composing a topology here, though you are not
> > aware of it.
>
> Yes. You are not aware of if -- that's the whole point about it -- don't
> put the Topology concept in the focus...

Let me turn this around, because that was my point: it's confusing to have
a name "StreamsBuilder" if that thing isn't building streams, and it is not.

As I mentioned before, I do think it is a benefit to make it clear to DSL
users that there are two aspects at play: (1) defining the logic/plan of
your processing, and (2) the execution of that plan.  I have a less strong
opinion whether or not having "topology" in the names would help to
communicate this separation as well as combination of (1) and (2) to make
your app work as expected.

If we stick with `KafkaStreams` for (2) *and* don't like having "topology"
in the name, then perhaps we should rename `KStreamBuilder` to
`KafkaStreamsBuilder`.  That at least gives some illusion of a combo of (1)
and (2).  IMHO, `KafkaStreamsBuilder` highlights better that "it is a
builder/helper for the Kafka Streams API", rather than "a builder for
streams".

Also, I think some of the naming challenges we're discussing here are
caused by having this builder pattern in the first place.  If the Streams
API was implemented in Scala, for example, we could use implicits for
helping us to "stitch streams/tables together to build the full topology",
thus using a different (better?) approach to composing your topologies that
through a builder pattern.  So: perhaps there's a better way then the
builder, and that way would also be clearer on terminology?  That said,
this might take this KIP off-scope.

-Michael




On Wed, Mar 22, 2017 at 12:33 AM, Matthias J. Sax 
wrote:

> @Guozhang:
>
> I recognized that you want to have `Topology` in the name. But it seems
> that more people preferred to not have it (Jay, Ram, Michael [?], myself).
>
> @Michael:
>
> You seemed to agree with Jay about not exposing the `Topology` concept
> in our main entry class (ie, current KStreamBuilder), thus, I
> interpreted that you do not want `Topology` in the name either (I am a
> little surprised by your last response, that goes the opposite direction).
>
> > StreamsBuilder builder = new StreamsBuilder();
> >
> > // And here you'd define your...well, what actually?
> > // Ah right, you are composing a topology here, though you are not
> > aware of it.
>
> Yes. You are not aware of if -- that's the whole point about it -- don't
> put the Topology concept in the focus...
>
> Furthermore,
>
> >>> So what are you building here with StreamsBuilder?  Streams (hint: No)?
> >>> And what about tables -- is there a TableBuilder (hint: No)?
>
> I am not sure, if this is too much a concern. In contrast to
> `KStreamBuilder` (singular) that contains `KStream` and thus puts
> KStream concept in focus and thus degrade `KTable`, `StreamsBuilder`
> (plural) focuses on "Streams API". IMHO, it does not put focus on
> KStream. It's just a builder from the Streams API -- you don't need to
> worry what you are building -- and you don't need to think about the
> `Topology` concept (of course, you see that .build() return a Topology).
>
>
> Personally, I see pros and cons for both `StreamsBuilder` and
> `StreamsTopologyBuilder` and thus, I am fine either way. Maybe Jay and
> Ram can follow up and share their thoughts?
>
> I would also help a lot if other people put their vote for a name, too.
>
>
>
> -Matthias
>
>
>
> On 3/21/17 2:11 PM, 

[jira] [Commented] (KAFKA-4892) kafka 0.8.2.1 getOffsetsBefore API returning correct offset given timestamp in unix epoch format

2017-03-22 Thread Colin P. McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15936553#comment-15936553
 ] 

Colin P. McCabe commented on KAFKA-4892:


Can you be a little bit more descriptive about what the bug is here?

> kafka 0.8.2.1 getOffsetsBefore API returning correct offset given timestamp 
> in unix epoch format
> 
>
> Key: KAFKA-4892
> URL: https://issues.apache.org/jira/browse/KAFKA-4892
> Project: Kafka
>  Issue Type: Bug
>  Components: clients, offset manager
>Affects Versions: 0.8.2.1
> Environment: ubuntu 16.04
>Reporter: Chris Bedford
>
> I am seeing unexpected behavior in the getOffsetsBefore method of the client 
> API.
> I understand the granularity of 'start-from-offset' via kafka spout is based 
> on how many log segments you have.  
> I have created a demo program that repro's this on my 
> git hub account [ 
> https://github.com/buildlackey/kafkaOffsetBug/blob/master/README.md ], 
> and I have also posted this same question to stack overflow with 
>  a detailed set of steps for how to repro this issue 
> (using my test program and scripts).  
> See: 
> http://stackoverflow.com/questions/42775128/kafka-0-8-2-1-getoffsetsbefore-api-returning-correct-offset-given-timestamp-in-u
> Thanks in advance for your help !



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (KAFKA-4933) Allow specifying keystore and truststore from byte array

2017-03-22 Thread Devon Meunier (JIRA)
Devon Meunier created KAFKA-4933:


 Summary: Allow specifying keystore and truststore from byte array
 Key: KAFKA-4933
 URL: https://issues.apache.org/jira/browse/KAFKA-4933
 Project: Kafka
  Issue Type: Improvement
  Components: config
Reporter: Devon Meunier
Priority: Minor


When using KafkaIO in Google Cloud Dataflow, you have unreliable access to the 
underlying filesystem, and it would be beneficial to be able to supply 
keystores and truststores as binary data rather than having to stage them on 
the filesystem.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: Kafka design query

2017-03-22 Thread Colin McCabe
Hi Amit,

In your example, computer A would be running the Kafka producer (which
is a software library) and computer B would be running the Kafka broker
(which is a long running-program, or daemon).  Hope that helps.

best,
Colin

On Fri, Mar 10, 2017, at 01:27, amit tomar wrote:
> Hi Experts,
> 
> I am new to Kafka and programming world. Just want to know that if i want
> to pull messages from another comupter to kafka cluster, for example if i
> have computer A - source and Computer B which is having kafka setup then
> to
> pull the messages from computer A where i need to run the Kafka Producer
> API program, on Computer A or Computer B.
> 
> Please answer.
> 
> Thanks,
> Amit Tomar


[jira] [Updated] (KAFKA-4932) Add UUID Serde

2017-03-22 Thread Jeff Klukas (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Klukas updated KAFKA-4932:
---
Description: 
I propose adding serializers and deserializers for the java.util.UUID class.

I have many use cases where I want to set the key of a Kafka message to be a 
UUID. Currently, I need to turn UUIDs into strings or byte arrays and use their 
associated Serdes, but it would be more convenient to serialize and deserialize 
UUIDs directly.

I'd propose that the serializer and deserializer use the 36-byte string 
representation, calling UUID.toString and UUID.fromString. We would also wrap 
these in a Serde and modify the streams Serdes class to include this in the 
list of supported types.

Optionally, we could have the deserializer support a 16-byte representation and 
it would check the size of the input byte array to determine whether it's a 
binary or string representation of the UUID. It's not well defined whether the 
most significant bits or least significant go first, so this deserializer would 
have to support only one or the other.

Similary, if the deserializer supported a 16-byte representation, there could 
be two variants of the serializer, a UUIDStringSerializer and a 
UUIDBytesSerializer.

I would be willing to write this PR, but am looking for feedback about whether 
there are significant concerns here around ambiguity of what the byte 
representation of a UUID should be, or if there's desire to keep to list of 
built-in Serdes minimal such that a PR would be unlikely to be accepted.

  was:
I propose adding serializers and deserializers for the java.util.UUID class.

I have many use cases where I want to set the key of a Kafka message to be a 
UUID. Currently, I need turn UUIDs into strings or byte arrays and use the 
associated Serdes, but it would be more convenient to serialize and deserialize 
UUIDs directly.

I'd propose that the serializer and deserializer use the 36-byte string 
representation, calling UUID.toString and UUID.fromString

Optionally, we could also has the deserializer support a 16-byte representation 
and it would check size of the input byte array to determine whether it's a 
binary or string representation of the UUID. It's not well defined whether the 
most significant bits or least significant go first, so this deserializer would 
have to support only one or the other.

Optionally, there could be two variants of the serializer, a 
UUIDStringSerializer and a UUIDBytesSerializer.

We would also wrap these in a Serde and modify the Serdes class to include this 
in the list of supported types.

I would be willing to write this PR, but am looking for feedback about whether 
there are significant concerns here around ambiguity of what the byte 
representation of a UUID should be, or if there's desire to keep to list of 
built-in Serdes minimal such that a PR would be unlikely to be accepted.


> Add UUID Serde
> --
>
> Key: KAFKA-4932
> URL: https://issues.apache.org/jira/browse/KAFKA-4932
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients, streams
>Reporter: Jeff Klukas
>Priority: Minor
>
> I propose adding serializers and deserializers for the java.util.UUID class.
> I have many use cases where I want to set the key of a Kafka message to be a 
> UUID. Currently, I need to turn UUIDs into strings or byte arrays and use 
> their associated Serdes, but it would be more convenient to serialize and 
> deserialize UUIDs directly.
> I'd propose that the serializer and deserializer use the 36-byte string 
> representation, calling UUID.toString and UUID.fromString. We would also wrap 
> these in a Serde and modify the streams Serdes class to include this in the 
> list of supported types.
> Optionally, we could have the deserializer support a 16-byte representation 
> and it would check the size of the input byte array to determine whether it's 
> a binary or string representation of the UUID. It's not well defined whether 
> the most significant bits or least significant go first, so this deserializer 
> would have to support only one or the other.
> Similary, if the deserializer supported a 16-byte representation, there could 
> be two variants of the serializer, a UUIDStringSerializer and a 
> UUIDBytesSerializer.
> I would be willing to write this PR, but am looking for feedback about 
> whether there are significant concerns here around ambiguity of what the byte 
> representation of a UUID should be, or if there's desire to keep to list of 
> built-in Serdes minimal such that a PR would be unlikely to be accepted.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (KAFKA-4932) Add UUID Serde

2017-03-22 Thread Jeff Klukas (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Klukas updated KAFKA-4932:
---
Summary: Add UUID Serde  (was: Add UUID Serdes)

> Add UUID Serde
> --
>
> Key: KAFKA-4932
> URL: https://issues.apache.org/jira/browse/KAFKA-4932
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients, streams
>Reporter: Jeff Klukas
>Priority: Minor
>
> I propose adding serializers and deserializers for the java.util.UUID class.
> I have many use cases where I want to set the key of a Kafka message to be a 
> UUID. Currently, I need turn UUIDs into strings or byte arrays and use the 
> associated Serdes, but it would be more convenient to serialize and 
> deserialize UUIDs directly.
> I'd propose that the serializer and deserializer use the 36-byte string 
> representation, calling UUID.toString and UUID.fromString
> Optionally, we could also has the deserializer support a 16-byte 
> representation and it would check size of the input byte array to determine 
> whether it's a binary or string representation of the UUID. It's not well 
> defined whether the most significant bits or least significant go first, so 
> this deserializer would have to support only one or the other.
> Optionally, there could be two variants of the serializer, a 
> UUIDStringSerializer and a UUIDBytesSerializer.
> We would also wrap these in a Serde and modify the Serdes class to include 
> this in the list of supported types.
> I would be willing to write this PR, but am looking for feedback about 
> whether there are significant concerns here around ambiguity of what the byte 
> representation of a UUID should be, or if there's desire to keep to list of 
> built-in Serdes minimal such that a PR would be unlikely to be accepted.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (KAFKA-4932) Add UUID Serdes

2017-03-22 Thread Jeff Klukas (JIRA)
Jeff Klukas created KAFKA-4932:
--

 Summary: Add UUID Serdes
 Key: KAFKA-4932
 URL: https://issues.apache.org/jira/browse/KAFKA-4932
 Project: Kafka
  Issue Type: Improvement
  Components: clients, streams
Reporter: Jeff Klukas
Priority: Minor


I propose adding serializers and deserializers for the java.util.UUID class.

I have many use cases where I want to set the key of a Kafka message to be a 
UUID. Currently, I need turn UUIDs into strings or byte arrays and use the 
associated Serdes, but it would be more convenient to serialize and deserialize 
UUIDs directly.

I'd propose that the serializer and deserializer use the 36-byte string 
representation, calling UUID.toString and UUID.fromString

Optionally, we could also has the deserializer support a 16-byte representation 
and it would check size of the input byte array to determine whether it's a 
binary or string representation of the UUID. It's not well defined whether the 
most significant bits or least significant go first, so this deserializer would 
have to support only one or the other.

Optionally, there could be two variants of the serializer, a 
UUIDStringSerializer and a UUIDBytesSerializer.

We would also wrap these in a Serde and modify the Serdes class to include this 
in the list of supported types.

I would be willing to write this PR, but am looking for feedback about whether 
there are significant concerns here around ambiguity of what the byte 
representation of a UUID should be, or if there's desire to keep to list of 
built-in Serdes minimal such that a PR would be unlikely to be accepted.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (KAFKA-4918) Continuous fetch requests for offset storage topic in kafka connect

2017-03-22 Thread Gwen Shapira (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gwen Shapira resolved KAFKA-4918.
-
Resolution: Not A Problem

> Continuous fetch requests for offset storage topic in kafka connect
> ---
>
> Key: KAFKA-4918
> URL: https://issues.apache.org/jira/browse/KAFKA-4918
> Project: Kafka
>  Issue Type: Improvement
>  Components: KafkaConnect
>Affects Versions: 0.10.1.0, 0.10.1.1, 0.10.2.0
> Environment: unix, osx
>Reporter: Liju
>  Labels: enhancement, performance
>
> The kafka consumer in the KafkaOffsetBackingStore polls continuously with 
> timeout hardcoded as 0 ms , this leads to high fetch request load to kafka 
> server , and specifically for the sink connectors ( eg. kafka-connect-hdfs) 
> which doesn't uses the offset storage topic for offset tracking , this 
> becomes redundant and it continuously sends fetch request as there is no data 
> in this topic 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (KAFKA-4918) Continuous fetch requests for offset storage topic in kafka connect

2017-03-22 Thread Gwen Shapira (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gwen Shapira reassigned KAFKA-4918:
---

Assignee: Gwen Shapira

> Continuous fetch requests for offset storage topic in kafka connect
> ---
>
> Key: KAFKA-4918
> URL: https://issues.apache.org/jira/browse/KAFKA-4918
> Project: Kafka
>  Issue Type: Improvement
>  Components: KafkaConnect
>Affects Versions: 0.10.1.0, 0.10.1.1, 0.10.2.0
> Environment: unix, osx
>Reporter: Liju
>Assignee: Gwen Shapira
>  Labels: enhancement, performance
>
> The kafka consumer in the KafkaOffsetBackingStore polls continuously with 
> timeout hardcoded as 0 ms , this leads to high fetch request load to kafka 
> server , and specifically for the sink connectors ( eg. kafka-connect-hdfs) 
> which doesn't uses the offset storage topic for offset tracking , this 
> becomes redundant and it continuously sends fetch request as there is no data 
> in this topic 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: [DISCUSS] KIP-129: Kafka Streams Exactly-Once Semantics

2017-03-22 Thread Michael Noll
Separate reply because I'm switching topics (no pun intended). :-)

One impact of the Streams EOS design is how we handle failures in Streams.
In the EOS design we have effectively three main failure categories, as far
as I understand:

1. Transient failures (which we now e.g. handle with infinite retries on
the side of producers)
2. Fatal errors aka "stop the world" (nothing we can do about those -- in
the worst case, these might bring down the full app)
3. Producer fence errors (happens when tasks have been migrated out of a
processing thread)

My question is:  how are failures handled that are caused by corrupt
messages, think: "poison pills"?  These typically manifest themselves as
exceptions that are thrown by the serdes.  Into which of the three failure
"categories" above would such "poison pill" exceptions fall?

For the record, today users need to take extra steps to handle such poison
pills (see e.g. [1]), but that isn't very elegant or convenient.  Since we
have already begun discussing how to improve the status quo, I am
interested in understanding the impact of this type of failure on/in the
EOS design.

-Michael


[1]
http://docs.confluent.io/current/streams/faq.html#handling-corrupted-records-and-deserialization-errors-poison-pill-messages




On Wed, Mar 22, 2017 at 3:12 PM, Michael Noll  wrote:

> I second Eno's concern regarding the impact of Streams EOS on state stores.
>
> >  We do a full recovery today and the EOS proposal will not make this
> any worse.
>
> Yes, today we do a full state store recovery under certain failures.
> However, I think the point (or perhaps: open question) is that, with the
> EOS design, there's now an *increased likelihood* of such failures that
> trigger full state store recovery.  If this increase is significant, then I
> would consider this to be a regression that we should address.
>
> As Eno said:
>
> > currently we pay the recovery price for a Kafka Streams instance
> failure.
> > Now we might pay it for a transaction failure. Will transaction failures
> be
> > more or less common than the previous types of failures?
>
> Damian voiced similar concerns at the very beginning of this discussion,
> not sure what his current opinion is here.
>
> -Michael
>
>
>
>
>
> On Wed, Mar 22, 2017 at 1:04 AM, Sriram Subramanian 
> wrote:
>
>> To add to this discussion, I do think we should think about this in
>> increments. We do a full recovery today and the EOS proposal will not make
>> this any worse. Using store snapshot is a good option to avoid store
>> recovery in the future but as Eno points out, all pluggable stores would
>> need to have this ability. W.r.t transaction failures, this should not be
>> an issue. We should be simply retrying. There is one optimization we can
>> do
>> for clean shutdowns. We could store a clean shutdown file that contains
>> the
>> input offsets. This file gets written when you close the streams instance.
>> On start, you could can check the offsets from the shutdown file and
>> compare it with the offsets we get from the consumer and ensure they
>> match.
>> If they do, you could use the same store instead of recovering. However,
>> if
>> we go with the snapshot approach, this will not be required. My vote would
>> be to implement V1 and solve the bootstrap problem which exist today in
>> the
>> future versions.
>>
>> On Tue, Mar 21, 2017 at 4:43 PM, Matthias J. Sax 
>> wrote:
>>
>> > Thanks for your feedback Eno.
>> >
>> > For now, I still think that the KIP itself does not need to talk about
>> > this in more detail, because we apply the same strategy for EoS as for
>> > non-EoS (as of 0.10.2).
>> >
>> > Thus, in case of a clean shutdown, we write the checkpoint file for a
>> > store and thus know we can reuse the store. In case of failure, we need
>> > to recreate the store from the changelog.
>> >
>> > > Will a V1 design that relies on plain store recovery from Kafka for
>> > > each transaction abort be good enough, or usable?
>> >
>> > Why should it not be usable? It's the same strategy as used in 0.10.2
>> > and it runs in production in many companies already.
>> >
>> > > however it seems to me we might have a regression of sorts
>> > > Now we might pay it for a transaction failure.
>> >
>> > I would assume transaction failures to be quite rare. Maybe the core EoS
>> > folks can comment here, too.
>> >
>> >
>> >
>> > -Matthias
>> >
>> >
>> >
>> > On 3/20/17 3:16 PM, Eno Thereska wrote:
>> > > Hi Matthias,
>> > >
>> > > I'd like to see some more info on how you propose to handle
>> transactions
>> > that involve state stores in the KIP itself. The design doc has info
>> about
>> > various optimisations like RocksDb snapshots and transactions and such,
>> but
>> > will there be a user-visible interface that indicates whether a store
>> has
>> > snapshot and/or transactional capabilities? If a user plugs in another
>> > store, what guarantees are they expected to 

Re: [DISCUSS] KIP-129: Kafka Streams Exactly-Once Semantics

2017-03-22 Thread Michael Noll
I second Eno's concern regarding the impact of Streams EOS on state stores.

>  We do a full recovery today and the EOS proposal will not make this any
worse.

Yes, today we do a full state store recovery under certain failures.
However, I think the point (or perhaps: open question) is that, with the
EOS design, there's now an *increased likelihood* of such failures that
trigger full state store recovery.  If this increase is significant, then I
would consider this to be a regression that we should address.

As Eno said:

> currently we pay the recovery price for a Kafka Streams instance failure.
> Now we might pay it for a transaction failure. Will transaction failures
be
> more or less common than the previous types of failures?

Damian voiced similar concerns at the very beginning of this discussion,
not sure what his current opinion is here.

-Michael





On Wed, Mar 22, 2017 at 1:04 AM, Sriram Subramanian 
wrote:

> To add to this discussion, I do think we should think about this in
> increments. We do a full recovery today and the EOS proposal will not make
> this any worse. Using store snapshot is a good option to avoid store
> recovery in the future but as Eno points out, all pluggable stores would
> need to have this ability. W.r.t transaction failures, this should not be
> an issue. We should be simply retrying. There is one optimization we can do
> for clean shutdowns. We could store a clean shutdown file that contains the
> input offsets. This file gets written when you close the streams instance.
> On start, you could can check the offsets from the shutdown file and
> compare it with the offsets we get from the consumer and ensure they match.
> If they do, you could use the same store instead of recovering. However, if
> we go with the snapshot approach, this will not be required. My vote would
> be to implement V1 and solve the bootstrap problem which exist today in the
> future versions.
>
> On Tue, Mar 21, 2017 at 4:43 PM, Matthias J. Sax 
> wrote:
>
> > Thanks for your feedback Eno.
> >
> > For now, I still think that the KIP itself does not need to talk about
> > this in more detail, because we apply the same strategy for EoS as for
> > non-EoS (as of 0.10.2).
> >
> > Thus, in case of a clean shutdown, we write the checkpoint file for a
> > store and thus know we can reuse the store. In case of failure, we need
> > to recreate the store from the changelog.
> >
> > > Will a V1 design that relies on plain store recovery from Kafka for
> > > each transaction abort be good enough, or usable?
> >
> > Why should it not be usable? It's the same strategy as used in 0.10.2
> > and it runs in production in many companies already.
> >
> > > however it seems to me we might have a regression of sorts
> > > Now we might pay it for a transaction failure.
> >
> > I would assume transaction failures to be quite rare. Maybe the core EoS
> > folks can comment here, too.
> >
> >
> >
> > -Matthias
> >
> >
> >
> > On 3/20/17 3:16 PM, Eno Thereska wrote:
> > > Hi Matthias,
> > >
> > > I'd like to see some more info on how you propose to handle
> transactions
> > that involve state stores in the KIP itself. The design doc has info
> about
> > various optimisations like RocksDb snapshots and transactions and such,
> but
> > will there be a user-visible interface that indicates whether a store has
> > snapshot and/or transactional capabilities? If a user plugs in another
> > store, what guarantees are they expected to get?
> > >
> > > Will a V1 design that relies on plain store recovery from Kafka for
> each
> > transaction abort be good enough, or usable? If your dataset is large
> > (e.g., 200GB) the recovery time might be so large as to effectively
> render
> > that Kafka Streams instance unavailable for tens of minutes. You mention
> > that is not a regression to what we currently have, however it seems to
> me
> > we might have a regression of sorts: currently we pay the recovery price
> > for a Kafka Streams instance failure. Now we might pay it for a
> transaction
> > failure. Will transaction failures be more or less common than the
> previous
> > types of failures? I'd like to see this addressed.
> > >
> > > Thanks
> > > Eno
> > >
> > >
> > >
> > >> On 15 Mar 2017, at 22:09, Matthias J. Sax 
> > wrote:
> > >>
> > >> Just a quick follow up:
> > >>
> > >> Our overall proposal is, to implement KIP-129 as is as a “Stream EoS
> > >> 1.0” version. The raised concerns are all valid, but hard to quantify
> at
> > >> the moment. Implementing KIP-129, that provides a clean design, allows
> > >> us to gain more insight in the performance implications. This enables
> > >> us, to make an educated decision, if the “producer per task” model
> > >> perform wells or not, and if a switch to a “producer per thread” model
> > >> is mandatory.
> > >>
> > >> We also want to point out, that we can move incrementally from
> "producer
> > >> per task" to "producer 

[jira] [Commented] (KAFKA-4930) Connect Rest API allows creating connectors with an empty name

2017-03-22 Thread Gwen Shapira (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15936374#comment-15936374
 ] 

Gwen Shapira commented on KAFKA-4930:
-

Good catch! If you fix this, we'll gladly take the PR. Let me know if you need 
help or pointers.

> Connect Rest API allows creating connectors with an empty name
> --
>
> Key: KAFKA-4930
> URL: https://issues.apache.org/jira/browse/KAFKA-4930
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 0.10.2.0
>Reporter: Sönke Liebau
>Priority: Minor
>
> The Connect Rest API allows to deploy connectors with an empty name field, 
> which then cannot be removed through the api.
> Sending the following request:
> {code}
> {
> "name": "",
> "config": {
> "connector.class": 
> "org.apache.kafka.connect.tools.MockSourceConnector",
> "tasks.max": "1",
> "topics": "test-topic"
> 
> }
> }
> {code}
> Results in a connector being deployed which can be seen in the list of 
> connectors:
> {code}
> [
>   "",
>   "testconnector"
> ]{code}
> But cannot be removed via a DELETE call, as the api thinks we are trying to 
> delete the /connectors endpoint and declines the request.
> I don't think there is a valid case for the connector name to be empty so 
> perhaps we should add a check for this. I am happy to work on this.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] kafka pull request #2725: MINOR: add note to Processor javadoc about not clo...

2017-03-22 Thread dguy
GitHub user dguy opened a pull request:

https://github.com/apache/kafka/pull/2725

MINOR: add note to Processor javadoc about not closing state stores



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dguy/kafka minor-processor-java-doc

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/2725.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2725


commit aeedf9fcd3614f5afd09ad9064f945c9e753e405
Author: Damian Guy 
Date:   2017-03-22T13:46:07Z

add note to Processor javadoc about not closing state stores




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-4919) Streams job fails with InvalidStateStoreException: Store is currently closed

2017-03-22 Thread Damian Guy (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15936328#comment-15936328
 ] 

Damian Guy commented on KAFKA-4919:
---

Hi [~elevy] Yes that would be the issue. Because we now re-use the stores if we 
can when there is a rebalance, they should not be closed by the Processor. We 
need to update the docs and examples.
Thanks,
Damian

> Streams job fails with InvalidStateStoreException: Store is currently closed
> 
>
> Key: KAFKA-4919
> URL: https://issues.apache.org/jira/browse/KAFKA-4919
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.10.2.0
>Reporter: Elias Levy
>
> I have a streams job, that previously worked, that consumes and writes to a 
> large number of topics with many partitions and that uses many threads.  I 
> upgraded the job to 0.10.2.0.  The job now fails after a short time running, 
> seemingly after a rebalance.
> {quote}
> WARN  2017-03-19 18:03:20,432 [StreamThread-10][StreamThread.java:160] : 
> Unexpected state transition from RUNNING to NOT_RUNNING
> {quote}
> The first observation is that Streams is no longer outputting exceptions and 
> backtraces.  I had to add code to get this information.
> The exception:
> {quote}
> Exception: org.apache.kafka.streams.errors.StreamsException: Exception caught 
> in process. taskId=1_225, processor=KSTREAM-SOURCE-03, 
> topic=some_topic, partition=225, offset=266411
> org.apache.kafka.streams.errors.StreamsException: Exception caught in 
> process. taskId=1_225, processor=KSTREAM-SOURCE-03, topic=some_topic, 
> partition=225, offset=266411
>   at 
> org.apache.kafka.streams.processor.internals.StreamTask.process(StreamTask.java:216)
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:641)
>   at 
> org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:368)
> Caused by: org.apache.kafka.streams.errors.InvalidStateStoreException: Store 
> someStore-201701060400 is currently closed
>   at 
> org.apache.kafka.streams.state.internals.RocksDBStore.validateStoreOpen(RocksDBStore.java:205)
>   at 
> org.apache.kafka.streams.state.internals.RocksDBStore.put(RocksDBStore.java:221)
>   at 
> org.apache.kafka.streams.state.internals.RocksDBSegmentedBytesStore.put(RocksDBSegmentedBytesStore.java:74)
>   at 
> org.apache.kafka.streams.state.internals.ChangeLoggingSegmentedBytesStore.put(ChangeLoggingSegmentedBytesStore.java:54)
>   at 
> org.apache.kafka.streams.state.internals.MeteredSegmentedBytesStore.put(MeteredSegmentedBytesStore.java:101)
>   at 
> org.apache.kafka.streams.state.internals.RocksDBWindowStore.put(RocksDBWindowStore.java:109)
>   ... X more
> {quote}
> The error occurs for many partitions.
> This was preceded by (for this partition):
> {quote}
> INFO  2017-03-19 18:03:16,403 [StreamThread-10][ConsumerCoordinator.java:393] 
> : Revoking previously assigned partitions [some_topic-225] for group some_job
> INFO  2017-03-19 18:03:16,403 [StreamThread-10][StreamThread.java:254] : 
> stream-thread [StreamThread-10] partitions [[some_topic-225]] revoked at the 
> beginning of consumer rebalance.
> INFO  2017-03-19 18:03:16,403 [StreamThread-10][StreamThread.java:1056] : 
> stream-thread [StreamThread-10] Closing a task's topology 1_225
> INFO  2017-03-19 18:03:17,887 [StreamThread-10][StreamThread.java:544] : 
> stream-thread [StreamThread-10] Flushing state stores of task 1_225
> INFO  2017-03-19 18:03:17,887 [StreamThread-10][StreamThread.java:534] : 
> stream-thread [StreamThread-10] Committing consumer offsets of task 1_225
> INFO  2017-03-19 18:03:17,891 [StreamThread-10][StreamThread.java:1012] : 
> stream-thread [StreamThread-10] Updating suspended tasks to contain active 
> tasks [[1_225, 0_445, 0_30]]
> INFO  2017-03-19 18:03:17,891 [StreamThread-10][StreamThread.java:1019] : 
> stream-thread [StreamThread-10] Removing all active tasks [[1_225, 0_445, 
> 0_30]]
> INFO  2017-03-19 18:03:19,925 [StreamThread-10][ConsumerCoordinator.java:252] 
> : Setting newly assigned partitions [some_tpoic-225] for group some_job
> INFO  2017-03-19 18:03:19,927 [StreamThread-10][StreamThread.java:228] : 
> stream-thread [StreamThread-10] New partitions [[some_topic-225]] assigned at 
> the end of consumer rebalance.
> INFO  2017-03-19 18:03:19,929 [StreamThread-10][StreamTask.java:333] : task 
> [1_225] Initializing processor nodes of the topology
> Something happens.  What ???
> INFO  2017-03-19 18:03:20,135 [StreamThread-10][StreamThread.java:1045] : 
> stream-thread [StreamThread-10] Closing a task 1_225
> INFO  2017-03-19 18:03:20,355 [StreamThread-10][StreamThread.java:544] : 
> stream-thread [StreamThread-10] Flushing state stores of task 

[GitHub] kafka pull request #2724: MINOR: log state store restore offsets

2017-03-22 Thread dguy
GitHub user dguy opened a pull request:

https://github.com/apache/kafka/pull/2724

MINOR: log state store restore offsets

Debug logging of the start and end offsets used during state store 
restoration

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dguy/kafka log-restore-offsets-0.10.2

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/2724.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2724


commit 3ad4c9938f9377807895f15d202f932558027511
Author: Damian Guy 
Date:   2017-03-22T12:31:14Z

log restore offsets




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (KAFKA-4931) stop script fails due 4096 ps output limit

2017-03-22 Thread Amit Jain (JIRA)
Amit Jain created KAFKA-4931:


 Summary: stop script fails due 4096 ps output limit
 Key: KAFKA-4931
 URL: https://issues.apache.org/jira/browse/KAFKA-4931
 Project: Kafka
  Issue Type: Bug
  Components: tools
Affects Versions: 0.10.2.0
Reporter: Amit Jain
Priority: Minor


When run the script: bin/zookeeper-server-stop.sh fails to stop the zookeeper 
server process if the ps output exceeds 4096 character limit of linux. I think 
instead of ps we can use ${JAVA_HOME}/bin/jps -vl | grep QuorumPeerMain  it 
would correctly stop zookeeper process. Currently we are using kill 

PIDS=$(ps ax | grep java | grep -i QuorumPeerMain | grep -v grep | awk '{print 
$1}')




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] kafka-site issue #51: Add missing semicolon in JAAS config

2017-03-22 Thread stromnet
Github user stromnet commented on the issue:

https://github.com/apache/kafka-site/pull/51
  
Also added a note regarding authorizer.class.name which by default is empty.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: [VOTE] KIP-122: Add Reset Consumer Group Offsets tooling

2017-03-22 Thread Jorge Esteban Quilcate Otoya
@Jason, thanks for your feedback!
You're right, we are not considering the old consumer, since we rely on the
KafkaConsumer#seek operations. I'm happy to update the KIP to make this
explicit.
About the second comment: I suppose that it would work, but I would like to
include it to the test cases first. Do you know if this scenario has been
test it in other clients?

Jorge


El mié., 22 mar. 2017 a las 5:23, Dong Lin () escribió:

> Thanks for the KIP!
>
> +1 (non-binding)
>
> On Tue, Mar 21, 2017 at 6:24 PM, Becket Qin  wrote:
>
> > +1
> >
> > Thanks for the KIP. The tool is very useful.
> >
> > On Tue, Mar 21, 2017 at 4:46 PM, Jason Gustafson 
> > wrote:
> >
> > > +1 This looks super useful! Might be worth mentioning somewhere
> > > compatibility with the old consumer. It looks like offsets in zk are
> not
> > > covered, which seems fine, but probably should be explicitly noted.
> Maybe
> > > you can also add a note saying that the tool can be used for old
> > consumers
> > > which have offsets stored in Kafka, but it will not protect against an
> > > active consumer group in that case?
> > >
> > > Thanks,
> > > Jason
> > >
> > > On Tue, Mar 14, 2017 at 10:13 AM, Dong Lin 
> wrote:
> > >
> > > > +1 (non-binding)
> > > >
> > > > On Tue, Mar 14, 2017 at 8:53 AM, Bill Bejeck 
> > wrote:
> > > >
> > > > > +1
> > > > >
> > > > > On Tue, Mar 14, 2017 at 11:50 AM, Grant Henke  >
> > > > wrote:
> > > > >
> > > > > > +1. Agreed. This is a great tool to have.
> > > > > >
> > > > > > On Tue, Mar 14, 2017 at 12:33 AM, Gwen Shapira <
> g...@confluent.io>
> > > > > wrote:
> > > > > >
> > > > > > > +1 (binding)
> > > > > > >
> > > > > > > Nice job - this is going to be super useful.
> > > > > > >
> > > > > > > On Thu, Feb 23, 2017 at 4:46 PM, Jorge Esteban Quilcate Otoya <
> > > > > > > quilcate.jo...@gmail.com> wrote:
> > > > > > >
> > > > > > > > Hi All,
> > > > > > > >
> > > > > > > > It seems that there is no further concern with the KIP-122.
> > > > > > > > At this point we would like to start the voting process.
> > > > > > > >
> > > > > > > > The KIP can be found here:
> > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > > > > > > 122%3A+Add+Reset+Consumer+Group+Offsets+tooling
> > > > > > > >
> > > > > > > >
> > > > > > > > Thanks!
> > > > > > > >
> > > > > > > > Jorge.
> > > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > *Gwen Shapira*
> > > > > > > Product Manager | Confluent
> > > > > > > 650.450.2760 <(650)%20450-2760> | @gwenshap
> > > > > > > Follow us: Twitter  | blog
> > > > > > > 
> > > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Grant Henke
> > > > > > Software Engineer | Cloudera
> > > > > > gr...@cloudera.com | twitter.com/gchenke |
> > > linkedin.com/in/granthenke
> > > > > >
> > > > >
> > > >
> > >
> >
>


[GitHub] kafka-site pull request #51: Add missing semicolon in JAAS config

2017-03-22 Thread stromnet
GitHub user stromnet opened a pull request:

https://github.com/apache/kafka-site/pull/51

Add missing semicolon in JAAS config

For future googlers ref, this solves the following error:
```
[2017-03-22 11:19:59,516] FATAL Fatal error during KafkaServerStartable 
startup. Prepare to shutdown (kafka.server.KafkaServerStartable)
org.apache.kafka.common.KafkaException: Exception while loading Zookeeper 
JAAS login context 'Client'
at 
org.apache.kafka.common.security.JaasUtils.isZkSecurityEnabled(JaasUtils.java:154)
at kafka.server.KafkaServer.initZk(KafkaServer.scala:310)
at kafka.server.KafkaServer.startup(KafkaServer.scala:187)
at 
kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:39)
at kafka.Kafka$.main(Kafka.scala:67)
at kafka.Kafka.main(Kafka.scala)
Caused by: java.lang.SecurityException: java.io.IOException: Configuration 
Error:
Line 5: expected [option key]
at sun.security.provider.ConfigFile$Spi.(ConfigFile.java:137)
at sun.security.provider.ConfigFile.(ConfigFile.java:102)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at java.lang.Class.newInstance(Class.java:442)
at javax.security.auth.login.Configuration$2.run(Configuration.java:255)
at javax.security.auth.login.Configuration$2.run(Configuration.java:247)
at java.security.AccessController.doPrivileged(Native Method)
at 
javax.security.auth.login.Configuration.getConfiguration(Configuration.java:246)
at 
org.apache.kafka.common.security.JaasUtils.isZkSecurityEnabled(JaasUtils.java:151)
... 5 more
Caused by: java.io.IOException: Configuration Error:
Line 5: expected [option key]
at sun.security.provider.ConfigFile$Spi.ioException(ConfigFile.java:666)
at sun.security.provider.ConfigFile$Spi.match(ConfigFile.java:562)
at 
sun.security.provider.ConfigFile$Spi.parseLoginEntry(ConfigFile.java:477)
at sun.security.provider.ConfigFile$Spi.readConfig(ConfigFile.java:427)
at sun.security.provider.ConfigFile$Spi.init(ConfigFile.java:329)
at sun.security.provider.ConfigFile$Spi.init(ConfigFile.java:271)
at sun.security.provider.ConfigFile$Spi.(ConfigFile.java:135)
... 16 more
[2017-03-22 11:19:59,518] INFO shutting down (kafka.server.KafkaServer)
```

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/stromnet/kafka-site fix-jaas-semicolon

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka-site/pull/51.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #51


commit 3982c6cf2bc67ecb17d7954d28b34c04cc1f6cfe
Author: Johan Ström 
Date:   2017-03-22T10:27:40Z

Add missing semicolon in JAAS config




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] kafka pull request #2723: MyProcessor doc example should implement, not exte...

2017-03-22 Thread miguno
GitHub user miguno opened a pull request:

https://github.com/apache/kafka/pull/2723

MyProcessor doc example should implement, not extend `Processor`



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/miguno/kafka trunk-streams-docs-typo-fix

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/2723.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2723


commit 61d010427486edac8933e5051ba103d5df10ed12
Author: Michael G. Noll 
Date:   2017-03-22T09:58:03Z

MyProcessor doc example should implement, not extend `Processor`




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Build failed in Jenkins: kafka-trunk-jdk8 #1367

2017-03-22 Thread Apache Jenkins Server
See 


Changes:

[me] MINOR: Pluggable verifiable clients

--
[...truncated 1.68 MB...]
org.apache.kafka.connect.runtime.distributed.WorkerCoordinatorTest > 
testJoinLeaderCannotAssign PASSED

org.apache.kafka.connect.runtime.distributed.WorkerCoordinatorTest > 
testRejoinGroup STARTED

org.apache.kafka.connect.runtime.distributed.WorkerCoordinatorTest > 
testRejoinGroup PASSED

org.apache.kafka.connect.runtime.distributed.WorkerCoordinatorTest > 
testNormalJoinGroupLeader STARTED

org.apache.kafka.connect.runtime.distributed.WorkerCoordinatorTest > 
testNormalJoinGroupLeader PASSED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testCreateConnector STARTED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testCreateConnector PASSED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testPutConnectorConfig STARTED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testPutConnectorConfig PASSED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testRestartConnector STARTED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testRestartConnector PASSED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testCreateConnectorFailedBasicValidation STARTED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testCreateConnectorFailedBasicValidation PASSED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testRestartTask STARTED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testRestartTask PASSED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testAccessors STARTED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testAccessors PASSED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testCreateConnectorFailedCustomValidation STARTED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testCreateConnectorFailedCustomValidation PASSED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testCreateConnectorAlreadyExists STARTED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testCreateConnectorAlreadyExists PASSED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testDestroyConnector STARTED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testDestroyConnector PASSED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testJoinAssignment STARTED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testJoinAssignment PASSED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testRebalance STARTED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testRebalance PASSED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testRebalanceFailedConnector STARTED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testRebalanceFailedConnector PASSED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testHaltCleansUpWorker STARTED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testHaltCleansUpWorker PASSED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testConnectorNameConflictsWithWorkerGroupId STARTED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testConnectorNameConflictsWithWorkerGroupId PASSED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testRestartUnknownConnector STARTED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testRestartUnknownConnector PASSED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testRestartConnectorRedirectToLeader STARTED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testRestartConnectorRedirectToLeader PASSED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testRestartConnectorRedirectToOwner STARTED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testRestartConnectorRedirectToOwner PASSED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testRestartUnknownTask STARTED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testRestartUnknownTask PASSED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testRequestProcessingOrder STARTED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testRequestProcessingOrder PASSED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testRestartTaskRedirectToLeader STARTED

org.apache.kafka.connect.runtime.distributed.DistributedHerderTest > 
testRestartTaskRedirectToLeader PASSED


[GitHub] kafka pull request #2722: KAFKA-4878: Improved Invalid Connect Config Error ...

2017-03-22 Thread original-brownbear
GitHub user original-brownbear opened a pull request:

https://github.com/apache/kafka/pull/2722

KAFKA-4878: Improved Invalid Connect Config Error Message

Addresses for https://issues.apache.org/jira/browse/KAFKA-4878

* Adjusted the error message to explicitly state errors and their number
* Dried up the logic for generating the message between standalone and 
distributed

Example

messed up two config keys in the file source config:

namse=local-file-source
connector.class=FileStreamSource
tasks.max=1
fisle=test.txt
topic=connect-test
```

Produces:

```
[2017-03-22 08:57:11,896] ERROR Stopping after connector error 
(org.apache.kafka.connect.cli.ConnectStandalone:99)
java.util.concurrent.ExecutionException: 
org.apache.kafka.connect.runtime.rest.errors.BadRequestException: Connector 
configuration is invalid and contains the following 2 error(s):
Missing required configuration "file" which has no default value.
Missing required configuration "name" which has no default value.
You can also find the above list of errors at the endpoint 
`/{connectorType}/config/validate`
```

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/original-brownbear/kafka KAFKA-4878

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/2722.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2722


commit 0135ed6a3f1399ecefd2f477c9c82c7830313adc
Author: Armin Braun 
Date:   2017-03-22T07:20:11Z

KAFKA-4878: Improved Error Messages for Broken Connect Config

commit 7790532b8c43048197256b92e9970270e4fe1152
Author: Armin Braun 
Date:   2017-03-22T07:58:49Z

KAFKA-4878: Improved Error Message Text




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---