Re: [DISCUSS] KIP-58 - Make Log Compaction Point Configurable

2016-05-16 Thread James Cheng
We would find this KIP very useful. Our particular use case falls into the 
"application mistakes" portion of the KIP.

We are storing source of truth data in log compacted topics, similar to the 
Confluent Schema Registry. One situation we had recently was a misbehaving 
application. It sent data into its log-compacted source-of-truth topic, with 
the same keys but with corrupted message bodies. It took several days for us to 
notice this and by that time, log compaction had occurred. We lost data.

If we had been able to specify that log compaction would not occur for say 14 
days, this would give us 2 weeks to notice and recover from this situation. We 
could not have prevented the application from appending bad data after the good 
data, but at least the good data would still be present in the 
not-yet-compacted portion of the log, and we could write some manual recovery 
process to extract the good data out and do with it whatever we needed.

Because we do not have this facility, we currently have a rather complicated 
"backup" procedure whereby we use mirrormaker to mirror from our log compacted 
topic to a time-retention based topic. We have 14 day retention on the 
time-based topic, thus allowing us 14 days to recover from any data corruption 
on our log compacted topic. There is also a (imho) clever thing we do where we 
restart mirroring from the beginning of the log-compacted topic every 7 days. 
This allows us to make sure that even for very old non-changing keys, we always 
have a copy of them in the time-based topic.

Gwen helped us come up with that backup plan many months ago (Thanks Gwen!) but 
having a configurable "do not compact the last X days" setting would likely be 
enough to satisfy our backup needs.

-James

> On May 16, 2016, at 9:20 PM, Jay Kreps  wrote:
> 
> Yeah I think I gave a scenario but that is not the same as a concrete use
> case. I think the question you have is how common is it that people care
> about this and what concrete things would you build where you had this
> requirement? I think that would be good to figure out.
> 
> I think the issue with the current state is that it really gives no SLA at
> all, the last write to a segment is potentially compacted immediately so
> even a few seconds of lag (if your segment size is small) would cause this.
> 
> -Jay
> 
> On Mon, May 16, 2016 at 9:05 PM, Gwen Shapira  wrote:
> 
>> I agree that log.cleaner.min.compaction.lag.ms gives slightly more
>> flexibility for potentially-lagging consumers than tuning
>> segment.roll.ms for the exact same scenario.
>> 
>> If more people think that the use-case of "consumer which must see
>> every single record, is running on a compacted topic, and is lagging
>> enough that tuning segment.roll.ms won't help" is important enough
>> that we need to address, I won't object to proceeding with the KIP
>> (i.e. I'm probably -0 on this). It is easy to come up with a scenario
>> in which a feature is helpful (heck, I do it all the time), I'm just
>> not sure there is a real problem that cannot be addressed using
>> Kafka's existing behavior.
>> 
>> I do think that it will be an excellent idea to revisit the log
>> compaction configurations and see whether they make sense to users.
>> For example, if "log.cleaner.min.compaction.lag.X" can replace
>> "log.cleaner.min.cleanable.ratio" as an easier-to-tune alternative,
>> I'll be more excited about the replacement, even without a strong
>> use-case for a specific compaction lag.
>> 
>> Gwen
>> 
>> On Mon, May 16, 2016 at 7:46 PM, Jay Kreps  wrote:
>>> I think it would be good to hammer out some of the practical use cases--I
>>> definitely share your disdain for adding more configs. Here is my sort of
>>> theoretical understanding of why you might want this.
>>> 
>>> As you say a consumer bootstrapping itself in the compacted part of the
>> log
>>> isn't actually traversing through valid states globally. i.e. if you have
>>> written the following:
>>>  offset, key, value
>>>  0, k0, v0
>>>  1, k1, v1
>>>  2, k0, v2
>>> it could be compacted to
>>>  1, k1, v1
>>>  2, k0, v2
>>> Thus at offset 1 in the compacted log, you would have applied k1, but not
>>> k0. So even though k0 and k1 both have valid values they get applied out
>> of
>>> order. This is totally normal, there is obviously no way to both compact
>>> and retain every valid state.
>>> 
>>> For many things this is a non-issue since they treat items only on a
>>> per-key basis without any global notion of consistency.
>>> 
>>> But let's say you want to guarantee you only traverse valid states in a
>>> caught-up real-time consumer, how can you do this? It's actually a bit
>>> tough. Generally speaking since we don't compact the active segment a
>>> real-time consumer should have this property but this doesn't really
>> give a
>>> hard SLA. With a small segment size and a lagging consumer you could
>>> imagine the compactor potentially 

[jira] [Commented] (KAFKA-3715) Higher granularity streams metrics

2016-05-16 Thread aarti gupta (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15286084#comment-15286084
 ] 

aarti gupta commented on KAFKA-3715:


@jklukas, Thanks for the clear description, I would like to take a stab at 
this, assigning to myself

> Higher granularity streams metrics 
> ---
>
> Key: KAFKA-3715
> URL: https://issues.apache.org/jira/browse/KAFKA-3715
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: Jeff Klukas
>Assignee: aarti gupta
>Priority: Minor
>  Labels: api, newbie
> Fix For: 0.10.1.0
>
>
> Originally proposed by [~guozhang] in 
> https://github.com/apache/kafka/pull/1362#issuecomment-218326690
> We can consider adding metrics for process / punctuate / commit rate at the 
> granularity of each processor node in addition to the global rate mentioned 
> above. This is very helpful in debugging.
> We can consider adding rate / total cumulated metrics for context.forward 
> indicating how many records were forwarded downstream from this processor 
> node as well. This is helpful in debugging.
> We can consider adding metrics for each stream partition's timestamp. This is 
> helpful in debugging.
> Besides the latency metrics, we can also add throughput latency in terms of 
> source records consumed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (KAFKA-3715) Higher granularity streams metrics

2016-05-16 Thread aarti gupta (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

aarti gupta reassigned KAFKA-3715:
--

Assignee: aarti gupta

> Higher granularity streams metrics 
> ---
>
> Key: KAFKA-3715
> URL: https://issues.apache.org/jira/browse/KAFKA-3715
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: Jeff Klukas
>Assignee: aarti gupta
>Priority: Minor
>  Labels: api, newbie
> Fix For: 0.10.1.0
>
>
> Originally proposed by [~guozhang] in 
> https://github.com/apache/kafka/pull/1362#issuecomment-218326690
> We can consider adding metrics for process / punctuate / commit rate at the 
> granularity of each processor node in addition to the global rate mentioned 
> above. This is very helpful in debugging.
> We can consider adding rate / total cumulated metrics for context.forward 
> indicating how many records were forwarded downstream from this processor 
> node as well. This is helpful in debugging.
> We can consider adding metrics for each stream partition's timestamp. This is 
> helpful in debugging.
> Besides the latency metrics, we can also add throughput latency in terms of 
> source records consumed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2082) Kafka Replication ends up in a bad state

2016-05-16 Thread Raghavendra Nandagopal (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15286045#comment-15286045
 ] 

Raghavendra Nandagopal commented on KAFKA-2082:
---

Hi,
  We are also seeing this issue in our production environment.  Is there any 
workaround?

Thanks

> Kafka Replication ends up in a bad state
> 
>
> Key: KAFKA-2082
> URL: https://issues.apache.org/jira/browse/KAFKA-2082
> Project: Kafka
>  Issue Type: Bug
>  Components: replication
>Affects Versions: 0.8.2.1
>Reporter: Evan Huus
>Assignee: Sriharsha Chintalapani
>Priority: Critical
>  Labels: zkclient-problems
> Attachments: KAFKA-2082.patch
>
>
> While running integration tests for Sarama (the go client) we came across a 
> pattern of connection losses that reliably puts kafka into a bad state: 
> several of the brokers start spinning, chewing ~30% CPU and spamming the logs 
> with hundreds of thousands of lines like:
> {noformat}
> [2015-04-01 13:08:40,070] WARN [Replica Manager on Broker 9093]: Fetch 
> request with correlation id 111094 from client ReplicaFetcherThread-0-9093 on 
> partition [many_partition,1] failed due to Leader not local for partition 
> [many_partition,1] on broker 9093 (kafka.server.ReplicaManager)
> [2015-04-01 13:08:40,070] WARN [Replica Manager on Broker 9093]: Fetch 
> request with correlation id 111094 from client ReplicaFetcherThread-0-9093 on 
> partition [many_partition,6] failed due to Leader not local for partition 
> [many_partition,6] on broker 9093 (kafka.server.ReplicaManager)
> [2015-04-01 13:08:40,070] WARN [Replica Manager on Broker 9093]: Fetch 
> request with correlation id 111095 from client ReplicaFetcherThread-0-9093 on 
> partition [many_partition,21] failed due to Leader not local for partition 
> [many_partition,21] on broker 9093 (kafka.server.ReplicaManager)
> [2015-04-01 13:08:40,071] WARN [Replica Manager on Broker 9093]: Fetch 
> request with correlation id 111095 from client ReplicaFetcherThread-0-9093 on 
> partition [many_partition,26] failed due to Leader not local for partition 
> [many_partition,26] on broker 9093 (kafka.server.ReplicaManager)
> [2015-04-01 13:08:40,071] WARN [Replica Manager on Broker 9093]: Fetch 
> request with correlation id 111095 from client ReplicaFetcherThread-0-9093 on 
> partition [many_partition,1] failed due to Leader not local for partition 
> [many_partition,1] on broker 9093 (kafka.server.ReplicaManager)
> [2015-04-01 13:08:40,071] WARN [Replica Manager on Broker 9093]: Fetch 
> request with correlation id 111095 from client ReplicaFetcherThread-0-9093 on 
> partition [many_partition,6] failed due to Leader not local for partition 
> [many_partition,6] on broker 9093 (kafka.server.ReplicaManager)
> [2015-04-01 13:08:40,072] WARN [Replica Manager on Broker 9093]: Fetch 
> request with correlation id 111096 from client ReplicaFetcherThread-0-9093 on 
> partition [many_partition,21] failed due to Leader not local for partition 
> [many_partition,21] on broker 9093 (kafka.server.ReplicaManager)
> [2015-04-01 13:08:40,072] WARN [Replica Manager on Broker 9093]: Fetch 
> request with correlation id 111096 from client ReplicaFetcherThread-0-9093 on 
> partition [many_partition,26] failed due to Leader not local for partition 
> [many_partition,26] on broker 9093 (kafka.server.ReplicaManager)
> {noformat}
> This can be easily and reliably reproduced using the {{toxiproxy-final}} 
> branch of https://github.com/Shopify/sarama which includes a vagrant script 
> for provisioning the appropriate cluster: 
> - {{git clone https://github.com/Shopify/sarama.git}}
> - {{git checkout test-jira-kafka-2082}}
> - {{vagrant up}}
> - {{TEST_SEED=1427917826425719059 DEBUG=true go test -v}}
> After the test finishes (it fails because the cluster ends up in a bad 
> state), you can log into the cluster machine with {{vagrant ssh}} and inspect 
> the bad nodes. The vagrant script provisions five zookeepers and five brokers 
> in {{/opt/kafka-9091/}} through {{/opt/kafka-9095/}}.
> Additional context: the test produces continually to the cluster while 
> randomly cutting and restoring zookeeper connections (all connections to 
> zookeeper are run through a simple proxy on the same vm to make this easy). 
> The majority of the time this works very well and does a good job exercising 
> our producer's retry and failover code. However, under certain patterns of 
> connection loss (the {{TEST_SEED}} in the instructions is important), kafka 
> gets confused. The test never cuts more than two connections at a time, so 
> zookeeper should always have quorum, and the topic (with three replicas) 
> should always be writable.
> Completely restarting the cluster via {{vagrant reload}} seems to put it back 
> into a sane state.



--
This 

Re: [DISCUSS] KIP-58 - Make Log Compaction Point Configurable

2016-05-16 Thread Gwen Shapira
I see what you mean, Eric.

I was unclear on the specifics of your architecture. It sounds like
you have a table somewhere that maps checkpoints to lists of
.
In that case it is indeed useful to know that if the checkpoint was
written N ms ago, you will be able to find the exact offsets by
looking at the log.

Reading ahead won't really help in that case, since it sounds like the
state is too large to maintain in memory while reading ahead to a
future checkpoint.
(Different from Jay's abstract case in that regard).

Gwen


On Mon, May 16, 2016 at 9:21 PM, Eric Wasserman
 wrote:
> Gwen,
>
> For simplicity, the example I gave in the gist is for a single table with a 
> single partition. The salient point is that even for a single topic with one 
> partition there is no guarantee without the feature that one will be able to 
> restore some particular checkpoint as the offset indicated by that checkpoint 
> may have been compacted away.
>
> The practical reality is we are trying to restore the state of a database 
> with nearly 1000 tables each of which has 8 partitions. In this real case 
> there are 8000 offsets indicated in each checkpoint. If even a single one of 
> those 8000 is compacted the checkpointed state cannot be reconstructed.
>
> Additionally, we don't really intend to have the consumers of the table 
> topics try to keep current. Rather they will occasionally (say at 1AM each 
> day) try to build the state of the database at a recent checkpoint (say from 
> midnight). Supposing this takes a bit of time (10's of minutes to hours) to 
> read all the partitions of all the table topics up each to its target offset 
> indicated in the midnight checkpoint. By the time all the consumers have 
> arrive at the designated offset perhaps one of them will have had its target 
> offset compacted away. We would then need to select a new target checkpoint 
> with its offsets for each topic and partition that is a bit later. How much 
> later? It might well be around the 10's of minutes to hours it took to read 
> through to the offsets of the original target checkpoint as the compaction 
> that foiled us may have occurred just before we reached the goal.
>
> Really the issue is that while without the feature while we could eventually 
> restore _some_ consistent state we couldn't be assured of being able to 
> restore any
> particular (recent) one. My comment about never being assured of the process 
> terminating is just acknowledging the perhaps small but nonetheless finite 
> possibility of the process of chasing the checkpoints looking for which no 
> partition has yet had its target offset compacted away could continue 
> indefinitely. There is really no condition in which one could be absolutely 
> guaranteed this process would terminate.
>
> The feature addresses this by providing a guarantee that _any_ checkpoint can 
> be reconstructed as long as it is within the compaction lag. I would love to 
> be convinced that I am in error but short of that I frankly would never turn 
> on compaction for a CDC use case without it.
>
> As to reducing the number of parameters. I personally only see the 
> min.compaction.lag.ms as being truly essential. Even the existing ratio 
> setting is secondary in my mind.
>
> Eric
>
>> On May 16, 2016, at 6:42 PM, Gwen Shapira  wrote:
>>
>> Hi Eric,
>>
>> Thank you for submitting this improvement suggestion.
>>
>> Do you mind clarifying the use-case for me?
>>
>> Looking at your gist: https://gist.github.com/ewasserman/f8c892c2e7a9cf26ee46
>>
>> If my consumer started reading all the CDC topics from the very
>> beginning in which they were created, without ever stopping, it is
>> obviously guaranteed to see every single consistent state of the
>> database.
>> If my consumer joined late (lets say after Tq got clobbered by Tr) it
>> will get a mixed state, but if it will continue listening on those
>> topics, always following the logs to their end, it is guaranteed to
>> see a consistent state as soon a new transaction commits. Am I missing
>> anything?
>>
>> Basically, I do not understand why you claim: "However, to recover all
>> the tables at the same checkpoint, with each independently compacting,
>> one may need to move to an even more recent checkpoint when a
>> different table had the same read issue with the new checkpoint. Thus
>> one could never be assured of this process terminating."
>>
>> I mean, it is true that you need to continuously read forward in order
>> to get to a consistent state, but why can't you be assured of getting
>> there?
>>
>> We are doing something very similar in KafkaConnect, where we need a
>> consistent view of our configuration. We make sure that if the current
>> state is inconsistent (i.e there is data that are not "committed"
>> yet), we continue reading to the log end until we get to a consistent
>> state.
>>
>> I am not convinced the new functionality is necessary, 

Re: [DISCUSS] KIP-58 - Make Log Compaction Point Configurable

2016-05-16 Thread Eric Wasserman
Gwen,

For simplicity, the example I gave in the gist is for a single table with a 
single partition. The salient point is that even for a single topic with one 
partition there is no guarantee without the feature that one will be able to 
restore some particular checkpoint as the offset indicated by that checkpoint 
may have been compacted away.

The practical reality is we are trying to restore the state of a database with 
nearly 1000 tables each of which has 8 partitions. In this real case there are 
8000 offsets indicated in each checkpoint. If even a single one of those 8000 
is compacted the checkpointed state cannot be reconstructed.

Additionally, we don't really intend to have the consumers of the table topics 
try to keep current. Rather they will occasionally (say at 1AM each day) try to 
build the state of the database at a recent checkpoint (say from midnight). 
Supposing this takes a bit of time (10's of minutes to hours) to read all the 
partitions of all the table topics up each to its target offset indicated in 
the midnight checkpoint. By the time all the consumers have arrive at the 
designated offset perhaps one of them will have had its target offset compacted 
away. We would then need to select a new target checkpoint with its offsets for 
each topic and partition that is a bit later. How much later? It might well be 
around the 10's of minutes to hours it took to read through to the offsets of 
the original target checkpoint as the compaction that foiled us may have 
occurred just before we reached the goal.

Really the issue is that while without the feature while we could eventually 
restore _some_ consistent state we couldn't be assured of being able to restore 
any
particular (recent) one. My comment about never being assured of the process 
terminating is just acknowledging the perhaps small but nonetheless finite 
possibility of the process of chasing the checkpoints looking for which no 
partition has yet had its target offset compacted away could continue 
indefinitely. There is really no condition in which one could be absolutely 
guaranteed this process would terminate.

The feature addresses this by providing a guarantee that _any_ checkpoint can 
be reconstructed as long as it is within the compaction lag. I would love to be 
convinced that I am in error but short of that I frankly would never turn on 
compaction for a CDC use case without it.

As to reducing the number of parameters. I personally only see the 
min.compaction.lag.ms as being truly essential. Even the existing ratio setting 
is secondary in my mind.

Eric

> On May 16, 2016, at 6:42 PM, Gwen Shapira  wrote:
> 
> Hi Eric,
> 
> Thank you for submitting this improvement suggestion.
> 
> Do you mind clarifying the use-case for me?
> 
> Looking at your gist: https://gist.github.com/ewasserman/f8c892c2e7a9cf26ee46
> 
> If my consumer started reading all the CDC topics from the very
> beginning in which they were created, without ever stopping, it is
> obviously guaranteed to see every single consistent state of the
> database.
> If my consumer joined late (lets say after Tq got clobbered by Tr) it
> will get a mixed state, but if it will continue listening on those
> topics, always following the logs to their end, it is guaranteed to
> see a consistent state as soon a new transaction commits. Am I missing
> anything?
> 
> Basically, I do not understand why you claim: "However, to recover all
> the tables at the same checkpoint, with each independently compacting,
> one may need to move to an even more recent checkpoint when a
> different table had the same read issue with the new checkpoint. Thus
> one could never be assured of this process terminating."
> 
> I mean, it is true that you need to continuously read forward in order
> to get to a consistent state, but why can't you be assured of getting
> there?
> 
> We are doing something very similar in KafkaConnect, where we need a
> consistent view of our configuration. We make sure that if the current
> state is inconsistent (i.e there is data that are not "committed"
> yet), we continue reading to the log end until we get to a consistent
> state.
> 
> I am not convinced the new functionality is necessary, or even helpful.
> 
> Gwen
> 
> On Mon, May 16, 2016 at 4:07 PM, Eric Wasserman
>  wrote:
>> I would like to begin discussion on KIP-58
>> 
>> The KIP is here:
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-58+-+Make+Log+Compaction+Point+Configurable
>> 
>> Jira: https://issues.apache.org/jira/browse/KAFKA-1981
>> 
>> Pull Request: https://github.com/apache/kafka/pull/1168
>> 
>> Thanks,
>> 
>> Eric



Re: [DISCUSS] KIP-58 - Make Log Compaction Point Configurable

2016-05-16 Thread Jay Kreps
Yeah I think I gave a scenario but that is not the same as a concrete use
case. I think the question you have is how common is it that people care
about this and what concrete things would you build where you had this
requirement? I think that would be good to figure out.

I think the issue with the current state is that it really gives no SLA at
all, the last write to a segment is potentially compacted immediately so
even a few seconds of lag (if your segment size is small) would cause this.

-Jay

On Mon, May 16, 2016 at 9:05 PM, Gwen Shapira  wrote:

> I agree that log.cleaner.min.compaction.lag.ms gives slightly more
> flexibility for potentially-lagging consumers than tuning
> segment.roll.ms for the exact same scenario.
>
> If more people think that the use-case of "consumer which must see
> every single record, is running on a compacted topic, and is lagging
> enough that tuning segment.roll.ms won't help" is important enough
> that we need to address, I won't object to proceeding with the KIP
> (i.e. I'm probably -0 on this). It is easy to come up with a scenario
> in which a feature is helpful (heck, I do it all the time), I'm just
> not sure there is a real problem that cannot be addressed using
> Kafka's existing behavior.
>
> I do think that it will be an excellent idea to revisit the log
> compaction configurations and see whether they make sense to users.
> For example, if "log.cleaner.min.compaction.lag.X" can replace
> "log.cleaner.min.cleanable.ratio" as an easier-to-tune alternative,
> I'll be more excited about the replacement, even without a strong
> use-case for a specific compaction lag.
>
> Gwen
>
> On Mon, May 16, 2016 at 7:46 PM, Jay Kreps  wrote:
> > I think it would be good to hammer out some of the practical use cases--I
> > definitely share your disdain for adding more configs. Here is my sort of
> > theoretical understanding of why you might want this.
> >
> > As you say a consumer bootstrapping itself in the compacted part of the
> log
> > isn't actually traversing through valid states globally. i.e. if you have
> > written the following:
> >   offset, key, value
> >   0, k0, v0
> >   1, k1, v1
> >   2, k0, v2
> > it could be compacted to
> >   1, k1, v1
> >   2, k0, v2
> > Thus at offset 1 in the compacted log, you would have applied k1, but not
> > k0. So even though k0 and k1 both have valid values they get applied out
> of
> > order. This is totally normal, there is obviously no way to both compact
> > and retain every valid state.
> >
> > For many things this is a non-issue since they treat items only on a
> > per-key basis without any global notion of consistency.
> >
> > But let's say you want to guarantee you only traverse valid states in a
> > caught-up real-time consumer, how can you do this? It's actually a bit
> > tough. Generally speaking since we don't compact the active segment a
> > real-time consumer should have this property but this doesn't really
> give a
> > hard SLA. With a small segment size and a lagging consumer you could
> > imagine the compactor potentially getting ahead of the consumer.
> >
> > So effectively what this config would establish is a guarantee that as
> long
> > as you consume all messages in log.cleaner.min.compaction.lag.ms you
> will
> > get every single produced record.
> >
> > -Jay
> >
> >
> >
> >
> >
> > On Mon, May 16, 2016 at 6:42 PM, Gwen Shapira  wrote:
> >
> >> Hi Eric,
> >>
> >> Thank you for submitting this improvement suggestion.
> >>
> >> Do you mind clarifying the use-case for me?
> >>
> >> Looking at your gist:
> >> https://gist.github.com/ewasserman/f8c892c2e7a9cf26ee46
> >>
> >> If my consumer started reading all the CDC topics from the very
> >> beginning in which they were created, without ever stopping, it is
> >> obviously guaranteed to see every single consistent state of the
> >> database.
> >> If my consumer joined late (lets say after Tq got clobbered by Tr) it
> >> will get a mixed state, but if it will continue listening on those
> >> topics, always following the logs to their end, it is guaranteed to
> >> see a consistent state as soon a new transaction commits. Am I missing
> >> anything?
> >>
> >> Basically, I do not understand why you claim: "However, to recover all
> >> the tables at the same checkpoint, with each independently compacting,
> >> one may need to move to an even more recent checkpoint when a
> >> different table had the same read issue with the new checkpoint. Thus
> >> one could never be assured of this process terminating."
> >>
> >> I mean, it is true that you need to continuously read forward in order
> >> to get to a consistent state, but why can't you be assured of getting
> >> there?
> >>
> >> We are doing something very similar in KafkaConnect, where we need a
> >> consistent view of our configuration. We make sure that if the current
> >> state is inconsistent (i.e there is data that are not "committed"
> >> yet), we 

Re: [DISCUSS] KIP-58 - Make Log Compaction Point Configurable

2016-05-16 Thread Gwen Shapira
I agree that log.cleaner.min.compaction.lag.ms gives slightly more
flexibility for potentially-lagging consumers than tuning
segment.roll.ms for the exact same scenario.

If more people think that the use-case of "consumer which must see
every single record, is running on a compacted topic, and is lagging
enough that tuning segment.roll.ms won't help" is important enough
that we need to address, I won't object to proceeding with the KIP
(i.e. I'm probably -0 on this). It is easy to come up with a scenario
in which a feature is helpful (heck, I do it all the time), I'm just
not sure there is a real problem that cannot be addressed using
Kafka's existing behavior.

I do think that it will be an excellent idea to revisit the log
compaction configurations and see whether they make sense to users.
For example, if "log.cleaner.min.compaction.lag.X" can replace
"log.cleaner.min.cleanable.ratio" as an easier-to-tune alternative,
I'll be more excited about the replacement, even without a strong
use-case for a specific compaction lag.

Gwen

On Mon, May 16, 2016 at 7:46 PM, Jay Kreps  wrote:
> I think it would be good to hammer out some of the practical use cases--I
> definitely share your disdain for adding more configs. Here is my sort of
> theoretical understanding of why you might want this.
>
> As you say a consumer bootstrapping itself in the compacted part of the log
> isn't actually traversing through valid states globally. i.e. if you have
> written the following:
>   offset, key, value
>   0, k0, v0
>   1, k1, v1
>   2, k0, v2
> it could be compacted to
>   1, k1, v1
>   2, k0, v2
> Thus at offset 1 in the compacted log, you would have applied k1, but not
> k0. So even though k0 and k1 both have valid values they get applied out of
> order. This is totally normal, there is obviously no way to both compact
> and retain every valid state.
>
> For many things this is a non-issue since they treat items only on a
> per-key basis without any global notion of consistency.
>
> But let's say you want to guarantee you only traverse valid states in a
> caught-up real-time consumer, how can you do this? It's actually a bit
> tough. Generally speaking since we don't compact the active segment a
> real-time consumer should have this property but this doesn't really give a
> hard SLA. With a small segment size and a lagging consumer you could
> imagine the compactor potentially getting ahead of the consumer.
>
> So effectively what this config would establish is a guarantee that as long
> as you consume all messages in log.cleaner.min.compaction.lag.ms you will
> get every single produced record.
>
> -Jay
>
>
>
>
>
> On Mon, May 16, 2016 at 6:42 PM, Gwen Shapira  wrote:
>
>> Hi Eric,
>>
>> Thank you for submitting this improvement suggestion.
>>
>> Do you mind clarifying the use-case for me?
>>
>> Looking at your gist:
>> https://gist.github.com/ewasserman/f8c892c2e7a9cf26ee46
>>
>> If my consumer started reading all the CDC topics from the very
>> beginning in which they were created, without ever stopping, it is
>> obviously guaranteed to see every single consistent state of the
>> database.
>> If my consumer joined late (lets say after Tq got clobbered by Tr) it
>> will get a mixed state, but if it will continue listening on those
>> topics, always following the logs to their end, it is guaranteed to
>> see a consistent state as soon a new transaction commits. Am I missing
>> anything?
>>
>> Basically, I do not understand why you claim: "However, to recover all
>> the tables at the same checkpoint, with each independently compacting,
>> one may need to move to an even more recent checkpoint when a
>> different table had the same read issue with the new checkpoint. Thus
>> one could never be assured of this process terminating."
>>
>> I mean, it is true that you need to continuously read forward in order
>> to get to a consistent state, but why can't you be assured of getting
>> there?
>>
>> We are doing something very similar in KafkaConnect, where we need a
>> consistent view of our configuration. We make sure that if the current
>> state is inconsistent (i.e there is data that are not "committed"
>> yet), we continue reading to the log end until we get to a consistent
>> state.
>>
>> I am not convinced the new functionality is necessary, or even helpful.
>>
>> Gwen
>>
>> On Mon, May 16, 2016 at 4:07 PM, Eric Wasserman
>>  wrote:
>> > I would like to begin discussion on KIP-58
>> >
>> > The KIP is here:
>> >
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-58+-+Make+Log+Compaction+Point+Configurable
>> >
>> > Jira: https://issues.apache.org/jira/browse/KAFKA-1981
>> >
>> > Pull Request: https://github.com/apache/kafka/pull/1168
>> >
>> > Thanks,
>> >
>> > Eric
>>


Re: [DISCUSS] KIP-58 - Make Log Compaction Point Configurable

2016-05-16 Thread Jay Kreps
I think it would be good to hammer out some of the practical use cases--I
definitely share your disdain for adding more configs. Here is my sort of
theoretical understanding of why you might want this.

As you say a consumer bootstrapping itself in the compacted part of the log
isn't actually traversing through valid states globally. i.e. if you have
written the following:
  offset, key, value
  0, k0, v0
  1, k1, v1
  2, k0, v2
it could be compacted to
  1, k1, v1
  2, k0, v2
Thus at offset 1 in the compacted log, you would have applied k1, but not
k0. So even though k0 and k1 both have valid values they get applied out of
order. This is totally normal, there is obviously no way to both compact
and retain every valid state.

For many things this is a non-issue since they treat items only on a
per-key basis without any global notion of consistency.

But let's say you want to guarantee you only traverse valid states in a
caught-up real-time consumer, how can you do this? It's actually a bit
tough. Generally speaking since we don't compact the active segment a
real-time consumer should have this property but this doesn't really give a
hard SLA. With a small segment size and a lagging consumer you could
imagine the compactor potentially getting ahead of the consumer.

So effectively what this config would establish is a guarantee that as long
as you consume all messages in log.cleaner.min.compaction.lag.ms you will
get every single produced record.

-Jay





On Mon, May 16, 2016 at 6:42 PM, Gwen Shapira  wrote:

> Hi Eric,
>
> Thank you for submitting this improvement suggestion.
>
> Do you mind clarifying the use-case for me?
>
> Looking at your gist:
> https://gist.github.com/ewasserman/f8c892c2e7a9cf26ee46
>
> If my consumer started reading all the CDC topics from the very
> beginning in which they were created, without ever stopping, it is
> obviously guaranteed to see every single consistent state of the
> database.
> If my consumer joined late (lets say after Tq got clobbered by Tr) it
> will get a mixed state, but if it will continue listening on those
> topics, always following the logs to their end, it is guaranteed to
> see a consistent state as soon a new transaction commits. Am I missing
> anything?
>
> Basically, I do not understand why you claim: "However, to recover all
> the tables at the same checkpoint, with each independently compacting,
> one may need to move to an even more recent checkpoint when a
> different table had the same read issue with the new checkpoint. Thus
> one could never be assured of this process terminating."
>
> I mean, it is true that you need to continuously read forward in order
> to get to a consistent state, but why can't you be assured of getting
> there?
>
> We are doing something very similar in KafkaConnect, where we need a
> consistent view of our configuration. We make sure that if the current
> state is inconsistent (i.e there is data that are not "committed"
> yet), we continue reading to the log end until we get to a consistent
> state.
>
> I am not convinced the new functionality is necessary, or even helpful.
>
> Gwen
>
> On Mon, May 16, 2016 at 4:07 PM, Eric Wasserman
>  wrote:
> > I would like to begin discussion on KIP-58
> >
> > The KIP is here:
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-58+-+Make+Log+Compaction+Point+Configurable
> >
> > Jira: https://issues.apache.org/jira/browse/KAFKA-1981
> >
> > Pull Request: https://github.com/apache/kafka/pull/1168
> >
> > Thanks,
> >
> > Eric
>


[jira] [Updated] (KAFKA-3717) Support building aggregate javadoc for all project modules

2016-05-16 Thread Grant Henke (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke updated KAFKA-3717:
---
Summary: Support building aggregate javadoc for all project modules  (was: 
On 0.10.0 branch, building javadoc results in very small subset of expected 
javadocs)

> Support building aggregate javadoc for all project modules
> --
>
> Key: KAFKA-3717
> URL: https://issues.apache.org/jira/browse/KAFKA-3717
> Project: Kafka
>  Issue Type: Bug
>Reporter: Gwen Shapira
>Assignee: Grant Henke
>
> If you run "./gradlew javadoc", you will only get JavaDoc for the High Level 
> Consumer. All the new clients are missing.
> See here: http://home.apache.org/~gwenshap/0.10.0.0-rc5/javadoc/
> I suggest fixing in 0.10.0 branch and in trunk, not rolling a new release 
> candidate, but updating our docs site.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3717) On 0.10.0 branch, building javadoc results in very small subset of expected javadocs

2016-05-16 Thread Grant Henke (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15285897#comment-15285897
 ] 

Grant Henke commented on KAFKA-3717:


Currently the build places a javadoc directory and jar in each modules build 
directory. This means you need to manually grab or merge all of them. I 
confirmed with [~gwenshap] that this was good enough for the 0.10 release. 

Going forward it would be nice to aggregate the docs output for all 
sub-modules. This is related to the work tracked by KAFKA-3405.  

Since manually collecting the javadocs works for now, I will update the title 
to track aggregating javadocs. 

> On 0.10.0 branch, building javadoc results in very small subset of expected 
> javadocs
> 
>
> Key: KAFKA-3717
> URL: https://issues.apache.org/jira/browse/KAFKA-3717
> Project: Kafka
>  Issue Type: Bug
>Reporter: Gwen Shapira
>Assignee: Grant Henke
>
> If you run "./gradlew javadoc", you will only get JavaDoc for the High Level 
> Consumer. All the new clients are missing.
> See here: http://home.apache.org/~gwenshap/0.10.0.0-rc5/javadoc/
> I suggest fixing in 0.10.0 branch and in trunk, not rolling a new release 
> candidate, but updating our docs site.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [DISCUSS] KIP-58 - Make Log Compaction Point Configurable

2016-05-16 Thread Gwen Shapira
Hi Eric,

Thank you for submitting this improvement suggestion.

Do you mind clarifying the use-case for me?

Looking at your gist: https://gist.github.com/ewasserman/f8c892c2e7a9cf26ee46

If my consumer started reading all the CDC topics from the very
beginning in which they were created, without ever stopping, it is
obviously guaranteed to see every single consistent state of the
database.
If my consumer joined late (lets say after Tq got clobbered by Tr) it
will get a mixed state, but if it will continue listening on those
topics, always following the logs to their end, it is guaranteed to
see a consistent state as soon a new transaction commits. Am I missing
anything?

Basically, I do not understand why you claim: "However, to recover all
the tables at the same checkpoint, with each independently compacting,
one may need to move to an even more recent checkpoint when a
different table had the same read issue with the new checkpoint. Thus
one could never be assured of this process terminating."

I mean, it is true that you need to continuously read forward in order
to get to a consistent state, but why can't you be assured of getting
there?

We are doing something very similar in KafkaConnect, where we need a
consistent view of our configuration. We make sure that if the current
state is inconsistent (i.e there is data that are not "committed"
yet), we continue reading to the log end until we get to a consistent
state.

I am not convinced the new functionality is necessary, or even helpful.

Gwen

On Mon, May 16, 2016 at 4:07 PM, Eric Wasserman
 wrote:
> I would like to begin discussion on KIP-58
>
> The KIP is here:
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-58+-+Make+Log+Compaction+Point+Configurable
>
> Jira: https://issues.apache.org/jira/browse/KAFKA-1981
>
> Pull Request: https://github.com/apache/kafka/pull/1168
>
> Thanks,
>
> Eric


[jira] [Assigned] (KAFKA-3717) On 0.10.0 branch, building javadoc results in very small subset of expected javadocs

2016-05-16 Thread Grant Henke (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke reassigned KAFKA-3717:
--

Assignee: Grant Henke

> On 0.10.0 branch, building javadoc results in very small subset of expected 
> javadocs
> 
>
> Key: KAFKA-3717
> URL: https://issues.apache.org/jira/browse/KAFKA-3717
> Project: Kafka
>  Issue Type: Bug
>Reporter: Gwen Shapira
>Assignee: Grant Henke
>
> If you run "./gradlew javadoc", you will only get JavaDoc for the High Level 
> Consumer. All the new clients are missing.
> See here: http://home.apache.org/~gwenshap/0.10.0.0-rc5/javadoc/
> I suggest fixing in 0.10.0 branch and in trunk, not rolling a new release 
> candidate, but updating our docs site.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Jenkins build is back to normal : kafka-0.10.0-jdk7 #96

2016-05-16 Thread Apache Jenkins Server
See 



Help wanted fixing Javadoc build issue

2016-05-16 Thread Gwen Shapira
Hi Team Kafka,

If anyone feels comfortable with Gradle and has some spare time, maybe
help bring back our missing javadocs:
https://issues.apache.org/jira/browse/KAFKA-3717

Thanks!

Gwen


[jira] [Created] (KAFKA-3717) On 0.10.0 branch, building javadoc results in very small subset of expected javadocs

2016-05-16 Thread Gwen Shapira (JIRA)
Gwen Shapira created KAFKA-3717:
---

 Summary: On 0.10.0 branch, building javadoc results in very small 
subset of expected javadocs
 Key: KAFKA-3717
 URL: https://issues.apache.org/jira/browse/KAFKA-3717
 Project: Kafka
  Issue Type: Bug
Reporter: Gwen Shapira


If you run "./gradlew javadoc", you will only get JavaDoc for the High Level 
Consumer. All the new clients are missing.

See here: http://home.apache.org/~gwenshap/0.10.0.0-rc5/javadoc/

I suggest fixing in 0.10.0 branch and in trunk, not rolling a new release 
candidate, but updating our docs site.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1981) Make log compaction point configurable

2016-05-16 Thread Eric Wasserman (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15285617#comment-15285617
 ] 

Eric Wasserman commented on KAFKA-1981:
---

Thanks, I added your use case. I would like to start discussion of the KIP. 
I'll attempt to started a discussion thread on dev@kafka.apache.org which from 
looking at the archives seems to be the right venue (though I may not have 
permissions for starting a thread).

> Make log compaction point configurable
> --
>
> Key: KAFKA-1981
> URL: https://issues.apache.org/jira/browse/KAFKA-1981
> Project: Kafka
>  Issue Type: Improvement
>Affects Versions: 0.8.2.0
>Reporter: Jay Kreps
>  Labels: newbie++
> Attachments: KIP for Kafka Compaction Patch.md
>
>
> Currently if you enable log compaction the compactor will kick in whenever 
> you hit a certain "dirty ratio", i.e. when 50% of your data is uncompacted. 
> Other than this we don't give you fine-grained control over when compaction 
> occurs. In addition we never compact the active segment (since it is still 
> being written to).
> Other than this we don't really give you much control over when compaction 
> will happen. The result is that you can't really guarantee that a consumer 
> will get every update to a compacted topic--if the consumer falls behind a 
> bit it might just get the compacted version.
> This is usually fine, but it would be nice to make this more configurable so 
> you could set either a # messages, size, or time bound for compaction.
> This would let you say, for example, "any consumer that is no more than 1 
> hour behind will get every message."
> This should be relatively easy to implement since it just impacts the 
> end-point the compactor considers available for compaction. I think we 
> already have that concept, so this would just be some other overrides to add 
> in when calculating that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[DISCUSS] KIP-58 - Make Log Compaction Point Configurable

2016-05-16 Thread Eric Wasserman
I would like to begin discussion on KIP-58

The KIP is here:
https://cwiki.apache.org/confluence/display/KAFKA/KIP-58+-+Make+Log+Compaction+Point+Configurable

Jira: https://issues.apache.org/jira/browse/KAFKA-1981

Pull Request: https://github.com/apache/kafka/pull/1168

Thanks,

Eric


Build failed in Jenkins: kafka-trunk-jdk8 #633

2016-05-16 Thread Apache Jenkins Server
See 

Changes:

[wangguoz] MINOR: Add INFO logging if ZK config is not specified

--
[...truncated 3934 lines...]

kafka.log.BrokerCompressionTest > testBrokerSideCompression[5] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[6] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[7] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[8] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[9] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[10] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[11] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[12] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[13] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[14] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[15] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[16] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[17] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[18] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[19] PASSED

kafka.log.OffsetMapTest > testClear PASSED

kafka.log.OffsetMapTest > testBasicValidation PASSED

kafka.log.LogManagerTest > testCleanupSegmentsToMaintainSize PASSED

kafka.log.LogManagerTest > testRecoveryDirectoryMappingWithRelativeDirectory 
PASSED

kafka.log.LogManagerTest > testGetNonExistentLog PASSED

kafka.log.LogManagerTest > testTwoLogManagersUsingSameDirFails PASSED

kafka.log.LogManagerTest > testLeastLoadedAssignment PASSED

kafka.log.LogManagerTest > testCleanupExpiredSegments PASSED

kafka.log.LogManagerTest > testCheckpointRecoveryPoints PASSED

kafka.log.LogManagerTest > testTimeBasedFlush PASSED

kafka.log.LogManagerTest > testCreateLog PASSED

kafka.log.LogManagerTest > testRecoveryDirectoryMappingWithTrailingSlash PASSED

kafka.log.LogSegmentTest > testRecoveryWithCorruptMessage PASSED

kafka.log.LogSegmentTest > testRecoveryFixesCorruptIndex PASSED

kafka.log.LogSegmentTest > testReadFromGap PASSED

kafka.log.LogSegmentTest > testTruncate PASSED

kafka.log.LogSegmentTest > testReadBeforeFirstOffset PASSED

kafka.log.LogSegmentTest > testCreateWithInitFileSizeAppendMessage PASSED

kafka.log.LogSegmentTest > testChangeFileSuffixes PASSED

kafka.log.LogSegmentTest > testMaxOffset PASSED

kafka.log.LogSegmentTest > testNextOffsetCalculation PASSED

kafka.log.LogSegmentTest > testReadOnEmptySegment PASSED

kafka.log.LogSegmentTest > testReadAfterLast PASSED

kafka.log.LogSegmentTest > testCreateWithInitFileSizeClearShutdown PASSED

kafka.log.LogSegmentTest > testTruncateFull PASSED

kafka.log.FileMessageSetTest > testTruncate PASSED

kafka.log.FileMessageSetTest > testIterationOverPartialAndTruncation PASSED

kafka.log.FileMessageSetTest > testRead PASSED

kafka.log.FileMessageSetTest > testFileSize PASSED

kafka.log.FileMessageSetTest > testIteratorWithLimits PASSED

kafka.log.FileMessageSetTest > testPreallocateTrue PASSED

kafka.log.FileMessageSetTest > testIteratorIsConsistent PASSED

kafka.log.FileMessageSetTest > testFormatConversionWithPartialMessage PASSED

kafka.log.FileMessageSetTest > testIterationDoesntChangePosition PASSED

kafka.log.FileMessageSetTest > testWrittenEqualsRead PASSED

kafka.log.FileMessageSetTest > testWriteTo PASSED

kafka.log.FileMessageSetTest > testPreallocateFalse PASSED

kafka.log.FileMessageSetTest > testPreallocateClearShutdown PASSED

kafka.log.FileMessageSetTest > testMessageFormatConversion PASSED

kafka.log.FileMessageSetTest > testSearch PASSED

kafka.log.FileMessageSetTest > testSizeInBytes PASSED

kafka.log.LogCleanerIntegrationTest > cleanerTest[0] PASSED

kafka.log.LogCleanerIntegrationTest > cleanerTest[1] PASSED

kafka.log.LogCleanerIntegrationTest > cleanerTest[2] PASSED

kafka.log.LogCleanerIntegrationTest > cleanerTest[3] PASSED

kafka.log.LogTest > testParseTopicPartitionNameForMissingTopic PASSED

kafka.log.LogTest > testIndexRebuild PASSED

kafka.log.LogTest > testLogRolls PASSED

kafka.log.LogTest > testMessageSizeCheck PASSED

kafka.log.LogTest > testAsyncDelete PASSED

kafka.log.LogTest > testReadOutOfRange PASSED

kafka.log.LogTest > testAppendWithOutOfOrderOffsetsThrowsException PASSED

kafka.log.LogTest > testReadAtLogGap PASSED

kafka.log.LogTest > testTimeBasedLogRoll PASSED

kafka.log.LogTest > testLoadEmptyLog PASSED

kafka.log.LogTest > testMessageSetSizeCheck PASSED

kafka.log.LogTest > testIndexResizingAtTruncation PASSED

kafka.log.LogTest > testCompactedTopicConstraints PASSED

kafka.log.LogTest > testThatGarbageCollectingSegmentsDoesntChangeOffset PASSED

kafka.log.LogTest > testAppendAndReadWithSequentialOffsets PASSED

kafka.log.LogTest > testDeleteOldSegmentsMethod PASSED

kafka.log.LogTest > testParseTopicPartitionNameForNull PASSED

kafka.log.LogTest 

[jira] [Updated] (KAFKA-3714) Allow users greater access to register custom streams metrics

2016-05-16 Thread Guozhang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang updated KAFKA-3714:
-
Labels: api  (was: )

> Allow users greater access to register custom streams metrics
> -
>
> Key: KAFKA-3714
> URL: https://issues.apache.org/jira/browse/KAFKA-3714
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: Jeff Klukas
>Assignee: Guozhang Wang
>Priority: Minor
>  Labels: api
> Fix For: 0.10.1.0
>
>
> Copying in some discussion that originally appeared in 
> https://github.com/apache/kafka/pull/1362#issuecomment-219064302
> Kafka Streams is largely a higher-level abstraction on top of producers and 
> consumers, and it seems sensible to match the KafkaStreams interface to that 
> of KafkaProducer and KafkaConsumer where possible. For producers and 
> consumers, the metric registry is internal and metrics are only exposed as an 
> unmodifiable map. This allows users to access client metric values for use in 
> application health checks, etc., but doesn't allow them to register new 
> metrics.
> That approach seems reasonable if we assume that a user interested in 
> defining custom metrics is already going to be using a separate metrics 
> library. In such a case, users will likely find it easier to define metrics 
> using whatever library they're familiar with rather than learning the API for 
> Kafka's Metrics class. Is this a reasonable assumption?
> If we want to expose the Metrics instance so that users can define arbitrary 
> metrics, I'd argue that there's need for documentation updates. In 
> particular, I find the notion of metric tags confusing. Tags can be defined 
> in a MetricConfig when the Metrics instance is constructed, 
> StreamsMetricsImpl is maintaining its own set of tags, and users can set tag 
> overrides.
> If a user were to get access to the Metrics instance, they would be missing 
> the tags defined in StreamsMetricsImpl. I'm imagining that users would want 
> their custom metrics to sit alongside the predefined metrics with the same 
> tags, and users shouldn't be expected to manage those additional tags 
> themselves.
> So, why are we allowing users to define their own metrics via the 
> StreamsMetrics interface in the first place? Is it that we'd like to be able 
> to provide a built-in latency metric, but the definition depends on the 
> details of the use case so there's no generic solution? That would be 
> sufficient motivation for this special case of addLatencySensor. If we want 
> to continue down that path and give users access to define a wider range of 
> custom metrics, I'd prefer to extend the StreamsMetrics interface so that 
> users can call methods on that object, automatically getting the tags 
> appropriate for that instance rather than interacting with the raw Metrics 
> instance.
> ---
> Guozhang had the following comments:
> 1) For the producer/consumer cases, all internal metrics are provided and 
> abstracted from users, and they just need to read the documentation to poll 
> whatever provided metrics that they are interested; and if they want to 
> define more metrics, they are likely to be outside the clients themselves and 
> they can use whatever methods they like, so Metrics do not need to be exposed 
> to users.
> 2) For streams, things are a bit different: users define the computational 
> logic, which becomes part of the "Streams Client" processing and may be of 
> interests to be monitored by user themselves; think of a customized processor 
> that sends an email to some address based on a condition, and users want to 
> monitor the average rate of emails sent. Hence it is worth considering 
> whether or not they should be able to access the Metrics instance to define 
> their own along side the pre-defined metrics provided by the library.
> 3) Now, since the Metrics class was not previously designed for public usage, 
> it is not designed to be very user-friendly for defining sensors, especially 
> the semantics differences between name / scope / tags. StreamsMetrics tries 
> to hide some of these semantics confusion from users, but it still expose 
> tags and hence is not perfect in doing so. We need to think of a better 
> approach so that: 1) user defined metrics will be "aligned" (i.e. with the 
> same name prefix within a single application, with similar scope hierarchy 
> definition, etc) with library provided metrics, 2) natural APIs to do so.
> I do not have concrete ideas about 3) above on top of my head, comments are 
> more than welcomed.
> ---
> I'm not sure that I agree that 1) and 2) are truly different situations. A 
> user might choose to send email messages within a bare consumer rather than a 
> streams application, and 

[jira] [Updated] (KAFKA-3715) Higher granularity streams metrics

2016-05-16 Thread Guozhang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang updated KAFKA-3715:
-
Assignee: (was: Guozhang Wang)

> Higher granularity streams metrics 
> ---
>
> Key: KAFKA-3715
> URL: https://issues.apache.org/jira/browse/KAFKA-3715
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: Jeff Klukas
>Priority: Minor
>  Labels: api, newbie
> Fix For: 0.10.1.0
>
>
> Originally proposed by [~guozhang] in 
> https://github.com/apache/kafka/pull/1362#issuecomment-218326690
> We can consider adding metrics for process / punctuate / commit rate at the 
> granularity of each processor node in addition to the global rate mentioned 
> above. This is very helpful in debugging.
> We can consider adding rate / total cumulated metrics for context.forward 
> indicating how many records were forwarded downstream from this processor 
> node as well. This is helpful in debugging.
> We can consider adding metrics for each stream partition's timestamp. This is 
> helpful in debugging.
> Besides the latency metrics, we can also add throughput latency in terms of 
> source records consumed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-3715) Higher granularity streams metrics

2016-05-16 Thread Guozhang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang updated KAFKA-3715:
-
Labels: api newbie  (was: api)

> Higher granularity streams metrics 
> ---
>
> Key: KAFKA-3715
> URL: https://issues.apache.org/jira/browse/KAFKA-3715
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: Jeff Klukas
>Assignee: Guozhang Wang
>Priority: Minor
>  Labels: api, newbie
> Fix For: 0.10.1.0
>
>
> Originally proposed by [~guozhang] in 
> https://github.com/apache/kafka/pull/1362#issuecomment-218326690
> We can consider adding metrics for process / punctuate / commit rate at the 
> granularity of each processor node in addition to the global rate mentioned 
> above. This is very helpful in debugging.
> We can consider adding rate / total cumulated metrics for context.forward 
> indicating how many records were forwarded downstream from this processor 
> node as well. This is helpful in debugging.
> We can consider adding metrics for each stream partition's timestamp. This is 
> helpful in debugging.
> Besides the latency metrics, we can also add throughput latency in terms of 
> source records consumed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-3715) Higher granularity streams metrics

2016-05-16 Thread Guozhang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang updated KAFKA-3715:
-
Labels: api  (was: )

> Higher granularity streams metrics 
> ---
>
> Key: KAFKA-3715
> URL: https://issues.apache.org/jira/browse/KAFKA-3715
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: Jeff Klukas
>Assignee: Guozhang Wang
>Priority: Minor
>  Labels: api
> Fix For: 0.10.1.0
>
>
> Originally proposed by [~guozhang] in 
> https://github.com/apache/kafka/pull/1362#issuecomment-218326690
> We can consider adding metrics for process / punctuate / commit rate at the 
> granularity of each processor node in addition to the global rate mentioned 
> above. This is very helpful in debugging.
> We can consider adding rate / total cumulated metrics for context.forward 
> indicating how many records were forwarded downstream from this processor 
> node as well. This is helpful in debugging.
> We can consider adding metrics for each stream partition's timestamp. This is 
> helpful in debugging.
> Besides the latency metrics, we can also add throughput latency in terms of 
> source records consumed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3716) Check against negative timestamps

2016-05-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15285551#comment-15285551
 ] 

ASF GitHub Bot commented on KAFKA-3716:
---

GitHub user guozhangwang opened a pull request:

https://github.com/apache/kafka/pull/1393

KAFKA-3716: Validate all timestamps are not negative



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/guozhangwang/kafka 
K3716-check-non-negative-timestamps

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/1393.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1393


commit 750dfc96d5a73ed1e252b0cd6b1ba1c2a37e5f32
Author: Guozhang Wang 
Date:   2016-05-16T22:26:59Z

validate all timestamps are not negative




> Check against negative timestamps
> -
>
> Key: KAFKA-3716
> URL: https://issues.apache.org/jira/browse/KAFKA-3716
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Guozhang Wang
>Assignee: Guozhang Wang
>  Labels: architecture, user-experience
>
> Although currently we do not enforce any semantic meaning on the {{Long}} 
> typed timestamps, we are actually assuming it to be non-negative while 
> storing the timestamp in windowed store. For example, in 
> {{RocksDBWindowStore}} we store the timestamp as part of the key, and relying 
> on RocksDB's default lexicographic byte array comparator, and hence negative 
> long value stored in RocksDB will cause the range search ordering to be 
> messed up.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[VOTE] 0.10.0.0 RC5

2016-05-16 Thread Gwen Shapira
Hello Kafka users, developers and client-developers,

This is the sixth (!) candidate for release of Apache Kafka 0.10.0.0.
This is a major release that includes: (1) New message format
including timestamps (2) client interceptor API (3) Kafka Streams.
Since this is a major release, we will give people more time to try it
out and give feedback.

Release notes for the 0.10.0.0 release:
http://home.apache.org/~gwenshap/0.10.0.0-rc5/RELEASE_NOTES.html

Special thanks for Liquan Pei and Tom Crayford for testing the
previous release candidate and reporting back issues.

Note that this is the sixth RC. I hope we are done with blockers,
because I'm tired of RCs :)

*** Please download, test and vote by Friday, May 20, 4pm PT

Kafka's KEYS file containing PGP keys we use to sign the release:
http://kafka.apache.org/KEYS

* Release artifacts to be voted upon (source and binary):
http://home.apache.org/~gwenshap/0.10.0.0-rc5/

* Maven artifacts to be voted upon:
https://repository.apache.org/content/groups/staging/org/apache/kafka

* scala-doc
http://home.apache.org/~gwenshap/0.10.0.0-rc5/scaladoc

* java-doc
http://home.apache.org/~gwenshap/0.10.0.0-rc5/javadoc/

* tag to be voted upon (off 0.10.0 branch) is the 0.10.0.0 tag:
https://git-wip-us.apache.org/repos/asf?p=kafka.git;a=tag;h=f68d6218478960b6cc6a01a18542825cc2fe8b80

* Documentation:
http://kafka.apache.org/0100/documentation.html

* Protocol:
http://kafka.apache.org/0100/protocol.html

/**

Thanks,

Gwen


[GitHub] kafka pull request: KAFKA-3716: Validate all timestamps are not ne...

2016-05-16 Thread guozhangwang
GitHub user guozhangwang opened a pull request:

https://github.com/apache/kafka/pull/1393

KAFKA-3716: Validate all timestamps are not negative



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/guozhangwang/kafka 
K3716-check-non-negative-timestamps

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/1393.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1393


commit 750dfc96d5a73ed1e252b0cd6b1ba1c2a37e5f32
Author: Guozhang Wang 
Date:   2016-05-16T22:26:59Z

validate all timestamps are not negative




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (KAFKA-3716) Check against negative timestamps

2016-05-16 Thread Guozhang Wang (JIRA)
Guozhang Wang created KAFKA-3716:


 Summary: Check against negative timestamps
 Key: KAFKA-3716
 URL: https://issues.apache.org/jira/browse/KAFKA-3716
 Project: Kafka
  Issue Type: Bug
  Components: streams
Reporter: Guozhang Wang
Assignee: Guozhang Wang


Although currently we do not enforce any semantic meaning on the {{Long}} typed 
timestamps, we are actually assuming it to be non-negative while storing the 
timestamp in windowed store. For example, in {{RocksDBWindowStore}} we store 
the timestamp as part of the key, and relying on RocksDB's default 
lexicographic byte array comparator, and hence negative long value stored in 
RocksDB will cause the range search ordering to be messed up.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request: MINOR: Add INFO logging if ZK config is not sp...

2016-05-16 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/1392


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (KAFKA-3444) Figure out when to bump the version on release-candidate artifacts

2016-05-16 Thread Gwen Shapira (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gwen Shapira updated KAFKA-3444:

Fix Version/s: (was: 0.10.1.0)

> Figure out when to bump the version on release-candidate artifacts
> --
>
> Key: KAFKA-3444
> URL: https://issues.apache.org/jira/browse/KAFKA-3444
> Project: Kafka
>  Issue Type: Bug
>Reporter: Gwen Shapira
>Priority: Blocker
>
> Currently we remove the "-SNAPSHOT" marker immediately upon branching. Which 
> means that our release artifacts are all released to maven repos with the 
> same version. Which is apparently challenging for projects that depend on 
> Kafka to test with the release artifacts.
> We need to revisit, discuss and maybe improve the process for the next 
> release.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3444) Figure out when to bump the version on release-candidate artifacts

2016-05-16 Thread Gwen Shapira (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15285297#comment-15285297
 ] 

Gwen Shapira commented on KAFKA-3444:
-

Resolved. The build process was updated to include tags without -SNAPSHOT but 
keep the -SNAPSHOT in the branch until after the release.

https://cwiki.apache.org/confluence/display/KAFKA/Release+Process

> Figure out when to bump the version on release-candidate artifacts
> --
>
> Key: KAFKA-3444
> URL: https://issues.apache.org/jira/browse/KAFKA-3444
> Project: Kafka
>  Issue Type: Bug
>Reporter: Gwen Shapira
>Priority: Blocker
>
> Currently we remove the "-SNAPSHOT" marker immediately upon branching. Which 
> means that our release artifacts are all released to maven repos with the 
> same version. Which is apparently challenging for projects that depend on 
> Kafka to test with the release artifacts.
> We need to revisit, discuss and maybe improve the process for the next 
> release.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (KAFKA-3444) Figure out when to bump the version on release-candidate artifacts

2016-05-16 Thread Gwen Shapira (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gwen Shapira resolved KAFKA-3444.
-
Resolution: Fixed
  Assignee: Gwen Shapira

> Figure out when to bump the version on release-candidate artifacts
> --
>
> Key: KAFKA-3444
> URL: https://issues.apache.org/jira/browse/KAFKA-3444
> Project: Kafka
>  Issue Type: Bug
>Reporter: Gwen Shapira
>Assignee: Gwen Shapira
>Priority: Blocker
>
> Currently we remove the "-SNAPSHOT" marker immediately upon branching. Which 
> means that our release artifacts are all released to maven repos with the 
> same version. Which is apparently challenging for projects that depend on 
> Kafka to test with the release artifacts.
> We need to revisit, discuss and maybe improve the process for the next 
> release.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Jenkins build is back to normal : kafka-trunk-jdk7 #1297

2016-05-16 Thread Apache Jenkins Server
See 



[jira] [Commented] (KAFKA-3699) Update protocol page on website to explain how KIP-35 should be used

2016-05-16 Thread Ashish K Singh (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15285135#comment-15285135
 ] 

Ashish K Singh commented on KAFKA-3699:
---

I will have a patch ready hopefully by EOD, apologies for the delay.

> Update protocol page on website to explain how KIP-35 should be used
> 
>
> Key: KAFKA-3699
> URL: https://issues.apache.org/jira/browse/KAFKA-3699
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Ismael Juma
>Assignee: Ashish K Singh
> Fix For: 0.10.0.1
>
>
> The following page should be updated:
> http://kafka.apache.org/protocol.html
> [~edenhill], [~singhashish] any of you would like to tackle this?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Build failed in Jenkins: kafka-0.10.0-jdk7 #95

2016-05-16 Thread Apache Jenkins Server
See 

Changes:

[cshapi] KAFKA-3704; Revert "Remove hard-coded block size in KafkaProducer"

--
[...truncated 39 lines...]
:core:clean
:examples:clean
:log4j-appender:clean
:streams:clean
:tools:clean
:connect:api:clean
:connect:file:clean
:connect:json:clean
:connect:runtime:clean
:streams:examples:clean
:jar_core_2_10
Building project 'core' with Scala version 2.10.6
:kafka-0.10.0-jdk7:clients:compileJavaNote: Some input files use unchecked or 
unsafe operations.
Note: Recompile with -Xlint:unchecked for details.

:kafka-0.10.0-jdk7:clients:processResources UP-TO-DATE
:kafka-0.10.0-jdk7:clients:classes
:kafka-0.10.0-jdk7:clients:determineCommitId UP-TO-DATE
:kafka-0.10.0-jdk7:clients:createVersionFile
:kafka-0.10.0-jdk7:clients:jar
:kafka-0.10.0-jdk7:core:compileJava UP-TO-DATE
:kafka-0.10.0-jdk7:core:compileScala
:79:
 value DEFAULT_TIMESTAMP in object OffsetCommitRequest is deprecated: see 
corresponding Javadoc for more information.

org.apache.kafka.common.requests.OffsetCommitRequest.DEFAULT_TIMESTAMP
 ^
:36:
 value DEFAULT_TIMESTAMP in object OffsetCommitRequest is deprecated: see 
corresponding Javadoc for more information.
 commitTimestamp: Long = 
org.apache.kafka.common.requests.OffsetCommitRequest.DEFAULT_TIMESTAMP,

  ^
:37:
 value DEFAULT_TIMESTAMP in object OffsetCommitRequest is deprecated: see 
corresponding Javadoc for more information.
 expireTimestamp: Long = 
org.apache.kafka.common.requests.OffsetCommitRequest.DEFAULT_TIMESTAMP) {

  ^
:401:
 value DEFAULT_TIMESTAMP in object OffsetCommitRequest is deprecated: see 
corresponding Javadoc for more information.
  if (value.expireTimestamp == 
org.apache.kafka.common.requests.OffsetCommitRequest.DEFAULT_TIMESTAMP)

^
:301:
 value DEFAULT_TIMESTAMP in object OffsetCommitRequest is deprecated: see 
corresponding Javadoc for more information.
  if (partitionData.timestamp == 
OffsetCommitRequest.DEFAULT_TIMESTAMP)
 ^
:93:
 class ProducerConfig in package producer is deprecated: This class has been 
deprecated and will be removed in a future release. Please use 
org.apache.kafka.clients.producer.ProducerConfig instead.
val producerConfig = new ProducerConfig(props)
 ^
:94:
 method fetchTopicMetadata in object ClientUtils is deprecated: This method has 
been deprecated and will be removed in a future release.
fetchTopicMetadata(topics, brokers, producerConfig, correlationId)
^
:396:
 constructor UpdateMetadataRequest in class UpdateMetadataRequest is 
deprecated: see corresponding Javadoc for more information.
new UpdateMetadataRequest(controllerId, controllerEpoch, 
liveBrokers.asJava, partitionStates.asJava)
^
:191:
 object ProducerRequestStatsRegistry in package producer is deprecated: This 
object has been deprecated and will be removed in a future release.
ProducerRequestStatsRegistry.removeProducerRequestStats(clientId)
^
:129:
 method readFromReadableChannel in class NetworkReceive is deprecated: see 
corresponding Javadoc for more information.
  response.readFromReadableChannel(channel)
   ^
:301:
 value timestamp in class PartitionData 

[GitHub] kafka pull request: MINOR: Add INFO logging if ZK config is not sp...

2016-05-16 Thread guozhangwang
GitHub user guozhangwang opened a pull request:

https://github.com/apache/kafka/pull/1392

MINOR: Add INFO logging if ZK config is not specified



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/guozhangwang/kafka Kminor-warn-no-zk-config

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/1392.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1392


commit 4255b452a80580718d0c865e88d9f3a0f2df246f
Author: Guozhang Wang 
Date:   2016-05-16T19:22:25Z

add info logging if ZK config is not specified




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-3699) Update protocol page on website to explain how KIP-35 should be used

2016-05-16 Thread Gwen Shapira (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15285120#comment-15285120
 ] 

Gwen Shapira commented on KAFKA-3699:
-

I moved this off the upcoming release. I would have loved having it, but not a 
blocker for the release.

> Update protocol page on website to explain how KIP-35 should be used
> 
>
> Key: KAFKA-3699
> URL: https://issues.apache.org/jira/browse/KAFKA-3699
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Ismael Juma
>Assignee: Ashish K Singh
> Fix For: 0.10.0.1
>
>
> The following page should be updated:
> http://kafka.apache.org/protocol.html
> [~edenhill], [~singhashish] any of you would like to tackle this?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Build failed in Jenkins: kafka-trunk-jdk8 #632

2016-05-16 Thread Apache Jenkins Server
See 

Changes:

[cshapi] KAFKA-3704; Revert "Remove hard-coded block size in KafkaProducer"

--
Started by an SCM change
[EnvInject] - Loading node environment variables.
Building remotely on ubuntu3 (Ubuntu ubuntu legacy-ubuntu) in workspace 

 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url 
 > https://git-wip-us.apache.org/repos/asf/kafka.git # timeout=10
Fetching upstream changes from https://git-wip-us.apache.org/repos/asf/kafka.git
 > git --version # timeout=10
 > git -c core.askpass=true fetch --tags --progress 
 > https://git-wip-us.apache.org/repos/asf/kafka.git 
 > +refs/heads/*:refs/remotes/origin/*
 > git rev-parse refs/remotes/origin/trunk^{commit} # timeout=10
 > git rev-parse refs/remotes/origin/origin/trunk^{commit} # timeout=10
Checking out Revision 9a3fcf41350ddda9a41db18cde718f892b95177c 
(refs/remotes/origin/trunk)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 9a3fcf41350ddda9a41db18cde718f892b95177c
 > git rev-list 61f4c8b0923baa17bfc6160082fc2c8ca5a2c44d # timeout=10
Setting 
GRADLE_2_4_RC_2_HOME=/home/jenkins/jenkins-slave/tools/hudson.plugins.gradle.GradleInstallation/Gradle_2.4-rc-2
Setting 
JDK1_8_0_66_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/jdk1.8.0_66
[kafka-trunk-jdk8] $ /bin/bash -xe /tmp/hudson925431803888401970.sh
+ 
/home/jenkins/jenkins-slave/tools/hudson.plugins.gradle.GradleInstallation/Gradle_2.4-rc-2/bin/gradle
To honour the JVM settings for this build a new JVM will be forked. Please 
consider using the daemon: 
http://gradle.org/docs/2.4-rc-2/userguide/gradle_daemon.html.
Building project 'core' with Scala version 2.10.6
:downloadWrapper

BUILD SUCCESSFUL

Total time: 12.572 secs
Setting 
GRADLE_2_4_RC_2_HOME=/home/jenkins/jenkins-slave/tools/hudson.plugins.gradle.GradleInstallation/Gradle_2.4-rc-2
Setting 
JDK1_8_0_66_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/jdk1.8.0_66
[kafka-trunk-jdk8] $ /bin/bash -xe /tmp/hudson5806033465652933728.sh
+ export GRADLE_OPTS=-Xmx1024m
+ GRADLE_OPTS=-Xmx1024m
+ ./gradlew -Dorg.gradle.project.maxParallelForks=1 clean jarAll testAll
To honour the JVM settings for this build a new JVM will be forked. Please 
consider using the daemon: 
https://docs.gradle.org/2.13/userguide/gradle_daemon.html.
Building project 'core' with Scala version 2.10.6
Build file '/x1/jenkins/jenkins-slave/workspace/kafka-trunk-jdk8/build.gradle': 
line 230
useAnt has been deprecated and is scheduled to be removed in Gradle 3.0. The 
Ant-Based Scala compiler is deprecated, please see 
https://docs.gradle.org/current/userguide/scala_plugin.html.
:clean UP-TO-DATE
:clients:clean
:connect:clean UP-TO-DATE
:core:clean
:examples:clean UP-TO-DATE
:log4j-appender:clean UP-TO-DATE
:streams:clean UP-TO-DATE
:tools:clean UP-TO-DATE
:connect:api:clean UP-TO-DATE
:connect:file:clean UP-TO-DATE
:connect:json:clean UP-TO-DATE
:connect:runtime:clean UP-TO-DATE
:streams:examples:clean UP-TO-DATE
:jar_core_2_10
Building project 'core' with Scala version 2.10.6
:kafka-trunk-jdk8:clients:compileJavawarning: [options] bootstrap class path 
not set in conjunction with -source 1.7
Note: Some input files use unchecked or unsafe operations.
Note: Recompile with -Xlint:unchecked for details.
1 warning

:kafka-trunk-jdk8:clients:processResources UP-TO-DATE
:kafka-trunk-jdk8:clients:classes
:kafka-trunk-jdk8:clients:determineCommitId UP-TO-DATE
:kafka-trunk-jdk8:clients:createVersionFile
:kafka-trunk-jdk8:clients:jar
:jar_core_2_10 FAILED

FAILURE: Build failed with an exception.

* What went wrong:
org.gradle.api.internal.changedetection.state.FileCollectionSnapshotImpl cannot 
be cast to 
org.gradle.api.internal.changedetection.state.OutputFilesCollectionSnapshotter$OutputFilesSnapshot

* Try:
Run with --stacktrace option to get the stack trace. Run with --info or --debug 
option to get more log output.

BUILD FAILED

Total time: 29.434 secs
Build step 'Execute shell' marked build as failure
Recording test results
Setting 
GRADLE_2_4_RC_2_HOME=/home/jenkins/jenkins-slave/tools/hudson.plugins.gradle.GradleInstallation/Gradle_2.4-rc-2
Setting 
JDK1_8_0_66_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/jdk1.8.0_66
ERROR: Step ?Publish JUnit test result report? failed: No test report files 
were found. Configuration error?
Setting 
GRADLE_2_4_RC_2_HOME=/home/jenkins/jenkins-slave/tools/hudson.plugins.gradle.GradleInstallation/Gradle_2.4-rc-2
Setting 
JDK1_8_0_66_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/jdk1.8.0_66


[jira] [Updated] (KAFKA-3699) Update protocol page on website to explain how KIP-35 should be used

2016-05-16 Thread Gwen Shapira (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gwen Shapira updated KAFKA-3699:

Fix Version/s: (was: 0.10.0.0)
   0.10.0.1

> Update protocol page on website to explain how KIP-35 should be used
> 
>
> Key: KAFKA-3699
> URL: https://issues.apache.org/jira/browse/KAFKA-3699
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Ismael Juma
>Assignee: Ashish K Singh
> Fix For: 0.10.0.1
>
>
> The following page should be updated:
> http://kafka.apache.org/protocol.html
> [~edenhill], [~singhashish] any of you would like to tackle this?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (KAFKA-3633) Kafka Consumer API breaking backward compatibility

2016-05-16 Thread Gwen Shapira (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gwen Shapira resolved KAFKA-3633.
-
Resolution: Won't Fix

It looks like this won't get fixed (per community discussion here: 
https://s.apache.org/pKdO).


> Kafka Consumer API breaking backward compatibility
> --
>
> Key: KAFKA-3633
> URL: https://issues.apache.org/jira/browse/KAFKA-3633
> Project: Kafka
>  Issue Type: Bug
>Reporter: Sriharsha Chintalapani
>Assignee: Sriharsha Chintalapani
>Priority: Blocker
> Fix For: 0.10.0.0
>
>
> KAFKA-2991 and KAFKA-3006 broke the backward compatibility. In storm we 
> already using 0.9.0.1 consumer api for the KafkaSpout. We should atleast kept 
> the older methods and shouldn't be breaking backward compatibility.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3704) Improve mechanism for compression stream block size selection in KafkaProducer

2016-05-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15285091#comment-15285091
 ] 

ASF GitHub Bot commented on KAFKA-3704:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/1391


> Improve mechanism for compression stream block size selection in KafkaProducer
> --
>
> Key: KAFKA-3704
> URL: https://issues.apache.org/jira/browse/KAFKA-3704
> Project: Kafka
>  Issue Type: Bug
>Reporter: Guozhang Wang
>Assignee: Ismael Juma
> Fix For: 0.10.1.0, 0.10.0.0
>
>
> As discovered in https://issues.apache.org/jira/browse/KAFKA-3565, the 
> current default block size (1K) used in Snappy and GZIP may cause a 
> sub-optimal compression ratio for Snappy, and hence reduce throughput. 
> Because we no longer recompress data in the broker, it also impacts what gets 
> stored on disk.
> A solution might be to use the default block size, which is 64K in LZ4, 32K 
> in Snappy and 0.5K in GZIP. The downside is that this solution will require 
> more memory allocated outside of the buffer pool and hence users may need to 
> bump up their JVM heap size, especially for MirrorMakers. Using Snappy as an 
> example, it's an additional 2x32k per batch (as Snappy uses two buffers) and 
> one would expect at least one batch per partition. However, the number of 
> batches per partition can be much higher if the broker is slow to acknowledge 
> producer requests (depending on `buffer.memory`, `batch.size`, message size, 
> etc.).
> Given the above, it seems like a configuration may be needed as the there is 
> no one size fits all. An alternative to a new config is to allocate buffers 
> from the buffer pool and pass them to the compression library. This is 
> possible with Snappy and we could adapt our LZ4 code. It's not possible with 
> GZIP, but it uses a very small buffer by default.
> Note that we decided that this change was too risky for 0.10.0.0 and reverted 
> the original attempt.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request: KAFKA-3704; Revert "Remove hard-coded block si...

2016-05-16 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/1391


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Resolved] (KAFKA-3704) Improve mechanism for compression stream block size selection in KafkaProducer

2016-05-16 Thread Gwen Shapira (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gwen Shapira resolved KAFKA-3704.
-
   Resolution: Fixed
Fix Version/s: 0.10.0.0

Issue resolved by pull request 1391
[https://github.com/apache/kafka/pull/1391]

> Improve mechanism for compression stream block size selection in KafkaProducer
> --
>
> Key: KAFKA-3704
> URL: https://issues.apache.org/jira/browse/KAFKA-3704
> Project: Kafka
>  Issue Type: Bug
>Reporter: Guozhang Wang
>Assignee: Ismael Juma
> Fix For: 0.10.1.0, 0.10.0.0
>
>
> As discovered in https://issues.apache.org/jira/browse/KAFKA-3565, the 
> current default block size (1K) used in Snappy and GZIP may cause a 
> sub-optimal compression ratio for Snappy, and hence reduce throughput. 
> Because we no longer recompress data in the broker, it also impacts what gets 
> stored on disk.
> A solution might be to use the default block size, which is 64K in LZ4, 32K 
> in Snappy and 0.5K in GZIP. The downside is that this solution will require 
> more memory allocated outside of the buffer pool and hence users may need to 
> bump up their JVM heap size, especially for MirrorMakers. Using Snappy as an 
> example, it's an additional 2x32k per batch (as Snappy uses two buffers) and 
> one would expect at least one batch per partition. However, the number of 
> batches per partition can be much higher if the broker is slow to acknowledge 
> producer requests (depending on `buffer.memory`, `batch.size`, message size, 
> etc.).
> Given the above, it seems like a configuration may be needed as the there is 
> no one size fits all. An alternative to a new config is to allocate buffers 
> from the buffer pool and pass them to the compression library. This is 
> possible with Snappy and we could adapt our LZ4 code. It's not possible with 
> GZIP, but it uses a very small buffer by default.
> Note that we decided that this change was too risky for 0.10.0.0 and reverted 
> the original attempt.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request: MINOR: Inline `Record.write` into `Compressor....

2016-05-16 Thread ijuma
Github user ijuma closed the pull request at:

https://github.com/apache/kafka/pull/1388


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Jenkins build is back to normal : kafka-0.10.0-jdk7 #94

2016-05-16 Thread Apache Jenkins Server
See 



Build failed in Jenkins: kafka-trunk-jdk7 #1296

2016-05-16 Thread Apache Jenkins Server
See 

Changes:

[ismael] KAFKA-3698; Update the message format section.

--
[...truncated 1646 lines...]

kafka.log.LogTest > testAsyncDelete PASSED

kafka.log.LogTest > testReadOutOfRange PASSED

kafka.log.LogTest > testAppendWithOutOfOrderOffsetsThrowsException PASSED

kafka.log.LogTest > testReadAtLogGap PASSED

kafka.log.LogTest > testTimeBasedLogRoll PASSED

kafka.log.LogTest > testLoadEmptyLog PASSED

kafka.log.LogTest > testMessageSetSizeCheck PASSED

kafka.log.LogTest > testIndexResizingAtTruncation PASSED

kafka.log.LogTest > testCompactedTopicConstraints PASSED

kafka.log.LogTest > testThatGarbageCollectingSegmentsDoesntChangeOffset PASSED

kafka.log.LogTest > testAppendAndReadWithSequentialOffsets PASSED

kafka.log.LogTest > testDeleteOldSegmentsMethod PASSED

kafka.log.LogTest > testParseTopicPartitionNameForNull PASSED

kafka.log.LogTest > testAppendAndReadWithNonSequentialOffsets PASSED

kafka.log.LogTest > testParseTopicPartitionNameForMissingSeparator PASSED

kafka.log.LogTest > testCorruptIndexRebuild PASSED

kafka.log.LogTest > testBogusIndexSegmentsAreRemoved PASSED

kafka.log.LogTest > testCompressedMessages PASSED

kafka.log.LogTest > testAppendMessageWithNullPayload PASSED

kafka.log.LogTest > testCorruptLog PASSED

kafka.log.LogTest > testLogRecoversToCorrectOffset PASSED

kafka.log.LogTest > testReopenThenTruncate PASSED

kafka.log.LogTest > testParseTopicPartitionNameForMissingPartition PASSED

kafka.log.LogTest > testParseTopicPartitionNameForEmptyName PASSED

kafka.log.LogTest > testOpenDeletesObsoleteFiles PASSED

kafka.log.LogTest > testSizeBasedLogRoll PASSED

kafka.log.LogTest > testTimeBasedLogRollJitter PASSED

kafka.log.LogTest > testParseTopicPartitionName PASSED

kafka.log.LogTest > testTruncateTo PASSED

kafka.log.LogTest > testCleanShutdownFile PASSED

kafka.log.LogCleanerIntegrationTest > cleanerTest[0] PASSED

kafka.log.LogCleanerIntegrationTest > cleanerTest[1] PASSED

kafka.log.LogCleanerIntegrationTest > cleanerTest[2] PASSED

kafka.log.LogCleanerIntegrationTest > cleanerTest[3] PASSED

kafka.log.CleanerTest > testBuildOffsetMap PASSED

kafka.log.CleanerTest > testBuildOffsetMapFakeLarge PASSED

kafka.log.CleanerTest > testSegmentGrouping PASSED

kafka.log.CleanerTest > testCleanSegmentsWithAbort PASSED

kafka.log.CleanerTest > testSegmentGroupingWithSparseOffsets PASSED

kafka.log.CleanerTest > testRecoveryAfterCrash PASSED

kafka.log.CleanerTest > testLogToClean PASSED

kafka.log.CleanerTest > testCleaningWithDeletes PASSED

kafka.log.CleanerTest > testCleanSegments PASSED

kafka.log.CleanerTest > testCleaningWithUnkeyedMessages PASSED

kafka.controller.ControllerFailoverTest > testMetadataUpdate PASSED

kafka.javaapi.consumer.ZookeeperConsumerConnectorTest > testBasic PASSED

kafka.javaapi.message.ByteBufferMessageSetTest > testSizeInBytesWithCompression 
PASSED

kafka.javaapi.message.ByteBufferMessageSetTest > 
testIteratorIsConsistentWithCompression PASSED

kafka.javaapi.message.ByteBufferMessageSetTest > testIteratorIsConsistent PASSED

kafka.javaapi.message.ByteBufferMessageSetTest > testEqualsWithCompression 
PASSED

kafka.javaapi.message.ByteBufferMessageSetTest > testWrittenEqualsRead PASSED

kafka.javaapi.message.ByteBufferMessageSetTest > testEquals PASSED

kafka.javaapi.message.ByteBufferMessageSetTest > testSizeInBytes PASSED

kafka.network.SocketServerTest > testClientDisconnectionUpdatesRequestMetrics 
PASSED

kafka.network.SocketServerTest > testMaxConnectionsPerIp PASSED

kafka.network.SocketServerTest > 
testBrokerSendAfterChannelClosedUpdatesRequestMetrics PASSED

kafka.network.SocketServerTest > simpleRequest PASSED

kafka.network.SocketServerTest > testSessionPrincipal PASSED

kafka.network.SocketServerTest > testMaxConnectionsPerIpOverrides PASSED

kafka.network.SocketServerTest > testSocketsCloseOnShutdown PASSED

kafka.network.SocketServerTest > testSslSocketServer PASSED

kafka.network.SocketServerTest > tooBigRequestIsRejected PASSED

kafka.integration.SaslSslTopicMetadataTest > 
testIsrAfterBrokerShutDownAndJoinsBack PASSED

kafka.integration.SaslSslTopicMetadataTest > testAutoCreateTopicWithCollision 
PASSED

kafka.integration.SaslSslTopicMetadataTest > testAliveBrokerListWithNoTopics 
PASSED

kafka.integration.SaslSslTopicMetadataTest > testAutoCreateTopic PASSED

kafka.integration.SaslSslTopicMetadataTest > testGetAllTopicMetadata PASSED

kafka.integration.SaslSslTopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterNewBrokerStartup PASSED

kafka.integration.SaslSslTopicMetadataTest > testBasicTopicMetadata PASSED

kafka.integration.SaslSslTopicMetadataTest > 
testAliveBrokersListWithNoTopicsAfterABrokerShutdown PASSED

kafka.integration.PrimitiveApiTest > testMultiProduce PASSED

kafka.integration.PrimitiveApiTest > testDefaultEncoderProducerAndFetch PASSED

kafka.integration.PrimitiveApiTest > 

Build failed in Jenkins: kafka-trunk-jdk8 #631

2016-05-16 Thread Apache Jenkins Server
See 

Changes:

[ismael] KAFKA-3698; Update the message format section.

--
Started by an SCM change
[EnvInject] - Loading node environment variables.
Building remotely on ubuntu3 (Ubuntu ubuntu legacy-ubuntu) in workspace 

 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url 
 > https://git-wip-us.apache.org/repos/asf/kafka.git # timeout=10
Fetching upstream changes from https://git-wip-us.apache.org/repos/asf/kafka.git
 > git --version # timeout=10
 > git -c core.askpass=true fetch --tags --progress 
 > https://git-wip-us.apache.org/repos/asf/kafka.git 
 > +refs/heads/*:refs/remotes/origin/*
 > git rev-parse refs/remotes/origin/trunk^{commit} # timeout=10
 > git rev-parse refs/remotes/origin/origin/trunk^{commit} # timeout=10
Checking out Revision 61f4c8b0923baa17bfc6160082fc2c8ca5a2c44d 
(refs/remotes/origin/trunk)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 61f4c8b0923baa17bfc6160082fc2c8ca5a2c44d
 > git rev-list 6a5d3e65c6ba9faa0f6cf4ec2871a83beb28c6c4 # timeout=10
Setting 
GRADLE_2_4_RC_2_HOME=/home/jenkins/jenkins-slave/tools/hudson.plugins.gradle.GradleInstallation/Gradle_2.4-rc-2
Setting 
JDK1_8_0_66_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/jdk1.8.0_66
[kafka-trunk-jdk8] $ /bin/bash -xe /tmp/hudson5186873600899710976.sh
+ 
/home/jenkins/jenkins-slave/tools/hudson.plugins.gradle.GradleInstallation/Gradle_2.4-rc-2/bin/gradle
To honour the JVM settings for this build a new JVM will be forked. Please 
consider using the daemon: 
http://gradle.org/docs/2.4-rc-2/userguide/gradle_daemon.html.
Building project 'core' with Scala version 2.10.6
:downloadWrapper

BUILD SUCCESSFUL

Total time: 17.047 secs
Setting 
GRADLE_2_4_RC_2_HOME=/home/jenkins/jenkins-slave/tools/hudson.plugins.gradle.GradleInstallation/Gradle_2.4-rc-2
Setting 
JDK1_8_0_66_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/jdk1.8.0_66
[kafka-trunk-jdk8] $ /bin/bash -xe /tmp/hudson6435732189702113944.sh
+ export GRADLE_OPTS=-Xmx1024m
+ GRADLE_OPTS=-Xmx1024m
+ ./gradlew -Dorg.gradle.project.maxParallelForks=1 clean jarAll testAll
To honour the JVM settings for this build a new JVM will be forked. Please 
consider using the daemon: 
https://docs.gradle.org/2.13/userguide/gradle_daemon.html.
Building project 'core' with Scala version 2.10.6
Build file '/x1/jenkins/jenkins-slave/workspace/kafka-trunk-jdk8/build.gradle': 
line 230
useAnt has been deprecated and is scheduled to be removed in Gradle 3.0. The 
Ant-Based Scala compiler is deprecated, please see 
https://docs.gradle.org/current/userguide/scala_plugin.html.
:clean UP-TO-DATE
:clients:clean
:connect:clean UP-TO-DATE
:core:clean
:examples:clean UP-TO-DATE
:log4j-appender:clean UP-TO-DATE
:streams:clean UP-TO-DATE
:tools:clean UP-TO-DATE
:connect:api:clean UP-TO-DATE
:connect:file:clean UP-TO-DATE
:connect:json:clean UP-TO-DATE
:connect:runtime:clean UP-TO-DATE
:streams:examples:clean UP-TO-DATE
:jar_core_2_10
Building project 'core' with Scala version 2.10.6
:kafka-trunk-jdk8:clients:compileJavawarning: [options] bootstrap class path 
not set in conjunction with -source 1.7
Note: Some input files use unchecked or unsafe operations.
Note: Recompile with -Xlint:unchecked for details.
1 warning

:kafka-trunk-jdk8:clients:processResources UP-TO-DATE
:kafka-trunk-jdk8:clients:classes
:kafka-trunk-jdk8:clients:determineCommitId UP-TO-DATE
:kafka-trunk-jdk8:clients:createVersionFile
:kafka-trunk-jdk8:clients:jar
:jar_core_2_10 FAILED

FAILURE: Build failed with an exception.

* What went wrong:
org.gradle.api.internal.changedetection.state.FileCollectionSnapshotImpl cannot 
be cast to 
org.gradle.api.internal.changedetection.state.OutputFilesCollectionSnapshotter$OutputFilesSnapshot

* Try:
Run with --stacktrace option to get the stack trace. Run with --info or --debug 
option to get more log output.

BUILD FAILED

Total time: 16.247 secs
Build step 'Execute shell' marked build as failure
Recording test results
Setting 
GRADLE_2_4_RC_2_HOME=/home/jenkins/jenkins-slave/tools/hudson.plugins.gradle.GradleInstallation/Gradle_2.4-rc-2
Setting 
JDK1_8_0_66_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/jdk1.8.0_66
ERROR: Step ?Publish JUnit test result report? failed: No test report files 
were found. Configuration error?
Setting 
GRADLE_2_4_RC_2_HOME=/home/jenkins/jenkins-slave/tools/hudson.plugins.gradle.GradleInstallation/Gradle_2.4-rc-2
Setting 
JDK1_8_0_66_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/jdk1.8.0_66


[GitHub] kafka pull request: KAFKA-3701: Expose KafkaStreams Metrics in pub...

2016-05-16 Thread jklukas
Github user jklukas closed the pull request at:

https://github.com/apache/kafka/pull/1362


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (KAFKA-3698) Update website documentation when it comes to the message format

2016-05-16 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-3698:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Issue resolved by pull request 1375
[https://github.com/apache/kafka/pull/1375]

> Update website documentation when it comes to the message format
> 
>
> Key: KAFKA-3698
> URL: https://issues.apache.org/jira/browse/KAFKA-3698
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Ismael Juma
>Assignee: Jiangjie Qin
> Fix For: 0.10.0.0
>
>
> Sections 5.3, 5.4 and 5.5 talk about the message format:
> http://kafka.apache.org/documentation.html#messages
> We should make sure they are up to date given recent changes.
> [~becket_qin], do you think you could do this?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request: KAFKA-3704; Revert "Remove hard-coded block si...

2016-05-16 Thread ijuma
GitHub user ijuma opened a pull request:

https://github.com/apache/kafka/pull/1391

KAFKA-3704; Revert "Remove hard-coded block size in KafkaProducer"

This is not an exact revert as the code changed a bit since the
original commit. We also include a note in `upgrade.html`.

The original commit is 1182d61deb23b5cd86cbe462471f7df583a796e1.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ijuma/kafka kafka-3704-revert

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/1391.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1391


commit 1673cd097f7986e903d422ab7dc99d81d333ca18
Author: Ismael Juma 
Date:   2016-05-16T15:49:37Z

Revert "KAFKA-3704: Remove hard-coded block size in KafkaProducer"

This is not an exact revert as the code changed a bit since the
original commit. We also include a note in `upgrade.html`.

The original commit is 1182d61deb23b5cd86cbe462471f7df583a796e1.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] kafka pull request: KAFKA-3698: Update the message format section.

2016-05-16 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/1375


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Jenkins build is back to normal : kafka-trunk-jdk8 #630

2016-05-16 Thread Apache Jenkins Server
See 



[jira] [Updated] (KAFKA-3704) Improve mechanism for compression stream block size selection in KafkaProducer

2016-05-16 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-3704:
---
Summary: Improve mechanism for compression stream block size selection in 
KafkaProducer  (was: Use default block size in KafkaProducer)

> Improve mechanism for compression stream block size selection in KafkaProducer
> --
>
> Key: KAFKA-3704
> URL: https://issues.apache.org/jira/browse/KAFKA-3704
> Project: Kafka
>  Issue Type: Bug
>Reporter: Guozhang Wang
>Assignee: Ismael Juma
> Fix For: 0.10.1.0
>
>
> As discovered in https://issues.apache.org/jira/browse/KAFKA-3565, the 
> current default block size (1K) used in Snappy and GZIP may cause a 
> sub-optimal compression ratio for Snappy, and hence reduce throughput. 
> Because we no longer recompress data in the broker, it also impacts what gets 
> stored on disk.
> A solution might be to use the default block size, which is 64K in LZ4, 32K 
> in Snappy and 0.5K in GZIP. The downside is that this solution will require 
> more memory allocated outside of the buffer pool and hence users may need to 
> bump up their JVM heap size, especially for MirrorMakers. Using Snappy as an 
> example, it's an additional 2x32k per batch (as Snappy uses two buffers) and 
> one would expect at least one batch per partition. However, the number of 
> batches per partition can be much higher if the broker is slow to acknowledge 
> producer requests (depending on `buffer.memory`, `batch.size`, message size, 
> etc.).
> Given the above, it seems like a configuration may be needed as the there is 
> no one size fits all. An alternative to a new config is to allocate buffers 
> from the buffer pool and pass them to the compression library. This is 
> possible with Snappy and we could adapt our LZ4 code. It's not possible with 
> GZIP, but it uses a very small buffer by default.
> Note that we decided that this change was too risky for 0.10.0.0 and reverted 
> the original attempt.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (KAFKA-3704) Use default block size in KafkaProducer

2016-05-16 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma reopened KAFKA-3704:

  Assignee: Ismael Juma  (was: Guozhang Wang)

> Use default block size in KafkaProducer
> ---
>
> Key: KAFKA-3704
> URL: https://issues.apache.org/jira/browse/KAFKA-3704
> Project: Kafka
>  Issue Type: Bug
>Reporter: Guozhang Wang
>Assignee: Ismael Juma
> Fix For: 0.10.1.0
>
>
> As discovered in https://issues.apache.org/jira/browse/KAFKA-3565, the 
> current default block size (1K) used in Snappy and GZIP may cause a 
> sub-optimal compression ratio for Snappy, and hence reduce throughput. 
> Because we no longer recompress data in the broker, it also impacts what gets 
> stored on disk.
> A solution might be to use the default block size, which is 64K in LZ4, 32K 
> in Snappy and 0.5K in GZIP. The downside is that this solution will require 
> more memory allocated outside of the buffer pool and hence users may need to 
> bump up their JVM heap size, especially for MirrorMakers. Using Snappy as an 
> example, it's an additional 2x32k per batch (as Snappy uses two buffers) and 
> one would expect at least one batch per partition. However, the number of 
> batches per partition can be much higher if the broker is slow to acknowledge 
> producer requests (depending on `buffer.memory`, `batch.size`, message size, 
> etc.).
> Given the above, it seems like a configuration may be needed as the there is 
> no one size fits all. An alternative to a new config is to allocate buffers 
> from the buffer pool and pass them to the compression library. This is 
> possible with Snappy and we could adapt our LZ4 code. It's not possible with 
> GZIP, but it uses a very small buffer by default.
> Note that we decided that this change was too risky for 0.10.0.0 and reverted 
> the original attempt.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-3704) Use default block size in KafkaProducer

2016-05-16 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-3704:
---
Fix Version/s: (was: 0.10.0.0)
   0.10.1.0

> Use default block size in KafkaProducer
> ---
>
> Key: KAFKA-3704
> URL: https://issues.apache.org/jira/browse/KAFKA-3704
> Project: Kafka
>  Issue Type: Bug
>Reporter: Guozhang Wang
>Assignee: Guozhang Wang
> Fix For: 0.10.1.0
>
>
> As discovered in https://issues.apache.org/jira/browse/KAFKA-3565, the 
> current default block size (1K) used in Snappy and GZIP may cause a 
> sub-optimal compression ratio for Snappy, and hence reduce throughput. 
> Because we no longer recompress data in the broker, it also impacts what gets 
> stored on disk.
> A solution might be to use the default block size, which is 64K in LZ4, 32K 
> in Snappy and 0.5K in GZIP. The downside is that this solution will require 
> more memory allocated outside of the buffer pool and hence users may need to 
> bump up their JVM heap size, especially for MirrorMakers. Using Snappy as an 
> example, it's an additional 2x32k per batch (as Snappy uses two buffers) and 
> one would expect at least one batch per partition. However, the number of 
> batches per partition can be much higher if the broker is slow to acknowledge 
> producer requests (depending on `buffer.memory`, `batch.size`, message size, 
> etc.).
> Given the above, it seems like a configuration may be needed as the there is 
> no one size fits all. An alternative to a new config is to allocate buffers 
> from the buffer pool and pass them to the compression library. This is 
> possible with Snappy and we could adapt our LZ4 code. It's not possible with 
> GZIP, but it uses a very small buffer by default.
> Note that we decided that this change was too risky for 0.10.0.0 and reverted 
> the original attempt.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-3704) Use default block size in KafkaProducer

2016-05-16 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-3704:
---
Description: 
As discovered in https://issues.apache.org/jira/browse/KAFKA-3565, the current 
default block size (1K) used in Snappy and GZIP may cause a sub-optimal 
compression ratio for Snappy, and hence reduce throughput. Because we no longer 
recompress data in the broker, it also impacts what gets stored on disk.

A solution might be to use the default block size, which is 64K in LZ4, 32K in 
Snappy and 0.5K in GZIP. The downside is that this solution will require more 
memory allocated outside of the buffer pool and hence users may need to bump up 
their JVM heap size, especially for MirrorMakers. Using Snappy as an example, 
it's an additional 2x32k per batch (as Snappy uses two buffers) and one would 
expect at least one batch per partition. However, the number of batches per 
partition can be much higher if the broker is slow to acknowledge producer 
requests (depending on `buffer.memory`, `batch.size`, message size, etc.).

Given the above, it seems like a configuration may be needed as the there is no 
one size fits all. An alternative to a new config is to allocate buffers from 
the buffer pool and pass them to the compression library. This is possible with 
Snappy and we could adapt our LZ4 code. It's not possible with GZIP, but it 
uses a very small buffer by default.

Note that we decided that this change was too risky for 0.10.0.0 and reverted 
the original attempt.

  was:
As discovered in https://issues.apache.org/jira/browse/KAFKA-3565, the current 
default block size (1K) used in Snappy and GZIP may cause sub-optimal 
compression ratio for Snappy, and hence reduce throughput.

A better solution would be using the default block size, which is 32K in Snappy 
and 0.5K in GZIP. A notable side-effect is that with Snappy, this solution will 
require more extra memory allocated out side of the bufferpoll, by {{(32 - 1)K 
* num.total.partitions}} and hence users may need to bump up their JVM heap 
size, especially for MirrorMakers.


> Use default block size in KafkaProducer
> ---
>
> Key: KAFKA-3704
> URL: https://issues.apache.org/jira/browse/KAFKA-3704
> Project: Kafka
>  Issue Type: Bug
>Reporter: Guozhang Wang
>Assignee: Guozhang Wang
> Fix For: 0.10.1.0
>
>
> As discovered in https://issues.apache.org/jira/browse/KAFKA-3565, the 
> current default block size (1K) used in Snappy and GZIP may cause a 
> sub-optimal compression ratio for Snappy, and hence reduce throughput. 
> Because we no longer recompress data in the broker, it also impacts what gets 
> stored on disk.
> A solution might be to use the default block size, which is 64K in LZ4, 32K 
> in Snappy and 0.5K in GZIP. The downside is that this solution will require 
> more memory allocated outside of the buffer pool and hence users may need to 
> bump up their JVM heap size, especially for MirrorMakers. Using Snappy as an 
> example, it's an additional 2x32k per batch (as Snappy uses two buffers) and 
> one would expect at least one batch per partition. However, the number of 
> batches per partition can be much higher if the broker is slow to acknowledge 
> producer requests (depending on `buffer.memory`, `batch.size`, message size, 
> etc.).
> Given the above, it seems like a configuration may be needed as the there is 
> no one size fits all. An alternative to a new config is to allocate buffers 
> from the buffer pool and pass them to the compression library. This is 
> possible with Snappy and we could adapt our LZ4 code. It's not possible with 
> GZIP, but it uses a very small buffer by default.
> Note that we decided that this change was too risky for 0.10.0.0 and reverted 
> the original attempt.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: reading the consumer offsets topic

2016-05-16 Thread Cliff Rhyne
Hi Tao,

Sorry for the delay.  Thanks for pointing out that property.  That was the
fix.

On Mon, May 9, 2016 at 6:00 PM, tao xiao  wrote:

> You need to enable internal topic in the consumer.properties
>
> exclude.internal.topics=false
>
> On Mon, 9 May 2016 at 12:42 Cliff Rhyne  wrote:
>
> > Thanks Todd and Tao.  I've tried those tricks but no luck.
> >
> > Just to add more info, this is the __consumer_offsets specific config
> that
> > is shown by the topic describe command:
> >
> >
> >
> Configs:segment.bytes=104857600,cleanup.policy=compact,compression.type=uncompressed
> >
> > On Mon, May 9, 2016 at 1:16 PM, tao xiao  wrote:
> >
> > > You can try this
> > >
> > > bin/kafka-console-consumer.sh --consumer.config
> > > config/consumer.properties --from-beginning
> > > --topic __consumer_offsets --zookeeper localhost:2181 —formatter
> > > "kafka.coordinator.GroupMetadataManager\$OffsetsMessageFormatter"
> > >
> > > On Mon, 9 May 2016 at 09:40 Todd Palino  wrote:
> > >
> > > > The GroupMetadataManager one should be working for you with 0.9. I
> > don’t
> > > > have a 0.9 KCC set up at the moment, so I’m using an 0.8 version
> where
> > > it’s
> > > > different (it’s the other class for me). The only thing I can offer
> now
> > > is
> > > > did you put quotes around the arg to --formatter so you don’t get
> weird
> > > > shell interference?
> > > >
> > > > -Todd
> > > >
> > > >
> > > > On Mon, May 9, 2016 at 8:18 AM, Cliff Rhyne 
> wrote:
> > > >
> > > > > Thanks, Todd.  It's still not working unfortunately.
> > > > >
> > > > > This results in nothing getting printed to the console and requires
> > > kill
> > > > -9
> > > > > in another window to stop (ctrl-c doesn't work):
> > > > >
> > > > > /kafka-console-consumer.sh --bootstrap-server localhost:9092
> > > --zookeeper
> > > > >  --topic __consumer_offsets --formatter
> > > > > kafka.coordinator.GroupMetadataManager\$OffsetsMessageFormatter
> > > > >
> > > > > This results in a stack trace because it can't find the class:
> > > > >
> > > > > ./kafka-console-consumer.sh --bootstrap-server localhost:9092
> > > --zookeeper
> > > > >  --topic __consumer_offsets --formatter
> > > > > kafka.server.OffsetManager\$OffsetsMessageFormatter
> > > > >
> > > > > Exception in thread "main" java.lang.ClassNotFoundException:
> > > > > kafka.server.OffsetManager$OffsetsMessageFormatter
> > > > >
> > > > >
> > > > > I'm on 0.9.0.1. "broker-list" is invalid and zookeeper is required
> > > > > regardless of the bootstrap-server parameter.
> > > > >
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Cliff
> > > > >
> > > > > On Sun, May 8, 2016 at 7:35 PM, Todd Palino 
> > wrote:
> > > > >
> > > > > > It looks like you’re just missing the proper message formatter.
> Of
> > > > > course,
> > > > > > that largely depends on your version of the broker. Try:
> > > > > >
> > > > > > ./kafka-console-consumer.sh --broker-list localhost:9092 --topic
> > > > > > __consumer_offsets
> > > > > > --formatter
> > > > > kafka.coordinator.GroupMetadataManager\$OffsetsMessageFormatter
> > > > > >
> > > > > >
> > > > > > If for some reason that doesn’t work, you can try
> > > > > > "kafka.server.OffsetManager\$OffsetsMessageFormatter” instead.
> > > > > >
> > > > > > -Todd
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Sun, May 8, 2016 at 1:26 PM, Cliff Rhyne 
> > > wrote:
> > > > > >
> > > > > > > I'm having difficulty reading the consumer offsets topic from
> the
> > > > > command
> > > > > > > line.  I try the following but it doesn't seem to work (along
> > with
> > > a
> > > > > few
> > > > > > > related variants including specifying the zookeeper hosts):
> > > > > > >
> > > > > > > ./kafka-console-consumer.sh --broker-list localhost:9092
> --topic
> > > > > > > __consumer_offsets
> > > > > > >
> > > > > > > Is there a special way to read the consumer offsets topic?
> > > > > > >
> > > > > > > Thanks,
> > > > > > > Cliff
> > > > > > >
> > > > > > > --
> > > > > > > Cliff Rhyne
> > > > > > > Software Engineering Manager
> > > > > > > e: crh...@signal.co
> > > > > > > signal.co
> > > > > > > 
> > > > > > >
> > > > > > > Cut Through the Noise
> > > > > > >
> > > > > > > This e-mail and any files transmitted with it are for the sole
> > use
> > > of
> > > > > the
> > > > > > > intended recipient(s) and may contain confidential and
> privileged
> > > > > > > information. Any unauthorized use of this email is strictly
> > > > prohibited.
> > > > > > > ©2016 Signal. All rights reserved.
> > > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > *—-*
> > > > > > *Todd Palino*
> > > > > > Staff Site Reliability Engineer
> > > > > > Data Infrastructure Streaming
> > > > > >
> > > > > >
> > > > > >
> > > > > > linkedin.com/in/toddpalino
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Cliff 

[jira] [Updated] (KAFKA-3714) Allow users greater access to register custom streams metrics

2016-05-16 Thread Jeff Klukas (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Klukas updated KAFKA-3714:
---
Issue Type: Improvement  (was: Bug)

> Allow users greater access to register custom streams metrics
> -
>
> Key: KAFKA-3714
> URL: https://issues.apache.org/jira/browse/KAFKA-3714
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: Jeff Klukas
>Assignee: Guozhang Wang
>Priority: Minor
> Fix For: 0.10.1.0
>
>
> Copying in some discussion that originally appeared in 
> https://github.com/apache/kafka/pull/1362#issuecomment-219064302
> Kafka Streams is largely a higher-level abstraction on top of producers and 
> consumers, and it seems sensible to match the KafkaStreams interface to that 
> of KafkaProducer and KafkaConsumer where possible. For producers and 
> consumers, the metric registry is internal and metrics are only exposed as an 
> unmodifiable map. This allows users to access client metric values for use in 
> application health checks, etc., but doesn't allow them to register new 
> metrics.
> That approach seems reasonable if we assume that a user interested in 
> defining custom metrics is already going to be using a separate metrics 
> library. In such a case, users will likely find it easier to define metrics 
> using whatever library they're familiar with rather than learning the API for 
> Kafka's Metrics class. Is this a reasonable assumption?
> If we want to expose the Metrics instance so that users can define arbitrary 
> metrics, I'd argue that there's need for documentation updates. In 
> particular, I find the notion of metric tags confusing. Tags can be defined 
> in a MetricConfig when the Metrics instance is constructed, 
> StreamsMetricsImpl is maintaining its own set of tags, and users can set tag 
> overrides.
> If a user were to get access to the Metrics instance, they would be missing 
> the tags defined in StreamsMetricsImpl. I'm imagining that users would want 
> their custom metrics to sit alongside the predefined metrics with the same 
> tags, and users shouldn't be expected to manage those additional tags 
> themselves.
> So, why are we allowing users to define their own metrics via the 
> StreamsMetrics interface in the first place? Is it that we'd like to be able 
> to provide a built-in latency metric, but the definition depends on the 
> details of the use case so there's no generic solution? That would be 
> sufficient motivation for this special case of addLatencySensor. If we want 
> to continue down that path and give users access to define a wider range of 
> custom metrics, I'd prefer to extend the StreamsMetrics interface so that 
> users can call methods on that object, automatically getting the tags 
> appropriate for that instance rather than interacting with the raw Metrics 
> instance.
> ---
> Guozhang had the following comments:
> 1) For the producer/consumer cases, all internal metrics are provided and 
> abstracted from users, and they just need to read the documentation to poll 
> whatever provided metrics that they are interested; and if they want to 
> define more metrics, they are likely to be outside the clients themselves and 
> they can use whatever methods they like, so Metrics do not need to be exposed 
> to users.
> 2) For streams, things are a bit different: users define the computational 
> logic, which becomes part of the "Streams Client" processing and may be of 
> interests to be monitored by user themselves; think of a customized processor 
> that sends an email to some address based on a condition, and users want to 
> monitor the average rate of emails sent. Hence it is worth considering 
> whether or not they should be able to access the Metrics instance to define 
> their own along side the pre-defined metrics provided by the library.
> 3) Now, since the Metrics class was not previously designed for public usage, 
> it is not designed to be very user-friendly for defining sensors, especially 
> the semantics differences between name / scope / tags. StreamsMetrics tries 
> to hide some of these semantics confusion from users, but it still expose 
> tags and hence is not perfect in doing so. We need to think of a better 
> approach so that: 1) user defined metrics will be "aligned" (i.e. with the 
> same name prefix within a single application, with similar scope hierarchy 
> definition, etc) with library provided metrics, 2) natural APIs to do so.
> I do not have concrete ideas about 3) above on top of my head, comments are 
> more than welcomed.
> ---
> I'm not sure that I agree that 1) and 2) are truly different situations. A 
> user might choose to send email messages within a bare consumer rather than a 
> streams application, and still want to 

[jira] [Commented] (KAFKA-3715) Higher granularity streams metrics

2016-05-16 Thread Jeff Klukas (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15284706#comment-15284706
 ] 

Jeff Klukas commented on KAFKA-3715:


It would be interesting to work on this, but I won't likely have time in the 
near future, so others should feel free to implement these ideas.

> Higher granularity streams metrics 
> ---
>
> Key: KAFKA-3715
> URL: https://issues.apache.org/jira/browse/KAFKA-3715
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: Jeff Klukas
>Assignee: Guozhang Wang
>Priority: Minor
> Fix For: 0.10.1.0
>
>
> Originally proposed by [~guozhang] in 
> https://github.com/apache/kafka/pull/1362#issuecomment-218326690
> We can consider adding metrics for process / punctuate / commit rate at the 
> granularity of each processor node in addition to the global rate mentioned 
> above. This is very helpful in debugging.
> We can consider adding rate / total cumulated metrics for context.forward 
> indicating how many records were forwarded downstream from this processor 
> node as well. This is helpful in debugging.
> We can consider adding metrics for each stream partition's timestamp. This is 
> helpful in debugging.
> Besides the latency metrics, we can also add throughput latency in terms of 
> source records consumed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KAFKA-3715) Higher granularity streams metrics

2016-05-16 Thread Jeff Klukas (JIRA)
Jeff Klukas created KAFKA-3715:
--

 Summary: Higher granularity streams metrics 
 Key: KAFKA-3715
 URL: https://issues.apache.org/jira/browse/KAFKA-3715
 Project: Kafka
  Issue Type: Improvement
  Components: streams
Reporter: Jeff Klukas
Assignee: Guozhang Wang
Priority: Minor
 Fix For: 0.10.1.0


Originally proposed by [~guozhang] in 
https://github.com/apache/kafka/pull/1362#issuecomment-218326690

We can consider adding metrics for process / punctuate / commit rate at the 
granularity of each processor node in addition to the global rate mentioned 
above. This is very helpful in debugging.

We can consider adding rate / total cumulated metrics for context.forward 
indicating how many records were forwarded downstream from this processor node 
as well. This is helpful in debugging.

We can consider adding metrics for each stream partition's timestamp. This is 
helpful in debugging.

Besides the latency metrics, we can also add throughput latency in terms of 
source records consumed.





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KAFKA-3714) Allow users greater access to register custom streams metrics

2016-05-16 Thread Jeff Klukas (JIRA)
Jeff Klukas created KAFKA-3714:
--

 Summary: Allow users greater access to register custom streams 
metrics
 Key: KAFKA-3714
 URL: https://issues.apache.org/jira/browse/KAFKA-3714
 Project: Kafka
  Issue Type: Bug
  Components: streams
Reporter: Jeff Klukas
Assignee: Guozhang Wang
Priority: Minor
 Fix For: 0.10.1.0


Copying in some discussion that originally appeared in 
https://github.com/apache/kafka/pull/1362#issuecomment-219064302

Kafka Streams is largely a higher-level abstraction on top of producers and 
consumers, and it seems sensible to match the KafkaStreams interface to that of 
KafkaProducer and KafkaConsumer where possible. For producers and consumers, 
the metric registry is internal and metrics are only exposed as an unmodifiable 
map. This allows users to access client metric values for use in application 
health checks, etc., but doesn't allow them to register new metrics.

That approach seems reasonable if we assume that a user interested in defining 
custom metrics is already going to be using a separate metrics library. In such 
a case, users will likely find it easier to define metrics using whatever 
library they're familiar with rather than learning the API for Kafka's Metrics 
class. Is this a reasonable assumption?

If we want to expose the Metrics instance so that users can define arbitrary 
metrics, I'd argue that there's need for documentation updates. In particular, 
I find the notion of metric tags confusing. Tags can be defined in a 
MetricConfig when the Metrics instance is constructed, StreamsMetricsImpl is 
maintaining its own set of tags, and users can set tag overrides.

If a user were to get access to the Metrics instance, they would be missing the 
tags defined in StreamsMetricsImpl. I'm imagining that users would want their 
custom metrics to sit alongside the predefined metrics with the same tags, and 
users shouldn't be expected to manage those additional tags themselves.

So, why are we allowing users to define their own metrics via the 
StreamsMetrics interface in the first place? Is it that we'd like to be able to 
provide a built-in latency metric, but the definition depends on the details of 
the use case so there's no generic solution? That would be sufficient 
motivation for this special case of addLatencySensor. If we want to continue 
down that path and give users access to define a wider range of custom metrics, 
I'd prefer to extend the StreamsMetrics interface so that users can call 
methods on that object, automatically getting the tags appropriate for that 
instance rather than interacting with the raw Metrics instance.

---

Guozhang had the following comments:

1) For the producer/consumer cases, all internal metrics are provided and 
abstracted from users, and they just need to read the documentation to poll 
whatever provided metrics that they are interested; and if they want to define 
more metrics, they are likely to be outside the clients themselves and they can 
use whatever methods they like, so Metrics do not need to be exposed to users.

2) For streams, things are a bit different: users define the computational 
logic, which becomes part of the "Streams Client" processing and may be of 
interests to be monitored by user themselves; think of a customized processor 
that sends an email to some address based on a condition, and users want to 
monitor the average rate of emails sent. Hence it is worth considering whether 
or not they should be able to access the Metrics instance to define their own 
along side the pre-defined metrics provided by the library.

3) Now, since the Metrics class was not previously designed for public usage, 
it is not designed to be very user-friendly for defining sensors, especially 
the semantics differences between name / scope / tags. StreamsMetrics tries to 
hide some of these semantics confusion from users, but it still expose tags and 
hence is not perfect in doing so. We need to think of a better approach so 
that: 1) user defined metrics will be "aligned" (i.e. with the same name prefix 
within a single application, with similar scope hierarchy definition, etc) with 
library provided metrics, 2) natural APIs to do so.

I do not have concrete ideas about 3) above on top of my head, comments are 
more than welcomed.

---

I'm not sure that I agree that 1) and 2) are truly different situations. A user 
might choose to send email messages within a bare consumer rather than a 
streams application, and still want to maintain a metric of sent emails. In 
this bare consumer case, we'd expect the user to define that email-sent metric 
outside of Kafka's metrics machinery.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3699) Update protocol page on website to explain how KIP-35 should be used

2016-05-16 Thread Ismael Juma (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15284508#comment-15284508
 ] 

Ismael Juma commented on KAFKA-3699:


Any progress on this [~singhashish]?

> Update protocol page on website to explain how KIP-35 should be used
> 
>
> Key: KAFKA-3699
> URL: https://issues.apache.org/jira/browse/KAFKA-3699
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Ismael Juma
>Assignee: Ashish K Singh
> Fix For: 0.10.0.0
>
>
> The following page should be updated:
> http://kafka.apache.org/protocol.html
> [~edenhill], [~singhashish] any of you would like to tackle this?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-3258) BrokerTopicMetrics of deleted topics are never deleted

2016-05-16 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-3258:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Issue resolved by pull request 944
[https://github.com/apache/kafka/pull/944]

> BrokerTopicMetrics of deleted topics are never deleted
> --
>
> Key: KAFKA-3258
> URL: https://issues.apache.org/jira/browse/KAFKA-3258
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.9.0.1
>Reporter: Rajini Sivaram
>Assignee: Rajini Sivaram
> Fix For: 0.10.1.0
>
>
> Per-topic BrokerTopicMetrics generated by brokers are not deleted even when 
> the topic is deleted. This shows misleading metrics in metrics reporters long 
> after a topic is deleted and is also a resource leak.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request: KAFKA-3258: Delete broker topic metrics of del...

2016-05-16 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/944


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Build failed in Jenkins: kafka-0.10.0-jdk7 #93

2016-05-16 Thread Apache Jenkins Server
See 

Changes:

[ismael] MINOR: document increased network bandwidth of 0.10 under replication

--
[...truncated 83 lines...]
new UpdateMetadataRequest(controllerId, controllerEpoch, 
liveBrokers.asJava, partitionStates.asJava)
^
/x1/jenkins/jenkins-slave/workspace/kafka-0.10.0-jdk7/core/src/main/scala/kafka/metrics/KafkaMetricsGroup.scala:191:
 object ProducerRequestStatsRegistry in package producer is deprecated: This 
object has been deprecated and will be removed in a future release.
ProducerRequestStatsRegistry.removeProducerRequestStats(clientId)
^
/x1/jenkins/jenkins-slave/workspace/kafka-0.10.0-jdk7/core/src/main/scala/kafka/network/BlockingChannel.scala:129:
 method readFromReadableChannel in class NetworkReceive is deprecated: see 
corresponding Javadoc for more information.
  response.readFromReadableChannel(channel)
   ^
/x1/jenkins/jenkins-slave/workspace/kafka-0.10.0-jdk7/core/src/main/scala/kafka/server/KafkaApis.scala:301:
 value timestamp in class PartitionData is deprecated: see corresponding 
Javadoc for more information.
  if (partitionData.timestamp == 
OffsetCommitRequest.DEFAULT_TIMESTAMP)
^
/x1/jenkins/jenkins-slave/workspace/kafka-0.10.0-jdk7/core/src/main/scala/kafka/server/KafkaApis.scala:304:
 value timestamp in class PartitionData is deprecated: see corresponding 
Javadoc for more information.
offsetRetention + partitionData.timestamp
^
/x1/jenkins/jenkins-slave/workspace/kafka-0.10.0-jdk7/core/src/main/scala/kafka/tools/ConsoleProducer.scala:43:
 class OldProducer in package producer is deprecated: This class has been 
deprecated and will be removed in a future release. Please use 
org.apache.kafka.clients.producer.KafkaProducer instead.
new OldProducer(getOldProducerProps(config))
^
/x1/jenkins/jenkins-slave/workspace/kafka-0.10.0-jdk7/core/src/main/scala/kafka/tools/ConsoleProducer.scala:45:
 class NewShinyProducer in package producer is deprecated: This class has been 
deprecated and will be removed in a future release. Please use 
org.apache.kafka.clients.producer.KafkaProducer instead.
new NewShinyProducer(getNewProducerProps(config))
^
14 warnings found
:kafka-0.10.0-jdk7:core:processResources UP-TO-DATE
:kafka-0.10.0-jdk7:core:classes
:kafka-0.10.0-jdk7:clients:compileTestJavaNote: 
/x1/jenkins/jenkins-slave/workspace/kafka-0.10.0-jdk7/clients/src/test/java/org/apache/kafka/common/requests/RequestResponseTest.java
 uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.

:kafka-0.10.0-jdk7:clients:processTestResources
:kafka-0.10.0-jdk7:clients:testClasses
:kafka-0.10.0-jdk7:core:copyDependantLibs
:kafka-0.10.0-jdk7:core:copyDependantTestLibs
:kafka-0.10.0-jdk7:core:jar
:jar_core_2_11
Building project 'core' with Scala version 2.11.8
:kafka-0.10.0-jdk7:clients:compileJava UP-TO-DATE
:kafka-0.10.0-jdk7:clients:processResources UP-TO-DATE
:kafka-0.10.0-jdk7:clients:classes UP-TO-DATE
:kafka-0.10.0-jdk7:clients:determineCommitId UP-TO-DATE
:kafka-0.10.0-jdk7:clients:createVersionFile
:kafka-0.10.0-jdk7:clients:jar UP-TO-DATE
:kafka-0.10.0-jdk7:core:compileJava UP-TO-DATE
:kafka-0.10.0-jdk7:core:compileScala
/x1/jenkins/jenkins-slave/workspace/kafka-0.10.0-jdk7/core/src/main/scala/kafka/api/OffsetCommitRequest.scala:79:
 value DEFAULT_TIMESTAMP in object OffsetCommitRequest is deprecated: see 
corresponding Javadoc for more information.

org.apache.kafka.common.requests.OffsetCommitRequest.DEFAULT_TIMESTAMP
 ^
/x1/jenkins/jenkins-slave/workspace/kafka-0.10.0-jdk7/core/src/main/scala/kafka/common/OffsetMetadataAndError.scala:36:
 value DEFAULT_TIMESTAMP in object OffsetCommitRequest is deprecated: see 
corresponding Javadoc for more information.
 commitTimestamp: Long = 
org.apache.kafka.common.requests.OffsetCommitRequest.DEFAULT_TIMESTAMP,

  ^
/x1/jenkins/jenkins-slave/workspace/kafka-0.10.0-jdk7/core/src/main/scala/kafka/common/OffsetMetadataAndError.scala:37:
 value DEFAULT_TIMESTAMP in object OffsetCommitRequest is deprecated: see 
corresponding Javadoc for more information.
 expireTimestamp: Long = 
org.apache.kafka.common.requests.OffsetCommitRequest.DEFAULT_TIMESTAMP) {

  ^
/x1/jenkins/jenkins-slave/workspace/kafka-0.10.0-jdk7/core/src/main/scala/kafka/coordinator/GroupMetadataManager.scala:401:
 value DEFAULT_TIMESTAMP in object OffsetCommitRequest is deprecated: see 
corresponding Javadoc 

Build failed in Jenkins: kafka-trunk-jdk8 #629

2016-05-16 Thread Apache Jenkins Server
See 

Changes:

[ismael] MINOR: document increased network bandwidth of 0.10 under replication

--
Started by an SCM change
[EnvInject] - Loading node environment variables.
Building remotely on ubuntu-us1 (Ubuntu ubuntu ubuntu-us golang-ppa) in 
workspace 
 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url 
 > https://git-wip-us.apache.org/repos/asf/kafka.git # timeout=10
Fetching upstream changes from https://git-wip-us.apache.org/repos/asf/kafka.git
 > git --version # timeout=10
 > git -c core.askpass=true fetch --tags --progress 
 > https://git-wip-us.apache.org/repos/asf/kafka.git 
 > +refs/heads/*:refs/remotes/origin/*
 > git rev-parse refs/remotes/origin/trunk^{commit} # timeout=10
 > git rev-parse refs/remotes/origin/origin/trunk^{commit} # timeout=10
Checking out Revision 62985f313f4b66d3810b05650d2a3f42b6cb054e 
(refs/remotes/origin/trunk)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 62985f313f4b66d3810b05650d2a3f42b6cb054e
 > git rev-list 7ded19a29ec140de93d57a9eb01722e6a8f2012a # timeout=10
Setting 
GRADLE_2_4_RC_2_HOME=/home/jenkins/jenkins-slave/tools/hudson.plugins.gradle.GradleInstallation/Gradle_2.4-rc-2
Setting 
JDK1_8_0_66_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/jdk1.8.0_66
[kafka-trunk-jdk8] $ /bin/bash -xe /tmp/hudson7681099949110479923.sh
+ 
/home/jenkins/jenkins-slave/tools/hudson.plugins.gradle.GradleInstallation/Gradle_2.4-rc-2/bin/gradle
To honour the JVM settings for this build a new JVM will be forked. Please 
consider using the daemon: 
http://gradle.org/docs/2.4-rc-2/userguide/gradle_daemon.html.
Building project 'core' with Scala version 2.10.6
:downloadWrapper

BUILD SUCCESSFUL

Total time: 20.08 secs
Setting 
GRADLE_2_4_RC_2_HOME=/home/jenkins/jenkins-slave/tools/hudson.plugins.gradle.GradleInstallation/Gradle_2.4-rc-2
Setting 
JDK1_8_0_66_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/jdk1.8.0_66
[kafka-trunk-jdk8] $ /bin/bash -xe /tmp/hudson5302167894172970418.sh
+ export GRADLE_OPTS=-Xmx1024m
+ GRADLE_OPTS=-Xmx1024m
+ ./gradlew -Dorg.gradle.project.maxParallelForks=1 clean jarAll testAll
To honour the JVM settings for this build a new JVM will be forked. Please 
consider using the daemon: 
https://docs.gradle.org/2.13/userguide/gradle_daemon.html.
Building project 'core' with Scala version 2.10.6
Build file ': 
line 230
useAnt has been deprecated and is scheduled to be removed in Gradle 3.0. The 
Ant-Based Scala compiler is deprecated, please see 
https://docs.gradle.org/current/userguide/scala_plugin.html.
:clean UP-TO-DATE
:clients:clean
:connect:clean UP-TO-DATE
:core:clean
:examples:clean
:log4j-appender:clean
:streams:clean
:tools:clean
:connect:api:clean
:connect:file:clean
:connect:json:clean
:connect:runtime:clean
:streams:examples:clean
:jar_core_2_10
Building project 'core' with Scala version 2.10.6
:kafka-trunk-jdk8:clients:compileJava
:jar_core_2_10 FAILED

FAILURE: Build failed with an exception.

* What went wrong:
Failed to capture snapshot of input files for task 'compileJava' during 
up-to-date check.
> Could not add entry 
> '/home/jenkins/.gradle/caches/modules-2/files-2.1/net.jpountz.lz4/lz4/1.3.0/c708bb2590c0652a642236ef45d9f99ff842a2ce/lz4-1.3.0.jar'
>  to cache fileHashes.bin 
> (

* Try:
Run with --stacktrace option to get the stack trace. Run with --info or --debug 
option to get more log output.

BUILD FAILED

Total time: 21.475 secs
Build step 'Execute shell' marked build as failure
Recording test results
Setting 
GRADLE_2_4_RC_2_HOME=/home/jenkins/jenkins-slave/tools/hudson.plugins.gradle.GradleInstallation/Gradle_2.4-rc-2
Setting 
JDK1_8_0_66_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/jdk1.8.0_66
ERROR: Step ‘Publish JUnit test result report’ failed: No test report files 
were found. Configuration error?
Setting 
GRADLE_2_4_RC_2_HOME=/home/jenkins/jenkins-slave/tools/hudson.plugins.gradle.GradleInstallation/Gradle_2.4-rc-2
Setting 
JDK1_8_0_66_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/jdk1.8.0_66


[GitHub] kafka pull request: MINOR: document increased network bandwidth of...

2016-05-16 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/1389


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: 0.9.0.1 - KafkaProducer keeps requesting for metadata of topic which is no longer present, leads to never ending UNKNOWN_TOPIC_OR_PARTITION logs

2016-05-16 Thread Jaikiran Pai

On Monday 16 May 2016 04:52 PM, Rajini Sivaram wrote:

Hi Jaikiran,

1) If you delete a topic with no outstanding sends, you will see one WARN
log entry and the topic is immediately removed from the metadata. If you
delete a topic with outstanding sends, you will see a sequence of WARN logs
until the outstanding send requests expire


That's reasonable.


2) The 0.10.0.0 release is imminent and I think it would be too late to
include this.


Fair enough.


Thank you for the quick and clear response.

-Jaikiran




On Mon, May 16, 2016 at 11:59 AM, Jaikiran Pai 
wrote:


Thank you for looking into this Rajini.

A few questions related to this:

1) Looking at that patch, it looks like that WARN message will still be
logged for a "few" times till that topic is removed from the Set maintained
in the Metadata. Is my understanding correct? If yes, would it be possible
to log that message at DEBUG (or lower level) if the producer send is
initiated for a different topic and not for the topic which isn't existent?
That was the application where this gets logged doesn't really have to be
bothered by it. It does make sense to error out or WARN about the missing
topic only if the send was issued against that topic.

2) I see that the JIRA https://issues.apache.org/jira/browse/KAFKA-2948
is marked for 0.10.1.0. I haven't been following the Kafka release plans.
Does this mean that there won't be a 0.9.x release any more? I see a
0.10.0.0 release candidate being voted up. Is 0.10.0.0 the next planned
release? If yes, is there any chance this one can make into that release?
Or will we have to wait for 0.10.1.0 for this to be available in a release?
If that's the case, our application might have to change the log level
setting in our logging configuration for NetworkClient to use ERROR level
since this message completely messes up the entire application logs. I had
a quick look at the NetworkClient code on 0.9.0 branch
https://github.com/apache/kafka/blob/0.9.0/clients/src/main/java/org/apache/kafka/clients/NetworkClient.java
and I think we should be fine if we bumped up the log level of that class
in our logging config to be ERROR, since it doesn't log anything real
useful below that level from that class.


-Jaikiran


On Monday 16 May 2016 02:30 PM, Rajini Sivaram wrote:


Sorry, that was the wrong JIRA.
https://issues.apache.org/jira/browse/KAFKA-2948 is the one which
addresses
this issue.

On Mon, May 16, 2016 at 7:52 AM, Rajini Sivaram <
rajinisiva...@googlemail.com> wrote:

There is an open JIRA for this issue (

https://issues.apache.org/jira/browse/KAFKA-3065). The PR is quite old
and needs rebasing. I will take a look at it today.

On Sun, May 15, 2016 at 6:14 AM, Jaikiran Pai 
wrote:

Hello Kafka team,


We are using 0.9.0.1 (latest stable) of Kafka server and client
libraries. We use Java client for communicating with the Kafka
installation. Our simple application uses a single instance of
KafkaProducer (since the javadoc of that class states it's thread safe
and
recommended to be shared across the threads) for the lifetime of our
application runtime. We seem to be running into a potential issue in the
Kafka producer and the way it expects metadata for topics which it had
communicated before but are no longer around.

The use case we have, where we run into this issue is as follows:

1. Our application is sent the name of the topic to which the
application
sends a message using the KafkaProducer
2. The topics is short lived and after a while the topic is deleted via
Kafka tools externally
3. Our application continues to run and the next time it receives a
request to send a message to _some other topic_, it ends up running
into an
issue where it endlessly floods the logs with messages:

10:17:53,245  WARN [NetworkClient] - Error while fetching metadata with
correlation id 122 :
{name-of-the-topic-that-got-deleted=UNKNOWN_TOPIC_OR_PARTITION}
10:17:53,347  WARN [NetworkClient] - Error while fetching metadata with
correlation id 123 :
{name-of-the-topic-that-got-deleted=UNKNOWN_TOPIC_OR_PARTITION}
10:17:53,449  WARN [NetworkClient] - Error while fetching metadata with
correlation id 124 :
{name-of-the-topic-that-got-deleted=UNKNOWN_TOPIC_OR_PARTITION}

It appears that the KafkaProducer wants to get hold of the metadata for
this topic which we deleted externally and which we have no intention to
communicate to anymore. These logs never stop, till we bring down the
application or close that producer instance.

This looks like an issue to me since the producer should either be aware
that the topic got deleted and no longer request for metadata (unless
there
is an explicit call to send some message to it from the user
application)
or it should just ignore the fact that the metadata for this topic isn't
there and move on without logging these logs (unless, again, there is an
explicit call to send some message to the deleted topic, from the user
application).

Looking 

Re: 0.9.0.1 - KafkaProducer keeps requesting for metadata of topic which is no longer present, leads to never ending UNKNOWN_TOPIC_OR_PARTITION logs

2016-05-16 Thread Rajini Sivaram
Hi Jaikiran,

1) If you delete a topic with no outstanding sends, you will see one WARN
log entry and the topic is immediately removed from the metadata. If you
delete a topic with outstanding sends, you will see a sequence of WARN logs
until the outstanding send requests expire ("max.block.ms" property of the
producer).

2) The 0.10.0.0 release is imminent and I think it would be too late to
include this.


On Mon, May 16, 2016 at 11:59 AM, Jaikiran Pai 
wrote:

> Thank you for looking into this Rajini.
>
> A few questions related to this:
>
> 1) Looking at that patch, it looks like that WARN message will still be
> logged for a "few" times till that topic is removed from the Set maintained
> in the Metadata. Is my understanding correct? If yes, would it be possible
> to log that message at DEBUG (or lower level) if the producer send is
> initiated for a different topic and not for the topic which isn't existent?
> That was the application where this gets logged doesn't really have to be
> bothered by it. It does make sense to error out or WARN about the missing
> topic only if the send was issued against that topic.
>
> 2) I see that the JIRA https://issues.apache.org/jira/browse/KAFKA-2948
> is marked for 0.10.1.0. I haven't been following the Kafka release plans.
> Does this mean that there won't be a 0.9.x release any more? I see a
> 0.10.0.0 release candidate being voted up. Is 0.10.0.0 the next planned
> release? If yes, is there any chance this one can make into that release?
> Or will we have to wait for 0.10.1.0 for this to be available in a release?
> If that's the case, our application might have to change the log level
> setting in our logging configuration for NetworkClient to use ERROR level
> since this message completely messes up the entire application logs. I had
> a quick look at the NetworkClient code on 0.9.0 branch
> https://github.com/apache/kafka/blob/0.9.0/clients/src/main/java/org/apache/kafka/clients/NetworkClient.java
> and I think we should be fine if we bumped up the log level of that class
> in our logging config to be ERROR, since it doesn't log anything real
> useful below that level from that class.
>
>
> -Jaikiran
>
>
> On Monday 16 May 2016 02:30 PM, Rajini Sivaram wrote:
>
>> Sorry, that was the wrong JIRA.
>> https://issues.apache.org/jira/browse/KAFKA-2948 is the one which
>> addresses
>> this issue.
>>
>> On Mon, May 16, 2016 at 7:52 AM, Rajini Sivaram <
>> rajinisiva...@googlemail.com> wrote:
>>
>> There is an open JIRA for this issue (
>>> https://issues.apache.org/jira/browse/KAFKA-3065). The PR is quite old
>>> and needs rebasing. I will take a look at it today.
>>>
>>> On Sun, May 15, 2016 at 6:14 AM, Jaikiran Pai 
>>> wrote:
>>>
>>> Hello Kafka team,


 We are using 0.9.0.1 (latest stable) of Kafka server and client
 libraries. We use Java client for communicating with the Kafka
 installation. Our simple application uses a single instance of
 KafkaProducer (since the javadoc of that class states it's thread safe
 and
 recommended to be shared across the threads) for the lifetime of our
 application runtime. We seem to be running into a potential issue in the
 Kafka producer and the way it expects metadata for topics which it had
 communicated before but are no longer around.

 The use case we have, where we run into this issue is as follows:

 1. Our application is sent the name of the topic to which the
 application
 sends a message using the KafkaProducer
 2. The topics is short lived and after a while the topic is deleted via
 Kafka tools externally
 3. Our application continues to run and the next time it receives a
 request to send a message to _some other topic_, it ends up running
 into an
 issue where it endlessly floods the logs with messages:

 10:17:53,245  WARN [NetworkClient] - Error while fetching metadata with
 correlation id 122 :
 {name-of-the-topic-that-got-deleted=UNKNOWN_TOPIC_OR_PARTITION}
 10:17:53,347  WARN [NetworkClient] - Error while fetching metadata with
 correlation id 123 :
 {name-of-the-topic-that-got-deleted=UNKNOWN_TOPIC_OR_PARTITION}
 10:17:53,449  WARN [NetworkClient] - Error while fetching metadata with
 correlation id 124 :
 {name-of-the-topic-that-got-deleted=UNKNOWN_TOPIC_OR_PARTITION}

 It appears that the KafkaProducer wants to get hold of the metadata for
 this topic which we deleted externally and which we have no intention to
 communicate to anymore. These logs never stop, till we bring down the
 application or close that producer instance.

 This looks like an issue to me since the producer should either be aware
 that the topic got deleted and no longer request for metadata (unless
 there
 is an explicit call to send some message to it from the user
 application)
 or it should just 

Re: 0.9.0.1 - KafkaProducer keeps requesting for metadata of topic which is no longer present, leads to never ending UNKNOWN_TOPIC_OR_PARTITION logs

2016-05-16 Thread Jaikiran Pai

Thank you for looking into this Rajini.

A few questions related to this:

1) Looking at that patch, it looks like that WARN message will still be 
logged for a "few" times till that topic is removed from the Set 
maintained in the Metadata. Is my understanding correct? If yes, would 
it be possible to log that message at DEBUG (or lower level) if the 
producer send is initiated for a different topic and not for the topic 
which isn't existent? That was the application where this gets logged 
doesn't really have to be bothered by it. It does make sense to error 
out or WARN about the missing topic only if the send was issued against 
that topic.


2) I see that the JIRA https://issues.apache.org/jira/browse/KAFKA-2948 
is marked for 0.10.1.0. I haven't been following the Kafka release 
plans. Does this mean that there won't be a 0.9.x release any more? I 
see a 0.10.0.0 release candidate being voted up. Is 0.10.0.0 the next 
planned release? If yes, is there any chance this one can make into that 
release? Or will we have to wait for 0.10.1.0 for this to be available 
in a release? If that's the case, our application might have to change 
the log level setting in our logging configuration for NetworkClient to 
use ERROR level since this message completely messes up the entire 
application logs. I had a quick look at the NetworkClient code on 0.9.0 
branch 
https://github.com/apache/kafka/blob/0.9.0/clients/src/main/java/org/apache/kafka/clients/NetworkClient.java 
and I think we should be fine if we bumped up the log level of that 
class in our logging config to be ERROR, since it doesn't log anything 
real useful below that level from that class.



-Jaikiran

On Monday 16 May 2016 02:30 PM, Rajini Sivaram wrote:

Sorry, that was the wrong JIRA.
https://issues.apache.org/jira/browse/KAFKA-2948 is the one which addresses
this issue.

On Mon, May 16, 2016 at 7:52 AM, Rajini Sivaram <
rajinisiva...@googlemail.com> wrote:


There is an open JIRA for this issue (
https://issues.apache.org/jira/browse/KAFKA-3065). The PR is quite old
and needs rebasing. I will take a look at it today.

On Sun, May 15, 2016 at 6:14 AM, Jaikiran Pai 
wrote:


Hello Kafka team,


We are using 0.9.0.1 (latest stable) of Kafka server and client
libraries. We use Java client for communicating with the Kafka
installation. Our simple application uses a single instance of
KafkaProducer (since the javadoc of that class states it's thread safe and
recommended to be shared across the threads) for the lifetime of our
application runtime. We seem to be running into a potential issue in the
Kafka producer and the way it expects metadata for topics which it had
communicated before but are no longer around.

The use case we have, where we run into this issue is as follows:

1. Our application is sent the name of the topic to which the application
sends a message using the KafkaProducer
2. The topics is short lived and after a while the topic is deleted via
Kafka tools externally
3. Our application continues to run and the next time it receives a
request to send a message to _some other topic_, it ends up running into an
issue where it endlessly floods the logs with messages:

10:17:53,245  WARN [NetworkClient] - Error while fetching metadata with
correlation id 122 :
{name-of-the-topic-that-got-deleted=UNKNOWN_TOPIC_OR_PARTITION}
10:17:53,347  WARN [NetworkClient] - Error while fetching metadata with
correlation id 123 :
{name-of-the-topic-that-got-deleted=UNKNOWN_TOPIC_OR_PARTITION}
10:17:53,449  WARN [NetworkClient] - Error while fetching metadata with
correlation id 124 :
{name-of-the-topic-that-got-deleted=UNKNOWN_TOPIC_OR_PARTITION}

It appears that the KafkaProducer wants to get hold of the metadata for
this topic which we deleted externally and which we have no intention to
communicate to anymore. These logs never stop, till we bring down the
application or close that producer instance.

This looks like an issue to me since the producer should either be aware
that the topic got deleted and no longer request for metadata (unless there
is an explicit call to send some message to it from the user application)
or it should just ignore the fact that the metadata for this topic isn't
there and move on without logging these logs (unless, again, there is an
explicit call to send some message to the deleted topic, from the user
application).

Looking at the code in the Kafka, it appears that the "topics" set gets
added with the topic name of the topic to which a communication is
established by the KafkaProducer. Once added, that topic name, never gets
removed from that set for the lifetime of that producer, even in cases like
these where the topic is deleted and never again used with that producer.

Do you think this is a bug in the Kafka code? I have a simple application
which reproduces this easily on a 0.9.0.1 setup here
https://gist.github.com/jaikiran/45e9ce510c259267b28821b84105a25a.

Let me know if you need more 

[jira] [Assigned] (KAFKA-3282) Change tools to use --new-consumer by default and introduce --old-consumer

2016-05-16 Thread Arun Mahadevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun Mahadevan reassigned KAFKA-3282:
-

Assignee: Arun Mahadevan

> Change tools to use --new-consumer by default and introduce --old-consumer
> --
>
> Key: KAFKA-3282
> URL: https://issues.apache.org/jira/browse/KAFKA-3282
> Project: Kafka
>  Issue Type: Improvement
>  Components: tools
>Reporter: Ismael Juma
>Assignee: Arun Mahadevan
> Fix For: 0.10.1.0
>
>
> This only applies to tools that support the new consumer and it's similar to 
> what we did with the producer for 0.9.0.0.
> Part of this JIRA is updating the documentation to remove `--new-consumer` 
> from command invocations where appropriate. An example where this will be the 
> case is in the security documentation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: 0.9.0.1 - KafkaProducer keeps requesting for metadata of topic which is no longer present, leads to never ending UNKNOWN_TOPIC_OR_PARTITION logs

2016-05-16 Thread Rajini Sivaram
Sorry, that was the wrong JIRA.
https://issues.apache.org/jira/browse/KAFKA-2948 is the one which addresses
this issue.

On Mon, May 16, 2016 at 7:52 AM, Rajini Sivaram <
rajinisiva...@googlemail.com> wrote:

> There is an open JIRA for this issue (
> https://issues.apache.org/jira/browse/KAFKA-3065). The PR is quite old
> and needs rebasing. I will take a look at it today.
>
> On Sun, May 15, 2016 at 6:14 AM, Jaikiran Pai 
> wrote:
>
>> Hello Kafka team,
>>
>>
>> We are using 0.9.0.1 (latest stable) of Kafka server and client
>> libraries. We use Java client for communicating with the Kafka
>> installation. Our simple application uses a single instance of
>> KafkaProducer (since the javadoc of that class states it's thread safe and
>> recommended to be shared across the threads) for the lifetime of our
>> application runtime. We seem to be running into a potential issue in the
>> Kafka producer and the way it expects metadata for topics which it had
>> communicated before but are no longer around.
>>
>> The use case we have, where we run into this issue is as follows:
>>
>> 1. Our application is sent the name of the topic to which the application
>> sends a message using the KafkaProducer
>> 2. The topics is short lived and after a while the topic is deleted via
>> Kafka tools externally
>> 3. Our application continues to run and the next time it receives a
>> request to send a message to _some other topic_, it ends up running into an
>> issue where it endlessly floods the logs with messages:
>>
>> 10:17:53,245  WARN [NetworkClient] - Error while fetching metadata with
>> correlation id 122 :
>> {name-of-the-topic-that-got-deleted=UNKNOWN_TOPIC_OR_PARTITION}
>> 10:17:53,347  WARN [NetworkClient] - Error while fetching metadata with
>> correlation id 123 :
>> {name-of-the-topic-that-got-deleted=UNKNOWN_TOPIC_OR_PARTITION}
>> 10:17:53,449  WARN [NetworkClient] - Error while fetching metadata with
>> correlation id 124 :
>> {name-of-the-topic-that-got-deleted=UNKNOWN_TOPIC_OR_PARTITION}
>>
>> It appears that the KafkaProducer wants to get hold of the metadata for
>> this topic which we deleted externally and which we have no intention to
>> communicate to anymore. These logs never stop, till we bring down the
>> application or close that producer instance.
>>
>> This looks like an issue to me since the producer should either be aware
>> that the topic got deleted and no longer request for metadata (unless there
>> is an explicit call to send some message to it from the user application)
>> or it should just ignore the fact that the metadata for this topic isn't
>> there and move on without logging these logs (unless, again, there is an
>> explicit call to send some message to the deleted topic, from the user
>> application).
>>
>> Looking at the code in the Kafka, it appears that the "topics" set gets
>> added with the topic name of the topic to which a communication is
>> established by the KafkaProducer. Once added, that topic name, never gets
>> removed from that set for the lifetime of that producer, even in cases like
>> these where the topic is deleted and never again used with that producer.
>>
>> Do you think this is a bug in the Kafka code? I have a simple application
>> which reproduces this easily on a 0.9.0.1 setup here
>> https://gist.github.com/jaikiran/45e9ce510c259267b28821b84105a25a.
>>
>> Let me know if you need more details about this.
>>
>> -Jaikiran
>>
>
>
>
> --
> Regards,
>
> Rajini
>



-- 
Regards,

Rajini


Re: 0.9.0.1 - KafkaProducer keeps requesting for metadata of topic which is no longer present, leads to never ending UNKNOWN_TOPIC_OR_PARTITION logs

2016-05-16 Thread Rajini Sivaram
There is an open JIRA for this issue (
https://issues.apache.org/jira/browse/KAFKA-3065). The PR is quite old and
needs rebasing. I will take a look at it today.

On Sun, May 15, 2016 at 6:14 AM, Jaikiran Pai 
wrote:

> Hello Kafka team,
>
>
> We are using 0.9.0.1 (latest stable) of Kafka server and client libraries.
> We use Java client for communicating with the Kafka installation. Our
> simple application uses a single instance of KafkaProducer (since the
> javadoc of that class states it's thread safe and recommended to be shared
> across the threads) for the lifetime of our application runtime. We seem to
> be running into a potential issue in the Kafka producer and the way it
> expects metadata for topics which it had communicated before but are no
> longer around.
>
> The use case we have, where we run into this issue is as follows:
>
> 1. Our application is sent the name of the topic to which the application
> sends a message using the KafkaProducer
> 2. The topics is short lived and after a while the topic is deleted via
> Kafka tools externally
> 3. Our application continues to run and the next time it receives a
> request to send a message to _some other topic_, it ends up running into an
> issue where it endlessly floods the logs with messages:
>
> 10:17:53,245  WARN [NetworkClient] - Error while fetching metadata with
> correlation id 122 :
> {name-of-the-topic-that-got-deleted=UNKNOWN_TOPIC_OR_PARTITION}
> 10:17:53,347  WARN [NetworkClient] - Error while fetching metadata with
> correlation id 123 :
> {name-of-the-topic-that-got-deleted=UNKNOWN_TOPIC_OR_PARTITION}
> 10:17:53,449  WARN [NetworkClient] - Error while fetching metadata with
> correlation id 124 :
> {name-of-the-topic-that-got-deleted=UNKNOWN_TOPIC_OR_PARTITION}
>
> It appears that the KafkaProducer wants to get hold of the metadata for
> this topic which we deleted externally and which we have no intention to
> communicate to anymore. These logs never stop, till we bring down the
> application or close that producer instance.
>
> This looks like an issue to me since the producer should either be aware
> that the topic got deleted and no longer request for metadata (unless there
> is an explicit call to send some message to it from the user application)
> or it should just ignore the fact that the metadata for this topic isn't
> there and move on without logging these logs (unless, again, there is an
> explicit call to send some message to the deleted topic, from the user
> application).
>
> Looking at the code in the Kafka, it appears that the "topics" set gets
> added with the topic name of the topic to which a communication is
> established by the KafkaProducer. Once added, that topic name, never gets
> removed from that set for the lifetime of that producer, even in cases like
> these where the topic is deleted and never again used with that producer.
>
> Do you think this is a bug in the Kafka code? I have a simple application
> which reproduces this easily on a 0.9.0.1 setup here
> https://gist.github.com/jaikiran/45e9ce510c259267b28821b84105a25a.
>
> Let me know if you need more details about this.
>
> -Jaikiran
>



-- 
Regards,

Rajini


Regarding kafka-rest plugin

2016-05-16 Thread basavaraj
Hi team Kafka,

 

Please let me know is their any Kafka rest service plug-in available. If
their please share the Details. 

 

I went through some github code but that  are not working.

 

 

Thanks & Regards, 
Basavaraj 



 



Re: [VOTE] 0.10.0.0 RC4

2016-05-16 Thread Gwen Shapira
Thanks, man!

Good to see Heroku being good friends to the Kafka community by
testing new releases, reporting issues and following up with the cause
and documentation pr.

With this out of the way, we closed all known blockers for 0.10.0.0.
I'll roll out a new RC tomorrow morning.

Gwen

On Sun, May 15, 2016 at 1:27 PM, Tom Crayford  wrote:
> https://github.com/apache/kafka/pull/1389
>
> On Sun, May 15, 2016 at 9:22 PM, Ismael Juma  wrote:
>
>> Hi Tom,
>>
>> Great to hear that the failure testing scenario went well. :)
>>
>> Your suggested improvement sounds good to me and a PR would be great. For
>> this kind of change, you can skip the JIRA, just prefix the PR title with
>> `MINOR:`.
>>
>> Thanks,
>> Ismael
>>
>> On Sun, May 15, 2016 at 9:17 PM, Tom Crayford 
>> wrote:
>>
>> > How about this?
>> >
>> > Note: Due to the additional timestamp introduced in each
>> message
>> > (8 bytes of data), producers sending small messages may see a
>> > message throughput degradation because of the increased overhead.
>> > Likewise, replication now transmits an additional 8 bytes per message.
>> > If you're running close to the network capacity of your cluster, it's
>> > possible that you'll overwhelm the network cards and see failures and
>> > performance
>> > issues due to the overload.
>> > When receiving compressed messages, 0.10.0
>> > brokers avoid recompressing the messages, which in general reduces
>> the
>> > latency and improves the throughput. In
>> > certain cases, this may reduce the batching size on the producer,
>> which
>> > could lead to worse throughput. If this
>> > happens, users can tune linger.ms and batch.size of the producer for
>> > better throughput.
>> >
>> > Would you like a Jira/PR with this kind of change so we can discuss them
>> in
>> > a more convenient format?
>> >
>> > Re our failure testing scenario: Kafka 0.10 RC behaves exactly the same
>> > under failure as 0.9 - the controller typically shifts the leader in
>> around
>> > 2 seconds or so, and the benchmark sees a small drop in throughput during
>> > that, then another drop whilst the replacement broker comes back to
>> speed.
>> > So, overall we're extremely happy and excited for this release! Thanks to
>> > the committers and maintainers for all their hard work.
>> >
>> > On Sun, May 15, 2016 at 9:03 PM, Ismael Juma  wrote:
>> >
>> > > Hi Tom,
>> > >
>> > > Thanks for the update and for all the testing you have done! No worries
>> > > about the chase here, I'd much rather have false positives by people
>> who
>> > > are validating the releases than false negatives because people don't
>> > > validate the releases. :)
>> > >
>> > > The upgrade note we currently have follows:
>> > >
>> > > https://github.com/apache/kafka/blob/0.10.0/docs/upgrade.html#L67
>> > >
>> > > Please feel free to suggest improvements.
>> > >
>> > > Thanks,
>> > > Ismael
>> > >
>> > > On Sun, May 15, 2016 at 6:39 PM, Tom Crayford 
>> > > wrote:
>> > >
>> > > > I've been digging into this some more. It seems like this may have
>> been
>> > > an
>> > > > issue with benchmarks maxing out the network card - under 0.10.0.0-RC
>> > the
>> > > > slightly additional bandwidth per message seems to have pushed the
>> > > broker's
>> > > > NIC into overload territory where it starts dropping packets
>> (verified
>> > > with
>> > > > ifconfig on each broker). This leads to it not being able to talk to
>> > > > Zookeeper properly, which leads to OfflinePartitions, which then
>> causes
>> > > > issues with the benchmarks validity, as throughput drops a lot when
>> > > brokers
>> > > > are flapping in and out of being online. 0.9.0.1 doing that 8 bytes
>> > less
>> > > > per message means the broker's NIC can sustain more messages/s. There
>> > was
>> > > > an "alignment" issue with the benchmarks here - under 0.9 we were
>> > *just*
>> > > at
>> > > > the barrier of the broker's NICs sustaining traffic, and under 0.10
>> we
>> > > > pushed over that (at 1.5 million messages/s, 8 bytes extra per
>> message
>> > is
>> > > > an extra 36 MB/s with replication factor 3 [if my math is right, and
>> > > that's
>> > > > before SSL encryption which may be additional overhead], which is as
>> > much
>> > > > as an additional producer machine).
>> > > >
>> > > > The dropped packets and the flapping weren't causing notable timeout
>> > > issues
>> > > > in the producer, but looking at the metrics on the brokers, offline
>> > > > partitions was clearly triggered and undergoing, and the broker logs
>> > show
>> > > > ZK session timeouts. This is consistent with earlier benchmarking
>> > > > experience - the number of producers we were running under 0.9.0.1
>> was
>> > > > carefully selected to be just under the limit here.
>> > > >
>> > > > The other issue with the benchmark where I reported an issue between
>> > two
>> > > > single producers was