Re: [ANNOUNCE] New committer: Jason Gustafson

2016-09-09 Thread Aditya Auradkar
Congrats Jason.

On Fri, Sep 9, 2016 at 5:30 AM, Michael Noll  wrote:

> My compliments, Jason -- well deserved! :-)
>
> -Michael
>
>
>
> On Wed, Sep 7, 2016 at 6:49 PM, Grant Henke  wrote:
>
> > Congratulations and thank you for all of your contributions to Apache
> > Kafka Jason!
> >
> > On Wed, Sep 7, 2016 at 10:12 AM, Mayuresh Gharat <
> > gharatmayures...@gmail.com
> > > wrote:
> >
> > > congrats Jason !
> > >
> > > Thanks,
> > >
> > > Mayuresh
> > >
> > > On Wed, Sep 7, 2016 at 5:16 AM, Eno Thereska 
> > > wrote:
> > >
> > > > Congrats Jason!
> > > >
> > > > Eno
> > > > > On 7 Sep 2016, at 10:00, Rajini Sivaram <
> > rajinisiva...@googlemail.com>
> > > > wrote:
> > > > >
> > > > > Congrats, Jason!
> > > > >
> > > > > On Wed, Sep 7, 2016 at 8:29 AM, Flavio P JUNQUEIRA  >
> > > > wrote:
> > > > >
> > > > >> Congrats, Jason. Well done and great to see this project inviting
> > new
> > > > >> committers.
> > > > >>
> > > > >> -Flavio
> > > > >>
> > > > >> On 7 Sep 2016 04:58, "Ashish Singh"  wrote:
> > > > >>
> > > > >>> Congrats, Jason!
> > > > >>>
> > > > >>> On Tuesday, September 6, 2016, Jason Gustafson <
> ja...@confluent.io
> > >
> > > > >> wrote:
> > > > >>>
> > > >  Thanks all!
> > > > 
> > > >  On Tue, Sep 6, 2016 at 5:13 PM, Becket Qin <
> becket@gmail.com
> > > >  > wrote:
> > > > 
> > > > > Congrats, Jason!
> > > > >
> > > > > On Tue, Sep 6, 2016 at 5:09 PM, Onur Karaman
> > > >   > > > >>
> > > > > wrote:
> > > > >
> > > > >> congrats jason!
> > > > >>
> > > > >> On Tue, Sep 6, 2016 at 4:12 PM, Sriram Subramanian <
> > > > >> r...@confluent.io
> > > >  >
> > > > >> wrote:
> > > > >>
> > > > >>> Congratulations Jason!
> > > > >>>
> > > > >>> On Tue, Sep 6, 2016 at 3:40 PM, Vahid S Hashemian <
> > > > >>> vahidhashem...@us.ibm.com 
> > > >  wrote:
> > > > >>>
> > > >  Congratulations Jason on this very well deserved
> recognition.
> > > > 
> > > >  --Vahid
> > > > 
> > > > 
> > > > 
> > > >  From:   Neha Narkhede >
> > > >  To: "dev@kafka.apache.org " <
> > > >  dev@kafka.apache.org >,
> > > >  "us...@kafka.apache.org " <
> > > > >> us...@kafka.apache.org
> > > >  >
> > > >  Cc: "priv...@kafka.apache.org " <
> > > >  priv...@kafka.apache.org >
> > > >  Date:   09/06/2016 03:26 PM
> > > >  Subject:[ANNOUNCE] New committer: Jason Gustafson
> > > > 
> > > > 
> > > > 
> > > >  The PMC for Apache Kafka has invited Jason Gustafson to join
> > > > >> as a
> > > >  committer and
> > > >  we are pleased to announce that he has accepted!
> > > > 
> > > >  Jason has contributed numerous patches to a wide range of
> > > > >> areas,
> > > > >> notably
> > > >  within the new consumer and the Kafka Connect layers. He has
> > > > > displayed
> > > >  great taste and judgement which has been apparent through
> his
> > > > >> involvement
> > > >  across the board from mailing lists, JIRA, code reviews to
> > > > > contributing
> > > >  features, bug fixes and code and documentation improvements.
> > > > 
> > > >  Thank you for your contribution and welcome to Apache Kafka,
> > > > >>> Jason!
> > > >  --
> > > >  Thanks,
> > > >  Neha
> > > > 
> > > > 
> > > > 
> > > > 
> > > > 
> > > > >>>
> > > > >>
> > > > >
> > > > 
> > > > >>>
> > > > >>>
> > > > >>> --
> > > > >>> Ashish h
> > > > >>>
> > > > >>
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Regards,
> > > > >
> > > > > Rajini
> > > >
> > > >
> > >
> > >
> > > --
> > > -Regards,
> > > Mayuresh R. Gharat
> > > (862) 250-7125
> > >
> >
> >
> >
> > --
> > Grant Henke
> > Software Engineer | Cloudera
> > gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke
> >
>


Re: [DISCUSS] KIP-55: Secure quotas for authenticated users

2016-05-25 Thread Aditya Auradkar
Hey Rajini -

If the quota.type is set to 'user', what happens to unauthenticated
clients? They don't supply a principal, so are they essentially
unthrottled?

This may be a nit, but I prefer 'quota.type' options to be
'authenticated-user' and 'client-id' as opposed to 'client' and 'user'. For
a new user, the options 'client' and 'user' sound essentially the same.

Aditya

On Tue, May 24, 2016 at 5:55 AM, Rajini Sivaram <
rajinisiva...@googlemail.com> wrote:

> Jun,
>
> I have updated the KIP based on your suggestion. Can you take a look?
>
> Thank you,
>
> Rajini
>
> On Tue, May 24, 2016 at 11:20 AM, Rajini Sivaram <
> rajinisiva...@googlemail.com> wrote:
>
> > Jun,
> >
> > Thank you for the review. I agree that a simple user principal based
> quota
> > is sufficient to allocate broker resources fairly in a multi-user system.
> > Hierarchical quotas proposed in the KIP currently enables clients of a
> user
> > to be rate-limited as well. This allows a user to run multiple clients
> > which don't interfere with each other's quotas. Since there is no clear
> > requirement to support this at the moment, I am happy to limit the scope
> of
> > the KIP to a single-level user-based quota. Will update the KIP today.
> >
> > Regards,
> >
> > Rajini
> >
> > On Mon, May 23, 2016 at 5:24 PM, Jun Rao  wrote:
> >
> >> Rajini,
> >>
> >> Thanks for the KIP. When we first added the quota support, the intention
> >> was to be able to add a quota per application. Since at that time, we
> >> don't
> >> have security yet. We essentially simulated users with client-ids. Now
> >> that
> >> we do have security. It seems that we just need to have a way to set
> quota
> >> at the user level. Setting quota at the combination of users and
> >> client-ids
> >> seems more complicated and I am not sure if there is a good use case.
> >>
> >> Also, the new config quota.secure.enable seems a bit weird. Would it be
> >> better to add a new config quota.type. It defaults to clientId for
> >> backward
> >> compatibility. If one sets it to user, then the default broker level
> quota
> >> is for users w/o a customized quota. In this setting, brokers will also
> >> only take quota set at the user level (i.e., quota set at clientId level
> >> will be ignored).
> >>
> >> Thanks,
> >>
> >> Jun
> >>
> >> On Tue, May 3, 2016 at 4:32 AM, Rajini Sivaram <
> >> rajinisiva...@googlemail.com
> >> > wrote:
> >>
> >> > Ewen,
> >> >
> >> > Thank you for the review. I agree that ideally we would have one
> >> definition
> >> > of quotas that handles all cases. But I couldn't quite fit all the
> >> > combinations that are possible today with client-id-based quotas into
> >> the
> >> > new configuration. I think upgrade path is not bad since quotas are
> >> > per-broker. You can configure quotas based on the new configuration,
> set
> >> > quota.secure.enable=true and restart the broker. Since there is no
> >> > requirement for both insecure client-id based quotas and secure
> >> user-based
> >> > quotas to co-exist in a cluster, isn't that sufficient? The
> >> implementation
> >> > does use a unified approach, so if an alternative configuration can be
> >> > defined (perhaps with some acceptable limitations?) which can express
> >> both,
> >> > it will be easy to implement. Suggestions welcome :-)
> >> >
> >> > The cases that the new configuration cannot express, but the old one
> can
> >> > are:
> >> >
> >> >1. SSL/SASL with multiple users, same client ids used by multiple
> >> users,
> >> >client-id based quotas where quotas are shared between multiple
> users
> >> >2. Default quotas for client-ids. In the new configuration, default
> >> >quotas are defined for users and clients with no configured
> sub-quota
> >> > share
> >> >the user's quota.
> >> >
> >> >
> >> >
> >> > On Sat, Apr 30, 2016 at 6:21 AM, Ewen Cheslack-Postava <
> >> e...@confluent.io>
> >> > wrote:
> >> >
> >> > > Rajini,
> >> > >
> >> > > I'm admittedly not very familiar with a lot of this code or
> >> > implementation,
> >> > > so correct me if I'm making any incorrect assumptions.
> >> > >
> >> > > I've only scanned the KIP, but my main concern is the rejection of
> the
> >> > > alternative -- unifying client-id and principal quotas. In
> particular,
> >> > > doesn't this make an upgrade for brokers using those different
> >> approaches
> >> > > difficult since you have to make a hard break between client-id and
> >> > > principal quotas? If people adopt client-id quotas to begin with, it
> >> > seems
> >> > > like we might not be providing a clean upgrade path.
> >> > >
> >> > > As I said, I haven't kept up to date with the details of the
> security
> >> and
> >> > > quota features, but I'd want to make sure we didn't suggest one path
> >> with
> >> > > 0.9, then add another that we can't provide a clean upgrade path to.
> >> > >
> >> > > -Ewen
> >> > >
> >> > > On Fri, Apr 22, 2016 at 7:22 AM, Rajini Sivaram <
> >> > > 

[jira] [Commented] (KAFKA-3492) support quota based on authenticated user name

2016-04-01 Thread Aditya Auradkar (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15222349#comment-15222349
 ] 

Aditya Auradkar commented on KAFKA-3492:


Very cool. Jun - are you planning to drive this?

> support quota based on authenticated user name
> --
>
> Key: KAFKA-3492
> URL: https://issues.apache.org/jira/browse/KAFKA-3492
> Project: Kafka
>  Issue Type: New Feature
>  Components: core
>Reporter: Jun Rao
>
> Currently, quota is based on the client.id set in the client configuration, 
> which can be changed easily. Ideally, quota should be set on the 
> authenticated user name. We will need to have a KIP proposal/discussion on 
> this first.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3456) In-house KafkaMetric misreports metrics when periodically observed

2016-03-25 Thread Aditya Auradkar (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15212514#comment-15212514
 ] 

Aditya Auradkar commented on KAFKA-3456:


Accidentally changed the thread title to gibberish :). Changed it back.

I think that the problem (and I'm not convinced it is a problem) is that when 
you have 2 windows, the rate can change significantly when a new window is 
created. You effectively throw away half of your samples and start seeding that 
data again. This can skew measurement. IIUC, the problem isn't that an 
incorrect rate is being reported, it is simply being reported over a 
potentially variable interval. Configuring a larger number of samples will 
reduce the time interval variability and can smooth this significantly. 

Extending your example, if you have 10 windows (6 seconds each), and you 
alternate between 999 and 1 req/sec in each of these samples. Your rate over 60 
seconds will be 500. If you roll over your first sample of 999, the rate 
changes to ~450 which seems closer to what you want?

> In-house KafkaMetric misreports metrics when periodically observed
> --
>
> Key: KAFKA-3456
> URL: https://issues.apache.org/jira/browse/KAFKA-3456
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer, core, producer 
>Affects Versions: 0.9.0.0, 0.9.0.1, 0.10.0.0
>Reporter: The Data Lorax
>Assignee: Neha Narkhede
>Priority: Minor
>
> The metrics captured by Kafka through the in-house {{SampledStat}} suffer 
> from misreporting metrics if observed in a periodic manner.
> Consider a {{Rate}} metric that is using the default 2 samples and 30 second 
> sample window i.e. the {{Rate}} is capturing 60 seconds worth of data.  So, 
> to report this metric to some external system we might poll it every 60 
> seconds to observe the current value. Using a shorter period would, in the 
> case of a {{Rate}}, lead to smoothing of the plotted data, and worse, in the 
> case of a {{Count}}, would lead to double counting - so 60 seconds is the 
> only period at which we can poll the metrics if we are to report accurate 
> metrics.
> To demonstrate the issue consider the following somewhat extreme case:
> The {{Rate}}  is capturing data from a system which alternates between a 999 
> per sec rate and a 1 per sec rate every 30 seconds, with the different rates 
> aligned with the sample boundaries within the {{Rate}} instance i.e. after 60 
> seconds the first sample within the {{Rate}} instance will have a rate of 999 
> per sec, and the second 1 per sec. 
> If we were to ask the metric for its value at this 60 second boundary it 
> would correctly report 500 per sec. However, if we asked it again 1 
> millisecond later it would report 1 per sec, as the first sample window has 
> been aged out. Depending on how retarded into the 60 sec period of the metric 
> our periodic poll of the metric was, we would observe a constant rate 
> somewhere in the range of 1 to 500 per second, most likely around the 250 
> mark. 
> Other metrics based off of the {{SampledStat}} type suffer from the same 
> issue e.g. the {{Count}} metric, given a constant rate of 1 per second, will 
> report a constant count somewhere between 30 and 60, rather than the correct 
> 60.
> This can be seen in the following test code:
> {code:java}
> public class MetricsTest {
> private MetricConfig metricsConfig;
> @Before
> public void setUp() throws Exception {
> metricsConfig = new MetricConfig();
> }
> private long t(final int bucket) {
> return metricsConfig.timeWindowMs() * bucket;
> }
> @Test
> public void testHowRateDropsMetrics() throws Exception {
> Rate rate = new Rate();
> metricsConfig.samples(2);
> metricsConfig.timeWindow(30, TimeUnit.SECONDS);
> // First sample window from t0 -> (t1 -1), with rate 999 per second:
> for (long time = t(0); time != t(1); time += 1000) {
> rate.record(metricsConfig, 999, time);
> }
> // Second sample window from t1 -> (t2 -1), with rate 1 per second:
> for (long time = t(1); time != t(2); time += 1000) {
> rate.record(metricsConfig, 1, time);
> }
> // Measure at bucket boundary, (though same issue exists all periodic 
> measurements)
> final double m1 = rate.measure(metricsConfig, t(2));// m1 = 1.0
> // Third sample window from t2 -> (t3 -1), with rate 999 per second:
> for (long time = t(2); time != t(3); time += 1000) {
> rate.record(met

[jira] [Updated] (KAFKA-3456) bihtfbucbcceinvujekclljidcuf

2016-03-25 Thread Aditya Auradkar (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aditya Auradkar updated KAFKA-3456:
---
Summary: bihtfbucbcceinvujekclljidcuf  (was: In-house KafkaMetric 
misreports metrics when periodically observed)

> bihtfbucbcceinvujekclljidcuf
> 
>
> Key: KAFKA-3456
> URL: https://issues.apache.org/jira/browse/KAFKA-3456
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer, core, producer 
>Affects Versions: 0.9.0.0, 0.9.0.1, 0.10.0.0
>Reporter: The Data Lorax
>Assignee: Neha Narkhede
>Priority: Minor
>
> The metrics captured by Kafka through the in-house {{SampledStat}} suffer 
> from misreporting metrics if observed in a periodic manner.
> Consider a {{Rate}} metric that is using the default 2 samples and 30 second 
> sample window i.e. the {{Rate}} is capturing 60 seconds worth of data.  So, 
> to report this metric to some external system we might poll it every 60 
> seconds to observe the current value. Using a shorter period would, in the 
> case of a {{Rate}}, lead to smoothing of the plotted data, and worse, in the 
> case of a {{Count}}, would lead to double counting - so 60 seconds is the 
> only period at which we can poll the metrics if we are to report accurate 
> metrics.
> To demonstrate the issue consider the following somewhat extreme case:
> The {{Rate}}  is capturing data from a system which alternates between a 999 
> per sec rate and a 1 per sec rate every 30 seconds, with the different rates 
> aligned with the sample boundaries within the {{Rate}} instance i.e. after 60 
> seconds the first sample within the {{Rate}} instance will have a rate of 999 
> per sec, and the second 1 per sec. 
> If we were to ask the metric for its value at this 60 second boundary it 
> would correctly report 500 per sec. However, if we asked it again 1 
> millisecond later it would report 1 per sec, as the first sample window has 
> been aged out. Depending on how retarded into the 60 sec period of the metric 
> our periodic poll of the metric was, we would observe a constant rate 
> somewhere in the range of 1 to 500 per second, most likely around the 250 
> mark. 
> Other metrics based off of the {{SampledStat}} type suffer from the same 
> issue e.g. the {{Count}} metric, given a constant rate of 1 per second, will 
> report a constant count somewhere between 30 and 60, rather than the correct 
> 60.
> This can be seen in the following test code:
> {code:java}
> public class MetricsTest {
> private MetricConfig metricsConfig;
> @Before
> public void setUp() throws Exception {
> metricsConfig = new MetricConfig();
> }
> private long t(final int bucket) {
> return metricsConfig.timeWindowMs() * bucket;
> }
> @Test
> public void testHowRateDropsMetrics() throws Exception {
> Rate rate = new Rate();
> metricsConfig.samples(2);
> metricsConfig.timeWindow(30, TimeUnit.SECONDS);
> // First sample window from t0 -> (t1 -1), with rate 999 per second:
> for (long time = t(0); time != t(1); time += 1000) {
> rate.record(metricsConfig, 999, time);
> }
> // Second sample window from t1 -> (t2 -1), with rate 1 per second:
> for (long time = t(1); time != t(2); time += 1000) {
> rate.record(metricsConfig, 1, time);
> }
> // Measure at bucket boundary, (though same issue exists all periodic 
> measurements)
> final double m1 = rate.measure(metricsConfig, t(2));// m1 = 1.0
> // Third sample window from t2 -> (t3 -1), with rate 999 per second:
> for (long time = t(2); time != t(3); time += 1000) {
> rate.record(metricsConfig, 999, time);
> }
> // Second sample window from t3 -> (t4 -1), with rate 1 per second:
> for (long time = t(3); time != t(4); time += 1000) {
> rate.record(metricsConfig, 1, time);
> }
> // Measure second pair of samples:
> final double m2 = rate.measure(metricsConfig, t(4));// m2 = 1.0
> assertEquals("Measurement of the rate over the first two samples", 
> 500.0, m1, 2.0);
> assertEquals("Measurement of the rate over the last two samples", 
> 500.0, m2, 2.0);
> }
> @Test
> public void testHowRateDropsMetricsWithRetardedObservations() throws 
> Exception {
> final long retardation = 1000;
> Rate rate = new Rate();
> metricsConfig.samples(2);
> metrics

[jira] [Updated] (KAFKA-3456) In-house KafkaMetric misreports metrics when periodically observed

2016-03-25 Thread Aditya Auradkar (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aditya Auradkar updated KAFKA-3456:
---
Summary: In-house KafkaMetric misreports metrics when periodically observed 
 (was: bihtfbucbcceinvujekclljidcuf)

> In-house KafkaMetric misreports metrics when periodically observed
> --
>
> Key: KAFKA-3456
> URL: https://issues.apache.org/jira/browse/KAFKA-3456
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer, core, producer 
>Affects Versions: 0.9.0.0, 0.9.0.1, 0.10.0.0
>Reporter: The Data Lorax
>Assignee: Neha Narkhede
>Priority: Minor
>
> The metrics captured by Kafka through the in-house {{SampledStat}} suffer 
> from misreporting metrics if observed in a periodic manner.
> Consider a {{Rate}} metric that is using the default 2 samples and 30 second 
> sample window i.e. the {{Rate}} is capturing 60 seconds worth of data.  So, 
> to report this metric to some external system we might poll it every 60 
> seconds to observe the current value. Using a shorter period would, in the 
> case of a {{Rate}}, lead to smoothing of the plotted data, and worse, in the 
> case of a {{Count}}, would lead to double counting - so 60 seconds is the 
> only period at which we can poll the metrics if we are to report accurate 
> metrics.
> To demonstrate the issue consider the following somewhat extreme case:
> The {{Rate}}  is capturing data from a system which alternates between a 999 
> per sec rate and a 1 per sec rate every 30 seconds, with the different rates 
> aligned with the sample boundaries within the {{Rate}} instance i.e. after 60 
> seconds the first sample within the {{Rate}} instance will have a rate of 999 
> per sec, and the second 1 per sec. 
> If we were to ask the metric for its value at this 60 second boundary it 
> would correctly report 500 per sec. However, if we asked it again 1 
> millisecond later it would report 1 per sec, as the first sample window has 
> been aged out. Depending on how retarded into the 60 sec period of the metric 
> our periodic poll of the metric was, we would observe a constant rate 
> somewhere in the range of 1 to 500 per second, most likely around the 250 
> mark. 
> Other metrics based off of the {{SampledStat}} type suffer from the same 
> issue e.g. the {{Count}} metric, given a constant rate of 1 per second, will 
> report a constant count somewhere between 30 and 60, rather than the correct 
> 60.
> This can be seen in the following test code:
> {code:java}
> public class MetricsTest {
> private MetricConfig metricsConfig;
> @Before
> public void setUp() throws Exception {
> metricsConfig = new MetricConfig();
> }
> private long t(final int bucket) {
> return metricsConfig.timeWindowMs() * bucket;
> }
> @Test
> public void testHowRateDropsMetrics() throws Exception {
> Rate rate = new Rate();
> metricsConfig.samples(2);
> metricsConfig.timeWindow(30, TimeUnit.SECONDS);
> // First sample window from t0 -> (t1 -1), with rate 999 per second:
> for (long time = t(0); time != t(1); time += 1000) {
> rate.record(metricsConfig, 999, time);
> }
> // Second sample window from t1 -> (t2 -1), with rate 1 per second:
> for (long time = t(1); time != t(2); time += 1000) {
> rate.record(metricsConfig, 1, time);
> }
> // Measure at bucket boundary, (though same issue exists all periodic 
> measurements)
> final double m1 = rate.measure(metricsConfig, t(2));// m1 = 1.0
> // Third sample window from t2 -> (t3 -1), with rate 999 per second:
> for (long time = t(2); time != t(3); time += 1000) {
> rate.record(metricsConfig, 999, time);
> }
> // Second sample window from t3 -> (t4 -1), with rate 1 per second:
> for (long time = t(3); time != t(4); time += 1000) {
> rate.record(metricsConfig, 1, time);
> }
> // Measure second pair of samples:
> final double m2 = rate.measure(metricsConfig, t(4));// m2 = 1.0
> assertEquals("Measurement of the rate over the first two samples", 
> 500.0, m1, 2.0);
> assertEquals("Measurement of the rate over the last two samples", 
> 500.0, m2, 2.0);
> }
> @Test
> public void testHowRateDropsMetricsWithRetardedObservations() throws 
> Exception {
> final long retardation = 1000;
> Rate rate =

[GitHub] kafka pull request: KAFKA-2309; ISR shrink rate not updated on Lea...

2016-03-23 Thread auradkar
GitHub user auradkar reopened a pull request:

https://github.com/apache/kafka/pull/185

KAFKA-2309; ISR shrink rate not updated on LeaderAndIsr request with shrunk 
ISR

Currently, a LeaderAndIsrRequest does not mark the isrShrinkRate if the 
received ISR is smaller than the existing ISR. This can happen if one of the 
replicas is shut down.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/auradkar/kafka K-2309

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/185.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #185


commit 090a78284478ba57c655fe7a931e8318641bceae
Author: Aditya Auradkar <aaurad...@linkedin.com>
Date:   2015-09-02T01:16:53Z

Fixing KAFKA-2309

commit 7d60319a99feadc0c13fd7639d39b6af786515e5
Author: Aditya Auradkar <aaurad...@linkedin.com>
Date:   2015-10-07T01:54:19Z

Addressing Joels comment

commit bd8dc02e28b94422735114275737f3efcf667a8b
Author: Aditya Auradkar <aaurad...@linkedin.com>
Date:   2015-10-07T02:00:21Z

addressing comments




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] kafka pull request: KAFKA-2309; ISR shrink rate not updated on Lea...

2016-03-23 Thread auradkar
Github user auradkar closed the pull request at:

https://github.com/apache/kafka/pull/185


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] kafka pull request: KAFKA-3427 - Broker should return correct vers...

2016-03-23 Thread auradkar
GitHub user auradkar opened a pull request:

https://github.com/apache/kafka/pull/1128

KAFKA-3427 - Broker should return correct version of FetchResponse on 
exception

Merging the fix from: https://issues.apache.org/jira/browse/KAFKA-3427
The original version of the code, returned a response using V0 of the 
response protocol. This caused clients to break because they expected the 
throttle_time_ms field to be present.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/auradkar/kafka k-34

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/1128.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1128


commit e1e3cf7ce20e1f0b58e602b48d8b974a64f60cd8
Author: Aditya Auradkar <aaurad...@linkedin.com>
Date:   2016-03-24T00:15:56Z

KAFKA-3427: Broker should return correct version of fetch response on 
exception
Merging fix from trunk.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-3427) broker can return incorrect version of fetch response when the broker hits an unknown exception

2016-03-20 Thread Aditya Auradkar (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15203370#comment-15203370
 ] 

Aditya Auradkar commented on KAFKA-3427:


[~junrao] - Certainly, I can patch 0.9. 

> broker can return incorrect version of fetch response when the broker hits an 
> unknown exception
> ---
>
> Key: KAFKA-3427
> URL: https://issues.apache.org/jira/browse/KAFKA-3427
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.9.0.1, 0.10.0.0
>Reporter: Jun Rao
>Assignee: Jun Rao
>Priority: Blocker
> Fix For: 0.10.0.0
>
>
> In FetchResponse.handleError(), we generate FetchResponse like the following, 
> which always defaults to version 0 of the response. 
> FetchResponse(correlationId, fetchResponsePartitionData)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3310) fetch requests can trigger repeated NPE when quota is enabled

2016-03-01 Thread Aditya Auradkar (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15174366#comment-15174366
 ] 

Aditya Auradkar commented on KAFKA-3310:


[~junrao] - can you take a look?

> fetch requests can trigger repeated NPE when quota is enabled
> -
>
> Key: KAFKA-3310
> URL: https://issues.apache.org/jira/browse/KAFKA-3310
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.9.0.1
>Reporter: Jun Rao
>
> We saw the following NPE when consumer quota is enabled. NPE is triggered on 
> every fetch request from the client.
> java.lang.NullPointerException
> at 
> kafka.server.ClientQuotaManager.recordAndMaybeThrottle(ClientQuotaManager.scala:122)
> at 
> kafka.server.KafkaApis.kafka$server$KafkaApis$$sendResponseCallback$3(KafkaApis.scala:419)
> at 
> kafka.server.KafkaApis$$anonfun$handleFetchRequest$1.apply(KafkaApis.scala:436)
> at 
> kafka.server.KafkaApis$$anonfun$handleFetchRequest$1.apply(KafkaApis.scala:436)
> at kafka.server.ReplicaManager.fetchMessages(ReplicaManager.scala:481)
> at kafka.server.KafkaApis.handleFetchRequest(KafkaApis.scala:431)
> at kafka.server.KafkaApis.handle(KafkaApis.scala:69)
> at kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:60)
> at java.lang.Thread.run(Thread.java:745)
> One possible cause of this is the logic of removing inactive sensors. 
> Currently, in ClientQuotaManager, we create two sensors per clientId: a 
> throttleTimeSensor and a quotaSensor. Each sensor expires if it's not 
> actively updated for 1 hour. What can happen is that initially, the quota is 
> not exceeded. So, quotaSensor is being updated actively, but 
> throttleTimeSensor is not. At some point, throttleTimeSensor is removed by 
> the expiring thread. Now, we are in a situation that quotaSensor is 
> registered, but throttleTimeSensor is not. Later on, if the quota is 
> exceeded, we will hit the above NPE when trying to update throttleTimeSensor.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request: KAFKA-3310: Fix for NPEs observed when throttl...

2016-03-01 Thread auradkar
GitHub user auradkar opened a pull request:

https://github.com/apache/kafka/pull/989

KAFKA-3310: Fix for NPEs observed when throttling clients.

The fix basically ensures that the throttleTimeSensor is non-null before 
handing off to record the metric value. We also record the throttle time to 0 
so that we don't recreate the sensor always.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/auradkar/kafka KAFKA-3310

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/989.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #989


commit cd5007eb3c94ae2d1983cc6a4b9a9fe4e96ff1b1
Author: Aditya Auradkar <aaurad...@linkedin.com>
Date:   2016-03-01T20:18:59Z

KAFKA-3310: Fix for NPEs observed when throttling clients.

The fix basically ensures that the throttleTimeSensor is non-null before 
handing off to record the metric value. We also record the throttle time to 0 
so that we don't recreate the sensor always.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-3310) fetch requests can trigger repeated NPE when quota is enabled

2016-02-29 Thread Aditya Auradkar (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15173223#comment-15173223
 ] 

Aditya Auradkar commented on KAFKA-3310:


[~junrao] - Just making sure, you observe that the response is still delayed 
right? The throttle time sensor is the last thing that is recorded and the 
element has been added to the delay queue, so the fetchResponseCallback should 
fire after the throttle time. 

> fetch requests can trigger repeated NPE when quota is enabled
> -
>
> Key: KAFKA-3310
> URL: https://issues.apache.org/jira/browse/KAFKA-3310
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.9.0.1
>Reporter: Jun Rao
>
> We saw the following NPE when consumer quota is enabled. NPE is triggered on 
> every fetch request from the client.
> java.lang.NullPointerException
> at 
> kafka.server.ClientQuotaManager.recordAndMaybeThrottle(ClientQuotaManager.scala:122)
> at 
> kafka.server.KafkaApis.kafka$server$KafkaApis$$sendResponseCallback$3(KafkaApis.scala:419)
> at 
> kafka.server.KafkaApis$$anonfun$handleFetchRequest$1.apply(KafkaApis.scala:436)
> at 
> kafka.server.KafkaApis$$anonfun$handleFetchRequest$1.apply(KafkaApis.scala:436)
> at kafka.server.ReplicaManager.fetchMessages(ReplicaManager.scala:481)
> at kafka.server.KafkaApis.handleFetchRequest(KafkaApis.scala:431)
> at kafka.server.KafkaApis.handle(KafkaApis.scala:69)
> at kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:60)
> at java.lang.Thread.run(Thread.java:745)
> One possible cause of this is the logic of removing inactive sensors. 
> Currently, in ClientQuotaManager, we create two sensors per clientId: a 
> throttleTimeSensor and a quotaSensor. Each sensor expires if it's not 
> actively updated for 1 hour. What can happen is that initially, the quota is 
> not exceeded. So, quotaSensor is being updated actively, but 
> throttleTimeSensor is not. At some point, throttleTimeSensor is removed by 
> the expiring thread. Now, we are in a situation that quotaSensor is 
> registered, but throttleTimeSensor is not. Later on, if the quota is 
> exceeded, we will hit the above NPE when trying to update throttleTimeSensor.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3310) fetch requests can trigger repeated NPE when quota is enabled

2016-02-29 Thread Aditya Auradkar (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15173204#comment-15173204
 ] 

Aditya Auradkar commented on KAFKA-3310:


[~junrao] - Let me investigate this. If this is a problem, it should be easy to 
fix by recording 0 on the throttle time sensor everytime. 

> fetch requests can trigger repeated NPE when quota is enabled
> -
>
> Key: KAFKA-3310
> URL: https://issues.apache.org/jira/browse/KAFKA-3310
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.9.0.1
>Reporter: Jun Rao
>
> We saw the following NPE when consumer quota is enabled. NPE is triggered on 
> every fetch request from the client.
> java.lang.NullPointerException
> at 
> kafka.server.ClientQuotaManager.recordAndMaybeThrottle(ClientQuotaManager.scala:122)
> at 
> kafka.server.KafkaApis.kafka$server$KafkaApis$$sendResponseCallback$3(KafkaApis.scala:419)
> at 
> kafka.server.KafkaApis$$anonfun$handleFetchRequest$1.apply(KafkaApis.scala:436)
> at 
> kafka.server.KafkaApis$$anonfun$handleFetchRequest$1.apply(KafkaApis.scala:436)
> at kafka.server.ReplicaManager.fetchMessages(ReplicaManager.scala:481)
> at kafka.server.KafkaApis.handleFetchRequest(KafkaApis.scala:431)
> at kafka.server.KafkaApis.handle(KafkaApis.scala:69)
> at kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:60)
> at java.lang.Thread.run(Thread.java:745)
> One possible cause of this is the logic of removing inactive sensors. 
> Currently, in ClientQuotaManager, we create two sensors per clientId: a 
> throttleTimeSensor and a quotaSensor. Each sensor expires if it's not 
> actively updated for 1 hour. What can happen is that initially, the quota is 
> not exceeded. So, quotaSensor is being updated actively, but 
> throttleTimeSensor is not. At some point, throttleTimeSensor is removed by 
> the expiring thread. Now, we are in a situation that quotaSensor is 
> registered, but throttleTimeSensor is not. Later on, if the quota is 
> exceeded, we will hit the above NPE when trying to update throttleTimeSensor.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1215) Rack-Aware replica assignment option

2016-02-23 Thread Aditya Auradkar (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159491#comment-15159491
 ] 

Aditya Auradkar commented on KAFKA-1215:


[~allenxwang] - Is this patch ready for review? I noticed you add several 
commits recently but I'm not sure if you are done.

> Rack-Aware replica assignment option
> 
>
> Key: KAFKA-1215
> URL: https://issues.apache.org/jira/browse/KAFKA-1215
> Project: Kafka
>  Issue Type: Improvement
>  Components: replication
>Affects Versions: 0.8.0
>Reporter: Joris Van Remoortere
>Assignee: Allen Wang
> Fix For: 0.10.0.0
>
> Attachments: rack_aware_replica_assignment_v1.patch, 
> rack_aware_replica_assignment_v2.patch
>
>
> Adding a rack-id to kafka config. This rack-id can be used during replica 
> assignment by using the max-rack-replication argument in the admin scripts 
> (create topic, etc.). By default the original replication assignment 
> algorithm is used because max-rack-replication defaults to -1. 
> max-rack-replication > -1 is not honored if you are doing manual replica 
> assignment (preffered).
> If this looks good I can add some test cases specific to the rack-aware 
> assignment.
> I can also port this to trunk. We are currently running 0.8.0 in production 
> and need this, so i wrote the patch against that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [DISCUSS] KIP-46: Self Healing

2016-02-16 Thread Aditya Auradkar
Thanks Neha. I'll discard this for now. We can pick it up once replica
throttling and the default policies are available and tested.

On Thu, Feb 11, 2016 at 5:45 PM, Neha Narkhede <n...@confluent.io> wrote:

> >
> > 1. Replica Throttling - I agree this is rather important to get done.
> > However, it may also be argued that this problem is orthogonal. We do not
> > have these protections currently yet we do run partition reassignment
> > fairly often. Having said that, I'm perfectly happy to tackle KIP-46
> after
> > this problem is solved. I understand it is actively being discussed in
> > KAFKA-1464.
>
>
> I think we are saying the same thing here. Replica throttling is required
> to be able to pull off any partition reassignment action. It removes the
> guesswork that comes from picking a batch size that is expressed in terms
> of partition count, which is an annoying hack.
>
>
> > 2. Pluggable policies - Can you elaborate on the need for pluggable
> > policies in the partition reassignment tool? Even if we make it pluggable
> > to begin with, this needs to ship with a default policy that makes sense
> > for most users. IMO, partition count is the most intuitive default and is
> > analogous to how we stripe partitions for new topics.
>
>
> Agree about the default. I was arguing for making it pluggable so we make
> it easy to test multiple policies. For instance, partition count is a
> decent one but I can imagine how one would want a policy that optimizes for
> balancing data sizes for instance.
>
>
> > 3. Even if the trigger were fully manual (as it is now), we could still
> > have the controller generate the assignment as per a configured policy
> i.e.
> > effectively the tool is built into Kafka itself. Following this approach
> to
> > begin with makes it easier to fully automate in the future since we will
> > only need to automate the trigger later.
>
>
> I would be much more comfortable adding the capability to move large
> amounts of data to the controller, after we are very sure that the default
> policy is well tested and the replica throttling works. If so, then it is
> just a matter of placing the trigger in the controller vs in the tool. But
> I'm skeptical of adding more things to the already messy controller,
> especially without being sure about how well it works.
>
> Thanks,
> Neha
>
> On Tue, Feb 9, 2016 at 12:53 PM, Aditya Auradkar <
> aaurad...@linkedin.com.invalid> wrote:
>
> > Hi Neha,
> >
> > Thanks for the detailed reply and apologies for my late response. I do
> have
> > a few comments.
> >
> > 1. Replica Throttling - I agree this is rather important to get done.
> > However, it may also be argued that this problem is orthogonal. We do not
> > have these protections currently yet we do run partition reassignment
> > fairly often. Having said that, I'm perfectly happy to tackle KIP-46
> after
> > this problem is solved. I understand it is actively being discussed in
> > KAFKA-1464.
> >
> > 2. Pluggable policies - Can you elaborate on the need for pluggable
> > policies in the partition reassignment tool? Even if we make it pluggable
> > to begin with, this needs to ship with a default policy that makes sense
> > for most users. IMO, partition count is the most intuitive default and is
> > analogous to how we stripe partitions for new topics.
> >
> > 3. Even if the trigger were fully manual (as it is now), we could still
> > have the controller generate the assignment as per a configured policy
> i.e.
> > effectively the tool is built into Kafka itself. Following this approach
> to
> > begin with makes it easier to fully automate in the future since we will
> > only need to automate the trigger later.
> >
> > Aditya
> >
> >
> >
> > On Wed, Feb 3, 2016 at 1:57 PM, Neha Narkhede <n...@confluent.io> wrote:
> >
> > > Adi,
> > >
> > > Thanks for the write-up. Here are my thoughts:
> > >
> > > I think you are suggesting a way of automating resurrecting a topic’s
> > > replication factor in the presence of a specific scenario: in the event
> > of
> > > permanent broker failures. I agree that the partition reassignment
> > > mechanism should be used to add replicas when they are lost to
> permanent
> > > broker failures. But I think the KIP probably chews off more than we
> can
> > > digest.
> > >
> > > Before we automate detection of permanent broker failures and have the
> > > controller mitigate through automatic data balancing, I’d like to poi

Re: [DISCUSS] KIP-46: Self Healing

2016-02-09 Thread Aditya Auradkar
ernal script that
> uses a simple health check. (Part 1 of KIP-46).
>
> I agree that it will be great to *eventually* be able to fully automate
> both the trigger as well as the policies while also improving the
> mechanism. But I’m highly skeptical of big-bang approaches that go from a
> completely manual and cumbersome process to a fully automated one,
> especially when that involves large-scale data movement in a running
> cluster. Once we stabilize these changes and feel confident that they work,
> we can push the policy into the controller and have it automatically be
> triggered based on different events.
>
> Thanks,
> Neha
>
> On Tue, Feb 2, 2016 at 6:13 PM, Aditya Auradkar <
> aaurad...@linkedin.com.invalid> wrote:
>
> > Hey everyone,
> >
> > I just created a kip to discuss automated replica reassignment when we
> lose
> > a broker in the cluster.
> >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-46%3A+Self+Healing+Kafka
> >
> > Any feedback is welcome.
> >
> > Thanks,
> > Aditya
> >
>
>
>
> --
> Thanks,
> Neha
>


[DISCUSS] KIP-46: Self Healing

2016-02-02 Thread Aditya Auradkar
Hey everyone,

I just created a kip to discuss automated replica reassignment when we lose
a broker in the cluster.
https://cwiki.apache.org/confluence/display/KAFKA/KIP-46%3A+Self+Healing+Kafka

Any feedback is welcome.

Thanks,
Aditya


[jira] [Commented] (KAFKA-3088) 0.9.0.0 broker crash on receipt of produce request with empty client ID

2016-01-21 Thread Aditya Auradkar (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15111924#comment-15111924
 ] 

Aditya Auradkar commented on KAFKA-3088:


I personally prefer preserving the old behavior i.e. option (1). Everyone using 
an empty client-id receives a default quota shared by all such instances.

[~dspeterson] - Are you planning to submit a patch for this? If not, I can.

> 0.9.0.0 broker crash on receipt of produce request with empty client ID
> ---
>
> Key: KAFKA-3088
> URL: https://issues.apache.org/jira/browse/KAFKA-3088
> Project: Kafka
>  Issue Type: Bug
>  Components: producer 
>Affects Versions: 0.9.0.0
>Reporter: Dave Peterson
>Assignee: Jun Rao
>
> Sending a produce request with an empty client ID to a 0.9.0.0 broker causes 
> the broker to crash as shown below.  More details can be found in the 
> following email thread:
> http://mail-archives.apache.org/mod_mbox/kafka-users/201601.mbox/%3c5693ecd9.4050...@dspeterson.com%3e
>[2016-01-10 23:03:44,957] ERROR [KafkaApi-3] error when handling request 
> Name: ProducerRequest; Version: 0; CorrelationId: 1; ClientId: null; 
> RequiredAcks: 1; AckTimeoutMs: 1 ms; TopicAndPartition: [topic_1,3] -> 37 
> (kafka.server.KafkaApis)
>java.lang.NullPointerException
>   at 
> org.apache.kafka.common.metrics.JmxReporter.getMBeanName(JmxReporter.java:127)
>   at 
> org.apache.kafka.common.metrics.JmxReporter.addAttribute(JmxReporter.java:106)
>   at 
> org.apache.kafka.common.metrics.JmxReporter.metricChange(JmxReporter.java:76)
>   at 
> org.apache.kafka.common.metrics.Metrics.registerMetric(Metrics.java:288)
>   at org.apache.kafka.common.metrics.Sensor.add(Sensor.java:177)
>   at org.apache.kafka.common.metrics.Sensor.add(Sensor.java:162)
>   at 
> kafka.server.ClientQuotaManager.getOrCreateQuotaSensors(ClientQuotaManager.scala:209)
>   at 
> kafka.server.ClientQuotaManager.recordAndMaybeThrottle(ClientQuotaManager.scala:111)
>   at 
> kafka.server.KafkaApis.kafka$server$KafkaApis$$sendResponseCallback$2(KafkaApis.scala:353)
>   at 
> kafka.server.KafkaApis$$anonfun$handleProducerRequest$1.apply(KafkaApis.scala:371)
>   at 
> kafka.server.KafkaApis$$anonfun$handleProducerRequest$1.apply(KafkaApis.scala:371)
>   at 
> kafka.server.ReplicaManager.appendMessages(ReplicaManager.scala:348)
>   at kafka.server.KafkaApis.handleProducerRequest(KafkaApis.scala:366)
>   at kafka.server.KafkaApis.handle(KafkaApis.scala:68)
>   at 
> kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:60)
>   at java.lang.Thread.run(Thread.java:745)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [VOTE] KIP-36 - Rack aware replica assignment

2016-01-19 Thread Aditya Auradkar
+1 (non-binding)

On Tue, Jan 19, 2016 at 8:28 AM, Grant Henke  wrote:

> +1 (non-binding)
>
> On Tue, Jan 19, 2016 at 7:46 AM, Ismael Juma  wrote:
>
> > +1 (non-binding)
> >
> > On Tue, Jan 19, 2016 at 6:12 AM, Guozhang Wang 
> wrote:
> >
> > > +1 (binding)
> > >
> > > On Mon, Jan 18, 2016 at 2:43 PM, Onur Karaman
> > >  > > > wrote:
> > >
> > > > +1 (non-binding)
> > > >
> > > > On Mon, Jan 18, 2016 at 2:32 PM, Jun Rao  wrote:
> > > >
> > > > > Allen,
> > > > >
> > > > > Thanks for updating the KIP wiki. +1
> > > > >
> > > > > Could other people take a look at the latest proposal and vote on
> it?
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Jun
> > > > >
> > > > > On Mon, Jan 4, 2016 at 4:30 PM, Allen Wang 
> > > wrote:
> > > > >
> > > > > > I would like to call for a vote for KIP-36 - Rack aware replica
> > > > > assignment.
> > > > > >
> > > > > > The latest proposal is at
> > > > > >
> > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-36
> > > > > > +Rack+aware+replica+assignment
> > > > > >
> > > > > > Thanks,
> > > > > > Allen
> > > > > >
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > -- Guozhang
> > >
> >
>
>
>
> --
> Grant Henke
> Software Engineer | Cloudera
> gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke
>


Re: [VOTE] KIP-32 Add CreateTime and LogAppendTime to Kafka message.

2016-01-08 Thread Aditya Auradkar
t; > > SimpleConsumer.
> > > > >
> > > > > My original question is actually whether we just bump up magic byte
> > in
> > > > > Message once to incorporate both the offset and the timestamp
> change.
> > > It
> > > > > seems that the answer is yes. Could you reflect that in the KIP?
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Jun
> > > > >
> > > > >
> > > > > On Wed, Jan 6, 2016 at 7:01 AM, Becket Qin <becket@gmail.com>
> > > wrote:
> > > > >
> > > > > > Thanks a lot for the careful reading, Jun.
> > > > > > Please see inline replies.
> > > > > >
> > > > > >
> > > > > > > On Jan 6, 2016, at 3:24 AM, Jun Rao <j...@confluent.io> wrote:
> > > > > > >
> > > > > > > Jiangjie,
> > > > > > >
> > > > > > > Thanks for the updated KIP. Overall, a +1 on the proposal. A
> few
> > > > minor
> > > > > > > comments on the KIP.
> > > > > > >
> > > > > > > KIP-32:
> > > > > > > 50. 6.c says "The log rolling has to depend on the earliest
> > > > timestamp",
> > > > > > > which is inconsistent with KIP-33.
> > > > > > Corrected.
> > > > > > >
> > > > > > > 51. 8.b "If the time difference threshold is set to 0. The
> > > timestamp
> > > > in
> > > > > > the
> > > > > > > message is equivalent to LogAppendTime." If the time difference
> > is
> > > 0
> > > > > and
> > > > > > > CreateTime is used, all messages will likely be rejected in
> this
> > > > > > proposal.
> > > > > > > So, it's not equivalent to LogAppendTime.
> > > > > > Corrected.
> > > > > > >
> > > > > > > 52. Could you include the new value of magic byte in message
> > format
> > > > > > change?
> > > > > > > Also, do we have a single new message format that includes both
> > the
> > > > > > offset
> > > > > > > change (relative offset for inner messages) and the addition of
> > > > > > timestamp?
> > > > > > I am actually thinking about this when I am writing the patch.
> > > > > > The timestamp will be added to the o.a.k.common.record.Record and
> > > > > > Kafka.message.Message. The offset change is in
> > > > > > o.a.k.common.record.MemoryRecords and Kafka.message.MessageSet.
> To
> > > > avoid
> > > > > > unnecessary changes, my current patch did not merge them together
> > but
> > > > > > simply make sure the version of  Record(Message) and
> > > > > > MemoryRecords(MessageSet) matches.
> > > > > >
> > > > > > Currently new clients uses classes in o.a.k.common.record, and
> the
> > > > broker
> > > > > > and old clients uses classes in kafka.message.
> > > > > > I am thinking about doing the followings:
> > > > > > 1. Migrate broker to use o.a.k.common.record.
> > > > > > 2. Add message format V0 and V1 to
> o.a.k.common.protocol.Protocols.
> > > > > > Ideally we should be able to define all the wire protocols in
> > > > > > o.a.k.common.protocol.Protocol. So instead of having Record class
> > to
> > > > > parse
> > > > > > byte arrays by itself, we can use Schema to parse the records.
> > > > > >
> > > > > > Would that be better?
> > > > > > >
> > > > > > > 53. Could you document the changes in ProducerRequest V2 and
> > > > > FetchRequest
> > > > > > > V2 (and the responses)?
> > > > > > Done.
> > > > > > >
> > > > > > > 54. In migration phase 1, step 2, does internal ApiVersion mean
> > > > > > > inter.broker.protocol.version?
> > > > > > Yes.
> > > > > > >
> > > > > > > 55. In canary step 2.b, it says "It will only see
> > > > > > > ProduceRequest/FetchRequest V1 from other brokers and
> clietns.".
> 

Re: [DISCUSS] KIP-36 - Rack aware replica assignment

2016-01-05 Thread Aditya Auradkar
;
> >> > >> I am busy with some time pressing issues for the last few days. I
> >> will
> >> > >> think about how the incomplete rack information will affect the
> >> balance
> >> > and
> >> > >> update the KIP by early next week.
> >> > >>
> >> > >> Thanks,
> >> > >> Allen
> >> > >>
> >> > >>
> >> > >> On Tue, Nov 3, 2015 at 9:03 AM, Neha Narkhede <n...@confluent.io>
> >> > wrote:
> >> > >>
> >> > >>> Few suggestions on improving the KIP
> >> > >>>
> >> > >>> *If some brokers have rack, and some do not, the algorithm will
> >> thrown
> >> > an
> >> > >>> > exception. This is to prevent incorrect assignment caused by
> user
> >> > >>> error.*
> >> > >>>
> >> > >>>
> >> > >>> In the KIP, can you clearly state the user-facing behavior when
> some
> >> > >>> brokers have rack information and some don't. Which actions and
> >> > requests
> >> > >>> will error out and how?
> >> > >>>
> >> > >>> *Even distribution of partition leadership among brokers*
> >> > >>>
> >> > >>>
> >> > >>> There is some information about arranging the sorted broker list
> >> > >>> interlaced
> >> > >>> with rack ids. Can you describe the changes to the current
> algorithm
> >> > in a
> >> > >>> little more detail? How does this interlacing work if only a
> subset
> >> of
> >> > >>> brokers have the rack id configured? Does this still work if
> uneven
> >> #
> >> > of
> >> > >>> brokers are assigned to each rack? It might work, I'm looking for
> >> more
> >> > >>> details on the changes, since it will affect the behavior seen by
> >> the
> >> > >>> user
> >> > >>> - imbalance on either the leaders or data or both.
> >> > >>>
> >> > >>> On Mon, Nov 2, 2015 at 6:39 PM, Aditya Auradkar <
> >> > aaurad...@linkedin.com>
> >> > >>> wrote:
> >> > >>>
> >> > >>> > I think this sounds reasonable. Anyone else have comments?
> >> > >>> >
> >> > >>> > Aditya
> >> > >>> >
> >> > >>> > On Tue, Oct 27, 2015 at 5:23 PM, Allen Wang <
> allenxw...@gmail.com
> >> >
> >> > >>> wrote:
> >> > >>> >
> >> > >>> > > During the discussion in the hangout, it was mentioned that it
> >> > would
> >> > >>> be
> >> > >>> > > desirable that consumers know the rack information of the
> >> brokers
> >> > so
> >> > >>> that
> >> > >>> > > they can consume from the broker in the same rack to reduce
> >> > latency.
> >> > >>> As I
> >> > >>> > > understand this will only be beneficial if consumer can
> consume
> >> > from
> >> > >>> any
> >> > >>> > > broker in ISR, which is not possible now.
> >> > >>> > >
> >> > >>> > > I suggest we skip the change to TMR. Once the change is made
> to
> >> > >>> consumer
> >> > >>> > to
> >> > >>> > > be able to consume from any broker in ISR, the rack
> information
> >> can
> >> > >>> be
> >> > >>> > > added to TMR.
> >> > >>> > >
> >> > >>> > > Another thing I want to confirm is  command line behavior. I
> >> think
> >> > >>> the
> >> > >>> > > desirable default behavior is to fail fast on command line for
> >> > >>> incomplete
> >> > >>> > > rack mapping. The error message can include further
> instruction
> >> > that
> >> > >>> > tells
> >> > >>> > > the user to add an extra argument (like
> >> "--allow-partial-rackinfo")
> >> >

Re: [VOTE] KIP-32 Add CreateTime and LogAppendTime to Kafka message.

2016-01-05 Thread Aditya Auradkar
+1. Great work Becket.

Aditya

On Tue, Jan 5, 2016 at 11:47 AM, Neha Narkhede <n...@confluent.io> wrote:

> +1 on the KIP. Thanks for the hard work, Becket!
>
> On Tue, Jan 5, 2016 at 11:24 AM, Jun Rao <j...@confluent.io> wrote:
>
> > Jiangjie,
> >
> > Thanks for the updated KIP. Overall, a +1 on the proposal. A few minor
> > comments on the KIP.
> >
> > KIP-32:
> > 50. 6.c says "The log rolling has to depend on the earliest timestamp",
> > which is inconsistent with KIP-33.
> >
> > 51. 8.b "If the time difference threshold is set to 0. The timestamp in
> the
> > message is equivalent to LogAppendTime." If the time difference is 0 and
> > CreateTime is used, all messages will likely be rejected in this
> proposal.
> > So, it's not equivalent to LogAppendTime.
> >
> > 52. Could you include the new value of magic byte in message format
> change?
> > Also, do we have a single new message format that includes both the
> offset
> > change (relative offset for inner messages) and the addition of
> timestamp?
> >
> > 53. Could you document the changes in ProducerRequest V2 and FetchRequest
> > V2 (and the responses)?
> >
> > 54. In migration phase 1, step 2, does internal ApiVersion mean
> > inter.broker.protocol.version?
> >
> > 55. In canary step 2.b, it says "It will only see
> > ProduceRequest/FetchRequest V1 from other brokers and clietns.". But in
> > phase 2, a broker will receive FetchRequest V2 from other brokers.
> >
> >
> > KIP-33:
> > 60. The KIP still says maintaining index at "at minute granularity" even
> > though the index interval is configurable now.
> >
> > 61. In this design, it's possible for a log segment to have an empty time
> > index. In the worse case, we may have to scan more than the active
> segment
> > to recover the latest timestamp.
> >
> > Thanks,
> >
> > Jun
> >
> > On Mon, Jan 4, 2016 at 11:37 AM, Aditya Auradkar <
> > aaurad...@linkedin.com.invalid> wrote:
> >
> > > Hey Becket/Anna -
> > >
> > > I have a few comments about the KIP.
> > >
> > > 1. (Minor) Can we rename the KIP? It's currently "Add CreateTime and
> > > LogAppendTime etc..". This is actually the title of the now rejected
> > Option
> > > 1.
> > > 2. (Minor) Can we rename the proposed option? It isn't really "option
> 4"
> > > anymore.
> > > 3. I'm not clear on what exactly happens to compressed messages
> > > when message.timestamp.type=LogAppendTime? Does every batch get
> > > recompressed because the inner message gets rewritten with the server
> > > timestamp? Or does the message set on disk have the timestamp set to
> -1.
> > In
> > > that case, what do we use as timestamp for the message?
> > > 4. Do message.timestamp.type and max.message.time.difference.ms need
> to
> > be
> > > per-topic configs? It seems that this is really a client config i.e. a
> > > client is the source of timestamps not a topic. It could also be a
> > > broker-level config to keep things simple.
> > > 5. The "Proposed Changes" section in the KIP tries to build a
> time-based
> > > index for query but that is a separate proposal (KIP-33). Can we more
> > > crisply identify what exactly will change when this KIP (and 31) is
> > > implemented? It isn't super clear to me at this point.
> > >
> > > Aside from that, I think the "Rejected Alternatives" section of the KIP
> > is
> > > excellent. Very good insight into what options were discussed and
> > rejected.
> > >
> > > Aditya
> > >
> > > On Mon, Dec 28, 2015 at 3:57 PM, Becket Qin <becket@gmail.com>
> > wrote:
> > >
> > > > Thanks Guozhang, Gwen and Neha for the comments. Sorry for late reply
> > > > because I only have occasional gmail access from my phone...
> > > >
> > > > I just updated the wiki for KIP-32.
> > > >
> > > > Gwen,
> > > >
> > > > Yes, the migration plan is what you described.
> > > >
> > > > I agree with your comments on the version.
> > > > I changed message.format.version to use the release version.
> > > > I did not change the internal version, we can discuss this in a
> > separate
> > > > thread.
> > > >
> > > > Thanks,
> > > >
> >

Re: [VOTE] KIP-32 Add CreateTime and LogAppendTime to Kafka message.

2016-01-04 Thread Aditya Auradkar
Hey Becket/Anna -

I have a few comments about the KIP.

1. (Minor) Can we rename the KIP? It's currently "Add CreateTime and
LogAppendTime etc..". This is actually the title of the now rejected Option
1.
2. (Minor) Can we rename the proposed option? It isn't really "option 4"
anymore.
3. I'm not clear on what exactly happens to compressed messages
when message.timestamp.type=LogAppendTime? Does every batch get
recompressed because the inner message gets rewritten with the server
timestamp? Or does the message set on disk have the timestamp set to -1. In
that case, what do we use as timestamp for the message?
4. Do message.timestamp.type and max.message.time.difference.ms need to be
per-topic configs? It seems that this is really a client config i.e. a
client is the source of timestamps not a topic. It could also be a
broker-level config to keep things simple.
5. The "Proposed Changes" section in the KIP tries to build a time-based
index for query but that is a separate proposal (KIP-33). Can we more
crisply identify what exactly will change when this KIP (and 31) is
implemented? It isn't super clear to me at this point.

Aside from that, I think the "Rejected Alternatives" section of the KIP is
excellent. Very good insight into what options were discussed and rejected.

Aditya

On Mon, Dec 28, 2015 at 3:57 PM, Becket Qin  wrote:

> Thanks Guozhang, Gwen and Neha for the comments. Sorry for late reply
> because I only have occasional gmail access from my phone...
>
> I just updated the wiki for KIP-32.
>
> Gwen,
>
> Yes, the migration plan is what you described.
>
> I agree with your comments on the version.
> I changed message.format.version to use the release version.
> I did not change the internal version, we can discuss this in a separate
> thread.
>
> Thanks,
>
> Jiangjie (Becket) Qin
>
>
>
> > On Dec 24, 2015, at 5:38 AM, Guozhang Wang  wrote:
> >
> > Also I agree with Gwen that such changes may worth a 0.10 release or even
> > 1.0, having it in 0.9.1 would be quite confusing to users.
> >
> > Guozhang
> >
> >> On Wed, Dec 23, 2015 at 1:36 PM, Guozhang Wang 
> wrote:
> >>
> >> Becket,
> >>
> >> Please let us know once you have updated the wiki page regarding the
> >> migration plan. Thanks!
> >>
> >> Guozhang
> >>
> >>> On Wed, Dec 23, 2015 at 11:52 AM, Gwen Shapira 
> wrote:
> >>>
> >>> Thanks Becket, Anne and Neha for responding to my concern.
> >>>
> >>> I had an offline discussion with Anne where she helped me understand
> the
> >>> migration process. It isn't as bad as it looks in the KIP :)
> >>>
> >>> If I understand it correctly, the process (for users) will be:
> >>>
> >>> 1. Prepare for upgrade (set format.version = 0, ApiVersion = 0.9.0)
> >>> 2. Rolling upgrade of brokers
> >>> 3. Bump ApiVersion to 0.9.0-1, so fetch requests between brokers will
> use
> >>> the new protocol
> >>> 4. Start upgrading clients
> >>> 5. When "enough" clients are upgraded, bump format.version to 1
> (rolling).
> >>>
> >>> Becket, can you confirm?
> >>>
> >>> Assuming this is the process, I'm +1 on the change.
> >>>
> >>> Reminder to coders and reviewers that pull-requests with user-facing
> >>> changes should include documentation changes as well as code changes.
> >>> And a polite request to try to be helpful to users on when to use
> >>> create-time and when to use log-append-time as configuration - this is
> not
> >>> a trivial decision.
> >>>
> >>> A separate point I'm going to raise in a different thread is that we
> need
> >>> to streamline our versions a bit:
> >>> 1. I'm afraid that 0.9.0-1 will be confusing to users who care about
> >>> released versions (what if we forget to change it before the release?
> Is
> >>> it
> >>> meaningful enough to someone running off trunk?), we need to come up
> with
> >>> something that will work for both LinkedIn and everyone else.
> >>> 2. ApiVersion has real version numbers. message.format.version has
> >>> sequence
> >>> numbers. This makes us look pretty silly :)
> >>>
> >>> My version concerns can be addressed separately and should not hold
> back
> >>> this KIP.
> >>>
> >>> Gwen
> >>>
> >>>
> >>>
> >>> On Tue, Dec 22, 2015 at 11:01 PM, Becket Qin 
> >>> wrote:
> >>>
>  Hi Anna,
> 
>  Thanks for initiating the voting process. I did not start the voting
>  process because there were still some ongoing discussion with Jun
> about
> >>> the
>  timestamp regarding compressed messages. That is why the wiki page
> >>> hasn't
>  reflected the latest conversation as Guozhang pointed out.
> 
>  Like Neha said I think we have reached general agreement on this KIP.
> So
>  it is probably fine to start the KIP voting. At least we draw more
>  attention to the KIP even if there are some new discussion to bring
> up.
> 
>  Regarding the upgrade plan, given we decided to implement KIP-31 and
>  KIP-32 in the same patch to 

[jira] [Commented] (KAFKA-2966) 0.9.0 docs missing upgrade notes regarding replica lag

2015-12-09 Thread Aditya Auradkar (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15049357#comment-15049357
 ] 

Aditya Auradkar commented on KAFKA-2966:


I'll work on it since I made those changes.

> 0.9.0 docs missing upgrade notes regarding replica lag
> --
>
> Key: KAFKA-2966
> URL: https://issues.apache.org/jira/browse/KAFKA-2966
> Project: Kafka
>  Issue Type: Bug
>Reporter: Gwen Shapira
>    Assignee: Aditya Auradkar
>
> We should document that:
> * replica.lag.max.messages is gone
> * replica.lag.time.max.ms has a new meaning
> In the upgrade section. People can get caught by surprise.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (KAFKA-2966) 0.9.0 docs missing upgrade notes regarding replica lag

2015-12-09 Thread Aditya Auradkar (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aditya Auradkar reassigned KAFKA-2966:
--

Assignee: Aditya Auradkar

> 0.9.0 docs missing upgrade notes regarding replica lag
> --
>
> Key: KAFKA-2966
> URL: https://issues.apache.org/jira/browse/KAFKA-2966
> Project: Kafka
>  Issue Type: Bug
>Reporter: Gwen Shapira
>    Assignee: Aditya Auradkar
>
> We should document that:
> * replica.lag.max.messages is gone
> * replica.lag.time.max.ms has a new meaning
> In the upgrade section. People can get caught by surprise.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [ANNOUNCE] New Kafka Committer Ewen Cheslack-Postava

2015-12-08 Thread Aditya Auradkar
Congrats Ewen!

On Tue, Dec 8, 2015 at 11:51 AM, Guozhang Wang  wrote:

> Congrats Ewen! Welcome onboard.
>
> Guozhang
>
> On Tue, Dec 8, 2015 at 11:42 AM, Liquan Pei  wrote:
>
> > Congrats, Ewen!
> >
> > On Tue, Dec 8, 2015 at 11:37 AM, Neha Narkhede 
> wrote:
> >
> > > I am pleased to announce that the Apache Kafka PMC has voted to
> > > invite Ewen Cheslack-Postava as a committer and Ewen has accepted.
> > >
> > > Ewen is an active member of the community and has contributed and
> > reviewed
> > > numerous patches to Kafka. His most significant contribution is Kafka
> > > Connect which was released few days ago as part of 0.9.
> > >
> > > Please join me on welcoming and congratulating Ewen.
> > >
> > > Ewen, we look forward to your continued contributions to the Kafka
> > > community!
> > >
> > > --
> > > Thanks,
> > > Neha
> > >
> >
> >
> >
> > --
> > Liquan Pei
> > Department of Physics
> > University of Massachusetts Amherst
> >
>
>
>
> --
> -- Guozhang
>


[jira] [Commented] (KAFKA-2310) Add config to prevent broker becoming controller

2015-12-02 Thread Aditya Auradkar (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15036359#comment-15036359
 ] 

Aditya Auradkar commented on KAFKA-2310:


[~abiletskyi] - I see you submitted a pull request for this recently. 
https://github.com/apache/kafka/pull/614/files

Can you actually elaborate on the reasoning behind this change a bit more? I 
actually think we need a KIP to discuss this at least (given that it adds a new 
config). I'm not really sure preventing a broker from becoming a controller 
solves the underlying problem of a broker being overloaded.

> Add config to prevent broker becoming controller
> 
>
> Key: KAFKA-2310
> URL: https://issues.apache.org/jira/browse/KAFKA-2310
> Project: Kafka
>  Issue Type: Bug
>Reporter: Andrii Biletskyi
>Assignee: Andrii Biletskyi
> Attachments: KAFKA-2310.patch, KAFKA-2310_0.8.1.patch, 
> KAFKA-2310_0.8.2.patch
>
>
> The goal is to be able to specify which cluster brokers can serve as a 
> controller and which cannot. This way it will be possible to "reserve" 
> particular, not overloaded with partitions and other operations, broker as 
> controller.
> Proposed to add config _controller.eligibility_ defaulted to true (for 
> backward compatibility, since now any broker can become a controller)
> Patch will be available for trunk, 0.8.2 and 0.8.1



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [DISCUSS] KIP-36 - Rack aware replica assignment

2015-11-02 Thread Aditya Auradkar
I think this sounds reasonable. Anyone else have comments?

Aditya

On Tue, Oct 27, 2015 at 5:23 PM, Allen Wang <allenxw...@gmail.com> wrote:

> During the discussion in the hangout, it was mentioned that it would be
> desirable that consumers know the rack information of the brokers so that
> they can consume from the broker in the same rack to reduce latency. As I
> understand this will only be beneficial if consumer can consume from any
> broker in ISR, which is not possible now.
>
> I suggest we skip the change to TMR. Once the change is made to consumer to
> be able to consume from any broker in ISR, the rack information can be
> added to TMR.
>
> Another thing I want to confirm is  command line behavior. I think the
> desirable default behavior is to fail fast on command line for incomplete
> rack mapping. The error message can include further instruction that tells
> the user to add an extra argument (like "--allow-partial-rackinfo") to
> suppress the error and do an imperfect rack aware assignment. If the
> default behavior is to allow incomplete mapping, the error can still be
> easily missed.
>
> The affected command line tools are TopicCommand and
> ReassignPartitionsCommand.
>
> Thanks,
> Allen
>
>
>
>
>
> On Mon, Oct 26, 2015 at 12:55 PM, Aditya Auradkar <aaurad...@linkedin.com>
> wrote:
>
> > Hi Allen,
> >
> > For TopicMetadataResponse to understand version, you can bump up the
> > request version itself. Based on the version of the request, the response
> > can be appropriately serialized. It shouldn't be a huge change. For
> > example: We went through something similar for ProduceRequest recently (
> > https://reviews.apache.org/r/33378/)
> > I guess the reason protocol information is not included in the TMR is
> > because the topic itself is independent of any particular protocol (SSL
> vs
> > Plaintext). Having said that, I'm not sure we even need rack information
> in
> > TMR. What usecase were you thinking of initially?
> >
> > For 1 - I'd be fine with adding an option to the command line tools that
> > check rack assignment. For e.g. "--strict-assignment" or something
> similar.
> >
> > Aditya
> >
> > On Thu, Oct 22, 2015 at 6:44 PM, Allen Wang <allenxw...@gmail.com>
> wrote:
> >
> > > For 2 and 3, I have updated the KIP. Please take a look. One thing I
> have
> > > changed is removing the proposal to add rack to TopicMetadataResponse.
> > The
> > > reason is that unlike UpdateMetadataRequest, TopicMetadataResponse does
> > not
> > > understand version. I don't see a way to include rack without breaking
> > old
> > > version of clients. That's probably why secure protocol is not included
> > in
> > > the TopicMetadataResponse either. I think it will be a much bigger
> change
> > > to include rack in TopicMetadataResponse.
> > >
> > > For 1, my concern is that doing rack aware assignment without complete
> > > broker to rack mapping will result in assignment that is not rack aware
> > and
> > > fail to provide fault tolerance in the event of rack outage. This kind
> of
> > > problem will be difficult to surface. And the cost of this problem is
> > high:
> > > you have to do partition reassignment if you are lucky to spot the
> > problem
> > > early on or face the consequence of data loss during real rack outage.
> > >
> > > I do see the concern of fail-fast as it might also cause data loss if
> > > producer is not able produce the message due to topic creation failure.
> > Is
> > > it feasible to treat dynamic topic creation and command tools
> > differently?
> > > We allow dynamic topic creation with incomplete broker-rack mapping and
> > > fail fast in command line. Another option is to let user determine the
> > > behavior for command line. For example, by default fail fast in command
> > > line but allow incomplete broker-rack mapping if another switch is
> > > provided.
> > >
> > >
> > >
> > >
> > > On Tue, Oct 20, 2015 at 10:05 AM, Aditya Auradkar <
> > > aaurad...@linkedin.com.invalid> wrote:
> > >
> > > > Hey Allen,
> > > >
> > > > 1. If we choose fail fast topic creation, we will have topic creation
> > > > failures while upgrading the cluster. I really doubt we want this
> > > behavior.
> > > > Ideally, this should be invisible to clients of a cluster. Currently,
> > > each
> > > > broker is effecti

[jira] [Commented] (KAFKA-2502) Quotas documentation for 0.8.3

2015-10-29 Thread Aditya Auradkar (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14980712#comment-14980712
 ] 

Aditya Auradkar commented on KAFKA-2502:


[~gwenshap] - Published a patch. Please take a look.

> Quotas documentation for 0.8.3
> --
>
> Key: KAFKA-2502
> URL: https://issues.apache.org/jira/browse/KAFKA-2502
> Project: Kafka
>  Issue Type: Task
>    Reporter: Aditya Auradkar
>    Assignee: Aditya Auradkar
>Priority: Blocker
>  Labels: quotas
> Fix For: 0.9.0.0
>
>
> Complete quotas documentation
> Also, 
> https://cwiki.apache.org/confluence/display/KAFKA/A+Guide+To+The+Kafka+Protocol
>  needs to be updated with protocol changes introduced in KAFKA-2136



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request: KAFKA-2502 - Documentation for quotas

2015-10-29 Thread auradkar
GitHub user auradkar opened a pull request:

https://github.com/apache/kafka/pull/381

KAFKA-2502 - Documentation for quotas

Followed the approach specified here: 
https://issues.apache.org/jira/browse/KAFKA-2502
I also made a minor fix to ConfigCommand to expose the right options on 
add-config.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/auradkar/kafka K-2502

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/381.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #381


commit 8cafbe60108b5bfb2dd869f5e6dd138f7ae1cc60
Author: Aditya Auradkar <aaurad...@linkedin.com>
Date:   2015-10-29T00:57:34Z

Adding documentation for quotas

commit 92fd19351b965fad553e3b185a639dbf4f869949
Author: Aditya Auradkar <aaurad...@linkedin.com>
Date:   2015-10-29T01:20:45Z

Added tab to ConfigCommand

commit 48118e932a0ade2a89617767712a9224df2d6a66
Author: Aditya Auradkar <aaurad...@linkedin.com>
Date:   2015-10-29T02:28:47Z

Added design section for quotas

commit 8f5ecb9591fa7a4f6f7311012b99a755a8862cd0
Author: Aditya Auradkar <aaurad...@linkedin.com>
Date:   2015-10-29T16:11:06Z

Minor corrections




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-2502) Quotas documentation for 0.8.3

2015-10-28 Thread Aditya Auradkar (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14979155#comment-14979155
 ] 

Aditya Auradkar commented on KAFKA-2502:


[~gwenshap][~ijuma] - Sorry for the delay.. been dealing with some internal 
stuff. I'll submit something by tomorrow.

> Quotas documentation for 0.8.3
> --
>
> Key: KAFKA-2502
> URL: https://issues.apache.org/jira/browse/KAFKA-2502
> Project: Kafka
>  Issue Type: Task
>    Reporter: Aditya Auradkar
>    Assignee: Aditya Auradkar
>Priority: Blocker
>  Labels: quotas
> Fix For: 0.9.0.0
>
>
> Complete quotas documentation
> Also, 
> https://cwiki.apache.org/confluence/display/KAFKA/A+Guide+To+The+Kafka+Protocol
>  needs to be updated with protocol changes introduced in KAFKA-2136



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request: KAFKA-2663, KAFKA-2664 - [Minor] Bugfixes

2015-10-27 Thread auradkar
GitHub user auradkar opened a pull request:

https://github.com/apache/kafka/pull/369

KAFKA-2663, KAFKA-2664 - [Minor] Bugfixes

This has 2 fixes:
KAFKA-2664 - This patch changes the underlying map implementation of 
Metrics.java to a ConcurrentHashMap. Using a CopyOnWriteMap caused new metrics 
creation to get extremely slow when the existing corpus of metrics is large. 
Using a ConcurrentHashMap seems to speed up metric creation time significantly

KAFKA-2663 - Splitting out the throttleTime from the remote time. On 
throttled requests, the remote time went up artificially.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/auradkar/kafka K-2664

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/369.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #369


commit 2dc50c39bb9ea2c29d4d7663cacc145bf4bcd758
Author: Aditya Auradkar <aaurad...@linkedin.com>
Date:   2015-10-27T16:29:29Z

Fix for KAFKA-2664, KAFKA-2663

commit f3abc741312a33fc2aba011fbc179519749af439
Author: Aditya Auradkar <aaurad...@linkedin.com>
Date:   2015-10-27T17:06:47Z

revert gradle changes




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (KAFKA-2699) Add test to validate times in RequestMetrics

2015-10-27 Thread Aditya Auradkar (JIRA)
Aditya Auradkar created KAFKA-2699:
--

 Summary: Add test to validate times in RequestMetrics
 Key: KAFKA-2699
 URL: https://issues.apache.org/jira/browse/KAFKA-2699
 Project: Kafka
  Issue Type: Test
Reporter: Aditya Auradkar
Assignee: Aditya Auradkar


No tests exist to validate the reported times in RequestMetrics. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [DISCUSS] KIP-36 - Rack aware replica assignment

2015-10-26 Thread Aditya Auradkar
Hi Allen,

For TopicMetadataResponse to understand version, you can bump up the
request version itself. Based on the version of the request, the response
can be appropriately serialized. It shouldn't be a huge change. For
example: We went through something similar for ProduceRequest recently (
https://reviews.apache.org/r/33378/)
I guess the reason protocol information is not included in the TMR is
because the topic itself is independent of any particular protocol (SSL vs
Plaintext). Having said that, I'm not sure we even need rack information in
TMR. What usecase were you thinking of initially?

For 1 - I'd be fine with adding an option to the command line tools that
check rack assignment. For e.g. "--strict-assignment" or something similar.

Aditya

On Thu, Oct 22, 2015 at 6:44 PM, Allen Wang <allenxw...@gmail.com> wrote:

> For 2 and 3, I have updated the KIP. Please take a look. One thing I have
> changed is removing the proposal to add rack to TopicMetadataResponse. The
> reason is that unlike UpdateMetadataRequest, TopicMetadataResponse does not
> understand version. I don't see a way to include rack without breaking old
> version of clients. That's probably why secure protocol is not included in
> the TopicMetadataResponse either. I think it will be a much bigger change
> to include rack in TopicMetadataResponse.
>
> For 1, my concern is that doing rack aware assignment without complete
> broker to rack mapping will result in assignment that is not rack aware and
> fail to provide fault tolerance in the event of rack outage. This kind of
> problem will be difficult to surface. And the cost of this problem is high:
> you have to do partition reassignment if you are lucky to spot the problem
> early on or face the consequence of data loss during real rack outage.
>
> I do see the concern of fail-fast as it might also cause data loss if
> producer is not able produce the message due to topic creation failure. Is
> it feasible to treat dynamic topic creation and command tools differently?
> We allow dynamic topic creation with incomplete broker-rack mapping and
> fail fast in command line. Another option is to let user determine the
> behavior for command line. For example, by default fail fast in command
> line but allow incomplete broker-rack mapping if another switch is
> provided.
>
>
>
>
> On Tue, Oct 20, 2015 at 10:05 AM, Aditya Auradkar <
> aaurad...@linkedin.com.invalid> wrote:
>
> > Hey Allen,
> >
> > 1. If we choose fail fast topic creation, we will have topic creation
> > failures while upgrading the cluster. I really doubt we want this
> behavior.
> > Ideally, this should be invisible to clients of a cluster. Currently,
> each
> > broker is effectively its own rack. So we probably can use the rack
> > information whenever possible but not make it a hard requirement. To
> extend
> > Gwen's example, one badly configured broker should not degrade topic
> > creation for the entire cluster.
> >
> > 2. Upgrade scenario - Can you add a section on the upgrade piece to
> confirm
> > that old clients will not see errors? I believe
> ZookeeperConsumerConnector
> > reads the Broker objects from ZK. I wanted to confirm that this will not
> > cause any problems.
> >
> > 3. Could you elaborate your proposed changes to the UpdateMetadataRequest
> > in the "Public Interfaces" section? Personally, I find this format easy
> to
> > read in terms of wire protocol changes:
> >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-4+-+Command+line+and+centralized+administrative+operations#KIP-4-Commandlineandcentralizedadministrativeoperations-CreateTopicRequest
> >
> > Aditya
> >
> > On Fri, Oct 16, 2015 at 3:45 PM, Allen Wang <allenxw...@gmail.com>
> wrote:
> >
> > > KIP is updated include rack as an optional property for broker. Please
> > take
> > > a look and let me know if more details are needed.
> > >
> > > For the case where some brokers have rack and some do not, the current
> > KIP
> > > uses the fail-fast behavior. If there are concerns, we can further
> > discuss
> > > this in the email thread or next hangout.
> > >
> > >
> > >
> > > On Thu, Oct 15, 2015 at 10:42 AM, Allen Wang <allenxw...@gmail.com>
> > wrote:
> > >
> > > > That's a good question. I can think of three actions if the rack
> > > > information is incomplete:
> > > >
> > > > 1. Treat the node without rack as if it is on its unique rack
> > > > 2. Disregard all rack information and fallback to current algorithm
> > > > 3. Fail-fast
&g

Re: [VOTE] KIP-38: ZooKeeper authentication

2015-10-21 Thread Aditya Auradkar
+1 (non-binding).

On Wed, Oct 21, 2015 at 3:57 PM, Ismael Juma  wrote:

> +1 (non-binding)
>
> On Wed, Oct 21, 2015 at 4:17 PM, Flavio Junqueira  wrote:
>
> > Thanks everyone for the feedback so far. At this point, I'd like to start
> > a vote for KIP-38.
> >
> > Summary: Add support for ZooKeeper authentication
> > KIP page:
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-38%3A+ZooKeeper+Authentication
> > <
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-38:+ZooKeeper+Authentication
> > >
> >
> > Thanks,
> > -Flavio
>


[jira] [Assigned] (KAFKA-2664) Adding a new metric with several pre-existing metrics is very expensive

2015-10-20 Thread Aditya Auradkar (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aditya Auradkar reassigned KAFKA-2664:
--

Assignee: Aditya Auradkar  (was: Onur Karaman)

> Adding a new metric with several pre-existing metrics is very expensive
> ---
>
> Key: KAFKA-2664
> URL: https://issues.apache.org/jira/browse/KAFKA-2664
> Project: Kafka
>  Issue Type: Bug
>Reporter: Joel Koshy
>    Assignee: Aditya Auradkar
> Fix For: 0.9.0.1
>
>
> I know the summary sounds expected, but we recently ran into a socket server 
> request queue backup that I suspect was caused by a combination of improperly 
> implemented applications that reconnect with a different (random) client-id 
> each time; and the fact that for quotas we now register a new quota 
> metric-set for each client-id.
> So here is what happened: a broker went down and a handful of other brokers 
> starting seeing queue times go up significantly. This caused the request 
> queue to backup, which caused socket timeouts and a further deluge of 
> reconnects. The only way we could get out of this was to fire-wall the broker 
> and downgrade to a version without quotas (or I think it would have worked to 
> just restart the broker).
> My guess is that there were a ton of pre-existing client-id metrics. I don’t 
> know for sure but I’m basing that on the fact that there were several new 
> unique client-ids showing up in the public access logs and request local 
> times for fetches started going up inexplicably. (It would have been useful 
> to have a metric for the number of metrics.) So it turns out that in the 
> above scenario (with say 50k pre-existing client-ids), the avg local time for 
> fetch can go up to the order of 50-100ms (at least with tests on a linux box) 
> largely due to the time taken to create new metrics; and that’s because we 
> use a copy-on-write map underneath. If you have enough (say, hundreds) of 
> clients re-connecting at the same time with new client-id's, that can cause 
> the request queues to start backing up and the overall queuing system to 
> become unstable; and the line starts to spill out of the building.
> I think this is a fairly new scenario with quotas - i.e., I don’t think the 
> past per-X metrics (per-topic for e.g.,) creation rate would ever come this 
> close.
> To be clear, the clients are clearly doing the wrong thing but I think the 
> broker can and should protect itself adequately against such rogue scenarios.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Slow request log in Kafka

2015-10-20 Thread Aditya Auradkar
Fair points. Kafka doesn't really have slow queries. I was thinking about
this kind of log in response to a request processing slowdown we had during
an internal release.. it's unlikely a slow query log would have really
helped since it slowed down requests from all entities (see KAFKA-2664 for
more).

I suppose one example of a "slow" query is produce with "ack=-1" in case
the replicas aren't catching up. However, we do have other metrics to catch
this.

Aditya

On Thu, Oct 15, 2015 at 12:43 AM, Ewen Cheslack-Postava <e...@confluent.io>
wrote:

> Kafka doesn't have the same type of queries that RDBMS systems have. What
> "slow queries" would we be trying to capture info about?
>
> -Ewen
>
> On Wed, Oct 14, 2015 at 4:27 PM, Gwen Shapira <g...@confluent.io> wrote:
>
> > I had some experience with the feature in MySQL.
> >
> > Its main good use is to identify queries that are obviously bad (full
> scans
> > on OLTP system) and need optimization. You can't infer from it anything
> > about the system as a whole because it lacks context and information
> about
> > what the rest of the system was doing at the same time.
> >
> > I'd like to hear how you see yourself using it in Apache Kafka to better
> > understand its usefulness. Can you share some details about how you would
> > have used it in the recent issue you mentioned?
> >
> > What I see as helpful:
> > 1. Ability to enable/disable trace/debug level logging of request
> handling
> > for specific request types and clients without restarting the broker
> (i.e.
> > through JMX, protocol or ZK)
> > 2. Publish histograms of the existing request time metrics
> > 3. Capture detailed timing of a random sample of the requests and log it
> > (i.e sample metrics rather than avgs). Note that clients that send more
> > requests and longer requests are more likely to get sampled. I've found
> > this super useful in the past.
> >
> > Gwen
> >
> > On Wed, Oct 14, 2015 at 3:39 PM, Aditya Auradkar <
> > aaurad...@linkedin.com.invalid> wrote:
> >
> > > Hey everyone,
> > >
> > > We were recently discussing a small logging improvement for Kafka.
> > > Basically, add a request log for queries that took longer than a
> certain
> > > configurable time to execute. This can be quite useful for debugging
> > > purposes, in fact it would have proven handy while investigating a
> recent
> > > issue during one of our deployments at LinkedIn.
> > >
> > > There is also supported in several other projects. For example: MySQL
> and
> > > Postgres both have slow request logs.
> > > https://dev.mysql.com/doc/refman/5.0/en/slow-query-log.html
> > > https://wiki.postgresql.org/wiki/Logging_Difficult_Queries
> > >
> > > Thoughts?
> > >
> > > Thanks,
> > > Aditya
> > >
> >
>
>
>
> --
> Thanks,
> Ewen
>


Re: [DISCUSS] KIP-36 - Rack aware replica assignment

2015-10-20 Thread Aditya Auradkar
t today. The recommendation is
> to
> >> > make
> >> > > > rack as a broker property in ZooKeeper. For users with existing
> rack
> >> > > > information stored somewhere, they would need to retrieve the
> >> > information
> >> > > > at broker start up and dynamically set the rack property, which
> can
> >> be
> >> > > > implemented as a wrapper to bootstrap broker. There will be no
> >> > interface
> >> > > or
> >> > > > pluggable implementation to retrieve the rack information.
> >> > > >
> >> > > > The assumption is that you always need to restart the broker to
> >> make a
> >> > > > change to the rack.
> >> > > >
> >> > > > Once the rack becomes a broker property, it will be possible to
> make
> >> > rack
> >> > > > part of the meta data to help the consumer choose which in sync
> >> replica
> >> > > to
> >> > > > consume from as part of the future consumer enhancement.
> >> > > >
> >> > > > I will update the KIP.
> >> > > >
> >> > > > Thanks,
> >> > > > Allen
> >> > > >
> >> > > >
> >> > > > On Thu, Oct 8, 2015 at 9:23 AM, Allen Wang <allenxw...@gmail.com>
> >> > wrote:
> >> > > >
> >> > > > > I attended Tuesday's KIP hangout but this KIP was not discussed
> >> due
> >> > to
> >> > > > > time constraint.
> >> > > > >
> >> > > > > However, after hearing discussion of KIP-35, I have the feeling
> >> that
> >> > > > > incompatibility (caused by new broker property) between brokers
> >> with
> >> > > > > different versions  will be solved there. In addition, having
> >> stack
> >> > in
> >> > > > > broker property as meta data may also help consumers in the
> >> future.
> >> > So
> >> > > I
> >> > > > am
> >> > > > > open to adding stack property to broker.
> >> > > > >
> >> > > > > Hopefully we can discuss this in the next KIP hangout.
> >> > > > >
> >> > > > > On Wed, Sep 30, 2015 at 2:46 PM, Allen Wang <
> allenxw...@gmail.com
> >> >
> >> > > > wrote:
> >> > > > >
> >> > > > >> Can you send me the information on the next KIP hangout?
> >> > > > >>
> >> > > > >> Currently the broker-rack mapping is not cached. In KafkaApis,
> >> > > > >> RackLocator.getRackInfo() is called each time the mapping is
> >> needed
> >> > > for
> >> > > > >> auto topic creation. This will ensure latest mapping is used at
> >> any
> >> > > > time.
> >> > > > >>
> >> > > > >> The ability to get the complete mapping makes it simple to
> reuse
> >> the
> >> > > > same
> >> > > > >> interface in command line tools.
> >> > > > >>
> >> > > > >>
> >> > > > >> On Wed, Sep 30, 2015 at 11:01 AM, Aditya Auradkar <
> >> > > > >> aaurad...@linkedin.com.invalid> wrote:
> >> > > > >>
> >> > > > >>> Perhaps we discuss this during the next KIP hangout?
> >> > > > >>>
> >> > > > >>> I do see that a pluggable rack locator can be useful but I do
> >> see a
> >> > > few
> >> > > > >>> concerns:
> >> > > > >>>
> >> > > > >>> - The RackLocator (as described in the document), implies that
> >> it
> >> > can
> >> > > > >>> discover rack information for any node in the cluster. How
> does
> >> it
> >> > > deal
> >> > > > >>> with rack location changes? For example, if I moved broker id
> >> (1)
> >> > > from
> >> > > > >>> rack
> >> > > > >>> X to Y, I only have to start that broker with a newer rack
> >> config.
> >> > If
> >> &g

Re: [DISCUSS] KIP-39 Pinning controller to a broker

2015-10-20 Thread Aditya Auradkar
Hi Abhishek -

Perhaps it would help if you explained the motivation behind your proposal.
I know there was a bunch of discussion on KAFKA-1778, can you summarize?
Currently, I'd agree with Neha and Jay that there isn't really a strong
reason to pin the controller to a given broker or restricted to a set of
brokers.

For rolling upgrades, it should be simpler to bounce the existing
controller last.
As for choosing a relatively lightly loaded broker, I think we should
ideally eliminate those by distributing partitions (and data rate) as
evenly as possible. If for some reason a broker cannot become the
controller, (by virtue of load or something else) arguably that is a
separate problem that needs addressing.

Thanks,
Aditya

On Tue, Oct 20, 2015 at 9:27 PM, Neha Narkhede  wrote:

> >
> > I will update the KIP on how we can optimize the placement of controller
> > (pinning it to a preferred broker id (potentially config enabled) ) if
> that
> > sounds reasonable.
>
>
> The point I (and I think Jay too) was making is that pinning a controller
> to a broker through config is what we should stay away from. This should be
> handled by whatever tool you may be using to bounce the cluster in a
> rolling restart fashion (where it detects the current controller and
> restarts it at the very end).
>
>
> On Tue, Oct 20, 2015 at 5:35 PM, Abhishek Nigam
>  > wrote:
>
> > Hi Jay/Neha,
> > I just subscribed to the mailing list so I read your response but did not
> > receive your email so adding the context into this email thread.
> >
> > "
> >
> > Agree with Jay on staying away from pinning roles to brokers. This is
> > actually harder to operate and monitor.
> >
> > Regarding the problems you mentioned-
> > 1. Reducing the controller moves during rolling bounce is useful but
> really
> > something that should be handled by the tooling. The root cause is that
> > currently the controller move is expensive. I think we'd be better off
> > investing time and effort in thinning out the controller. Just moving to
> > the batch write APIs in ZooKeeper will make a huge difference.
> > 2. I'm not sure I understood the motivation behind moving partitions out
> of
> > the controller broker. That seems like a proposal for a solution, but can
> > you describe the problems you saw that affected controller functionality?
> >
> > Regarding the location of the controller, it seems there are 2 things you
> > are suggesting:
> > 1. Optimizing the strategy of picking a broker as the controller (e.g.
> > least loaded node)
> > 2. Moving the controller if a broker soft fails.
> >
> > I don't think #1 is worth the effort involved. The better way of
> addressing
> > it is to make the controller thinner and faster. #2 is interesting since
> > the problem is that while a broker fails, all state changes fail or are
> > queued up which globally impacts the cluster. There are 2 alternatives -
> > have a tool that allows you to move the controller or just kill the
> broker
> > so the controller moves. I prefer the latter since it is simple and also
> > because a misbehaving broker is better off shutdown anyway.
> >
> > Having said that, it will be helpful to know details of the problems you
> > saw while operating the controller. I think understanding those will help
> > guide the solution better.
> >
> > On Tue, Oct 20, 2015 at 12:49 PM, Jay Kreps  wrote:
> >
> > > This seems like a step backwards--we really don't want people to
> manually
> > > manage the location of the controller and try to manually balance
> > > partitions off that broker.
> > >
> > > I think it might make sense to consider directly fixing the things you
> > > actual want to fix:
> > > 1. Two many controller moves--we could either just make this cheaper or
> > > make the controller location more deterministic e.g. having the
> election
> > > prefer the node with the smallest node id so there were fewer failovers
> > in
> > > rolling bounces.
> > > 2. You seem to think having the controller on a normal node is a
> problem.
> > > Can you elaborate on what the negative consequences you've observed?
> > Let's
> > > focus on fixing those.
> > >
> > > In general we've worked very hard to avoid having a bunch of dedicated
> > > roles for different nodes and I would be very very loath to see us move
> > > away from that philosophy. I have a fair amount of experience with both
> > > homogenous systems that have a single role and also systems with many
> > > differentiated roles and I really think that the differentiated
> approach
> > > causes more problems than it solves for most deployments due to the
> added
> > > complexity.
> > >
> > > I think we could also fix up this KIP a bit. For example it says there
> > are
> > > no public interfaces involved but surely there are new admin commands
> to
> > > control the location? There are also some minor things like listing it
> as
> > > released in 0.8.3 that seem wrong.
> 

[jira] [Commented] (KAFKA-2502) Quotas documentation for 0.8.3

2015-10-19 Thread Aditya Auradkar (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14963909#comment-14963909
 ] 

Aditya Auradkar commented on KAFKA-2502:


Thanks Gwen and Ismael. I'll have send a patch to review later this week.

> Quotas documentation for 0.8.3
> --
>
> Key: KAFKA-2502
> URL: https://issues.apache.org/jira/browse/KAFKA-2502
> Project: Kafka
>  Issue Type: Task
>    Reporter: Aditya Auradkar
>    Assignee: Aditya Auradkar
>Priority: Blocker
>  Labels: quotas
> Fix For: 0.9.0.0
>
>
> Complete quotas documentation
> Also, 
> https://cwiki.apache.org/confluence/display/KAFKA/A+Guide+To+The+Kafka+Protocol
>  needs to be updated with protocol changes introduced in KAFKA-2136



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2502) Quotas documentation for 0.8.3

2015-10-19 Thread Aditya Auradkar (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14963605#comment-14963605
 ] 

Aditya Auradkar commented on KAFKA-2502:


[~ijuma] - I assume I need to submit changes to the 0.9 site here: 
http://kafka.apache.org/documentation.html

I'll add the following changes:
1. Add newly introduced configs to the "Configuration" section
2. Add a section on quota design to the "Design" section
3. Add a piece on setting quotas dynamically via ConfigCommand in "Basic Kafka 
Operations"
4. In "Monitoring" add suggested metrics to monitor.

Sound ok?

> Quotas documentation for 0.8.3
> --
>
> Key: KAFKA-2502
> URL: https://issues.apache.org/jira/browse/KAFKA-2502
> Project: Kafka
>  Issue Type: Task
>Reporter: Aditya Auradkar
>Assignee: Aditya Auradkar
>Priority: Blocker
>  Labels: quotas
> Fix For: 0.9.0.0
>
>
> Complete quotas documentation
> Also, 
> https://cwiki.apache.org/confluence/display/KAFKA/A+Guide+To+The+Kafka+Protocol
>  needs to be updated with protocol changes introduced in KAFKA-2136



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2419) Allow certain Sensors to be garbage collected after inactivity

2015-10-15 Thread Aditya Auradkar (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14958962#comment-14958962
 ] 

Aditya Auradkar commented on KAFKA-2419:


[~ijuma] - We create only a single metrics instance for the KafkaServer. We do 
create a separate Sensor per consumer and producer.

As for the ExpireSensor task, let me fix that. My patch added that to the 
thread pool executor, I nuked that while refactoring. Shall submit a fix today.

> Allow certain Sensors to be garbage collected after inactivity
> --
>
> Key: KAFKA-2419
> URL: https://issues.apache.org/jira/browse/KAFKA-2419
> Project: Kafka
>  Issue Type: New Feature
>Affects Versions: 0.9.0.0
>    Reporter: Aditya Auradkar
>    Assignee: Aditya Auradkar
>Priority: Blocker
>  Labels: quotas
> Fix For: 0.9.0.0
>
>
> Currently, metrics cannot be removed once registered. 
> Implement a feature to remove certain sensors after a certain period of 
> inactivity (perhaps configurable).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request: KAFKA-2419 - Fix to prevent background thread ...

2015-10-15 Thread auradkar
GitHub user auradkar opened a pull request:

https://github.com/apache/kafka/pull/323

KAFKA-2419 - Fix to prevent background thread from getting created when not 
required

See here for more discussion: 
https://issues.apache.org/jira/browse/KAFKA-2419
Basically, the fix involves adding a param to Metrics to indicate if it is 
capable of metric cleanup or not.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/auradkar/kafka KAFKA-2419-fix

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/323.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #323


commit c233d3006950bb3676bb731e1f23d1ce93f05302
Author: Aditya Auradkar <aaura...@aauradka-mn1.linkedin.biz>
Date:   2015-10-16T00:59:27Z

Followup commit to KAFKA-2419. This basically prevents a background thread 
from getting created if the Metrics instance does not support expiration




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-2419) Allow certain Sensors to be garbage collected after inactivity

2015-10-15 Thread Aditya Auradkar (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14959588#comment-14959588
 ] 

Aditya Auradkar commented on KAFKA-2419:


I agree it would be nice to not need a tick thread.. but it seems like we can 
avoid this altogether on clients assuming we don't need sensors that can be 
GCed. 

You are right that we can check for expiry when creating new sensors or even 
when calling record() (though that might slow down the record quite a bit). But 
we can have a situation where a chunk of sensors get recorded during a short 
window and we have no new metrics for a long time. These sensor objects 
continue to occupy memory which could otherwise be freed.. this can be 
significant if they have several samples. 

[~junrao] - There isn't a strong reason (or use case) to support per-sensor 
expiration right now. I added it because it didn't seem to add any complexity 
to the implementation and could be useful down the road.

> Allow certain Sensors to be garbage collected after inactivity
> --
>
> Key: KAFKA-2419
> URL: https://issues.apache.org/jira/browse/KAFKA-2419
> Project: Kafka
>  Issue Type: New Feature
>Affects Versions: 0.9.0.0
>    Reporter: Aditya Auradkar
>    Assignee: Aditya Auradkar
>Priority: Blocker
>  Labels: quotas
> Fix For: 0.9.0.0
>
>
> Currently, metrics cannot be removed once registered. 
> Implement a feature to remove certain sensors after a certain period of 
> inactivity (perhaps configurable).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2419) Allow certain Sensors to be garbage collected after inactivity

2015-10-15 Thread Aditya Auradkar (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14959484#comment-14959484
 ] 

Aditya Auradkar commented on KAFKA-2419:


[~junrao] - That is indeed the current implementation i.e. sensors are expired 
whenever the expiry task runs because it is fine to not be super precise. 

My comment was a bit different. Currently each sensor has the ability to 
specify the "inactivity period" i.e. time after which it is eligible for GC. 
Are you saying we should have a single such value on the base metrics instance?

> Allow certain Sensors to be garbage collected after inactivity
> --
>
> Key: KAFKA-2419
> URL: https://issues.apache.org/jira/browse/KAFKA-2419
> Project: Kafka
>  Issue Type: New Feature
>Affects Versions: 0.9.0.0
>Reporter: Aditya Auradkar
>Assignee: Aditya Auradkar
>Priority: Blocker
>  Labels: quotas
> Fix For: 0.9.0.0
>
>
> Currently, metrics cannot be removed once registered. 
> Implement a feature to remove certain sensors after a certain period of 
> inactivity (perhaps configurable).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (KAFKA-2419) Allow certain Sensors to be garbage collected after inactivity

2015-10-15 Thread Aditya Auradkar (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14959409#comment-14959409
 ] 

Aditya Auradkar edited comment on KAFKA-2419 at 10/15/15 6:52 PM:
--

[~junrao] - I concur that option 2 is simpler. One minor comment is that it may 
be better to leave the sensor objects as they are. Instead of a boolean, they 
are configured with a numeric expiration delay. By default, they won't ever 
expire.. basically the same as not being enabled for time based GC.

I can submit a patch for this today.


was (Author: aauradkar):
[~junrao] - I concur that option 2 is simpler. One minor comment is that it may 
be better to leave the sensor objects as they are. Instead of a boolean, they 
are configured with a numeric expiration delay. By default, they won't ever 
expire.. basically the same as not being enabled for time based GC.

> Allow certain Sensors to be garbage collected after inactivity
> --
>
> Key: KAFKA-2419
> URL: https://issues.apache.org/jira/browse/KAFKA-2419
> Project: Kafka
>  Issue Type: New Feature
>Affects Versions: 0.9.0.0
>    Reporter: Aditya Auradkar
>    Assignee: Aditya Auradkar
>Priority: Blocker
>  Labels: quotas
> Fix For: 0.9.0.0
>
>
> Currently, metrics cannot be removed once registered. 
> Implement a feature to remove certain sensors after a certain period of 
> inactivity (perhaps configurable).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Slow request log in Kafka

2015-10-14 Thread Aditya Auradkar
Hey everyone,

We were recently discussing a small logging improvement for Kafka.
Basically, add a request log for queries that took longer than a certain
configurable time to execute. This can be quite useful for debugging
purposes, in fact it would have proven handy while investigating a recent
issue during one of our deployments at LinkedIn.

There is also supported in several other projects. For example: MySQL and
Postgres both have slow request logs.
https://dev.mysql.com/doc/refman/5.0/en/slow-query-log.html
https://wiki.postgresql.org/wiki/Logging_Difficult_Queries

Thoughts?

Thanks,
Aditya


[jira] [Commented] (KAFKA-2536) topics tool should allow users to alter topic configuration

2015-10-13 Thread Aditya Auradkar (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14955375#comment-14955375
 ] 

Aditya Auradkar commented on KAFKA-2536:


[~gwenshap] - Thanks for reporting. Do we plan to keep this functionality in 
kafka-topic for future releases? In the future it will have to change to use 
the AlterConfig command to the brokers.

> topics tool should allow users to alter topic configuration
> ---
>
> Key: KAFKA-2536
> URL: https://issues.apache.org/jira/browse/KAFKA-2536
> Project: Kafka
>  Issue Type: Bug
>Reporter: Gwen Shapira
>Assignee: Grant Henke
>
> When we added dynamic config, we added a kafka-config tool (which can be used 
> to maintain configs for non-topic entities), and remove the capability from 
> kafka-topic tool.
> Removing the capability from kafka-topic is:
> 1. Breaks backward compatibility in our most essential tools. This has 
> significant impact on usability.
> 2. Kinda confusing that --create --config works but --alter --config does 
> not. 
> I suggest fixing this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2527) System Test for Quotas in Ducktape

2015-10-13 Thread Aditya Auradkar (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14955720#comment-14955720
 ] 

Aditya Auradkar commented on KAFKA-2527:


[~gwenshap] - Thanks!

> System Test for Quotas in Ducktape
> --
>
> Key: KAFKA-2527
> URL: https://issues.apache.org/jira/browse/KAFKA-2527
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Dong Lin
>Assignee: Dong Lin
>  Labels: quota
> Fix For: 0.9.0.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request: KAFKA-2209 - Change quotas dynamically using D...

2015-10-12 Thread auradkar
GitHub user auradkar opened a pull request:

https://github.com/apache/kafka/pull/298

KAFKA-2209 - Change quotas dynamically using DynamicConfigManager

Changes in this patch are:
1. ClientIdConfigHandler now passes through the config changes to the quota 
manager.
2. Removed static KafkaConfigs for quota overrides. These are no longer 
needed since we can override configs through ZooKeeper.
3. Added testcases to verify that the config changes are propogated from ZK 
(written using AdminTools) to the actual Metric objects.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/auradkar/kafka K-2209

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/298.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #298


commit 4a1914d86e37f32a5efae2b1840015f4dea4c139
Author: Aditya Auradkar <aaura...@aauradka-mn1.linkedin.biz>
Date:   2015-10-12T18:39:20Z

KAFKA-2209 - Dynamically change quotas per clientId

commit 42e8412f82d1199289ac2fa327502de43eb831d7
Author: Aditya Auradkar <aaura...@aauradka-mn1.linkedin.biz>
Date:   2015-10-12T22:08:39Z

Minor fixes

commit dfe35e2941ef76e5eb47b355296d3aa8c4a0e3f0
Author: Aditya Auradkar <aaura...@aauradka-mn1.linkedin.biz>
Date:   2015-10-12T22:11:38Z

Minor change




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-2606) Remove kafka.utils.Time in favour of o.a.kafka.common.utils.Time

2015-10-02 Thread Aditya Auradkar (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14941984#comment-14941984
 ] 

Aditya Auradkar commented on KAFKA-2606:


[~ijuma] This is a duplicate of https://issues.apache.org/jira/browse/KAFKA-2247
If you plan to work on this, I'll close the other one. Otherwise, I can close 
this.

> Remove kafka.utils.Time in favour of o.a.kafka.common.utils.Time
> 
>
> Key: KAFKA-2606
> URL: https://issues.apache.org/jira/browse/KAFKA-2606
> Project: Kafka
>  Issue Type: Improvement
>Affects Versions: 0.8.2.2
>Reporter: Ismael Juma
>Priority: Minor
>  Labels: newbie
>
> They duplicate each other at the moment and some server classes actually need 
> an instance of both types, which is annoying.
> It's worth noting that `kafka.utils.MockTime` includes a `scheduler` that is 
> used by some tests while `o.a.kafka.common.utils.Time` does not. We either 
> need to add this functionality or change the tests not to need it anymore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-2247) Merge kafka.utils.Time and kafka.common.utils.Time

2015-10-02 Thread Aditya Auradkar (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aditya Auradkar updated KAFKA-2247:
---
Description: 
We currently have 2 different versions of Time in clients and core. These need 
to be merged.

It's worth noting that `kafka.utils.MockTime` includes a `scheduler` that is 
used by some tests while `o.a.kafka.common.utils.Time` does not. We either need 
to add this functionality or change the tests not to need it anymore.

  was:We currently have 2 different versions of Time in clients and core. These 
need to be merged


> Merge kafka.utils.Time and kafka.common.utils.Time
> --
>
> Key: KAFKA-2247
> URL: https://issues.apache.org/jira/browse/KAFKA-2247
> Project: Kafka
>  Issue Type: Improvement
>    Reporter: Aditya Auradkar
>    Assignee: Aditya Auradkar
>Priority: Minor
>
> We currently have 2 different versions of Time in clients and core. These 
> need to be merged.
> It's worth noting that `kafka.utils.MockTime` includes a `scheduler` that is 
> used by some tests while `o.a.kafka.common.utils.Time` does not. We either 
> need to add this functionality or change the tests not to need it anymore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (KAFKA-2606) Remove kafka.utils.Time in favour of o.a.kafka.common.utils.Time

2015-10-02 Thread Aditya Auradkar (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aditya Auradkar resolved KAFKA-2606.

Resolution: Duplicate

Duplicate of: https://issues.apache.org/jira/browse/KAFKA-2247

Ismael - I copied the piece about the scheduler from this ticket onto 2247. 
Thanks!

> Remove kafka.utils.Time in favour of o.a.kafka.common.utils.Time
> 
>
> Key: KAFKA-2606
> URL: https://issues.apache.org/jira/browse/KAFKA-2606
> Project: Kafka
>  Issue Type: Improvement
>Affects Versions: 0.8.2.2
>Reporter: Ismael Juma
>Priority: Minor
>  Labels: newbie
>
> They duplicate each other at the moment and some server classes actually need 
> an instance of both types, which is annoying.
> It's worth noting that `kafka.utils.MockTime` includes a `scheduler` that is 
> used by some tests while `o.a.kafka.common.utils.Time` does not. We either 
> need to add this functionality or change the tests not to need it anymore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [DISCUSS] KIP-36 - Rack aware replica assignment

2015-09-30 Thread Aditya Auradkar
Perhaps we discuss this during the next KIP hangout?

I do see that a pluggable rack locator can be useful but I do see a few
concerns:

- The RackLocator (as described in the document), implies that it can
discover rack information for any node in the cluster. How does it deal
with rack location changes? For example, if I moved broker id (1) from rack
X to Y, I only have to start that broker with a newer rack config. If
RackLocator discovers broker -> rack information at start up time, any
change to a broker will require bouncing the entire cluster since
createTopic requests can be sent to any node in the cluster.
For this reason it may be simpler to have each node be aware of its own
rack and persist it in ZK during start up time.

- A pluggable RackLocator relies on an external service being available to
serve rack information.

Out of curiosity, I looked up how a couple of other systems deal with
zone/rack awareness.
For Cassandra some interesting modes are:
(Property File configuration)
http://docs.datastax.com/en/cassandra/2.0/cassandra/architecture/architectureSnitchPFSnitch_t.html
(Dynamic inference)
http://docs.datastax.com/en/cassandra/2.0/cassandra/architecture/architectureSnitchRackInf_c.html

Voldemort does a static node -> zone assignment based on configuration.

Aditya

On Wed, Sep 30, 2015 at 10:05 AM, Allen Wang <allenxw...@gmail.com> wrote:

> I would like to see if we can do both:
>
> - Make RackLocator pluggable to facilitate migration with existing
> broker-rack mapping
>
> - Make rack an optional property for broker. If rack is available from
> broker, treat it as source of truth. For users with existing broker-rack
> mapping somewhere else, they can use the pluggable way or they can transfer
> the mapping to the broker rack property.
>
> One thing I am not sure is what happens at rolling upgrade when we have
> rack as a broker property. For brokers with older version of Kafka, will it
> cause problem for them? If so, is there any workaround? I also think it
> would be better not to have rack in the controller wire protocol but not
> sure if it is achievable.
>
> Thanks,
> Allen
>
>
>
>
>
> On Mon, Sep 28, 2015 at 4:55 PM, Todd Palino <tpal...@gmail.com> wrote:
>
> > I tend to like the idea of a pluggable locator. For example, we already
> > have an interface for discovering information about the physical location
> > of servers. I don't relish the idea of having to maintain data in
> multiple
> > places.
> >
> > -Todd
> >
> > On Mon, Sep 28, 2015 at 4:48 PM, Aditya Auradkar <
> > aaurad...@linkedin.com.invalid> wrote:
> >
> > > Thanks for starting this KIP Allen.
> > >
> > > I agree with Gwen that having a RackLocator class that is pluggable
> seems
> > > to be too complex. The KIP refers to potentially non-ZK storage for the
> > > rack info which I don't think is necessary.
> > >
> > > Perhaps we can persist this info in zk under /brokers/ids/
> > > similar to other broker properties and add a config in KafkaConfig
> called
> > > "rack".
> > > {"jmx_port":-1,"endpoints":[...],"host":"xxx","port":yyy, "rack":
> "abc"}
> > >
> > > Aditya
> > >
> > > On Mon, Sep 28, 2015 at 2:30 PM, Gwen Shapira <g...@confluent.io>
> wrote:
> > >
> > > > Hi,
> > > >
> > > > First, thanks for putting out a KIP for this. This is super important
> > for
> > > > production deployments of Kafka.
> > > >
> > > > Few questions:
> > > >
> > > > 1) Are we sure we want "as many racks as possible"? I'd want to
> balance
> > > > between safety (more racks) and network utilization (traffic within a
> > > rack
> > > > uses the high-bandwidth TOR switch). One replica on a different rack
> > and
> > > > the rest on same rack (if possible) sounds better to me.
> > > >
> > > > 2) Rack-locator class seems overly complex compared to adding a
> > > rack.number
> > > > property to the broker properties file. Why do we want that?
> > > >
> > > > Gwen
> > > >
> > > >
> > > >
> > > > On Mon, Sep 28, 2015 at 12:15 PM, Allen Wang <allenxw...@gmail.com>
> > > wrote:
> > > >
> > > > > Hello Kafka Developers,
> > > > >
> > > > > I just created KIP-36 for rack aware replica assignment.
> > > > >
> > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-36+Rack+aware+replica+assignment
> > > > >
> > > > > The goal is to utilize the isolation provided by the racks in data
> > > center
> > > > > and distribute replicas to racks to provide fault tolerance.
> > > > >
> > > > > Comments are welcome.
> > > > >
> > > > > Thanks,
> > > > > Allen
> > > > >
> > > >
> > >
> >
>


Re: Release plan for kafka security

2015-09-25 Thread Aditya Auradkar
Basically, 0.8.3 has been renamed to 0.9.0. The plan is to include security
in the 0.9 release which should happen once all the blocker bugs have been
resolved and testing is complete (committers can provide more accurate
timelines).

On Fri, Sep 25, 2015 at 10:35 AM, Whitney, Adam 
wrote:

> Hello Kafka Developers,
>
> I’m looking for a queuing solution and Kafka is very near the top of my
> list … except that security is a primary concern (see the domain my email
> is coming from ;-)
>
> I’m a little confused about when security is going to be part of Kafka and
> in what release. On the Future Release Plan wiki<
> https://cwiki.apache.org/confluence/display/KAFKA/Future+release+plan> it
> shows it might be part of 0.8.3. But on Jira<
> https://issues.apache.org/jira/browse/KAFKA-1686> it looks like it is
> targeted for 0.9.0 release and there is no 0.9.0 release date listed on the
> wiki.
>
> Any help figuring out when security might become part of Kafka would be
> greatly appreciated.
>
> Thanks,
> adam
>


[jira] [Commented] (KAFKA-1215) Rack-Aware replica assignment option

2015-09-25 Thread Aditya Auradkar (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14908460#comment-14908460
 ] 

Aditya Auradkar commented on KAFKA-1215:


[~allenxwang] - bump. 

> Rack-Aware replica assignment option
> 
>
> Key: KAFKA-1215
> URL: https://issues.apache.org/jira/browse/KAFKA-1215
> Project: Kafka
>  Issue Type: Improvement
>  Components: replication
>Affects Versions: 0.8.0
>Reporter: Joris Van Remoortere
>Assignee: Jun Rao
> Fix For: 0.10.0.0
>
> Attachments: rack_aware_replica_assignment_v1.patch, 
> rack_aware_replica_assignment_v2.patch
>
>
> Adding a rack-id to kafka config. This rack-id can be used during replica 
> assignment by using the max-rack-replication argument in the admin scripts 
> (create topic, etc.). By default the original replication assignment 
> algorithm is used because max-rack-replication defaults to -1. 
> max-rack-replication > -1 is not honored if you are doing manual replica 
> assignment (preffered).
> If this looks good I can add some test cases specific to the rack-aware 
> assignment.
> I can also port this to trunk. We are currently running 0.8.0 in production 
> and need this, so i wrote the patch against that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [VOTE] KIP-31 - Move to relative offsets in compressed message sets.

2015-09-23 Thread Aditya Auradkar
+1

On Wed, Sep 23, 2015 at 8:03 PM, Neha Narkhede  wrote:

> +1
>
> On Wed, Sep 23, 2015 at 6:21 PM, Todd Palino  wrote:
>
> > +1000
> >
> > !
> >
> > -Todd
> >
> > On Wednesday, September 23, 2015, Jiangjie Qin  >
> > wrote:
> >
> > > Hi,
> > >
> > > Thanks a lot for the reviews and feedback on KIP-31. It looks all the
> > > concerns of the KIP has been addressed. I would like to start the
> voting
> > > process.
> > >
> > > The short summary for the KIP:
> > > We are going to use the relative offset in the message format to avoid
> > > server side recompression.
> > >
> > > In case you haven't got a chance to check, here is the KIP link.
> > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-31+-+Move+to+relative+offsets+in+compressed+message+sets
> > >
> > > Thanks,
> > >
> > > Jiangjie (Becket) Qin
> > >
> >
>
>
>
> --
> Thanks,
> Neha
>


[jira] [Updated] (KAFKA-2567) throttle-time shouldn't be NaN

2015-09-22 Thread Aditya Auradkar (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aditya Auradkar updated KAFKA-2567:
---
Labels: quotas  (was: )

> throttle-time shouldn't be NaN
> --
>
> Key: KAFKA-2567
> URL: https://issues.apache.org/jira/browse/KAFKA-2567
> Project: Kafka
>  Issue Type: Bug
>Reporter: Jun Rao
>    Assignee: Aditya Auradkar
>Priority: Minor
>  Labels: quotas
> Fix For: 0.9.0.0
>
>
> Currently, if throttling never happens, we get the NaN for throttle-time. It 
> seems it's better to default to 0.
> "kafka.server:client-id=eventsimgroup200343,type=Fetch" : { "byte-rate": 0.0, 
> "throttle-time": NaN }



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (KAFKA-2567) throttle-time shouldn't be NaN

2015-09-22 Thread Aditya Auradkar (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aditya Auradkar reassigned KAFKA-2567:
--

Assignee: Aditya Auradkar

> throttle-time shouldn't be NaN
> --
>
> Key: KAFKA-2567
> URL: https://issues.apache.org/jira/browse/KAFKA-2567
> Project: Kafka
>  Issue Type: Bug
>Reporter: Jun Rao
>    Assignee: Aditya Auradkar
>Priority: Minor
>  Labels: quotas
> Fix For: 0.9.0.0
>
>
> Currently, if throttling never happens, we get the NaN for throttle-time. It 
> seems it's better to default to 0.
> "kafka.server:client-id=eventsimgroup200343,type=Fetch" : { "byte-rate": 0.0, 
> "throttle-time": NaN }



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2567) throttle-time shouldn't be NaN

2015-09-22 Thread Aditya Auradkar (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14903101#comment-14903101
 ] 

Aditya Auradkar commented on KAFKA-2567:


[~junrao] - I'l fix this in my next patch

> throttle-time shouldn't be NaN
> --
>
> Key: KAFKA-2567
> URL: https://issues.apache.org/jira/browse/KAFKA-2567
> Project: Kafka
>  Issue Type: Bug
>Reporter: Jun Rao
>    Assignee: Aditya Auradkar
>Priority: Minor
>  Labels: quotas
> Fix For: 0.9.0.0
>
>
> Currently, if throttling never happens, we get the NaN for throttle-time. It 
> seems it's better to default to 0.
> "kafka.server:client-id=eventsimgroup200343,type=Fetch" : { "byte-rate": 0.0, 
> "throttle-time": NaN }



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request: KAFKA-2419; Garbage collect unused sensors

2015-09-22 Thread auradkar
GitHub user auradkar opened a pull request:

https://github.com/apache/kafka/pull/233

KAFKA-2419; Garbage collect unused sensors

As discussed in KAFKA-2419 - I've added a time based sensor retention 
config to Sensor. Sensors that have not been "recorded" for 'n' seconds are 
eligible for expiration.

In addition to the time based retention, I've also altered several tests to 
close the Metrics and scheduler objects since they can cause leaks while 
running tests. This causes TestUtils.verifyNonDaemonThreadStatus to fail.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/auradkar/kafka K-2419

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/233.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #233


commit b19603cb28a5ae45ba1e6f4288e8bf57ac36b1e3
Author: Aditya Auradkar <aaura...@aauradka-mn1.linkedin.biz>
Date:   2015-09-22T23:25:30Z

KAFKA-2419; Garbage collect unused sensors




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: [ANNOUNCE] New Committer Sriharsha Chintalapani

2015-09-21 Thread Aditya Auradkar
Congrats Sriharsha! Great work.. especially on SSL.

On Mon, Sep 21, 2015 at 10:31 PM, Prabhjot Bharaj 
wrote:

> Congratulations. It's inspiring for newbies like me
>
> Regards,
> Prabhjot
> On Sep 22, 2015 10:30 AM, "Ashish Singh"  wrote:
>
> > Congrats Harsha!
> >
> > On Monday, September 21, 2015, Manikumar Reddy 
> > wrote:
> >
> > > congrats harsha!
> > >
> > > On Tue, Sep 22, 2015 at 9:48 AM, Dong Lin  > > > wrote:
> > >
> > > > Congratulations Sriharsha!
> > > >
> > > > Dong
> > > >
> > > > On Tue, Sep 22, 2015 at 4:17 AM, Guozhang Wang  > > > wrote:
> > > >
> > > > > Congrats Sriharsha!
> > > > >
> > > > > Guozhang
> > > > >
> > > > > On Mon, Sep 21, 2015 at 9:10 PM, Jun Rao  > > > wrote:
> > > > >
> > > > > > I am pleased to announce that the Apache Kafka PMC has voted to
> > > > > > invite Sriharsha Chintalapani as a committer and Sriharsha has
> > > > accepted.
> > > > > >
> > > > > > Sriharsha has contributed numerous patches to Kafka. The most
> > > > significant
> > > > > > one is the SSL support.
> > > > > >
> > > > > > Please join me on welcoming and congratulating Sriharsha.
> > > > > >
> > > > > > I look forward to your continued contributions and much more to
> > come!
> > > > > >
> > > > > > Jun
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > -- Guozhang
> > > > >
> > > >
> > >
> >
> >
> > --
> > Ashish h
> >
>


[jira] [Commented] (KAFKA-1599) Change preferred replica election admin command to handle large clusters

2015-09-21 Thread Aditya Auradkar (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14901142#comment-14901142
 ] 

Aditya Auradkar commented on KAFKA-1599:


[~anigam] - Perhaps you can write up your proposal here? Based on what the 
committers say, you write a KIP if required.

> Change preferred replica election admin command to handle large clusters
> 
>
> Key: KAFKA-1599
> URL: https://issues.apache.org/jira/browse/KAFKA-1599
> Project: Kafka
>  Issue Type: Improvement
>Affects Versions: 0.8.2.0
>Reporter: Todd Palino
>Assignee: Abhishek Nigam
>  Labels: newbie++
>
> We ran into a problem with a cluster that has 70k partitions where we could 
> not trigger a preferred replica election for all topics and partitions using 
> the admin tool. Upon investigation, it was determined that this was because 
> the JSON object that was being written to the admin znode to tell the 
> controller to start the election was 1.8 MB in size. As the default Zookeeper 
> data size limit is 1MB, and it is non-trivial to change, we should come up 
> with a better way to represent the list of topics and partitions for this 
> admin command.
> I have several thoughts on this so far:
> 1) Trigger the command for all topics and partitions with a JSON object that 
> does not include an explicit list of them (i.e. a flag that says "all 
> partitions")
> 2) Use a more compact JSON representation. Currently, the JSON contains a 
> 'partitions' key which holds a list of dictionaries that each have a 'topic' 
> and 'partition' key, and there must be one list item for each partition. This 
> results in a lot of repetition of key names that is unneeded. Changing this 
> to a format like this would be much more compact:
> {'topics': {'topicName1': [0, 1, 2, 3], 'topicName2': [0,1]}, 'version': 1}
> 3) Use a representation other than JSON. Strings are inefficient. A binary 
> format would be the most compact. This does put a greater burden on tools and 
> scripts that do not use the inbuilt libraries, but it is not too high.
> 4) Use a representation that involves multiple znodes. A structured tree in 
> the admin command would probably provide the most complete solution. However, 
> we would need to make sure to not exceed the data size limit with a wide tree 
> (the list of children for any single znode cannot exceed the ZK data size of 
> 1MB)
> Obviously, there could be a combination of #1 with a change in the 
> representation, which would likely be appropriate as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1215) Rack-Aware replica assignment option

2015-09-15 Thread Aditya Auradkar (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14745803#comment-14745803
 ] 

Aditya Auradkar commented on KAFKA-1215:


[~allenxwang] One of the committers can provide you write access once you 
provide your confluence apache id. Please let me know if you need any help with 
the KIP/reviews etc. Thanks!

> Rack-Aware replica assignment option
> 
>
> Key: KAFKA-1215
> URL: https://issues.apache.org/jira/browse/KAFKA-1215
> Project: Kafka
>  Issue Type: Improvement
>  Components: replication
>Affects Versions: 0.8.0
>Reporter: Joris Van Remoortere
>Assignee: Jun Rao
> Fix For: 0.10.0.0
>
> Attachments: rack_aware_replica_assignment_v1.patch, 
> rack_aware_replica_assignment_v2.patch
>
>
> Adding a rack-id to kafka config. This rack-id can be used during replica 
> assignment by using the max-rack-replication argument in the admin scripts 
> (create topic, etc.). By default the original replication assignment 
> algorithm is used because max-rack-replication defaults to -1. 
> max-rack-replication > -1 is not honored if you are doing manual replica 
> assignment (preffered).
> If this looks good I can add some test cases specific to the rack-aware 
> assignment.
> I can also port this to trunk. We are currently running 0.8.0 in production 
> and need this, so i wrote the patch against that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request: KAFKA-2443 Expose windowSize on Rate

2015-09-14 Thread auradkar
GitHub user auradkar opened a pull request:

https://github.com/apache/kafka/pull/213

KAFKA-2443  Expose windowSize on Rate

This is a followup ticket from KAFKA-2084 to improve the windowSize 
calculation in Quotas. I've made the following changes:

1. Added a windowSize function on Rate
2. Calling Rate.windowSize in ClientQuotaManager to return the exact window 
size to use when computing the delay time.
3. Changed the window size calculation subtly. The current calculation had 
a bug wherein, it used the number of elapsed seconds from the 
"lastWindowSeconds" of the most recent Sample object. However, the 
lastWindowSeconds is the time when the sample is created.. this causes an issue 
because it implies that the current window elapsed time is always "0" when the 
sample is created. This is incorrect as demonstrated in a testcase I added in 
MetricsTest. I've fixed the calculation to count the elapsed time from the 
"oldest" sample in the set since that gives us an accurate value of the exact 
amount of time elapsed

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/auradkar/kafka K-2443

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/213.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #213


commit 78868c50fd7966d20bf023509b9a444f6cea1443
Author: Aditya Auradkar <aaurad...@linkedin.com>
Date:   2015-09-14T17:43:55Z

Fixing K-2443




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (KAFKA-2443) Expose windowSize on Rate

2015-09-14 Thread Aditya Auradkar (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aditya Auradkar updated KAFKA-2443:
---
Summary: Expose windowSize on Rate  (was: Expose windowSize on Measurable)

> Expose windowSize on Rate
> -
>
> Key: KAFKA-2443
> URL: https://issues.apache.org/jira/browse/KAFKA-2443
> Project: Kafka
>  Issue Type: Task
>    Reporter: Aditya Auradkar
>    Assignee: Aditya Auradkar
>  Labels: quotas
>
> Currently, we dont have a means to measure the size of the metric window 
> since the final sample can be incomplete.
> Expose windowSize(now) on Measurable



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2500) Make logEndOffset available in the 0.8.3 Consumer

2015-09-10 Thread Aditya Auradkar (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14739003#comment-14739003
 ] 

Aditya Auradkar commented on KAFKA-2500:


[~hachikuji] - Are you planning on supporting this in the new consumer for the 
first release? This is quite useful for a few teams within LinkedIn.

> Make logEndOffset available in the 0.8.3 Consumer
> -
>
> Key: KAFKA-2500
> URL: https://issues.apache.org/jira/browse/KAFKA-2500
> Project: Kafka
>  Issue Type: Sub-task
>  Components: consumer
>Affects Versions: 0.8.3
>Reporter: Will Funnell
>Assignee: Jason Gustafson
>Priority: Critical
> Fix For: 0.8.3
>
>
> Originally created in the old consumer here: 
> https://issues.apache.org/jira/browse/KAFKA-1977
> The requirement is to create a snapshot from the Kafka topic but NOT do 
> continual reads after that point. For example you might be creating a backup 
> of the data to a file.
> This ticket covers the addition of the functionality to the new consumer.
> In order to achieve that, a recommended solution by Joel Koshy and Jay Kreps 
> was to expose the high watermark, as maxEndOffset, from the FetchResponse 
> object through to each MessageAndMetadata object in order to be aware when 
> the consumer has reached the end of each partition.
> The submitted patch achieves this by adding the maxEndOffset to the 
> PartitionTopicInfo, which is updated when a new message arrives in the 
> ConsumerFetcherThread and then exposed in MessageAndMetadata.
> See here for discussion:
> http://search-hadoop.com/m/4TaT4TpJy71



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2419) Allow certain Sensors to be garbage collected after inactivity

2015-09-10 Thread Aditya Auradkar (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14739324#comment-14739324
 ] 

Aditya Auradkar commented on KAFKA-2419:


Thanks [~ijuma] I'll take a look today. If I indeed can use your patch, when do 
you expect your commit to go through ?   

> Allow certain Sensors to be garbage collected after inactivity
> --
>
> Key: KAFKA-2419
> URL: https://issues.apache.org/jira/browse/KAFKA-2419
> Project: Kafka
>  Issue Type: New Feature
>Affects Versions: 0.8.3
>    Reporter: Aditya Auradkar
>    Assignee: Aditya Auradkar
>Priority: Blocker
>  Labels: quotas
> Fix For: 0.8.3
>
>
> Currently, metrics cannot be removed once registered. 
> Implement a feature to remove certain sensors after a certain period of 
> inactivity (perhaps configurable).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2528) Quota Performance Evaluation

2015-09-10 Thread Aditya Auradkar (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14739899#comment-14739899
 ] 

Aditya Auradkar commented on KAFKA-2528:


One possible explanation for the difference is that we append to the log when 
the produce request is received. For example, in your experiment you have 12 
mirror makers each sending a batch of data. When a batch is recorded the 
clients get throttled until the quota is within the limit. After receiving a 
response, each of them immediately sends a large batch to the brokers. Because 
the quota is so low and the request size can be much larger, there is a small 
absolute difference in this example which corresponds to the maximum size of 
the received request. I think if measured over a period of time from the client 
perspective, the actual throughput will be very similar to the 1MB quota.



> Quota Performance Evaluation
> 
>
> Key: KAFKA-2528
> URL: https://issues.apache.org/jira/browse/KAFKA-2528
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Dong Lin
>Assignee: Dong Lin
> Attachments: QuotaPerformanceEvaluation.pdf
>
>
> In this document we present the results of experiments we did at LinkedIn, to 
> validate the basic functionality of quota, as well as the performances 
> benefits of using quota in a heterogeneous multi-tenant environment.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2528) Quota Performance Evaluation

2015-09-10 Thread Aditya Auradkar (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14739438#comment-14739438
 ] 

Aditya Auradkar commented on KAFKA-2528:


I'm not quite sure why the actual rate is higher in this particular case. It 
seems to be a lot closer in the other tests Dong has posted. The difference is 
likely because of some measurement issue.. perhaps a test issue. It should be 
straightforward to reproduce this.

> Quota Performance Evaluation
> 
>
> Key: KAFKA-2528
> URL: https://issues.apache.org/jira/browse/KAFKA-2528
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Dong Lin
>Assignee: Dong Lin
> Attachments: QuotaPerformanceEvaluation.pdf
>
>
> In this document we present the results of experiments we did at LinkedIn, to 
> validate the basic functionality of quota, as well as the performances 
> benefits of using quota in a heterogeneous multi-tenant environment.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2419) Allow certain Sensors to be garbage collected after inactivity

2015-09-09 Thread Aditya Auradkar (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14737756#comment-14737756
 ] 

Aditya Auradkar commented on KAFKA-2419:


[~junrao][~jjkoshy] We have a couple of ways to solve this problem. Let me know 
which approach you prefer.

1. Generic Time based
- In this approach, we can mark any sensor as "transient". After a certain 
period of inactivity i.e. no record called for x minutes, we can remove the 
sensor and its associated metrics, unregister from all MetricReporters. Here, 
we can build the notion of time based retention into the root Metrics object 
itself i.e. periodically iterate through sensors and remove transient sensors 
that are deemed inactive. Is this a useful addition to the metrics library or 
does it only make sense for Quotas?

2. Time Based but keep time computation in Quota code.
- In this approach, the ClientQuotaManager can keep track of the last record 
time and trigger deleteSensor requests to the Metrics object. The metrics 
library will only understand creation and deletion of sensors.

3. Explicit deletion 
- (First suggested by Joel in 2084) Only delete sensors explicitly. For quotas, 
this involves keeping track of client disconnections. This is tricky to do 
correctly because connections/disconnections are handled in the Selector per 
SocketChannel and we need to start tracking number of connections per unique 
clientId.

> Allow certain Sensors to be garbage collected after inactivity
> --
>
> Key: KAFKA-2419
> URL: https://issues.apache.org/jira/browse/KAFKA-2419
> Project: Kafka
>  Issue Type: New Feature
>Affects Versions: 0.8.3
>Reporter: Aditya Auradkar
>Assignee: Aditya Auradkar
>Priority: Blocker
>  Labels: quotas
> Fix For: 0.8.3
>
>
> Currently, metrics cannot be removed once registered. 
> Implement a feature to remove certain sensors after a certain period of 
> inactivity (perhaps configurable).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-1215) Rack-Aware replica assignment option

2015-09-09 Thread Aditya Auradkar (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14737918#comment-14737918
 ] 

Aditya Auradkar commented on KAFKA-1215:


[~allenxwang] Hi Allen. Thanks for the patch. Can you create a KIP to discuss 
the changes being proposed (since this patch adds configs and ZK structures) ? 
https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Improvement+Proposals

We are hoping to leverage this patch within LinkedIn as well.

> Rack-Aware replica assignment option
> 
>
> Key: KAFKA-1215
> URL: https://issues.apache.org/jira/browse/KAFKA-1215
> Project: Kafka
>  Issue Type: Improvement
>  Components: replication
>Affects Versions: 0.8.0
>Reporter: Joris Van Remoortere
>Assignee: Jun Rao
> Fix For: 0.9.0
>
> Attachments: rack_aware_replica_assignment_v1.patch, 
> rack_aware_replica_assignment_v2.patch
>
>
> Adding a rack-id to kafka config. This rack-id can be used during replica 
> assignment by using the max-rack-replication argument in the admin scripts 
> (create topic, etc.). By default the original replication assignment 
> algorithm is used because max-rack-replication defaults to -1. 
> max-rack-replication > -1 is not honored if you are doing manual replica 
> assignment (preffered).
> If this looks good I can add some test cases specific to the rack-aware 
> assignment.
> I can also port this to trunk. We are currently running 0.8.0 in production 
> and need this, so i wrote the patch against that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2419) Allow certain Sensors to be garbage collected after inactivity

2015-09-09 Thread Aditya Auradkar (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14737497#comment-14737497
 ] 

Aditya Auradkar commented on KAFKA-2419:


[~junrao] I'm working on this now. Shall submit a patch within a few days

> Allow certain Sensors to be garbage collected after inactivity
> --
>
> Key: KAFKA-2419
> URL: https://issues.apache.org/jira/browse/KAFKA-2419
> Project: Kafka
>  Issue Type: New Feature
>Affects Versions: 0.8.3
>    Reporter: Aditya Auradkar
>    Assignee: Aditya Auradkar
>Priority: Blocker
>  Labels: quotas
> Fix For: 0.8.3
>
>
> Currently, metrics cannot be removed once registered. 
> Implement a feature to remove certain sensors after a certain period of 
> inactivity (perhaps configurable).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Maybe 0.8.3 should really be 0.9.0?

2015-09-08 Thread Aditya Auradkar
Hi Gwen,

I certainly think 0.9.0 is better than 0.8.3.
As regards semantic versioning, do we have a plan for a 1.0 release? IIUC,
compatibility rules don't really apply for pre-1.0 stuff. I'd argue that
Kafka already qualifies for 1.x.

Aditya

On Tue, Sep 8, 2015 at 10:26 AM, Gwen Shapira  wrote:

> We've been rather messy about this in the past, but I'm hoping to converge
> toward semantic versioning: http://semver.org/
>
> 0.9.0 will fit since we are adding new functionality in backward compatible
> manner.
>
> On Tue, Sep 8, 2015 at 10:23 AM, Flavio Junqueira  wrote:
>
> > Hi Gwen,
> >
> > What's the expected meaning of the individual digits of the version for
> > this community? Could you give me some insight here?
> >
> > -Flavio
> >
> > > On 08 Sep 2015, at 18:19, Gwen Shapira  wrote:
> > >
> > > Hi Kafka Fans,
> > >
> > > What do you think of making the next release (the one with security,
> new
> > > consumer, quotas, etc) a 0.9.0 instead of 0.8.3?
> > >
> > > It has lots of new features, and new consumer was pretty much scoped
> for
> > > 0.9.0, so it matches our original roadmap. I feel that so many awesome
> > > features deserve a better release number.
> > >
> > > The downside is mainly some confusion (we refer to 0.8.3 in bunch of
> > > places), and noisy emails from JIRA while we change "fix version" field
> > > everywhere.
> > >
> > > Thoughts?
> >
> >
>


[jira] [Resolved] (KAFKA-2136) Client side protocol changes to return quota delays

2015-09-02 Thread Aditya Auradkar (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aditya Auradkar resolved KAFKA-2136.

Resolution: Fixed

> Client side protocol changes to return quota delays
> ---
>
> Key: KAFKA-2136
> URL: https://issues.apache.org/jira/browse/KAFKA-2136
> Project: Kafka
>  Issue Type: Sub-task
>    Reporter: Aditya Auradkar
>    Assignee: Aditya Auradkar
>Priority: Blocker
>  Labels: quotas
> Fix For: 0.8.3
>
> Attachments: KAFKA-2136.patch, KAFKA-2136_2015-05-06_18:32:48.patch, 
> KAFKA-2136_2015-05-06_18:35:54.patch, KAFKA-2136_2015-05-11_14:50:56.patch, 
> KAFKA-2136_2015-05-12_14:40:44.patch, KAFKA-2136_2015-06-09_10:07:13.patch, 
> KAFKA-2136_2015-06-09_10:10:25.patch, KAFKA-2136_2015-06-30_19:43:55.patch, 
> KAFKA-2136_2015-07-13_13:34:03.patch, KAFKA-2136_2015-08-18_13:19:57.patch, 
> KAFKA-2136_2015-08-18_13:24:00.patch, KAFKA-2136_2015-08-21_16:29:17.patch, 
> KAFKA-2136_2015-08-24_10:33:10.patch, KAFKA-2136_2015-08-25_11:29:52.patch
>
>
> As described in KIP-13, evolve the protocol to return a throttle_time_ms in 
> the Fetch and the ProduceResponse objects. Add client side metrics on the new 
> producer and consumer to expose the delay time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-2502) Quotas documentation for 0.8.3

2015-09-02 Thread Aditya Auradkar (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aditya Auradkar updated KAFKA-2502:
---
Labels: quotas  (was: )

> Quotas documentation for 0.8.3
> --
>
> Key: KAFKA-2502
> URL: https://issues.apache.org/jira/browse/KAFKA-2502
> Project: Kafka
>  Issue Type: Task
>    Reporter: Aditya Auradkar
>    Assignee: Aditya Auradkar
>Priority: Blocker
>  Labels: quotas
> Fix For: 0.8.3
>
>
> Complete quotas documentation



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KAFKA-2502) Quotas documentation for 0.8.3

2015-09-02 Thread Aditya Auradkar (JIRA)
Aditya Auradkar created KAFKA-2502:
--

 Summary: Quotas documentation for 0.8.3
 Key: KAFKA-2502
 URL: https://issues.apache.org/jira/browse/KAFKA-2502
 Project: Kafka
  Issue Type: Task
Reporter: Aditya Auradkar
Assignee: Aditya Auradkar
Priority: Blocker
 Fix For: 0.8.3


Complete quotas documentation



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-2209) Change client quotas dynamically using DynamicConfigManager

2015-09-02 Thread Aditya Auradkar (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aditya Auradkar updated KAFKA-2209:
---
Labels: quotas  (was: )

> Change client quotas dynamically using DynamicConfigManager
> ---
>
> Key: KAFKA-2209
> URL: https://issues.apache.org/jira/browse/KAFKA-2209
> Project: Kafka
>  Issue Type: Sub-task
>    Reporter: Aditya Auradkar
>    Assignee: Aditya Auradkar
>  Labels: quotas
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-21+-+Dynamic+Configuration



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-2443) Expose windowSize on Measurable

2015-09-02 Thread Aditya Auradkar (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aditya Auradkar updated KAFKA-2443:
---
Labels: quotas  (was: )

> Expose windowSize on Measurable
> ---
>
> Key: KAFKA-2443
> URL: https://issues.apache.org/jira/browse/KAFKA-2443
> Project: Kafka
>  Issue Type: Task
>    Reporter: Aditya Auradkar
>    Assignee: Aditya Auradkar
>  Labels: quotas
>
> Currently, we dont have a means to measure the size of the metric window 
> since the final sample can be incomplete.
> Expose windowSize(now) on Measurable



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request: Fixing KAFKA-2309

2015-09-01 Thread auradkar
GitHub user auradkar opened a pull request:

https://github.com/apache/kafka/pull/185

Fixing KAFKA-2309

Currently, a LeaderAndIsrRequest does not mark the isrShrinkRate if the 
received ISR is smaller than the existing ISR. This can happen if one of the 
replicas is shut down.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/auradkar/kafka K-2309

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/185.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #185






---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Open source Mysql -> Kafka connector

2015-09-01 Thread Aditya Auradkar
I thought this would be of interest:
https://developer.zendesk.com/blog/introducing-maxwell-a-mysql-to-kafka-binlog-processor

A copycat connector that parses MySQL binlogs would be rather useful I
think. Streaming connectors using jdbc are tricky to implement because they
rely on an indexed timestamp field being present all the time.

Thanks,
Aditya


Re: no Kafka KIP meeting tomorrow

2015-09-01 Thread Aditya Auradkar
Typically to discuss anything listed here or other important changes:
https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Improvement+Proposals

On Tue, Sep 1, 2015 at 1:28 AM, jinxing  wrote:

> Can I know what is KIP meeting?
>
>
>
>
>
>
> At 2015-09-01 12:30:18, "Jun Rao"  wrote:
> >Since there are no new KIP issues for discussion, again, there is no
> >KIP meeting
> >tomorrow.
> >
> >Thanks,
> >
> >Jun
>


Re: Review Request 33378: Patch for KAFKA-2136

2015-08-25 Thread Aditya Auradkar


 On Aug. 22, 2015, 12:45 a.m., Joel Koshy wrote:
  clients/src/main/java/org/apache/kafka/common/requests/FetchResponse.java, 
  line 107
  https://reviews.apache.org/r/33378/diff/12-13/?file=1043780#file1043780line107
 
  This will probably need a versionId as well (as is done in the Scala 
  response) - i.e., when we move the broker over to use these protocol 
  schemas.
 
 Aditya Auradkar wrote:
 Makes sense. Do you want me to tackle this in this patch or should it be 
 tackled in the patch that migrates the broker to use these schemas?
 
 Joel Koshy wrote:
 I think it would be safer to do it in this patch itself.

Do you think we actually need to check the version id? The Struct contains the 
schema which should be sufficient to understand the version number right?


- Aditya


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33378/#review96100
---


On Aug. 24, 2015, 5:33 p.m., Aditya Auradkar wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/33378/
 ---
 
 (Updated Aug. 24, 2015, 5:33 p.m.)
 
 
 Review request for kafka, Joel Koshy and Jun Rao.
 
 
 Bugs: KAFKA-2136
 https://issues.apache.org/jira/browse/KAFKA-2136
 
 
 Repository: kafka
 
 
 Description
 ---
 
 Changes are
 - Addressing Joel's comments
 - protocol changes to the fetch request and response to return the 
 throttle_time_ms to clients
 - New producer/consumer metrics to expose the avg and max delay time for a 
 client
 - Test cases.
 - Addressed Joel and Juns comments
 
 
 Diffs
 -
 
   
 clients/src/main/java/org/apache/kafka/clients/consumer/internals/Fetcher.java
  9dc669728e6b052f5c6686fcf1b5696a50538ab4 
   
 clients/src/main/java/org/apache/kafka/clients/producer/internals/Sender.java 
 0baf16e55046a2f49f6431e01d52c323c95eddf0 
   clients/src/main/java/org/apache/kafka/common/protocol/Protocol.java 
 3dc8b015afd2347a41c9a9dbc02b8e367da5f75f 
   clients/src/main/java/org/apache/kafka/common/requests/FetchRequest.java 
 df073a0e76cc5cc731861b9604d0e19a928970e0 
   clients/src/main/java/org/apache/kafka/common/requests/FetchResponse.java 
 eb8951fba48c335095cc43fc3672de1c733e07ff 
   clients/src/main/java/org/apache/kafka/common/requests/ProduceRequest.java 
 715504b32950666e9aa5a260fa99d5f897b2007a 
   clients/src/main/java/org/apache/kafka/common/requests/ProduceResponse.java 
 febfc70dabc23671fd8a85cf5c5b274dff1e10fb 
   
 clients/src/test/java/org/apache/kafka/clients/consumer/internals/FetcherTest.java
  a7c83cac47d41138d47d7590a3787432d675c1b0 
   
 clients/src/test/java/org/apache/kafka/clients/producer/internals/SenderTest.java
  8b1805d3d2bcb9fe2bacb37d870c3236aa9532c4 
   
 clients/src/test/java/org/apache/kafka/common/requests/RequestResponseTest.java
  8b2aca85fa738180e5420985fddc39a4bf9681ea 
   core/src/main/scala/kafka/api/FetchRequest.scala 
 5b38f8554898e54800abd65a7415dd0ac41fd958 
   core/src/main/scala/kafka/api/FetchResponse.scala 
 b9efec2efbd3ea0ee12b911f453c47e66ad34894 
   core/src/main/scala/kafka/api/ProducerRequest.scala 
 c866180d3680da03e48d374415f10220f6ca68c4 
   core/src/main/scala/kafka/api/ProducerResponse.scala 
 5d1fac4cb8943f5bfaa487f8e9d9d2856efbd330 
   core/src/main/scala/kafka/consumer/SimpleConsumer.scala 
 7ebc0405d1f309bed9943e7116051d1d8276f200 
   core/src/main/scala/kafka/server/AbstractFetcherThread.scala 
 f84306143c43049e3aa44e42beaffe7eb2783163 
   core/src/main/scala/kafka/server/ClientQuotaManager.scala 
 9f8473f1c64d10c04cf8cc91967688e29e54ae2e 
   core/src/main/scala/kafka/server/KafkaApis.scala 
 67f0cad802f901f255825aa2158545d7f5e7cc3d 
   core/src/main/scala/kafka/server/ReplicaFetcherThread.scala 
 fae22d2af8daccd528ac24614290f46ae8f6c797 
   core/src/main/scala/kafka/server/ReplicaManager.scala 
 d829e180c3943a90861a12ec184f9b4e4bbafe7d 
   core/src/main/scala/kafka/server/ThrottledResponse.scala 
 1f80d5480ccf7c411a02dd90296a7046ede0fae2 
   core/src/test/scala/unit/kafka/api/RequestResponseSerializationTest.scala 
 b4c2a228c3c9872e5817ac58d3022e4903e317b7 
   core/src/test/scala/unit/kafka/server/ClientQuotaManagerTest.scala 
 caf98e8f2e09d39ab8234b9f4b9478686865e908 
   core/src/test/scala/unit/kafka/server/ThrottledResponseExpirationTest.scala 
 c4b5803917e700965677d53624f1460c1a52bf52 
 
 Diff: https://reviews.apache.org/r/33378/diff/
 
 
 Testing
 ---
 
 New tests added
 
 
 Thanks,
 
 Aditya Auradkar
 




[jira] [Commented] (KAFKA-2136) Client side protocol changes to return quota delays

2015-08-25 Thread Aditya A Auradkar (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14711745#comment-14711745
 ] 

Aditya A Auradkar commented on KAFKA-2136:
--

Updated reviewboard https://reviews.apache.org/r/33378/diff/
 against branch origin/trunk

 Client side protocol changes to return quota delays
 ---

 Key: KAFKA-2136
 URL: https://issues.apache.org/jira/browse/KAFKA-2136
 Project: Kafka
  Issue Type: Sub-task
Reporter: Aditya Auradkar
Assignee: Aditya Auradkar
  Labels: quotas
 Attachments: KAFKA-2136.patch, KAFKA-2136_2015-05-06_18:32:48.patch, 
 KAFKA-2136_2015-05-06_18:35:54.patch, KAFKA-2136_2015-05-11_14:50:56.patch, 
 KAFKA-2136_2015-05-12_14:40:44.patch, KAFKA-2136_2015-06-09_10:07:13.patch, 
 KAFKA-2136_2015-06-09_10:10:25.patch, KAFKA-2136_2015-06-30_19:43:55.patch, 
 KAFKA-2136_2015-07-13_13:34:03.patch, KAFKA-2136_2015-08-18_13:19:57.patch, 
 KAFKA-2136_2015-08-18_13:24:00.patch, KAFKA-2136_2015-08-21_16:29:17.patch, 
 KAFKA-2136_2015-08-24_10:33:10.patch, KAFKA-2136_2015-08-25_11:29:52.patch


 As described in KIP-13, evolve the protocol to return a throttle_time_ms in 
 the Fetch and the ProduceResponse objects. Add client side metrics on the new 
 producer and consumer to expose the delay time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 33378: Patch for KAFKA-2136

2015-08-25 Thread Aditya Auradkar

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33378/
---

(Updated Aug. 25, 2015, 6:30 p.m.)


Review request for kafka, Joel Koshy and Jun Rao.


Bugs: KAFKA-2136
https://issues.apache.org/jira/browse/KAFKA-2136


Repository: kafka


Description (updated)
---

Addressing Joel's comments


Merging


Chaning variable name


Addressing Joel's comments


Addressing Joel's comments


Addressing comments


Addressing joels comments


Addressing joels comments


Addressed Joels comments


Diffs (updated)
-

  
clients/src/main/java/org/apache/kafka/clients/consumer/internals/Fetcher.java 
9dc669728e6b052f5c6686fcf1b5696a50538ab4 
  clients/src/main/java/org/apache/kafka/clients/producer/internals/Sender.java 
0baf16e55046a2f49f6431e01d52c323c95eddf0 
  clients/src/main/java/org/apache/kafka/common/protocol/Protocol.java 
3dc8b015afd2347a41c9a9dbc02b8e367da5f75f 
  clients/src/main/java/org/apache/kafka/common/requests/FetchRequest.java 
df073a0e76cc5cc731861b9604d0e19a928970e0 
  clients/src/main/java/org/apache/kafka/common/requests/FetchResponse.java 
eb8951fba48c335095cc43fc3672de1c733e07ff 
  clients/src/main/java/org/apache/kafka/common/requests/ProduceRequest.java 
715504b32950666e9aa5a260fa99d5f897b2007a 
  clients/src/main/java/org/apache/kafka/common/requests/ProduceResponse.java 
febfc70dabc23671fd8a85cf5c5b274dff1e10fb 
  
clients/src/test/java/org/apache/kafka/clients/consumer/internals/FetcherTest.java
 a7c83cac47d41138d47d7590a3787432d675c1b0 
  
clients/src/test/java/org/apache/kafka/clients/producer/internals/SenderTest.java
 8b1805d3d2bcb9fe2bacb37d870c3236aa9532c4 
  
clients/src/test/java/org/apache/kafka/common/requests/RequestResponseTest.java 
8b2aca85fa738180e5420985fddc39a4bf9681ea 
  core/src/main/scala/kafka/api/FetchRequest.scala 
5b38f8554898e54800abd65a7415dd0ac41fd958 
  core/src/main/scala/kafka/api/FetchResponse.scala 
b9efec2efbd3ea0ee12b911f453c47e66ad34894 
  core/src/main/scala/kafka/api/ProducerRequest.scala 
c866180d3680da03e48d374415f10220f6ca68c4 
  core/src/main/scala/kafka/api/ProducerResponse.scala 
5d1fac4cb8943f5bfaa487f8e9d9d2856efbd330 
  core/src/main/scala/kafka/consumer/SimpleConsumer.scala 
7ebc0405d1f309bed9943e7116051d1d8276f200 
  core/src/main/scala/kafka/server/AbstractFetcherThread.scala 
f84306143c43049e3aa44e42beaffe7eb2783163 
  core/src/main/scala/kafka/server/ClientQuotaManager.scala 
9f8473f1c64d10c04cf8cc91967688e29e54ae2e 
  core/src/main/scala/kafka/server/KafkaApis.scala 
67f0cad802f901f255825aa2158545d7f5e7cc3d 
  core/src/main/scala/kafka/server/ReplicaFetcherThread.scala 
fae22d2af8daccd528ac24614290f46ae8f6c797 
  core/src/main/scala/kafka/server/ReplicaManager.scala 
d829e180c3943a90861a12ec184f9b4e4bbafe7d 
  core/src/main/scala/kafka/server/ThrottledResponse.scala 
1f80d5480ccf7c411a02dd90296a7046ede0fae2 
  core/src/test/scala/unit/kafka/api/RequestResponseSerializationTest.scala 
b4c2a228c3c9872e5817ac58d3022e4903e317b7 
  core/src/test/scala/unit/kafka/server/ClientQuotaManagerTest.scala 
caf98e8f2e09d39ab8234b9f4b9478686865e908 
  core/src/test/scala/unit/kafka/server/ThrottledResponseExpirationTest.scala 
c4b5803917e700965677d53624f1460c1a52bf52 

Diff: https://reviews.apache.org/r/33378/diff/


Testing
---

New tests added


Thanks,

Aditya Auradkar



[jira] [Updated] (KAFKA-2136) Client side protocol changes to return quota delays

2015-08-25 Thread Aditya A Auradkar (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aditya A Auradkar updated KAFKA-2136:
-
Attachment: KAFKA-2136_2015-08-25_11:29:52.patch

 Client side protocol changes to return quota delays
 ---

 Key: KAFKA-2136
 URL: https://issues.apache.org/jira/browse/KAFKA-2136
 Project: Kafka
  Issue Type: Sub-task
Reporter: Aditya Auradkar
Assignee: Aditya Auradkar
  Labels: quotas
 Attachments: KAFKA-2136.patch, KAFKA-2136_2015-05-06_18:32:48.patch, 
 KAFKA-2136_2015-05-06_18:35:54.patch, KAFKA-2136_2015-05-11_14:50:56.patch, 
 KAFKA-2136_2015-05-12_14:40:44.patch, KAFKA-2136_2015-06-09_10:07:13.patch, 
 KAFKA-2136_2015-06-09_10:10:25.patch, KAFKA-2136_2015-06-30_19:43:55.patch, 
 KAFKA-2136_2015-07-13_13:34:03.patch, KAFKA-2136_2015-08-18_13:19:57.patch, 
 KAFKA-2136_2015-08-18_13:24:00.patch, KAFKA-2136_2015-08-21_16:29:17.patch, 
 KAFKA-2136_2015-08-24_10:33:10.patch, KAFKA-2136_2015-08-25_11:29:52.patch


 As described in KIP-13, evolve the protocol to return a throttle_time_ms in 
 the Fetch and the ProduceResponse objects. Add client side metrics on the new 
 producer and consumer to expose the delay time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 33378: Patch for KAFKA-2136

2015-08-25 Thread Aditya Auradkar

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33378/
---

(Updated Aug. 25, 2015, 6:30 p.m.)


Review request for kafka, Joel Koshy and Jun Rao.


Bugs: KAFKA-2136
https://issues.apache.org/jira/browse/KAFKA-2136


Repository: kafka


Description (updated)
---

Changes are
- Addressing Joel's comments
- protocol changes to the fetch request and response to return the 
throttle_time_ms to clients
- New producer/consumer metrics to expose the avg and max delay time for a 
client
- Test cases.
- Addressed Joel and Juns comments


Diffs
-

  
clients/src/main/java/org/apache/kafka/clients/consumer/internals/Fetcher.java 
9dc669728e6b052f5c6686fcf1b5696a50538ab4 
  clients/src/main/java/org/apache/kafka/clients/producer/internals/Sender.java 
0baf16e55046a2f49f6431e01d52c323c95eddf0 
  clients/src/main/java/org/apache/kafka/common/protocol/Protocol.java 
3dc8b015afd2347a41c9a9dbc02b8e367da5f75f 
  clients/src/main/java/org/apache/kafka/common/requests/FetchRequest.java 
df073a0e76cc5cc731861b9604d0e19a928970e0 
  clients/src/main/java/org/apache/kafka/common/requests/FetchResponse.java 
eb8951fba48c335095cc43fc3672de1c733e07ff 
  clients/src/main/java/org/apache/kafka/common/requests/ProduceRequest.java 
715504b32950666e9aa5a260fa99d5f897b2007a 
  clients/src/main/java/org/apache/kafka/common/requests/ProduceResponse.java 
febfc70dabc23671fd8a85cf5c5b274dff1e10fb 
  
clients/src/test/java/org/apache/kafka/clients/consumer/internals/FetcherTest.java
 a7c83cac47d41138d47d7590a3787432d675c1b0 
  
clients/src/test/java/org/apache/kafka/clients/producer/internals/SenderTest.java
 8b1805d3d2bcb9fe2bacb37d870c3236aa9532c4 
  
clients/src/test/java/org/apache/kafka/common/requests/RequestResponseTest.java 
8b2aca85fa738180e5420985fddc39a4bf9681ea 
  core/src/main/scala/kafka/api/FetchRequest.scala 
5b38f8554898e54800abd65a7415dd0ac41fd958 
  core/src/main/scala/kafka/api/FetchResponse.scala 
b9efec2efbd3ea0ee12b911f453c47e66ad34894 
  core/src/main/scala/kafka/api/ProducerRequest.scala 
c866180d3680da03e48d374415f10220f6ca68c4 
  core/src/main/scala/kafka/api/ProducerResponse.scala 
5d1fac4cb8943f5bfaa487f8e9d9d2856efbd330 
  core/src/main/scala/kafka/consumer/SimpleConsumer.scala 
7ebc0405d1f309bed9943e7116051d1d8276f200 
  core/src/main/scala/kafka/server/AbstractFetcherThread.scala 
f84306143c43049e3aa44e42beaffe7eb2783163 
  core/src/main/scala/kafka/server/ClientQuotaManager.scala 
9f8473f1c64d10c04cf8cc91967688e29e54ae2e 
  core/src/main/scala/kafka/server/KafkaApis.scala 
67f0cad802f901f255825aa2158545d7f5e7cc3d 
  core/src/main/scala/kafka/server/ReplicaFetcherThread.scala 
fae22d2af8daccd528ac24614290f46ae8f6c797 
  core/src/main/scala/kafka/server/ReplicaManager.scala 
d829e180c3943a90861a12ec184f9b4e4bbafe7d 
  core/src/main/scala/kafka/server/ThrottledResponse.scala 
1f80d5480ccf7c411a02dd90296a7046ede0fae2 
  core/src/test/scala/unit/kafka/api/RequestResponseSerializationTest.scala 
b4c2a228c3c9872e5817ac58d3022e4903e317b7 
  core/src/test/scala/unit/kafka/server/ClientQuotaManagerTest.scala 
caf98e8f2e09d39ab8234b9f4b9478686865e908 
  core/src/test/scala/unit/kafka/server/ThrottledResponseExpirationTest.scala 
c4b5803917e700965677d53624f1460c1a52bf52 

Diff: https://reviews.apache.org/r/33378/diff/


Testing
---

New tests added


Thanks,

Aditya Auradkar



Re: Review Request 33378: Patch for KAFKA-2136

2015-08-24 Thread Aditya Auradkar


 On Aug. 22, 2015, 12:45 a.m., Joel Koshy wrote:
  clients/src/main/java/org/apache/kafka/common/requests/FetchResponse.java, 
  line 107
  https://reviews.apache.org/r/33378/diff/12-13/?file=1043780#file1043780line107
 
  This will probably need a versionId as well (as is done in the Scala 
  response) - i.e., when we move the broker over to use these protocol 
  schemas.

Makes sense. Do you want me to tackle this in this patch or should it be 
tackled in the patch that migrates the broker to use these schemas?


- Aditya


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33378/#review96100
---


On Aug. 24, 2015, 5:33 p.m., Aditya Auradkar wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/33378/
 ---
 
 (Updated Aug. 24, 2015, 5:33 p.m.)
 
 
 Review request for kafka, Joel Koshy and Jun Rao.
 
 
 Bugs: KAFKA-2136
 https://issues.apache.org/jira/browse/KAFKA-2136
 
 
 Repository: kafka
 
 
 Description
 ---
 
 Changes are
 - Addressing Joel's comments
 - protocol changes to the fetch request and response to return the 
 throttle_time_ms to clients
 - New producer/consumer metrics to expose the avg and max delay time for a 
 client
 - Test cases.
 - Addressed Joel and Juns comments
 
 
 Diffs
 -
 
   
 clients/src/main/java/org/apache/kafka/clients/consumer/internals/Fetcher.java
  9dc669728e6b052f5c6686fcf1b5696a50538ab4 
   
 clients/src/main/java/org/apache/kafka/clients/producer/internals/Sender.java 
 0baf16e55046a2f49f6431e01d52c323c95eddf0 
   clients/src/main/java/org/apache/kafka/common/protocol/Protocol.java 
 3dc8b015afd2347a41c9a9dbc02b8e367da5f75f 
   clients/src/main/java/org/apache/kafka/common/requests/FetchRequest.java 
 df073a0e76cc5cc731861b9604d0e19a928970e0 
   clients/src/main/java/org/apache/kafka/common/requests/FetchResponse.java 
 eb8951fba48c335095cc43fc3672de1c733e07ff 
   clients/src/main/java/org/apache/kafka/common/requests/ProduceRequest.java 
 715504b32950666e9aa5a260fa99d5f897b2007a 
   clients/src/main/java/org/apache/kafka/common/requests/ProduceResponse.java 
 febfc70dabc23671fd8a85cf5c5b274dff1e10fb 
   
 clients/src/test/java/org/apache/kafka/clients/consumer/internals/FetcherTest.java
  a7c83cac47d41138d47d7590a3787432d675c1b0 
   
 clients/src/test/java/org/apache/kafka/clients/producer/internals/SenderTest.java
  8b1805d3d2bcb9fe2bacb37d870c3236aa9532c4 
   
 clients/src/test/java/org/apache/kafka/common/requests/RequestResponseTest.java
  8b2aca85fa738180e5420985fddc39a4bf9681ea 
   core/src/main/scala/kafka/api/FetchRequest.scala 
 5b38f8554898e54800abd65a7415dd0ac41fd958 
   core/src/main/scala/kafka/api/FetchResponse.scala 
 b9efec2efbd3ea0ee12b911f453c47e66ad34894 
   core/src/main/scala/kafka/api/ProducerRequest.scala 
 c866180d3680da03e48d374415f10220f6ca68c4 
   core/src/main/scala/kafka/api/ProducerResponse.scala 
 5d1fac4cb8943f5bfaa487f8e9d9d2856efbd330 
   core/src/main/scala/kafka/consumer/SimpleConsumer.scala 
 7ebc0405d1f309bed9943e7116051d1d8276f200 
   core/src/main/scala/kafka/server/AbstractFetcherThread.scala 
 f84306143c43049e3aa44e42beaffe7eb2783163 
   core/src/main/scala/kafka/server/ClientQuotaManager.scala 
 9f8473f1c64d10c04cf8cc91967688e29e54ae2e 
   core/src/main/scala/kafka/server/KafkaApis.scala 
 67f0cad802f901f255825aa2158545d7f5e7cc3d 
   core/src/main/scala/kafka/server/ReplicaFetcherThread.scala 
 fae22d2af8daccd528ac24614290f46ae8f6c797 
   core/src/main/scala/kafka/server/ReplicaManager.scala 
 d829e180c3943a90861a12ec184f9b4e4bbafe7d 
   core/src/main/scala/kafka/server/ThrottledResponse.scala 
 1f80d5480ccf7c411a02dd90296a7046ede0fae2 
   core/src/test/scala/unit/kafka/api/RequestResponseSerializationTest.scala 
 b4c2a228c3c9872e5817ac58d3022e4903e317b7 
   core/src/test/scala/unit/kafka/server/ClientQuotaManagerTest.scala 
 caf98e8f2e09d39ab8234b9f4b9478686865e908 
   core/src/test/scala/unit/kafka/server/ThrottledResponseExpirationTest.scala 
 c4b5803917e700965677d53624f1460c1a52bf52 
 
 Diff: https://reviews.apache.org/r/33378/diff/
 
 
 Testing
 ---
 
 New tests added
 
 
 Thanks,
 
 Aditya Auradkar
 




[jira] [Updated] (KAFKA-2136) Client side protocol changes to return quota delays

2015-08-24 Thread Aditya A Auradkar (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aditya A Auradkar updated KAFKA-2136:
-
Attachment: KAFKA-2136_2015-08-24_10:33:10.patch

 Client side protocol changes to return quota delays
 ---

 Key: KAFKA-2136
 URL: https://issues.apache.org/jira/browse/KAFKA-2136
 Project: Kafka
  Issue Type: Sub-task
Reporter: Aditya Auradkar
Assignee: Aditya Auradkar
  Labels: quotas
 Attachments: KAFKA-2136.patch, KAFKA-2136_2015-05-06_18:32:48.patch, 
 KAFKA-2136_2015-05-06_18:35:54.patch, KAFKA-2136_2015-05-11_14:50:56.patch, 
 KAFKA-2136_2015-05-12_14:40:44.patch, KAFKA-2136_2015-06-09_10:07:13.patch, 
 KAFKA-2136_2015-06-09_10:10:25.patch, KAFKA-2136_2015-06-30_19:43:55.patch, 
 KAFKA-2136_2015-07-13_13:34:03.patch, KAFKA-2136_2015-08-18_13:19:57.patch, 
 KAFKA-2136_2015-08-18_13:24:00.patch, KAFKA-2136_2015-08-21_16:29:17.patch, 
 KAFKA-2136_2015-08-24_10:33:10.patch


 As described in KIP-13, evolve the protocol to return a throttle_time_ms in 
 the Fetch and the ProduceResponse objects. Add client side metrics on the new 
 producer and consumer to expose the delay time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2136) Client side protocol changes to return quota delays

2015-08-24 Thread Aditya A Auradkar (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14709677#comment-14709677
 ] 

Aditya A Auradkar commented on KAFKA-2136:
--

Updated reviewboard https://reviews.apache.org/r/33378/diff/
 against branch origin/trunk

 Client side protocol changes to return quota delays
 ---

 Key: KAFKA-2136
 URL: https://issues.apache.org/jira/browse/KAFKA-2136
 Project: Kafka
  Issue Type: Sub-task
Reporter: Aditya Auradkar
Assignee: Aditya Auradkar
  Labels: quotas
 Attachments: KAFKA-2136.patch, KAFKA-2136_2015-05-06_18:32:48.patch, 
 KAFKA-2136_2015-05-06_18:35:54.patch, KAFKA-2136_2015-05-11_14:50:56.patch, 
 KAFKA-2136_2015-05-12_14:40:44.patch, KAFKA-2136_2015-06-09_10:07:13.patch, 
 KAFKA-2136_2015-06-09_10:10:25.patch, KAFKA-2136_2015-06-30_19:43:55.patch, 
 KAFKA-2136_2015-07-13_13:34:03.patch, KAFKA-2136_2015-08-18_13:19:57.patch, 
 KAFKA-2136_2015-08-18_13:24:00.patch, KAFKA-2136_2015-08-21_16:29:17.patch, 
 KAFKA-2136_2015-08-24_10:33:10.patch


 As described in KIP-13, evolve the protocol to return a throttle_time_ms in 
 the Fetch and the ProduceResponse objects. Add client side metrics on the new 
 producer and consumer to expose the delay time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 33378: Patch for KAFKA-2136

2015-08-24 Thread Aditya Auradkar

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33378/
---

(Updated Aug. 24, 2015, 5:33 p.m.)


Review request for kafka, Joel Koshy and Jun Rao.


Bugs: KAFKA-2136
https://issues.apache.org/jira/browse/KAFKA-2136


Repository: kafka


Description (updated)
---

Changes are
- Addressing Joel's comments
- protocol changes to the fetch request and response to return the 
throttle_time_ms to clients
- New producer/consumer metrics to expose the avg and max delay time for a 
client
- Test cases.
- Addressed Joel and Juns comments


Diffs
-

  
clients/src/main/java/org/apache/kafka/clients/consumer/internals/Fetcher.java 
9dc669728e6b052f5c6686fcf1b5696a50538ab4 
  clients/src/main/java/org/apache/kafka/clients/producer/internals/Sender.java 
0baf16e55046a2f49f6431e01d52c323c95eddf0 
  clients/src/main/java/org/apache/kafka/common/protocol/Protocol.java 
3dc8b015afd2347a41c9a9dbc02b8e367da5f75f 
  clients/src/main/java/org/apache/kafka/common/requests/FetchRequest.java 
df073a0e76cc5cc731861b9604d0e19a928970e0 
  clients/src/main/java/org/apache/kafka/common/requests/FetchResponse.java 
eb8951fba48c335095cc43fc3672de1c733e07ff 
  clients/src/main/java/org/apache/kafka/common/requests/ProduceRequest.java 
715504b32950666e9aa5a260fa99d5f897b2007a 
  clients/src/main/java/org/apache/kafka/common/requests/ProduceResponse.java 
febfc70dabc23671fd8a85cf5c5b274dff1e10fb 
  
clients/src/test/java/org/apache/kafka/clients/consumer/internals/FetcherTest.java
 a7c83cac47d41138d47d7590a3787432d675c1b0 
  
clients/src/test/java/org/apache/kafka/clients/producer/internals/SenderTest.java
 8b1805d3d2bcb9fe2bacb37d870c3236aa9532c4 
  
clients/src/test/java/org/apache/kafka/common/requests/RequestResponseTest.java 
8b2aca85fa738180e5420985fddc39a4bf9681ea 
  core/src/main/scala/kafka/api/FetchRequest.scala 
5b38f8554898e54800abd65a7415dd0ac41fd958 
  core/src/main/scala/kafka/api/FetchResponse.scala 
b9efec2efbd3ea0ee12b911f453c47e66ad34894 
  core/src/main/scala/kafka/api/ProducerRequest.scala 
c866180d3680da03e48d374415f10220f6ca68c4 
  core/src/main/scala/kafka/api/ProducerResponse.scala 
5d1fac4cb8943f5bfaa487f8e9d9d2856efbd330 
  core/src/main/scala/kafka/consumer/SimpleConsumer.scala 
7ebc0405d1f309bed9943e7116051d1d8276f200 
  core/src/main/scala/kafka/server/AbstractFetcherThread.scala 
f84306143c43049e3aa44e42beaffe7eb2783163 
  core/src/main/scala/kafka/server/ClientQuotaManager.scala 
9f8473f1c64d10c04cf8cc91967688e29e54ae2e 
  core/src/main/scala/kafka/server/KafkaApis.scala 
67f0cad802f901f255825aa2158545d7f5e7cc3d 
  core/src/main/scala/kafka/server/ReplicaFetcherThread.scala 
fae22d2af8daccd528ac24614290f46ae8f6c797 
  core/src/main/scala/kafka/server/ReplicaManager.scala 
d829e180c3943a90861a12ec184f9b4e4bbafe7d 
  core/src/main/scala/kafka/server/ThrottledResponse.scala 
1f80d5480ccf7c411a02dd90296a7046ede0fae2 
  core/src/test/scala/unit/kafka/api/RequestResponseSerializationTest.scala 
b4c2a228c3c9872e5817ac58d3022e4903e317b7 
  core/src/test/scala/unit/kafka/server/ClientQuotaManagerTest.scala 
caf98e8f2e09d39ab8234b9f4b9478686865e908 
  core/src/test/scala/unit/kafka/server/ThrottledResponseExpirationTest.scala 
c4b5803917e700965677d53624f1460c1a52bf52 

Diff: https://reviews.apache.org/r/33378/diff/


Testing
---

New tests added


Thanks,

Aditya Auradkar



[GitHub] kafka pull request: KAFKA-2442: Fixing transiently failing test

2015-08-21 Thread auradkar
GitHub user auradkar opened a pull request:

https://github.com/apache/kafka/pull/160

KAFKA-2442: Fixing transiently failing test

Made the following changes:
1. Made the quotas very small. (100 and 10 bytes/sec for producer and 
consumer respectively)
2. For the producer, I'm asserting the throttle_time with a timed loop 
using waitUntilTrue
3. For the consumer, I'm simply calling a timed poll in a loop until the 
server side throttle time metric returns true

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/auradkar/kafka failing-test

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/160.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #160


commit cd60472049dee0f515d4fc3f87bc1d6bba1eb923
Author: Aditya Auradkar aaurad...@linkedin.com
Date:   2015-08-21T21:21:51Z

KAFKA-2442: Fixing transiently failing test




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Assigned] (KAFKA-2442) QuotasTest should not fail when cpu is busy

2015-08-21 Thread Aditya Auradkar (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aditya Auradkar reassigned KAFKA-2442:
--

Assignee: Aditya Auradkar  (was: Dong Lin)

 QuotasTest should not fail when cpu is busy
 ---

 Key: KAFKA-2442
 URL: https://issues.apache.org/jira/browse/KAFKA-2442
 Project: Kafka
  Issue Type: Bug
Reporter: Dong Lin
Assignee: Aditya Auradkar
 Fix For: 0.8.3


 We observed that testThrottledProducerConsumer in QuotasTest may fail or 
 succeed randomly. It appears that the test may fail when the system is slow. 
 We can add timer in the integration test to avoid random failure.
 See an example failure at 
 https://builds.apache.org/job/kafka-trunk-git-pr/166/console for patch 
 https://github.com/apache/kafka/pull/142.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2442) QuotasTest should not fail when cpu is busy

2015-08-21 Thread Aditya Auradkar (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14707492#comment-14707492
 ] 

Aditya Auradkar commented on KAFKA-2442:


[~jjkoshy] - Can you take a look?

 QuotasTest should not fail when cpu is busy
 ---

 Key: KAFKA-2442
 URL: https://issues.apache.org/jira/browse/KAFKA-2442
 Project: Kafka
  Issue Type: Bug
Reporter: Dong Lin
Assignee: Aditya Auradkar
 Fix For: 0.8.3


 We observed that testThrottledProducerConsumer in QuotasTest may fail or 
 succeed randomly. It appears that the test may fail when the system is slow. 
 We can add timer in the integration test to avoid random failure.
 See an example failure at 
 https://builds.apache.org/job/kafka-trunk-git-pr/166/console for patch 
 https://github.com/apache/kafka/pull/142.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 33378: Patch for KAFKA-2136

2015-08-21 Thread Aditya Auradkar

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33378/
---

(Updated Aug. 21, 2015, 11:30 p.m.)


Review request for kafka, Joel Koshy and Jun Rao.


Bugs: KAFKA-2136
https://issues.apache.org/jira/browse/KAFKA-2136


Repository: kafka


Description (updated)
---

Changes are
- Addressing Joel's comments
- protocol changes to the fetch request and response to return the 
throttle_time_ms to clients
- New producer/consumer metrics to expose the avg and max delay time for a 
client
- Test cases.
- Addressed Joel and Juns comments


Diffs
-

  
clients/src/main/java/org/apache/kafka/clients/consumer/internals/Fetcher.java 
9dc669728e6b052f5c6686fcf1b5696a50538ab4 
  clients/src/main/java/org/apache/kafka/clients/producer/internals/Sender.java 
0baf16e55046a2f49f6431e01d52c323c95eddf0 
  clients/src/main/java/org/apache/kafka/common/protocol/Protocol.java 
3dc8b015afd2347a41c9a9dbc02b8e367da5f75f 
  clients/src/main/java/org/apache/kafka/common/requests/FetchRequest.java 
df073a0e76cc5cc731861b9604d0e19a928970e0 
  clients/src/main/java/org/apache/kafka/common/requests/FetchResponse.java 
eb8951fba48c335095cc43fc3672de1c733e07ff 
  clients/src/main/java/org/apache/kafka/common/requests/ProduceRequest.java 
715504b32950666e9aa5a260fa99d5f897b2007a 
  clients/src/main/java/org/apache/kafka/common/requests/ProduceResponse.java 
febfc70dabc23671fd8a85cf5c5b274dff1e10fb 
  
clients/src/test/java/org/apache/kafka/clients/consumer/internals/FetcherTest.java
 a7c83cac47d41138d47d7590a3787432d675c1b0 
  
clients/src/test/java/org/apache/kafka/clients/producer/internals/SenderTest.java
 8b1805d3d2bcb9fe2bacb37d870c3236aa9532c4 
  
clients/src/test/java/org/apache/kafka/common/requests/RequestResponseTest.java 
8b2aca85fa738180e5420985fddc39a4bf9681ea 
  core/src/main/scala/kafka/api/FetchRequest.scala 
5b38f8554898e54800abd65a7415dd0ac41fd958 
  core/src/main/scala/kafka/api/FetchResponse.scala 
b9efec2efbd3ea0ee12b911f453c47e66ad34894 
  core/src/main/scala/kafka/api/ProducerRequest.scala 
c866180d3680da03e48d374415f10220f6ca68c4 
  core/src/main/scala/kafka/api/ProducerResponse.scala 
5d1fac4cb8943f5bfaa487f8e9d9d2856efbd330 
  core/src/main/scala/kafka/consumer/SimpleConsumer.scala 
7ebc0405d1f309bed9943e7116051d1d8276f200 
  core/src/main/scala/kafka/server/AbstractFetcherThread.scala 
f84306143c43049e3aa44e42beaffe7eb2783163 
  core/src/main/scala/kafka/server/ClientQuotaManager.scala 
9f8473f1c64d10c04cf8cc91967688e29e54ae2e 
  core/src/main/scala/kafka/server/KafkaApis.scala 
67f0cad802f901f255825aa2158545d7f5e7cc3d 
  core/src/main/scala/kafka/server/ReplicaFetcherThread.scala 
fae22d2af8daccd528ac24614290f46ae8f6c797 
  core/src/main/scala/kafka/server/ReplicaManager.scala 
d829e180c3943a90861a12ec184f9b4e4bbafe7d 
  core/src/main/scala/kafka/server/ThrottledResponse.scala 
1f80d5480ccf7c411a02dd90296a7046ede0fae2 
  core/src/test/scala/unit/kafka/api/RequestResponseSerializationTest.scala 
b4c2a228c3c9872e5817ac58d3022e4903e317b7 
  core/src/test/scala/unit/kafka/server/ClientQuotaManagerTest.scala 
caf98e8f2e09d39ab8234b9f4b9478686865e908 
  core/src/test/scala/unit/kafka/server/ThrottledResponseExpirationTest.scala 
c4b5803917e700965677d53624f1460c1a52bf52 

Diff: https://reviews.apache.org/r/33378/diff/


Testing
---

New tests added


Thanks,

Aditya Auradkar



Re: Review Request 33378: Patch for KAFKA-2136

2015-08-21 Thread Aditya Auradkar


 On Aug. 21, 2015, 12:13 a.m., Joel Koshy wrote:
  core/src/main/scala/kafka/server/ClientQuotaManager.scala, line 142
  https://reviews.apache.org/r/33378/diff/12/?file=1043792#file1043792line142
 
  any specific reason for this change?

recordAndMaybeThrottle returns an int value as delay. Nicer to have delayTime 
return an int as well


 On Aug. 21, 2015, 12:13 a.m., Joel Koshy wrote:
  core/src/main/scala/kafka/api/FetchResponse.scala, line 175
  https://reviews.apache.org/r/33378/diff/12/?file=1043787#file1043787line175
 
  Since (in the event of multiple calls) this grouping would be repeated, 
  should we just have `responseSize` take the `FetchResponse` object and have 
  that look up the `lazy val dataGroupedByTopic`? That said, I think the 
  original version should have had `sizeInBytes` as a `lazy val` as well.

FetchResponse.responseSize is called from KafkaApis in order to figure out what 
value to record. We cannot pass in a FetchResponse object because the object 
doesn't exist yet because the throttle time is not available.

Should I change the signature to accept a dataGroupedByTopic instead of a 
TopicPartition - FetchResponsePartitionData map.


- Aditya


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33378/#review96009
---


On Aug. 18, 2015, 8:24 p.m., Aditya Auradkar wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/33378/
 ---
 
 (Updated Aug. 18, 2015, 8:24 p.m.)
 
 
 Review request for kafka, Joel Koshy and Jun Rao.
 
 
 Bugs: KAFKA-2136
 https://issues.apache.org/jira/browse/KAFKA-2136
 
 
 Repository: kafka
 
 
 Description
 ---
 
 Changes are
 - Addressing Joel's comments
 - protocol changes to the fetch request and response to return the 
 throttle_time_ms to clients
 - New producer/consumer metrics to expose the avg and max delay time for a 
 client
 - Test cases.
 - Addressed Joel and Juns comments
 
 
 Diffs
 -
 
   
 clients/src/main/java/org/apache/kafka/clients/consumer/internals/Fetcher.java
  9dc669728e6b052f5c6686fcf1b5696a50538ab4 
   
 clients/src/main/java/org/apache/kafka/clients/producer/internals/Sender.java 
 0baf16e55046a2f49f6431e01d52c323c95eddf0 
   clients/src/main/java/org/apache/kafka/common/protocol/Protocol.java 
 3dc8b015afd2347a41c9a9dbc02b8e367da5f75f 
   clients/src/main/java/org/apache/kafka/common/requests/FetchRequest.java 
 df073a0e76cc5cc731861b9604d0e19a928970e0 
   clients/src/main/java/org/apache/kafka/common/requests/FetchResponse.java 
 eb8951fba48c335095cc43fc3672de1c733e07ff 
   clients/src/main/java/org/apache/kafka/common/requests/ProduceRequest.java 
 715504b32950666e9aa5a260fa99d5f897b2007a 
   clients/src/main/java/org/apache/kafka/common/requests/ProduceResponse.java 
 febfc70dabc23671fd8a85cf5c5b274dff1e10fb 
   
 clients/src/test/java/org/apache/kafka/clients/consumer/internals/FetcherTest.java
  a7c83cac47d41138d47d7590a3787432d675c1b0 
   
 clients/src/test/java/org/apache/kafka/clients/producer/internals/SenderTest.java
  8b1805d3d2bcb9fe2bacb37d870c3236aa9532c4 
   
 clients/src/test/java/org/apache/kafka/common/requests/RequestResponseTest.java
  8b2aca85fa738180e5420985fddc39a4bf9681ea 
   core/src/main/scala/kafka/api/FetchRequest.scala 
 5b38f8554898e54800abd65a7415dd0ac41fd958 
   core/src/main/scala/kafka/api/FetchResponse.scala 
 0b6b33ab6f7a732ff1322b6f48abd4c43e0d7215 
   core/src/main/scala/kafka/api/ProducerRequest.scala 
 c866180d3680da03e48d374415f10220f6ca68c4 
   core/src/main/scala/kafka/api/ProducerResponse.scala 
 5d1fac4cb8943f5bfaa487f8e9d9d2856efbd330 
   core/src/main/scala/kafka/consumer/SimpleConsumer.scala 
 7ebc0405d1f309bed9943e7116051d1d8276f200 
   core/src/main/scala/kafka/server/AbstractFetcherThread.scala 
 f84306143c43049e3aa44e42beaffe7eb2783163 
   core/src/main/scala/kafka/server/ClientQuotaManager.scala 
 9f8473f1c64d10c04cf8cc91967688e29e54ae2e 
   core/src/main/scala/kafka/server/KafkaApis.scala 
 67f0cad802f901f255825aa2158545d7f5e7cc3d 
   core/src/main/scala/kafka/server/ReplicaFetcherThread.scala 
 fae22d2af8daccd528ac24614290f46ae8f6c797 
   core/src/main/scala/kafka/server/ReplicaManager.scala 
 d829e180c3943a90861a12ec184f9b4e4bbafe7d 
   core/src/main/scala/kafka/server/ThrottledResponse.scala 
 1f80d5480ccf7c411a02dd90296a7046ede0fae2 
   core/src/test/scala/unit/kafka/api/RequestResponseSerializationTest.scala 
 b4c2a228c3c9872e5817ac58d3022e4903e317b7 
   core/src/test/scala/unit/kafka/server/ClientQuotaManagerTest.scala 
 97dcca8c96f955acb3d92b29d7faa1e031ba71d4 
   core/src/test/scala/unit/kafka/server/ThrottledResponseExpirationTest.scala 
 14a7f4538041d557c190127e3d5f85edf2a0e7c1 
 
 Diff: https://reviews.apache.org/r/33378/diff

Re: Review Request 33378: Patch for KAFKA-2136

2015-08-21 Thread Aditya Auradkar

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33378/
---

(Updated Aug. 21, 2015, 11:29 p.m.)


Review request for kafka, Joel Koshy and Jun Rao.


Bugs: KAFKA-2136
https://issues.apache.org/jira/browse/KAFKA-2136


Repository: kafka


Description (updated)
---

Addressing Joel's comments


Merging


Chaning variable name


Addressing Joel's comments


Addressing Joel's comments


Addressing comments


Addressing joels comments


Diffs (updated)
-

  
clients/src/main/java/org/apache/kafka/clients/consumer/internals/Fetcher.java 
9dc669728e6b052f5c6686fcf1b5696a50538ab4 
  clients/src/main/java/org/apache/kafka/clients/producer/internals/Sender.java 
0baf16e55046a2f49f6431e01d52c323c95eddf0 
  clients/src/main/java/org/apache/kafka/common/protocol/Protocol.java 
3dc8b015afd2347a41c9a9dbc02b8e367da5f75f 
  clients/src/main/java/org/apache/kafka/common/requests/FetchRequest.java 
df073a0e76cc5cc731861b9604d0e19a928970e0 
  clients/src/main/java/org/apache/kafka/common/requests/FetchResponse.java 
eb8951fba48c335095cc43fc3672de1c733e07ff 
  clients/src/main/java/org/apache/kafka/common/requests/ProduceRequest.java 
715504b32950666e9aa5a260fa99d5f897b2007a 
  clients/src/main/java/org/apache/kafka/common/requests/ProduceResponse.java 
febfc70dabc23671fd8a85cf5c5b274dff1e10fb 
  
clients/src/test/java/org/apache/kafka/clients/consumer/internals/FetcherTest.java
 a7c83cac47d41138d47d7590a3787432d675c1b0 
  
clients/src/test/java/org/apache/kafka/clients/producer/internals/SenderTest.java
 8b1805d3d2bcb9fe2bacb37d870c3236aa9532c4 
  
clients/src/test/java/org/apache/kafka/common/requests/RequestResponseTest.java 
8b2aca85fa738180e5420985fddc39a4bf9681ea 
  core/src/main/scala/kafka/api/FetchRequest.scala 
5b38f8554898e54800abd65a7415dd0ac41fd958 
  core/src/main/scala/kafka/api/FetchResponse.scala 
b9efec2efbd3ea0ee12b911f453c47e66ad34894 
  core/src/main/scala/kafka/api/ProducerRequest.scala 
c866180d3680da03e48d374415f10220f6ca68c4 
  core/src/main/scala/kafka/api/ProducerResponse.scala 
5d1fac4cb8943f5bfaa487f8e9d9d2856efbd330 
  core/src/main/scala/kafka/consumer/SimpleConsumer.scala 
7ebc0405d1f309bed9943e7116051d1d8276f200 
  core/src/main/scala/kafka/server/AbstractFetcherThread.scala 
f84306143c43049e3aa44e42beaffe7eb2783163 
  core/src/main/scala/kafka/server/ClientQuotaManager.scala 
9f8473f1c64d10c04cf8cc91967688e29e54ae2e 
  core/src/main/scala/kafka/server/KafkaApis.scala 
67f0cad802f901f255825aa2158545d7f5e7cc3d 
  core/src/main/scala/kafka/server/ReplicaFetcherThread.scala 
fae22d2af8daccd528ac24614290f46ae8f6c797 
  core/src/main/scala/kafka/server/ReplicaManager.scala 
d829e180c3943a90861a12ec184f9b4e4bbafe7d 
  core/src/main/scala/kafka/server/ThrottledResponse.scala 
1f80d5480ccf7c411a02dd90296a7046ede0fae2 
  core/src/test/scala/unit/kafka/api/RequestResponseSerializationTest.scala 
b4c2a228c3c9872e5817ac58d3022e4903e317b7 
  core/src/test/scala/unit/kafka/server/ClientQuotaManagerTest.scala 
caf98e8f2e09d39ab8234b9f4b9478686865e908 
  core/src/test/scala/unit/kafka/server/ThrottledResponseExpirationTest.scala 
c4b5803917e700965677d53624f1460c1a52bf52 

Diff: https://reviews.apache.org/r/33378/diff/


Testing
---

New tests added


Thanks,

Aditya Auradkar



[jira] [Updated] (KAFKA-2136) Client side protocol changes to return quota delays

2015-08-21 Thread Aditya A Auradkar (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aditya A Auradkar updated KAFKA-2136:
-
Attachment: KAFKA-2136_2015-08-21_16:29:17.patch

 Client side protocol changes to return quota delays
 ---

 Key: KAFKA-2136
 URL: https://issues.apache.org/jira/browse/KAFKA-2136
 Project: Kafka
  Issue Type: Sub-task
Reporter: Aditya Auradkar
Assignee: Aditya Auradkar
  Labels: quotas
 Attachments: KAFKA-2136.patch, KAFKA-2136_2015-05-06_18:32:48.patch, 
 KAFKA-2136_2015-05-06_18:35:54.patch, KAFKA-2136_2015-05-11_14:50:56.patch, 
 KAFKA-2136_2015-05-12_14:40:44.patch, KAFKA-2136_2015-06-09_10:07:13.patch, 
 KAFKA-2136_2015-06-09_10:10:25.patch, KAFKA-2136_2015-06-30_19:43:55.patch, 
 KAFKA-2136_2015-07-13_13:34:03.patch, KAFKA-2136_2015-08-18_13:19:57.patch, 
 KAFKA-2136_2015-08-18_13:24:00.patch, KAFKA-2136_2015-08-21_16:29:17.patch


 As described in KIP-13, evolve the protocol to return a throttle_time_ms in 
 the Fetch and the ProduceResponse objects. Add client side metrics on the new 
 producer and consumer to expose the delay time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2136) Client side protocol changes to return quota delays

2015-08-21 Thread Aditya A Auradkar (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14707637#comment-14707637
 ] 

Aditya A Auradkar commented on KAFKA-2136:
--

Updated reviewboard https://reviews.apache.org/r/33378/diff/
 against branch origin/trunk

 Client side protocol changes to return quota delays
 ---

 Key: KAFKA-2136
 URL: https://issues.apache.org/jira/browse/KAFKA-2136
 Project: Kafka
  Issue Type: Sub-task
Reporter: Aditya Auradkar
Assignee: Aditya Auradkar
  Labels: quotas
 Attachments: KAFKA-2136.patch, KAFKA-2136_2015-05-06_18:32:48.patch, 
 KAFKA-2136_2015-05-06_18:35:54.patch, KAFKA-2136_2015-05-11_14:50:56.patch, 
 KAFKA-2136_2015-05-12_14:40:44.patch, KAFKA-2136_2015-06-09_10:07:13.patch, 
 KAFKA-2136_2015-06-09_10:10:25.patch, KAFKA-2136_2015-06-30_19:43:55.patch, 
 KAFKA-2136_2015-07-13_13:34:03.patch, KAFKA-2136_2015-08-18_13:19:57.patch, 
 KAFKA-2136_2015-08-18_13:24:00.patch, KAFKA-2136_2015-08-21_16:29:17.patch


 As described in KIP-13, evolve the protocol to return a throttle_time_ms in 
 the Fetch and the ProduceResponse objects. Add client side metrics on the new 
 producer and consumer to expose the delay time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   3   4   5   6   >