[jira] [Assigned] (KAFKA-3058) remove the usage of deprecated config properties

2016-01-04 Thread Konrad Kalita (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konrad Kalita reassigned KAFKA-3058:


Assignee: Konrad Kalita

> remove the usage of deprecated config properties
> 
>
> Key: KAFKA-3058
> URL: https://issues.apache.org/jira/browse/KAFKA-3058
> Project: Kafka
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 0.9.0.0
>Reporter: Jun Rao
>Assignee: Konrad Kalita
>  Labels: newbie
> Fix For: 0.9.1.0
>
>
> There are compilation warnings like the following, which can be avoided.
> core/src/main/scala/kafka/tools/EndToEndLatency.scala:74: value 
> BLOCK_ON_BUFFER_FULL_CONFIG in object ProducerConfig is deprecated: see 
> corresponding Javadoc for more information.
> producerProps.put(ProducerConfig.BLOCK_ON_BUFFER_FULL_CONFIG, "true")
>  ^
> kafka/core/src/main/scala/kafka/tools/MirrorMaker.scala:195: value 
> BLOCK_ON_BUFFER_FULL_CONFIG in object ProducerConfig is deprecated: see 
> corresponding Javadoc for more information.
>   maybeSetDefaultProperty(producerProps, 
> ProducerConfig.BLOCK_ON_BUFFER_FULL_CONFIG, "true")
> ^
> /Users/junrao/intellij/kafka/core/src/main/scala/kafka/tools/ProducerPerformance.scala:40:
>  @deprecated now takes two arguments; see the scaladoc.
> @deprecated



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: KIP-41: KafkaConsumer Max Records

2016-01-04 Thread Jason Gustafson
Hi Cliff,

I think we're all agreed that the current contract of poll() should be
kept. The consumer wouldn't wait for max messages to become available in
this proposal; it would only sure that it never returns more than max
messages.

-Jason

On Mon, Jan 4, 2016 at 11:52 AM, Cliff Rhyne  wrote:

> Instead of a heartbeat, I'd prefer poll() to return whatever messages the
> client has.  Either a) I don't care if I get less than my max message limit
> or b) I do care and will set a larger timeout.  Case B is less common than
> A and is fairly easy to handle in the application's code.
>
> On Mon, Jan 4, 2016 at 1:47 PM, Gwen Shapira  wrote:
>
> > 1. Agree that TCP window style scaling will be cool. I'll try to think
> of a
> > good excuse to use it ;)
> >
> > 2. I'm very concerned about the challenges of getting the timeouts,
> > hearbeats and max messages right.
> >
> > Another option could be to expose "heartbeat" API to consumers. If my app
> > is still processing data but is still alive, it could initiate a
> heartbeat
> > to signal its alive without having to handle additional messages.
> >
> > I don't know if this improves more than it complicates though :(
> >
> > On Mon, Jan 4, 2016 at 11:40 AM, Jason Gustafson 
> > wrote:
> >
> > > Hey Gwen,
> > >
> > > I was thinking along the lines of TCP window scaling in order to
> > > dynamically find a good consumption rate. Basically you'd start off
> > > consuming say 100 records and you'd let it increase until the
> consumption
> > > took longer than half the session timeout (for example). You /might/ be
> > > able to achieve the same thing using pause/resume, but it would be a
> lot
> > > trickier since you have to do it at the granularity of partitions. But
> > > yeah, database write performance doesn't always scale in a predictable
> > > enough way to accommodate this, so I'm not sure how useful it would be
> in
> > > practice. It might also be more difficult to implement since it
> wouldn't
> > be
> > > as clear when to initiate the next fetch. With a static setting, the
> > > consumer knows exactly how many records will be returned on the next
> call
> > > to poll() and can send fetches accordingly.
> > >
> > > On the other hand, I do feel a little wary of the need to tune the
> > session
> > > timeout and max messages though since these settings might depend on
> the
> > > environment that the consumer is deployed in. It wouldn't be a big deal
> > if
> > > the impact was relatively minor, but getting them wrong can cause a lot
> > of
> > > rebalance churn which could keep the consumer from making any progress.
> > > It's not a particularly graceful failure.
> > >
> > > -Jason
> > >
> > > On Mon, Jan 4, 2016 at 10:49 AM, Gwen Shapira 
> wrote:
> > >
> > > > I can't speak to all use-cases, but for the database one, I think
> > > > pause-resume will be necessary in any case, and therefore dynamic
> batch
> > > > sizes are not needed.
> > > >
> > > > Databases are really unexpected regarding response times - load and
> > > locking
> > > > can affect this. I'm not sure there's a good way to know you are
> going
> > > into
> > > > rebalance hell before it is too late. So if I were writing code that
> > > > updates an RDBMS based on Kafka, I'd pick a reasonable batch size
> (say
> > > 5000
> > > > records), and basically pause, batch-insert all records, commit and
> > > resume.
> > > >
> > > > Does that make sense?
> > > >
> > > > On Mon, Jan 4, 2016 at 10:37 AM, Jason Gustafson  >
> > > > wrote:
> > > >
> > > > > Gwen and Ismael,
> > > > >
> > > > > I agree the configuration option is probably the way to go, but I
> was
> > > > > wondering whether there would be cases where it made sense to let
> the
> > > > > consumer dynamically set max messages to adjust for downstream
> > > slowness.
> > > > > For example, if the consumer is writing consumed records to another
> > > > > database, and that database is experiencing heavier than expected
> > load,
> > > > > then the consumer could halve its current max messages in order to
> > > adapt
> > > > > without risking rebalance hell. It could then increase max messages
> > as
> > > > the
> > > > > load on the database decreases. It's basically an easier way to
> > handle
> > > > flow
> > > > > control than we provide with pause/resume.
> > > > >
> > > > > -Jason
> > > > >
> > > > > On Mon, Jan 4, 2016 at 9:46 AM, Gwen Shapira 
> > > wrote:
> > > > >
> > > > > > The wiki you pointed to is no longer maintained and fell out of
> > sync
> > > > with
> > > > > > the code and protocol.
> > > > > >
> > > > > > You may want  to refer to:
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/A+Guide+To+The+Kafka+Protocol
> > > > > >
> > > > > > On Mon, Jan 4, 2016 at 4:38 AM, Jens Rantil  >
> > > > wrote:
> > > > > >
> > > > > > > Hi 

Re: [VOTE] KIP-32 Add CreateTime and LogAppendTime to Kafka message.

2016-01-04 Thread Aditya Auradkar
Hey Becket/Anna -

I have a few comments about the KIP.

1. (Minor) Can we rename the KIP? It's currently "Add CreateTime and
LogAppendTime etc..". This is actually the title of the now rejected Option
1.
2. (Minor) Can we rename the proposed option? It isn't really "option 4"
anymore.
3. I'm not clear on what exactly happens to compressed messages
when message.timestamp.type=LogAppendTime? Does every batch get
recompressed because the inner message gets rewritten with the server
timestamp? Or does the message set on disk have the timestamp set to -1. In
that case, what do we use as timestamp for the message?
4. Do message.timestamp.type and max.message.time.difference.ms need to be
per-topic configs? It seems that this is really a client config i.e. a
client is the source of timestamps not a topic. It could also be a
broker-level config to keep things simple.
5. The "Proposed Changes" section in the KIP tries to build a time-based
index for query but that is a separate proposal (KIP-33). Can we more
crisply identify what exactly will change when this KIP (and 31) is
implemented? It isn't super clear to me at this point.

Aside from that, I think the "Rejected Alternatives" section of the KIP is
excellent. Very good insight into what options were discussed and rejected.

Aditya

On Mon, Dec 28, 2015 at 3:57 PM, Becket Qin  wrote:

> Thanks Guozhang, Gwen and Neha for the comments. Sorry for late reply
> because I only have occasional gmail access from my phone...
>
> I just updated the wiki for KIP-32.
>
> Gwen,
>
> Yes, the migration plan is what you described.
>
> I agree with your comments on the version.
> I changed message.format.version to use the release version.
> I did not change the internal version, we can discuss this in a separate
> thread.
>
> Thanks,
>
> Jiangjie (Becket) Qin
>
>
>
> > On Dec 24, 2015, at 5:38 AM, Guozhang Wang  wrote:
> >
> > Also I agree with Gwen that such changes may worth a 0.10 release or even
> > 1.0, having it in 0.9.1 would be quite confusing to users.
> >
> > Guozhang
> >
> >> On Wed, Dec 23, 2015 at 1:36 PM, Guozhang Wang 
> wrote:
> >>
> >> Becket,
> >>
> >> Please let us know once you have updated the wiki page regarding the
> >> migration plan. Thanks!
> >>
> >> Guozhang
> >>
> >>> On Wed, Dec 23, 2015 at 11:52 AM, Gwen Shapira 
> wrote:
> >>>
> >>> Thanks Becket, Anne and Neha for responding to my concern.
> >>>
> >>> I had an offline discussion with Anne where she helped me understand
> the
> >>> migration process. It isn't as bad as it looks in the KIP :)
> >>>
> >>> If I understand it correctly, the process (for users) will be:
> >>>
> >>> 1. Prepare for upgrade (set format.version = 0, ApiVersion = 0.9.0)
> >>> 2. Rolling upgrade of brokers
> >>> 3. Bump ApiVersion to 0.9.0-1, so fetch requests between brokers will
> use
> >>> the new protocol
> >>> 4. Start upgrading clients
> >>> 5. When "enough" clients are upgraded, bump format.version to 1
> (rolling).
> >>>
> >>> Becket, can you confirm?
> >>>
> >>> Assuming this is the process, I'm +1 on the change.
> >>>
> >>> Reminder to coders and reviewers that pull-requests with user-facing
> >>> changes should include documentation changes as well as code changes.
> >>> And a polite request to try to be helpful to users on when to use
> >>> create-time and when to use log-append-time as configuration - this is
> not
> >>> a trivial decision.
> >>>
> >>> A separate point I'm going to raise in a different thread is that we
> need
> >>> to streamline our versions a bit:
> >>> 1. I'm afraid that 0.9.0-1 will be confusing to users who care about
> >>> released versions (what if we forget to change it before the release?
> Is
> >>> it
> >>> meaningful enough to someone running off trunk?), we need to come up
> with
> >>> something that will work for both LinkedIn and everyone else.
> >>> 2. ApiVersion has real version numbers. message.format.version has
> >>> sequence
> >>> numbers. This makes us look pretty silly :)
> >>>
> >>> My version concerns can be addressed separately and should not hold
> back
> >>> this KIP.
> >>>
> >>> Gwen
> >>>
> >>>
> >>>
> >>> On Tue, Dec 22, 2015 at 11:01 PM, Becket Qin 
> >>> wrote:
> >>>
>  Hi Anna,
> 
>  Thanks for initiating the voting process. I did not start the voting
>  process because there were still some ongoing discussion with Jun
> about
> >>> the
>  timestamp regarding compressed messages. That is why the wiki page
> >>> hasn't
>  reflected the latest conversation as Guozhang pointed out.
> 
>  Like Neha said I think we have reached general agreement on this KIP.
> So
>  it is probably fine to start the KIP voting. At least we draw more
>  attention to the KIP even if there are some new discussion to bring
> up.
> 
>  Regarding the upgrade plan, given we decided to implement KIP-31 and
>  KIP-32 in the same patch to 

[jira] [Updated] (KAFKA-3051) security.protocol documentation is inaccurate

2016-01-04 Thread Jun Rao (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jun Rao updated KAFKA-3051:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Issue resolved by pull request 724
[https://github.com/apache/kafka/pull/724]

> security.protocol documentation is inaccurate
> -
>
> Key: KAFKA-3051
> URL: https://issues.apache.org/jira/browse/KAFKA-3051
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients
>Affects Versions: 0.9.0.0
>Reporter: Jun Rao
>Assignee: Ismael Juma
>  Labels: security
> Fix For: 0.9.0.1
>
>
> In CommonClientConfigs, the doc for security.protocol says "Currently only 
> PLAINTEXT and SSL are supported.". This is inaccurate since we support SASL 
> as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request: KAFKA-3051 KAFKA-3048; Security config docs im...

2016-01-04 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/724


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] kafka pull request: MINOR: Update version to 0.9.0.1-SNAPSHOT

2016-01-04 Thread ewencp
GitHub user ewencp opened a pull request:

https://github.com/apache/kafka/pull/727

MINOR: Update version to 0.9.0.1-SNAPSHOT



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ewencp/kafka minor-0.9.0.1-SNAPSHOT

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/727.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #727


commit 0417bb23b84d1df59ae9fcb04f8ecf99a5b996a9
Author: Ewen Cheslack-Postava 
Date:   2016-01-04T18:27:01Z

MINOR: Update version to 0.9.0.1-SNAPSHOT




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: KIP-41: KafkaConsumer Max Records

2016-01-04 Thread Gwen Shapira
I can't speak to all use-cases, but for the database one, I think
pause-resume will be necessary in any case, and therefore dynamic batch
sizes are not needed.

Databases are really unexpected regarding response times - load and locking
can affect this. I'm not sure there's a good way to know you are going into
rebalance hell before it is too late. So if I were writing code that
updates an RDBMS based on Kafka, I'd pick a reasonable batch size (say 5000
records), and basically pause, batch-insert all records, commit and resume.

Does that make sense?

On Mon, Jan 4, 2016 at 10:37 AM, Jason Gustafson  wrote:

> Gwen and Ismael,
>
> I agree the configuration option is probably the way to go, but I was
> wondering whether there would be cases where it made sense to let the
> consumer dynamically set max messages to adjust for downstream slowness.
> For example, if the consumer is writing consumed records to another
> database, and that database is experiencing heavier than expected load,
> then the consumer could halve its current max messages in order to adapt
> without risking rebalance hell. It could then increase max messages as the
> load on the database decreases. It's basically an easier way to handle flow
> control than we provide with pause/resume.
>
> -Jason
>
> On Mon, Jan 4, 2016 at 9:46 AM, Gwen Shapira  wrote:
>
> > The wiki you pointed to is no longer maintained and fell out of sync with
> > the code and protocol.
> >
> > You may want  to refer to:
> >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/A+Guide+To+The+Kafka+Protocol
> >
> > On Mon, Jan 4, 2016 at 4:38 AM, Jens Rantil  wrote:
> >
> > > Hi guys,
> > >
> > > I realized I never thanked yall for your input - thanks!
> > > Jason: I apologize for assuming your stance on the issue! Feels like we
> > all
> > > agreed on the solution. +1
> > >
> > > Follow-up: Jason made a point about defining prefetch and fairness
> > > behaviour in the KIP. I am now working on putting that down in writing.
> > To
> > > do be able to do this I think I need to understand the current prefetch
> > > behaviour in the new consumer API (0.9) a bit better. Some specific
> > > questions:
> > >
> > >- How does a specific consumer balance incoming messages from
> multiple
> > >partitions? Is the consumer simply issuing Multi-Fetch requests[1]
> for
> > > the
> > >consumed assigned partitions of the relevant topics? Or is the
> > consumer
> > >fetching from one partition at a time and balancing between them
> > >internally? That is, is the responsibility of partition balancing
> (and
> > >fairness) on the broker side or consumer side?
> > >- Is the above documented somewhere?
> > >
> > > [1]
> > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/Writing+a+Driver+for+Kafka
> > > ,
> > > see "Multi-Fetch".
> > >
> > > Thanks,
> > > Jens
> > >
> > > On Wed, Dec 23, 2015 at 2:44 AM, Ismael Juma 
> wrote:
> > >
> > > > On Wed, Dec 23, 2015 at 1:24 AM, Gwen Shapira 
> > wrote:
> > > >
> > > > > Given the background, it sounds like you'll generally want each
> call
> > to
> > > > > poll() to return the same number of events (which is the number you
> > > > planned
> > > > > on having enough memory / time for). It also sounds like tuning the
> > > > number
> > > > > of events will be closely tied to tuning the session timeout. That
> > is -
> > > > if
> > > > > I choose to lower the session timeout for some reason, I will have
> to
> > > > > modify the number of records returning too.
> > > > >
> > > > > If those assumptions are correct, I think a configuration makes
> more
> > > > sense.
> > > > > 1. We are unlikely to want this parameter to be change at the
> > lifetime
> > > of
> > > > > the consumer
> > > > > 2. The correct value is tied to another configuration parameter, so
> > > they
> > > > > will be controlled together.
> > > > >
> > > >
> > > > I was thinking the same thing.
> > > >
> > > > Ismael
> > > >
> > >
> > >
> > >
> > > --
> > > Jens Rantil
> > > Backend engineer
> > > Tink AB
> > >
> > > Email: jens.ran...@tink.se
> > > Phone: +46 708 84 18 32
> > > Web: www.tink.se
> > >
> > > Facebook  Linkedin
> > > <
> > >
> >
> http://www.linkedin.com/company/2735919?trk=vsrp_companies_res_photo=VSRPsearchId%3A1057023381369207406670%2CVSRPtargetId%3A2735919%2CVSRPcmpt%3Aprimary
> > > >
> > >  Twitter 
> > >
> >
>


Re: KIP-41: KafkaConsumer Max Records

2016-01-04 Thread Cliff Rhyne
My team's main use-case benefits from flexibility between consumers, which
both options handle well.  If the poll timeout triggers before the full
batch size, I'd still want to get whatever is available.

On Mon, Jan 4, 2016 at 12:49 PM, Gwen Shapira  wrote:

> I can't speak to all use-cases, but for the database one, I think
> pause-resume will be necessary in any case, and therefore dynamic batch
> sizes are not needed.
>
> Databases are really unexpected regarding response times - load and locking
> can affect this. I'm not sure there's a good way to know you are going into
> rebalance hell before it is too late. So if I were writing code that
> updates an RDBMS based on Kafka, I'd pick a reasonable batch size (say 5000
> records), and basically pause, batch-insert all records, commit and resume.
>
> Does that make sense?
>
> On Mon, Jan 4, 2016 at 10:37 AM, Jason Gustafson 
> wrote:
>
> > Gwen and Ismael,
> >
> > I agree the configuration option is probably the way to go, but I was
> > wondering whether there would be cases where it made sense to let the
> > consumer dynamically set max messages to adjust for downstream slowness.
> > For example, if the consumer is writing consumed records to another
> > database, and that database is experiencing heavier than expected load,
> > then the consumer could halve its current max messages in order to adapt
> > without risking rebalance hell. It could then increase max messages as
> the
> > load on the database decreases. It's basically an easier way to handle
> flow
> > control than we provide with pause/resume.
> >
> > -Jason
> >
> > On Mon, Jan 4, 2016 at 9:46 AM, Gwen Shapira  wrote:
> >
> > > The wiki you pointed to is no longer maintained and fell out of sync
> with
> > > the code and protocol.
> > >
> > > You may want  to refer to:
> > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/A+Guide+To+The+Kafka+Protocol
> > >
> > > On Mon, Jan 4, 2016 at 4:38 AM, Jens Rantil 
> wrote:
> > >
> > > > Hi guys,
> > > >
> > > > I realized I never thanked yall for your input - thanks!
> > > > Jason: I apologize for assuming your stance on the issue! Feels like
> we
> > > all
> > > > agreed on the solution. +1
> > > >
> > > > Follow-up: Jason made a point about defining prefetch and fairness
> > > > behaviour in the KIP. I am now working on putting that down in
> writing.
> > > To
> > > > do be able to do this I think I need to understand the current
> prefetch
> > > > behaviour in the new consumer API (0.9) a bit better. Some specific
> > > > questions:
> > > >
> > > >- How does a specific consumer balance incoming messages from
> > multiple
> > > >partitions? Is the consumer simply issuing Multi-Fetch requests[1]
> > for
> > > > the
> > > >consumed assigned partitions of the relevant topics? Or is the
> > > consumer
> > > >fetching from one partition at a time and balancing between them
> > > >internally? That is, is the responsibility of partition balancing
> > (and
> > > >fairness) on the broker side or consumer side?
> > > >- Is the above documented somewhere?
> > > >
> > > > [1]
> > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/Writing+a+Driver+for+Kafka
> > > > ,
> > > > see "Multi-Fetch".
> > > >
> > > > Thanks,
> > > > Jens
> > > >
> > > > On Wed, Dec 23, 2015 at 2:44 AM, Ismael Juma 
> > wrote:
> > > >
> > > > > On Wed, Dec 23, 2015 at 1:24 AM, Gwen Shapira 
> > > wrote:
> > > > >
> > > > > > Given the background, it sounds like you'll generally want each
> > call
> > > to
> > > > > > poll() to return the same number of events (which is the number
> you
> > > > > planned
> > > > > > on having enough memory / time for). It also sounds like tuning
> the
> > > > > number
> > > > > > of events will be closely tied to tuning the session timeout.
> That
> > > is -
> > > > > if
> > > > > > I choose to lower the session timeout for some reason, I will
> have
> > to
> > > > > > modify the number of records returning too.
> > > > > >
> > > > > > If those assumptions are correct, I think a configuration makes
> > more
> > > > > sense.
> > > > > > 1. We are unlikely to want this parameter to be change at the
> > > lifetime
> > > > of
> > > > > > the consumer
> > > > > > 2. The correct value is tied to another configuration parameter,
> so
> > > > they
> > > > > > will be controlled together.
> > > > > >
> > > > >
> > > > > I was thinking the same thing.
> > > > >
> > > > > Ismael
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Jens Rantil
> > > > Backend engineer
> > > > Tink AB
> > > >
> > > > Email: jens.ran...@tink.se
> > > > Phone: +46 708 84 18 32
> > > > Web: www.tink.se
> > > >
> > > > Facebook  Linkedin
> > > > <
> > > >
> > >
> >
> 

[GitHub] kafka pull request: KAFKA-2422: Allow copycat connector plugins to...

2016-01-04 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/687


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-3051) security.protocol documentation is inaccurate

2016-01-04 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15081780#comment-15081780
 ] 

ASF GitHub Bot commented on KAFKA-3051:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/724


> security.protocol documentation is inaccurate
> -
>
> Key: KAFKA-3051
> URL: https://issues.apache.org/jira/browse/KAFKA-3051
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients
>Affects Versions: 0.9.0.0
>Reporter: Jun Rao
>Assignee: Ismael Juma
>  Labels: security
> Fix For: 0.9.0.1
>
>
> In CommonClientConfigs, the doc for security.protocol says "Currently only 
> PLAINTEXT and SSL are supported.". This is inaccurate since we support SASL 
> as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-3048) incorrect property name ssl.want.client.auth

2016-01-04 Thread Jun Rao (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jun Rao updated KAFKA-3048:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Issue resolved by pull request 724
[https://github.com/apache/kafka/pull/724]

> incorrect property name ssl.want.client.auth 
> -
>
> Key: KAFKA-3048
> URL: https://issues.apache.org/jira/browse/KAFKA-3048
> Project: Kafka
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 0.9.0.0
>Reporter: Jun Rao
>Assignee: Ismael Juma
>  Labels: security
> Fix For: 0.9.0.1
>
>
> In the description of ssl.client.auth, we mention ssl.want.client.auth, which 
> is incorrect.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Build failed in Jenkins: kafka-trunk-jdk8 #264

2016-01-04 Thread Apache Jenkins Server
See 

Changes:

[me] KAFKA-2422: Allow copycat connector plugins to be aliased to simpler

--
[...truncated 4758 lines...]

org.apache.kafka.connect.runtime.WorkerTest > testStopInvalidConnector PASSED

org.apache.kafka.connect.runtime.WorkerTest > testCleanupTasksOnStop PASSED

org.apache.kafka.connect.runtime.WorkerTest > testReconfigureConnectorTasks 
PASSED

org.apache.kafka.connect.runtime.WorkerTest > testAddRemoveTask PASSED

org.apache.kafka.connect.runtime.WorkerTest > testStopInvalidTask PASSED

org.apache.kafka.connect.runtime.WorkerTest > testAddConnectorByAlias PASSED

org.apache.kafka.connect.runtime.WorkerTest > testAddRemoveConnector PASSED
:testAll

BUILD SUCCESSFUL

Total time: 58 mins 12.753 secs
+ ./gradlew --stacktrace docsJarAll
To honour the JVM settings for this build a new JVM will be forked. Please 
consider using the daemon: 
https://docs.gradle.org/2.10/userguide/gradle_daemon.html.
Building project 'core' with Scala version 2.10.6
:docsJar_2_10
Building project 'core' with Scala version 2.10.6
:kafka-trunk-jdk8:clients:compileJava UP-TO-DATE
:kafka-trunk-jdk8:clients:processResources UP-TO-DATE
:kafka-trunk-jdk8:clients:classes UP-TO-DATE
:kafka-trunk-jdk8:clients:determineCommitId UP-TO-DATE
:kafka-trunk-jdk8:clients:createVersionFile
:kafka-trunk-jdk8:clients:jar UP-TO-DATE
:kafka-trunk-jdk8:clients:javadoc UP-TO-DATE
:kafka-trunk-jdk8:core:compileJava UP-TO-DATE
:kafka-trunk-jdk8:core:compileScalaJava HotSpot(TM) 64-Bit Server VM warning: 
ignoring option MaxPermSize=512m; support was removed in 8.0

:79:
 value DEFAULT_TIMESTAMP in object OffsetCommitRequest is deprecated: see 
corresponding Javadoc for more information.

org.apache.kafka.common.requests.OffsetCommitRequest.DEFAULT_TIMESTAMP
 ^
:36:
 value DEFAULT_TIMESTAMP in object OffsetCommitRequest is deprecated: see 
corresponding Javadoc for more information.
 commitTimestamp: Long = 
org.apache.kafka.common.requests.OffsetCommitRequest.DEFAULT_TIMESTAMP,

  ^
:37:
 value DEFAULT_TIMESTAMP in object OffsetCommitRequest is deprecated: see 
corresponding Javadoc for more information.
 expireTimestamp: Long = 
org.apache.kafka.common.requests.OffsetCommitRequest.DEFAULT_TIMESTAMP) {

  ^
:394:
 value DEFAULT_TIMESTAMP in object OffsetCommitRequest is deprecated: see 
corresponding Javadoc for more information.
  if (value.expireTimestamp == 
org.apache.kafka.common.requests.OffsetCommitRequest.DEFAULT_TIMESTAMP)

^
:284:
 value DEFAULT_TIMESTAMP in object OffsetCommitRequest is deprecated: see 
corresponding Javadoc for more information.
if (offsetAndMetadata.commitTimestamp == 
org.apache.kafka.common.requests.OffsetCommitRequest.DEFAULT_TIMESTAMP)

  ^
:301:
 a pure expression does nothing in statement position; you may be omitting 
necessary parentheses
ControllerStats.uncleanLeaderElectionRate
^
:302:
 a pure expression does nothing in statement position; you may be omitting 
necessary parentheses
ControllerStats.leaderElectionTimer
^
:74:
 value BLOCK_ON_BUFFER_FULL_CONFIG in object ProducerConfig is deprecated: see 
corresponding Javadoc for more information.
producerProps.put(ProducerConfig.BLOCK_ON_BUFFER_FULL_CONFIG, "true")
 ^
:195:
 value BLOCK_ON_BUFFER_FULL_CONFIG in object ProducerConfig is deprecated: 

Re: KIP-41: KafkaConsumer Max Records

2016-01-04 Thread Jason Gustafson
Gwen and Ismael,

I agree the configuration option is probably the way to go, but I was
wondering whether there would be cases where it made sense to let the
consumer dynamically set max messages to adjust for downstream slowness.
For example, if the consumer is writing consumed records to another
database, and that database is experiencing heavier than expected load,
then the consumer could halve its current max messages in order to adapt
without risking rebalance hell. It could then increase max messages as the
load on the database decreases. It's basically an easier way to handle flow
control than we provide with pause/resume.

-Jason

On Mon, Jan 4, 2016 at 9:46 AM, Gwen Shapira  wrote:

> The wiki you pointed to is no longer maintained and fell out of sync with
> the code and protocol.
>
> You may want  to refer to:
>
> https://cwiki.apache.org/confluence/display/KAFKA/A+Guide+To+The+Kafka+Protocol
>
> On Mon, Jan 4, 2016 at 4:38 AM, Jens Rantil  wrote:
>
> > Hi guys,
> >
> > I realized I never thanked yall for your input - thanks!
> > Jason: I apologize for assuming your stance on the issue! Feels like we
> all
> > agreed on the solution. +1
> >
> > Follow-up: Jason made a point about defining prefetch and fairness
> > behaviour in the KIP. I am now working on putting that down in writing.
> To
> > do be able to do this I think I need to understand the current prefetch
> > behaviour in the new consumer API (0.9) a bit better. Some specific
> > questions:
> >
> >- How does a specific consumer balance incoming messages from multiple
> >partitions? Is the consumer simply issuing Multi-Fetch requests[1] for
> > the
> >consumed assigned partitions of the relevant topics? Or is the
> consumer
> >fetching from one partition at a time and balancing between them
> >internally? That is, is the responsibility of partition balancing (and
> >fairness) on the broker side or consumer side?
> >- Is the above documented somewhere?
> >
> > [1]
> >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/Writing+a+Driver+for+Kafka
> > ,
> > see "Multi-Fetch".
> >
> > Thanks,
> > Jens
> >
> > On Wed, Dec 23, 2015 at 2:44 AM, Ismael Juma  wrote:
> >
> > > On Wed, Dec 23, 2015 at 1:24 AM, Gwen Shapira 
> wrote:
> > >
> > > > Given the background, it sounds like you'll generally want each call
> to
> > > > poll() to return the same number of events (which is the number you
> > > planned
> > > > on having enough memory / time for). It also sounds like tuning the
> > > number
> > > > of events will be closely tied to tuning the session timeout. That
> is -
> > > if
> > > > I choose to lower the session timeout for some reason, I will have to
> > > > modify the number of records returning too.
> > > >
> > > > If those assumptions are correct, I think a configuration makes more
> > > sense.
> > > > 1. We are unlikely to want this parameter to be change at the
> lifetime
> > of
> > > > the consumer
> > > > 2. The correct value is tied to another configuration parameter, so
> > they
> > > > will be controlled together.
> > > >
> > >
> > > I was thinking the same thing.
> > >
> > > Ismael
> > >
> >
> >
> >
> > --
> > Jens Rantil
> > Backend engineer
> > Tink AB
> >
> > Email: jens.ran...@tink.se
> > Phone: +46 708 84 18 32
> > Web: www.tink.se
> >
> > Facebook  Linkedin
> > <
> >
> http://www.linkedin.com/company/2735919?trk=vsrp_companies_res_photo=VSRPsearchId%3A1057023381369207406670%2CVSRPtargetId%3A2735919%2CVSRPcmpt%3Aprimary
> > >
> >  Twitter 
> >
>


Re: KIP-41: KafkaConsumer Max Records

2016-01-04 Thread Cliff Rhyne
Instead of a heartbeat, I'd prefer poll() to return whatever messages the
client has.  Either a) I don't care if I get less than my max message limit
or b) I do care and will set a larger timeout.  Case B is less common than
A and is fairly easy to handle in the application's code.

On Mon, Jan 4, 2016 at 1:47 PM, Gwen Shapira  wrote:

> 1. Agree that TCP window style scaling will be cool. I'll try to think of a
> good excuse to use it ;)
>
> 2. I'm very concerned about the challenges of getting the timeouts,
> hearbeats and max messages right.
>
> Another option could be to expose "heartbeat" API to consumers. If my app
> is still processing data but is still alive, it could initiate a heartbeat
> to signal its alive without having to handle additional messages.
>
> I don't know if this improves more than it complicates though :(
>
> On Mon, Jan 4, 2016 at 11:40 AM, Jason Gustafson 
> wrote:
>
> > Hey Gwen,
> >
> > I was thinking along the lines of TCP window scaling in order to
> > dynamically find a good consumption rate. Basically you'd start off
> > consuming say 100 records and you'd let it increase until the consumption
> > took longer than half the session timeout (for example). You /might/ be
> > able to achieve the same thing using pause/resume, but it would be a lot
> > trickier since you have to do it at the granularity of partitions. But
> > yeah, database write performance doesn't always scale in a predictable
> > enough way to accommodate this, so I'm not sure how useful it would be in
> > practice. It might also be more difficult to implement since it wouldn't
> be
> > as clear when to initiate the next fetch. With a static setting, the
> > consumer knows exactly how many records will be returned on the next call
> > to poll() and can send fetches accordingly.
> >
> > On the other hand, I do feel a little wary of the need to tune the
> session
> > timeout and max messages though since these settings might depend on the
> > environment that the consumer is deployed in. It wouldn't be a big deal
> if
> > the impact was relatively minor, but getting them wrong can cause a lot
> of
> > rebalance churn which could keep the consumer from making any progress.
> > It's not a particularly graceful failure.
> >
> > -Jason
> >
> > On Mon, Jan 4, 2016 at 10:49 AM, Gwen Shapira  wrote:
> >
> > > I can't speak to all use-cases, but for the database one, I think
> > > pause-resume will be necessary in any case, and therefore dynamic batch
> > > sizes are not needed.
> > >
> > > Databases are really unexpected regarding response times - load and
> > locking
> > > can affect this. I'm not sure there's a good way to know you are going
> > into
> > > rebalance hell before it is too late. So if I were writing code that
> > > updates an RDBMS based on Kafka, I'd pick a reasonable batch size (say
> > 5000
> > > records), and basically pause, batch-insert all records, commit and
> > resume.
> > >
> > > Does that make sense?
> > >
> > > On Mon, Jan 4, 2016 at 10:37 AM, Jason Gustafson 
> > > wrote:
> > >
> > > > Gwen and Ismael,
> > > >
> > > > I agree the configuration option is probably the way to go, but I was
> > > > wondering whether there would be cases where it made sense to let the
> > > > consumer dynamically set max messages to adjust for downstream
> > slowness.
> > > > For example, if the consumer is writing consumed records to another
> > > > database, and that database is experiencing heavier than expected
> load,
> > > > then the consumer could halve its current max messages in order to
> > adapt
> > > > without risking rebalance hell. It could then increase max messages
> as
> > > the
> > > > load on the database decreases. It's basically an easier way to
> handle
> > > flow
> > > > control than we provide with pause/resume.
> > > >
> > > > -Jason
> > > >
> > > > On Mon, Jan 4, 2016 at 9:46 AM, Gwen Shapira 
> > wrote:
> > > >
> > > > > The wiki you pointed to is no longer maintained and fell out of
> sync
> > > with
> > > > > the code and protocol.
> > > > >
> > > > > You may want  to refer to:
> > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/A+Guide+To+The+Kafka+Protocol
> > > > >
> > > > > On Mon, Jan 4, 2016 at 4:38 AM, Jens Rantil 
> > > wrote:
> > > > >
> > > > > > Hi guys,
> > > > > >
> > > > > > I realized I never thanked yall for your input - thanks!
> > > > > > Jason: I apologize for assuming your stance on the issue! Feels
> > like
> > > we
> > > > > all
> > > > > > agreed on the solution. +1
> > > > > >
> > > > > > Follow-up: Jason made a point about defining prefetch and
> fairness
> > > > > > behaviour in the KIP. I am now working on putting that down in
> > > writing.
> > > > > To
> > > > > > do be able to do this I think I need to understand the current
> > > prefetch
> > > > > > behaviour in the new consumer API (0.9) a bit 

[jira] [Resolved] (KAFKA-2884) Schema cache corruption in JsonConverter

2016-01-04 Thread Ewen Cheslack-Postava (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewen Cheslack-Postava resolved KAFKA-2884.
--
Resolution: Duplicate

Oops, seems when KAFKA-3055 was filed we missed that we already had this 
instance of the bug. Marking this one as duplicate since we already applied a 
patch with KAFKA-3055.

> Schema cache corruption in JsonConverter
> 
>
> Key: KAFKA-2884
> URL: https://issues.apache.org/jira/browse/KAFKA-2884
> Project: Kafka
>  Issue Type: Bug
>  Components: copycat
>Affects Versions: 0.9.0.0
> Environment: MacBook Pro 15'' mid-2014
> Mac OS X 10.11.1
>Reporter: Juan Hernandez
>Assignee: Ewen Cheslack-Postava
>
> There is an issue with the schema cache in the {{JsonConverter}} class when 
> using Struct types that corrupts its entries.
> The schema cache returns an schema object for a given field type which then 
> gets modified setting the field name.
> {code}
> ObjectNode fieldJsonSchema = asJsonSchema(field.schema());   
> fieldJsonSchema.put(JsonSchema.STRUCT_FIELD_NAME_FIELD_NAME, field.name());
> {code}
> The same `fieldJsonSchema` object instance is returned for following fields 
> (of the same type) and the field name attribute 
> ({{JsonSchema.STRUCT_FIELD_NAME_FIELD_NAME}}) is set overwritting the 
> previous value. This causes to serialize a corrupted schema in the message.
> {code:title=JsonConverter.java|borderStyle=solid}
> private ObjectNode asJsonSchema(Schema schema) {
> if (schema == null)
> return null;
> ObjectNode cached = fromConnectSchemaCache.get(schema);
> if (cached != null)
> return cached;
> // 
> case STRUCT:
> jsonSchema = 
> JsonNodeFactory.instance.objectNode().put(JsonSchema.SCHEMA_TYPE_FIELD_NAME, 
> JsonSchema.STRUCT_TYPE_NAME);
> ArrayNode fields = JsonNodeFactory.instance.arrayNode();
> for (Field field : schema.fields()) {
> ObjectNode fieldJsonSchema = asJsonSchema(field.schema());
> fieldJsonSchema.put(JsonSchema.STRUCT_FIELD_NAME_FIELD_NAME, 
> field.name());
> fields.add(fieldJsonSchema);
> }
> jsonSchema.set(JsonSchema.STRUCT_FIELDS_FIELD_NAME, fields);
> break;
> // ...
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: KIP-41: KafkaConsumer Max Records

2016-01-04 Thread Jason Gustafson
Hey Gwen,

I was thinking along the lines of TCP window scaling in order to
dynamically find a good consumption rate. Basically you'd start off
consuming say 100 records and you'd let it increase until the consumption
took longer than half the session timeout (for example). You /might/ be
able to achieve the same thing using pause/resume, but it would be a lot
trickier since you have to do it at the granularity of partitions. But
yeah, database write performance doesn't always scale in a predictable
enough way to accommodate this, so I'm not sure how useful it would be in
practice. It might also be more difficult to implement since it wouldn't be
as clear when to initiate the next fetch. With a static setting, the
consumer knows exactly how many records will be returned on the next call
to poll() and can send fetches accordingly.

On the other hand, I do feel a little wary of the need to tune the session
timeout and max messages though since these settings might depend on the
environment that the consumer is deployed in. It wouldn't be a big deal if
the impact was relatively minor, but getting them wrong can cause a lot of
rebalance churn which could keep the consumer from making any progress.
It's not a particularly graceful failure.

-Jason

On Mon, Jan 4, 2016 at 10:49 AM, Gwen Shapira  wrote:

> I can't speak to all use-cases, but for the database one, I think
> pause-resume will be necessary in any case, and therefore dynamic batch
> sizes are not needed.
>
> Databases are really unexpected regarding response times - load and locking
> can affect this. I'm not sure there's a good way to know you are going into
> rebalance hell before it is too late. So if I were writing code that
> updates an RDBMS based on Kafka, I'd pick a reasonable batch size (say 5000
> records), and basically pause, batch-insert all records, commit and resume.
>
> Does that make sense?
>
> On Mon, Jan 4, 2016 at 10:37 AM, Jason Gustafson 
> wrote:
>
> > Gwen and Ismael,
> >
> > I agree the configuration option is probably the way to go, but I was
> > wondering whether there would be cases where it made sense to let the
> > consumer dynamically set max messages to adjust for downstream slowness.
> > For example, if the consumer is writing consumed records to another
> > database, and that database is experiencing heavier than expected load,
> > then the consumer could halve its current max messages in order to adapt
> > without risking rebalance hell. It could then increase max messages as
> the
> > load on the database decreases. It's basically an easier way to handle
> flow
> > control than we provide with pause/resume.
> >
> > -Jason
> >
> > On Mon, Jan 4, 2016 at 9:46 AM, Gwen Shapira  wrote:
> >
> > > The wiki you pointed to is no longer maintained and fell out of sync
> with
> > > the code and protocol.
> > >
> > > You may want  to refer to:
> > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/A+Guide+To+The+Kafka+Protocol
> > >
> > > On Mon, Jan 4, 2016 at 4:38 AM, Jens Rantil 
> wrote:
> > >
> > > > Hi guys,
> > > >
> > > > I realized I never thanked yall for your input - thanks!
> > > > Jason: I apologize for assuming your stance on the issue! Feels like
> we
> > > all
> > > > agreed on the solution. +1
> > > >
> > > > Follow-up: Jason made a point about defining prefetch and fairness
> > > > behaviour in the KIP. I am now working on putting that down in
> writing.
> > > To
> > > > do be able to do this I think I need to understand the current
> prefetch
> > > > behaviour in the new consumer API (0.9) a bit better. Some specific
> > > > questions:
> > > >
> > > >- How does a specific consumer balance incoming messages from
> > multiple
> > > >partitions? Is the consumer simply issuing Multi-Fetch requests[1]
> > for
> > > > the
> > > >consumed assigned partitions of the relevant topics? Or is the
> > > consumer
> > > >fetching from one partition at a time and balancing between them
> > > >internally? That is, is the responsibility of partition balancing
> > (and
> > > >fairness) on the broker side or consumer side?
> > > >- Is the above documented somewhere?
> > > >
> > > > [1]
> > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/Writing+a+Driver+for+Kafka
> > > > ,
> > > > see "Multi-Fetch".
> > > >
> > > > Thanks,
> > > > Jens
> > > >
> > > > On Wed, Dec 23, 2015 at 2:44 AM, Ismael Juma 
> > wrote:
> > > >
> > > > > On Wed, Dec 23, 2015 at 1:24 AM, Gwen Shapira 
> > > wrote:
> > > > >
> > > > > > Given the background, it sounds like you'll generally want each
> > call
> > > to
> > > > > > poll() to return the same number of events (which is the number
> you
> > > > > planned
> > > > > > on having enough memory / time for). It also sounds like tuning
> the
> > > > > number
> > > > > > of events will be closely tied to tuning the session 

[jira] [Commented] (KAFKA-2422) Allow copycat connector plugins to be aliased to simpler names

2016-01-04 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15081693#comment-15081693
 ] 

ASF GitHub Bot commented on KAFKA-2422:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/687


> Allow copycat connector plugins to be aliased to simpler names
> --
>
> Key: KAFKA-2422
> URL: https://issues.apache.org/jira/browse/KAFKA-2422
> Project: Kafka
>  Issue Type: Sub-task
>  Components: copycat
>Affects Versions: 0.9.0.0
>Reporter: Ewen Cheslack-Postava
>Assignee: Gwen Shapira
>Priority: Minor
> Fix For: 0.9.1.0
>
>
> Configurations of connectors can get quite verbose when you have to specify 
> the full class name, e.g. 
> connector.class=org.apache.kafka.copycat.file.FileStreamSinkConnector
> It would be nice to allow connector classes to provide shorter aliases, e.g. 
> something like "file-sink", to make this config less verbose. Flume does 
> this, so we can use it as an example.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (KAFKA-2422) Allow copycat connector plugins to be aliased to simpler names

2016-01-04 Thread Ewen Cheslack-Postava (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewen Cheslack-Postava resolved KAFKA-2422.
--
   Resolution: Fixed
Fix Version/s: 0.9.1.0

Issue resolved by pull request 687
[https://github.com/apache/kafka/pull/687]

> Allow copycat connector plugins to be aliased to simpler names
> --
>
> Key: KAFKA-2422
> URL: https://issues.apache.org/jira/browse/KAFKA-2422
> Project: Kafka
>  Issue Type: Sub-task
>  Components: copycat
>Affects Versions: 0.9.0.0
>Reporter: Ewen Cheslack-Postava
>Assignee: Gwen Shapira
>Priority: Minor
> Fix For: 0.9.1.0
>
>
> Configurations of connectors can get quite verbose when you have to specify 
> the full class name, e.g. 
> connector.class=org.apache.kafka.copycat.file.FileStreamSinkConnector
> It would be nice to allow connector classes to provide shorter aliases, e.g. 
> something like "file-sink", to make this config less verbose. Flume does 
> this, so we can use it as an example.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request: MINOR: Add secondary constructor that takes on...

2016-01-04 Thread ijuma
GitHub user ijuma opened a pull request:

https://github.com/apache/kafka/pull/728

MINOR: Add secondary constructor that takes only `props` in `KafkaConfig`

This fixes a compatibility break for users that use `new KafkaConfig(map)`
instead of `KafkaConfig(map)`.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ijuma/kafka 
add-secondary-kafka-config-constructor

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/728.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #728


commit 25d6f664bfbd735ab1e5e38e9a1e7bb98fee0169
Author: Ismael Juma 
Date:   2016-01-04T20:10:10Z

Add secondary constructor that takes only `props` in `KafkaConfig`

This fixes a compatibility break for users that use `new KafkaConfig(map)`
instead of `KafkaConfig(map)`.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-3059) ConsumerGroupCommand should allow resetting offsets for consumer groups

2016-01-04 Thread Stevo Slavic (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15081756#comment-15081756
 ] 

Stevo Slavic commented on KAFKA-3059:
-

KAFKA-3057 was earlier created to fix related documentation.

> ConsumerGroupCommand should allow resetting offsets for consumer groups
> ---
>
> Key: KAFKA-3059
> URL: https://issues.apache.org/jira/browse/KAFKA-3059
> Project: Kafka
>  Issue Type: Bug
>Reporter: Gwen Shapira
>Assignee: Jason Gustafson
>
> As discussed here:
> http://mail-archives.apache.org/mod_mbox/kafka-users/201601.mbox/%3CCA%2BndhHpf3ib%3Ddsh9zvtfVjRiUjSz%2B%3D8umXm4myW%2BpBsbTYATAQ%40mail.gmail.com%3E
> * Given a consumer group, remove all stored offsets
> * Given a group and a topic, remove offset for group  and topic
> * Given a group, topic, partition and offset - set the offset for the 
> specified partition and group with the given value



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: KIP-41: KafkaConsumer Max Records

2016-01-04 Thread Gwen Shapira
1. Agree that TCP window style scaling will be cool. I'll try to think of a
good excuse to use it ;)

2. I'm very concerned about the challenges of getting the timeouts,
hearbeats and max messages right.

Another option could be to expose "heartbeat" API to consumers. If my app
is still processing data but is still alive, it could initiate a heartbeat
to signal its alive without having to handle additional messages.

I don't know if this improves more than it complicates though :(

On Mon, Jan 4, 2016 at 11:40 AM, Jason Gustafson  wrote:

> Hey Gwen,
>
> I was thinking along the lines of TCP window scaling in order to
> dynamically find a good consumption rate. Basically you'd start off
> consuming say 100 records and you'd let it increase until the consumption
> took longer than half the session timeout (for example). You /might/ be
> able to achieve the same thing using pause/resume, but it would be a lot
> trickier since you have to do it at the granularity of partitions. But
> yeah, database write performance doesn't always scale in a predictable
> enough way to accommodate this, so I'm not sure how useful it would be in
> practice. It might also be more difficult to implement since it wouldn't be
> as clear when to initiate the next fetch. With a static setting, the
> consumer knows exactly how many records will be returned on the next call
> to poll() and can send fetches accordingly.
>
> On the other hand, I do feel a little wary of the need to tune the session
> timeout and max messages though since these settings might depend on the
> environment that the consumer is deployed in. It wouldn't be a big deal if
> the impact was relatively minor, but getting them wrong can cause a lot of
> rebalance churn which could keep the consumer from making any progress.
> It's not a particularly graceful failure.
>
> -Jason
>
> On Mon, Jan 4, 2016 at 10:49 AM, Gwen Shapira  wrote:
>
> > I can't speak to all use-cases, but for the database one, I think
> > pause-resume will be necessary in any case, and therefore dynamic batch
> > sizes are not needed.
> >
> > Databases are really unexpected regarding response times - load and
> locking
> > can affect this. I'm not sure there's a good way to know you are going
> into
> > rebalance hell before it is too late. So if I were writing code that
> > updates an RDBMS based on Kafka, I'd pick a reasonable batch size (say
> 5000
> > records), and basically pause, batch-insert all records, commit and
> resume.
> >
> > Does that make sense?
> >
> > On Mon, Jan 4, 2016 at 10:37 AM, Jason Gustafson 
> > wrote:
> >
> > > Gwen and Ismael,
> > >
> > > I agree the configuration option is probably the way to go, but I was
> > > wondering whether there would be cases where it made sense to let the
> > > consumer dynamically set max messages to adjust for downstream
> slowness.
> > > For example, if the consumer is writing consumed records to another
> > > database, and that database is experiencing heavier than expected load,
> > > then the consumer could halve its current max messages in order to
> adapt
> > > without risking rebalance hell. It could then increase max messages as
> > the
> > > load on the database decreases. It's basically an easier way to handle
> > flow
> > > control than we provide with pause/resume.
> > >
> > > -Jason
> > >
> > > On Mon, Jan 4, 2016 at 9:46 AM, Gwen Shapira 
> wrote:
> > >
> > > > The wiki you pointed to is no longer maintained and fell out of sync
> > with
> > > > the code and protocol.
> > > >
> > > > You may want  to refer to:
> > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/A+Guide+To+The+Kafka+Protocol
> > > >
> > > > On Mon, Jan 4, 2016 at 4:38 AM, Jens Rantil 
> > wrote:
> > > >
> > > > > Hi guys,
> > > > >
> > > > > I realized I never thanked yall for your input - thanks!
> > > > > Jason: I apologize for assuming your stance on the issue! Feels
> like
> > we
> > > > all
> > > > > agreed on the solution. +1
> > > > >
> > > > > Follow-up: Jason made a point about defining prefetch and fairness
> > > > > behaviour in the KIP. I am now working on putting that down in
> > writing.
> > > > To
> > > > > do be able to do this I think I need to understand the current
> > prefetch
> > > > > behaviour in the new consumer API (0.9) a bit better. Some specific
> > > > > questions:
> > > > >
> > > > >- How does a specific consumer balance incoming messages from
> > > multiple
> > > > >partitions? Is the consumer simply issuing Multi-Fetch
> requests[1]
> > > for
> > > > > the
> > > > >consumed assigned partitions of the relevant topics? Or is the
> > > > consumer
> > > > >fetching from one partition at a time and balancing between them
> > > > >internally? That is, is the responsibility of partition
> balancing
> > > (and
> > > > >fairness) on the broker side or consumer side?
> > > > >- Is the 

[GitHub] kafka pull request: MINOR: Add secondary constructor that takes on...

2016-01-04 Thread ijuma
Github user ijuma closed the pull request at:

https://github.com/apache/kafka/pull/728


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-3051) security.protocol documentation is inaccurate

2016-01-04 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15081028#comment-15081028
 ] 

ASF GitHub Bot commented on KAFKA-3051:
---

GitHub user ijuma opened a pull request:

https://github.com/apache/kafka/pull/724

KAFKA-3051 KAFKA-3048; Security config docs improvements



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ijuma/kafka minor-security-fixes

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/724.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #724


commit 9156fa3fb2dd90c3cf871d23a49b6ead149e29ad
Author: Ismael Juma 
Date:   2016-01-04T10:59:28Z

KAFKA-3051: security.protocol documentation is inaccurate

commit a9bae91039c63a32ec1c84c9a556a051a68d9373
Author: Ismael Juma 
Date:   2016-01-04T11:00:05Z

KAFKA-3048: incorrect property name ssl.want.client.auth

commit 6b3f87ee82266363584498df4a6fa99dc63eb47d
Author: Ismael Juma 
Date:   2016-01-04T11:17:44Z

Remove redundant default information in security-related config docs

I removed the cases where the value was the same as the one
that appears in the HTML table and left the cases where it adds
useful information.

commit 670363ec7846716c58d77ef266bea6a4e146084f
Author: Ismael Juma 
Date:   2016-01-04T11:18:41Z

Use `{` and `}` instead of `<` and `>` as placeholder to fix HTML rendering

commit b82ae369bd90fb63445efb6a022b5d4e5dbd93d6
Author: Ismael Juma 
Date:   2016-01-04T11:19:58Z

Fix `KafkaChannel.principal()` documentation

Also include minor `KafkaChannel` clean-ups.

commit 0fa85e5ab4acc7322fe07f79b9d2218fe2d08d2a
Author: Ismael Juma 
Date:   2016-01-04T11:20:36Z

Fix `SslTransportLayer` reference in `SslSelectorTest` comment




> security.protocol documentation is inaccurate
> -
>
> Key: KAFKA-3051
> URL: https://issues.apache.org/jira/browse/KAFKA-3051
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients
>Affects Versions: 0.9.0.0
>Reporter: Jun Rao
>Assignee: Ismael Juma
>  Labels: security
> Fix For: 0.9.0.1
>
>
> In CommonClientConfigs, the doc for security.protocol says "Currently only 
> PLAINTEXT and SSL are supported.". This is inaccurate since we support SASL 
> as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request: KAFKA-3051 KAFKA-3048; Security config docs im...

2016-01-04 Thread ijuma
GitHub user ijuma opened a pull request:

https://github.com/apache/kafka/pull/724

KAFKA-3051 KAFKA-3048; Security config docs improvements



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ijuma/kafka minor-security-fixes

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/724.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #724


commit 9156fa3fb2dd90c3cf871d23a49b6ead149e29ad
Author: Ismael Juma 
Date:   2016-01-04T10:59:28Z

KAFKA-3051: security.protocol documentation is inaccurate

commit a9bae91039c63a32ec1c84c9a556a051a68d9373
Author: Ismael Juma 
Date:   2016-01-04T11:00:05Z

KAFKA-3048: incorrect property name ssl.want.client.auth

commit 6b3f87ee82266363584498df4a6fa99dc63eb47d
Author: Ismael Juma 
Date:   2016-01-04T11:17:44Z

Remove redundant default information in security-related config docs

I removed the cases where the value was the same as the one
that appears in the HTML table and left the cases where it adds
useful information.

commit 670363ec7846716c58d77ef266bea6a4e146084f
Author: Ismael Juma 
Date:   2016-01-04T11:18:41Z

Use `{` and `}` instead of `<` and `>` as placeholder to fix HTML rendering

commit b82ae369bd90fb63445efb6a022b5d4e5dbd93d6
Author: Ismael Juma 
Date:   2016-01-04T11:19:58Z

Fix `KafkaChannel.principal()` documentation

Also include minor `KafkaChannel` clean-ups.

commit 0fa85e5ab4acc7322fe07f79b9d2218fe2d08d2a
Author: Ismael Juma 
Date:   2016-01-04T11:20:36Z

Fix `SslTransportLayer` reference in `SslSelectorTest` comment




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (KAFKA-3051) security.protocol documentation is inaccurate

2016-01-04 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-3051:
---
Reviewer: Jun Rao
  Status: Patch Available  (was: Open)

> security.protocol documentation is inaccurate
> -
>
> Key: KAFKA-3051
> URL: https://issues.apache.org/jira/browse/KAFKA-3051
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients
>Affects Versions: 0.9.0.0
>Reporter: Jun Rao
>Assignee: Ismael Juma
>  Labels: security
> Fix For: 0.9.0.1
>
>
> In CommonClientConfigs, the doc for security.protocol says "Currently only 
> PLAINTEXT and SSL are supported.". This is inaccurate since we support SASL 
> as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KAFKA-3057) "Checking consumer position" docs are referencing (only) deprecated ConsumerOffsetChecker

2016-01-04 Thread Stevo Slavic (JIRA)
Stevo Slavic created KAFKA-3057:
---

 Summary: "Checking consumer position" docs are referencing (only) 
deprecated ConsumerOffsetChecker
 Key: KAFKA-3057
 URL: https://issues.apache.org/jira/browse/KAFKA-3057
 Project: Kafka
  Issue Type: Bug
  Components: admin, website
Affects Versions: 0.9.0.0
Reporter: Stevo Slavic
Priority: Trivial


["Checking consumer position" operations 
instructions|http://kafka.apache.org/090/documentation.html#basic_ops_consumer_lag]
 are referencing only ConsumerOffsetChecker which is mentioned as deprecated in 
[Potential breaking changes in 
0.9.0.0|http://kafka.apache.org/documentation.html#upgrade_9_breaking]

Please consider updating docs with new ways for checking consumer position, 
covering differences between old and new way, and recommendation which one is 
preferred and why.

Would be nice to document (and support if not already available), not only how 
to read/fetch/check consumer (group) offset, but also how to set offset for 
consumer group using Kafka's operations tools.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (KAFKA-3052) broker properties get logged twice if acl is enabled

2016-01-04 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma reassigned KAFKA-3052:
--

Assignee: Ismael Juma

> broker properties get logged twice if acl is enabled
> 
>
> Key: KAFKA-3052
> URL: https://issues.apache.org/jira/browse/KAFKA-3052
> Project: Kafka
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 0.9.0.0
>Reporter: Jun Rao
>Assignee: Ismael Juma
>  Labels: newbie, security
> Fix For: 0.9.0.1
>
>
> This is because in SimpleAclAuthorizer.configure(), there is the following 
> statement which triggers the logging of all broker properties.
> val kafkaConfig = KafkaConfig.fromProps(props)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-3048) incorrect property name ssl.want.client.auth

2016-01-04 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-3048:
---
Reviewer: Jun Rao
  Status: Patch Available  (was: Open)

PR (which also includes KAFKA-3051):

https://github.com/apache/kafka/pull/724

> incorrect property name ssl.want.client.auth 
> -
>
> Key: KAFKA-3048
> URL: https://issues.apache.org/jira/browse/KAFKA-3048
> Project: Kafka
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 0.9.0.0
>Reporter: Jun Rao
>Assignee: Ismael Juma
>  Labels: security
> Fix For: 0.9.0.1
>
>
> In the description of ssl.client.auth, we mention ssl.want.client.auth, which 
> is incorrect.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (KAFKA-3062) Read from kafka replication to get data likely Version based

2016-01-04 Thread xingang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15081989#comment-15081989
 ] 

xingang edited comment on KAFKA-3062 at 1/4/16 11:45 PM:
-

"Likely version based" means:

1) it's Ok for such None-Latency sensitive consumers to fetch data slower than 
from the leaders. or even less of them
2) the consistency of replicas could be version-based synced, thus, the 
consumers can get the "None-dup, None-loss" eventually, No requirement to be 
consistent for a real-time

It will be so GOOD in this way, No more extra storage required but get 
History/Realtime consuming separated


was (Author: itismewxg):
"Likely version based" means:

1) it's Ok for such None-Latency consumers to fetch data slower than from the 
leaders. or even less of them
2) the consistency of replicas could be version-based synced, thus, the 
consumers can get the "None-dup, None-loss" eventually, No requirement to be 
consistent for a real-time

It will be so GOOD in this way, No more extra storage required but get 
History/Realtime consuming separated

> Read from kafka replication to get data likely Version based
> 
>
> Key: KAFKA-3062
> URL: https://issues.apache.org/jira/browse/KAFKA-3062
> Project: Kafka
>  Issue Type: Improvement
>Reporter: xingang
>
> Since Kafka require all the reading happens in the leader for the consistency.
> If there could be possible for the reading can happens in replication, thus, 
> for data have a number of consumers, for the consumers Not latency-sensitive 
> But Data-Loss sensitive can fetch its data from replication, in this case, it 
> will pollute the Pagecache for other consumers which are latency-sensitive



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Build failed in Jenkins: kafka-trunk-jdk7 #933

2016-01-04 Thread Apache Jenkins Server
See 

Changes:

[me] KAFKA-2422: Allow copycat connector plugins to be aliased to simpler

--
[...truncated 4753 lines...]
org.apache.kafka.connect.storage.KafkaConfigStorageTest > testRestore PASSED

org.apache.kafka.connect.storage.KafkaConfigStorageTest > 
testPutTaskConfigsDoesNotResolveAllInconsistencies PASSED

org.apache.kafka.connect.util.KafkaBasedLogTest > testStartStop PASSED

org.apache.kafka.connect.util.KafkaBasedLogTest > testReloadOnStart PASSED

org.apache.kafka.connect.util.KafkaBasedLogTest > testSendAndReadToEnd PASSED

org.apache.kafka.connect.util.KafkaBasedLogTest > testConsumerError PASSED

org.apache.kafka.connect.util.KafkaBasedLogTest > testProducerError PASSED

org.apache.kafka.connect.util.ShutdownableThreadTest > testGracefulShutdown 
PASSED

org.apache.kafka.connect.util.ShutdownableThreadTest > testForcibleShutdown 
PASSED
:testAll

BUILD SUCCESSFUL

Total time: 1 hrs 33 mins 18.614 secs
+ ./gradlew --stacktrace docsJarAll
To honour the JVM settings for this build a new JVM will be forked. Please 
consider using the daemon: 
https://docs.gradle.org/2.10/userguide/gradle_daemon.html.
Building project 'core' with Scala version 2.10.6
:docsJar_2_10
Building project 'core' with Scala version 2.10.6
:kafka-trunk-jdk7:clients:compileJava UP-TO-DATE
:kafka-trunk-jdk7:clients:processResources UP-TO-DATE
:kafka-trunk-jdk7:clients:classes UP-TO-DATE
:kafka-trunk-jdk7:clients:determineCommitId UP-TO-DATE
:kafka-trunk-jdk7:clients:createVersionFile
:kafka-trunk-jdk7:clients:jar UP-TO-DATE
:kafka-trunk-jdk7:clients:javadoc UP-TO-DATE
:kafka-trunk-jdk7:core:compileJava UP-TO-DATE
:kafka-trunk-jdk7:core:compileScala
:79:
 value DEFAULT_TIMESTAMP in object OffsetCommitRequest is deprecated: see 
corresponding Javadoc for more information.

org.apache.kafka.common.requests.OffsetCommitRequest.DEFAULT_TIMESTAMP
 ^
:36:
 value DEFAULT_TIMESTAMP in object OffsetCommitRequest is deprecated: see 
corresponding Javadoc for more information.
 commitTimestamp: Long = 
org.apache.kafka.common.requests.OffsetCommitRequest.DEFAULT_TIMESTAMP,

  ^
:37:
 value DEFAULT_TIMESTAMP in object OffsetCommitRequest is deprecated: see 
corresponding Javadoc for more information.
 expireTimestamp: Long = 
org.apache.kafka.common.requests.OffsetCommitRequest.DEFAULT_TIMESTAMP) {

  ^
:394:
 value DEFAULT_TIMESTAMP in object OffsetCommitRequest is deprecated: see 
corresponding Javadoc for more information.
  if (value.expireTimestamp == 
org.apache.kafka.common.requests.OffsetCommitRequest.DEFAULT_TIMESTAMP)

^
:284:
 value DEFAULT_TIMESTAMP in object OffsetCommitRequest is deprecated: see 
corresponding Javadoc for more information.
if (offsetAndMetadata.commitTimestamp == 
org.apache.kafka.common.requests.OffsetCommitRequest.DEFAULT_TIMESTAMP)

  ^
:301:
 a pure expression does nothing in statement position; you may be omitting 
necessary parentheses
ControllerStats.uncleanLeaderElectionRate
^
:302:
 a pure expression does nothing in statement position; you may be omitting 
necessary parentheses
ControllerStats.leaderElectionTimer
^
:74:
 value BLOCK_ON_BUFFER_FULL_CONFIG in object ProducerConfig is deprecated: see 
corresponding Javadoc for more information.
producerProps.put(ProducerConfig.BLOCK_ON_BUFFER_FULL_CONFIG, "true")
 ^

[GitHub] kafka pull request: KAFKA-2937 : Disable the leaderIsr check if th...

2016-01-04 Thread MayureshGharat
GitHub user MayureshGharat opened a pull request:

https://github.com/apache/kafka/pull/729

KAFKA-2937 : Disable the leaderIsr check if the topic is to be deleted.

The check was implemented in KAFKA-340 : If we are shutting down a broker 
when the ISR of a partition includes only that broker, we could lose some 
messages that have been previously committed. For clean shutdown, we need to 
guarantee that there is at least 1 other broker in ISR after the broker is shut 
down.

When we are deleting the topic, this check can be avoided.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/MayureshGharat/kafka kafka-2937

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/729.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #729


commit 9d5afd0f29f2f4311e534eb375e1c9ddb23b33dd
Author: Mayuresh Gharat 
Date:   2016-01-04T22:56:01Z

Disable the leaderIsr check if the topic is to be deleted.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-3062) Read from kafka replication to get data likely Version based

2016-01-04 Thread xingang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15082004#comment-15082004
 ] 

xingang commented on KAFKA-3062:


Yes! Example: 

Huge volume data producing to >60 partition, and 15 consumers will works on 
this data.

10 of them are time-latency sensitive, which is nearly real-time processing, 
it's better for them to consume from the page cache to get the data, sometime a 
little data loss even can be tolerant as its processing shows processing result 
for realtime

5 of them are reports processing from the data, it's Ok to be hours or even 
daily jobs, it does not require to show its result in a short time. 

considering, if the 5 stats-processing are in a lag, and they will consume from 
the disk, and make the page cache full of them, since such history data 
consuming are N times faster than the producing rate. hence, the 10 
time-latency sensitive processing are sad, since they always see the page cache 
missing~~ once they get a short time lag


Thanks for your quick response!


> Read from kafka replication to get data likely Version based
> 
>
> Key: KAFKA-3062
> URL: https://issues.apache.org/jira/browse/KAFKA-3062
> Project: Kafka
>  Issue Type: Improvement
>Reporter: xingang
>
> Since Kafka require all the reading happens in the leader for the consistency.
> If there could be possible for the reading can happens in replication, thus, 
> for data have a number of consumers, for the consumers Not latency-sensitive 
> But Data-Loss sensitive can fetch its data from replication, in this case, it 
> will pollute the Pagecache for other consumers which are latency-sensitive



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2937) Topics marked for delete in Zookeeper may become undeletable

2016-01-04 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15081958#comment-15081958
 ] 

ASF GitHub Bot commented on KAFKA-2937:
---

GitHub user MayureshGharat opened a pull request:

https://github.com/apache/kafka/pull/729

KAFKA-2937 : Disable the leaderIsr check if the topic is to be deleted.

The check was implemented in KAFKA-340 : If we are shutting down a broker 
when the ISR of a partition includes only that broker, we could lose some 
messages that have been previously committed. For clean shutdown, we need to 
guarantee that there is at least 1 other broker in ISR after the broker is shut 
down.

When we are deleting the topic, this check can be avoided.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/MayureshGharat/kafka kafka-2937

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/729.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #729


commit 9d5afd0f29f2f4311e534eb375e1c9ddb23b33dd
Author: Mayuresh Gharat 
Date:   2016-01-04T22:56:01Z

Disable the leaderIsr check if the topic is to be deleted.




> Topics marked for delete in Zookeeper may become undeletable
> 
>
> Key: KAFKA-2937
> URL: https://issues.apache.org/jira/browse/KAFKA-2937
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.9.0.0
>Reporter: Rajini Sivaram
>Assignee: Mayuresh Gharat
>
> In our clusters, we occasionally see topics marked for delete, but never 
> actually deleted. It may be due to brokers being restarted while tests were 
> running, but further restarts of Kafka dont fix the problem. The topics 
> remain marked for delete in Zookeeper.
> Topic describe shows:
> {quote}
> Topic:testtopic   PartitionCount:1ReplicationFactor:3 Configs:
>   Topic: testtopicPartition: 0Leader: noneReplicas: 3,4,0 
> Isr: 
> {quote}
> Kafka logs show:
> {quote}
> 2015-12-02 15:53:30,152] ERROR Controller 2 epoch 213 initiated state change 
> of replica 3 for partition [testtopic,0] from OnlineReplica to OfflineReplica 
> failed (state.change.logger)
> kafka.common.StateChangeFailedException: Failed to change state of replica 3 
> for partition [testtopic,0] since the leader and isr path in zookeeper is 
> empty
> at 
> kafka.controller.ReplicaStateMachine.handleStateChange(ReplicaStateMachine.scala:269)
> at 
> kafka.controller.ReplicaStateMachine$$anonfun$handleStateChanges$2.apply(ReplicaStateMachine.scala:114)
> at 
> kafka.controller.ReplicaStateMachine$$anonfun$handleStateChanges$2.apply(ReplicaStateMachine.scala:114)
> at 
> scala.collection.immutable.HashSet$HashSet1.foreach(HashSet.scala:322)
> at 
> scala.collection.immutable.HashSet$HashTrieSet.foreach(HashSet.scala:978)
> at 
> kafka.controller.ReplicaStateMachine.handleStateChanges(ReplicaStateMachine.scala:114)
> at 
> kafka.controller.TopicDeletionManager$$anonfun$startReplicaDeletion$2.apply(TopicDeletionManager.scala:342)
> at 
> kafka.controller.TopicDeletionManager$$anonfun$startReplicaDeletion$2.apply(TopicDeletionManager.scala:334)
> at scala.collection.immutable.Map$Map1.foreach(Map.scala:116)
> at 
> kafka.controller.TopicDeletionManager.startReplicaDeletion(TopicDeletionManager.scala:334)
> at 
> kafka.controller.TopicDeletionManager.kafka$controller$TopicDeletionManager$$onPartitionDeletion(TopicDeletionManager.scala:367)
> at 
> kafka.controller.TopicDeletionManager$$anonfun$kafka$controller$TopicDeletionManager$$onTopicDeletion$2.apply(TopicDeletionManager.scala:313)
> at 
> kafka.controller.TopicDeletionManager$$anonfun$kafka$controller$TopicDeletionManager$$onTopicDeletion$2.apply(TopicDeletionManager.scala:312)
> at scala.collection.immutable.Set$Set1.foreach(Set.scala:79)
> at 
> kafka.controller.TopicDeletionManager.kafka$controller$TopicDeletionManager$$onTopicDeletion(TopicDeletionManager.scala:312)
> at 
> kafka.controller.TopicDeletionManager$DeleteTopicsThread$$anonfun$doWork$1$$anonfun$apply$mcV$sp$4.apply(TopicDeletionManager.scala:431)
> at 
> kafka.controller.TopicDeletionManager$DeleteTopicsThread$$anonfun$doWork$1$$anonfun$apply$mcV$sp$4.apply(TopicDeletionManager.scala:403)
> at scala.collection.immutable.Set$Set2.foreach(Set.scala:111)
> at 
> kafka.controller.TopicDeletionManager$DeleteTopicsThread$$anonfun$doWork$1.apply$mcV$sp(TopicDeletionManager.scala:403)
> at 
> 

[jira] [Comment Edited] (KAFKA-3062) Read from kafka replication to get data likely Version based

2016-01-04 Thread xingang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15082004#comment-15082004
 ] 

xingang edited comment on KAFKA-3062 at 1/4/16 11:30 PM:
-

Yes! Example: 

Huge volume data producing to >60 partition, and 15 consumers will work on this 
data.

10 of them are time-latency sensitive, which is nearly real-time processing, 
it's better for them to consume from the page cache to get the data, sometime a 
little data loss even can be tolerant as its processing shows processing result 
for realtime

5 of them are reports processing from the data, it's Ok to be hours or even 
daily jobs, it does not require to show its result in a short time. 

considering, if the 5 stats-processing are in a lag, and they will consume from 
the disk, and make the page cache full of them, since such history data 
consuming are N times faster than the producing rate. hence, the 10 
time-latency sensitive processing are sad, since they always see the page cache 
missing~~ once they get a short time lag


Thanks for your quick response!



was (Author: itismewxg):
Yes! Example: 

Huge volume data producing to >60 partition, and 15 consumers will works on 
this data.

10 of them are time-latency sensitive, which is nearly real-time processing, 
it's better for them to consume from the page cache to get the data, sometime a 
little data loss even can be tolerant as its processing shows processing result 
for realtime

5 of them are reports processing from the data, it's Ok to be hours or even 
daily jobs, it does not require to show its result in a short time. 

considering, if the 5 stats-processing are in a lag, and they will consume from 
the disk, and make the page cache full of them, since such history data 
consuming are N times faster than the producing rate. hence, the 10 
time-latency sensitive processing are sad, since they always see the page cache 
missing~~ once they get a short time lag


Thanks for your quick response!


> Read from kafka replication to get data likely Version based
> 
>
> Key: KAFKA-3062
> URL: https://issues.apache.org/jira/browse/KAFKA-3062
> Project: Kafka
>  Issue Type: Improvement
>Reporter: xingang
>
> Since Kafka require all the reading happens in the leader for the consistency.
> If there could be possible for the reading can happens in replication, thus, 
> for data have a number of consumers, for the consumers Not latency-sensitive 
> But Data-Loss sensitive can fetch its data from replication, in this case, it 
> will pollute the Pagecache for other consumers which are latency-sensitive



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3062) Read from kafka replication to get data likely Version based

2016-01-04 Thread xingang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15081989#comment-15081989
 ] 

xingang commented on KAFKA-3062:


"Likely version based" means:

1) it's Ok for such None-Latency consumers to fetch data slower than from the 
leaders. or even less of them
2) the consistency of replicas could be version-based synced, thus, the 
consumers can get the "None-dup, None-loss" eventually, No requirement to be 
consistent for a real-time

It will be so GOOD in this way, No more extra storage required but get 
History/Realtime consuming separated

> Read from kafka replication to get data likely Version based
> 
>
> Key: KAFKA-3062
> URL: https://issues.apache.org/jira/browse/KAFKA-3062
> Project: Kafka
>  Issue Type: Improvement
>Reporter: xingang
>
> Since Kafka require all the reading happens in the leader for the consistency.
> If there could be possible for the reading can happens in replication, thus, 
> for data have a number of consumers, for the consumers Not latency-sensitive 
> But Data-Loss sensitive can fetch its data from replication, in this case, it 
> will pollute the Pagecache for other consumers which are latency-sensitive



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[VOTE] KIP-36 - Rack aware replica assignment

2016-01-04 Thread Allen Wang
I would like to call for a vote for KIP-36 - Rack aware replica assignment.

The latest proposal is at

https://cwiki.apache.org/confluence/display/KAFKA/KIP-36
+Rack+aware+replica+assignment

Thanks,
Allen


[jira] [Created] (KAFKA-3060) Refactor MeteredXXStore

2016-01-04 Thread Yasuhiro Matsuda (JIRA)
Yasuhiro Matsuda created KAFKA-3060:
---

 Summary: Refactor MeteredXXStore
 Key: KAFKA-3060
 URL: https://issues.apache.org/jira/browse/KAFKA-3060
 Project: Kafka
  Issue Type: Sub-task
  Components: kafka streams
Affects Versions: 0.9.0.1
Reporter: Yasuhiro Matsuda
Priority: Minor


** copied from a github comment by Guozhang Wang **

The original motivation of having the MeteredXXStore is to wrap all metrics / 
logging semantics into one place so they do not need to be re-implemented 
again, but this seems to be an obstacle with the current pattern now, for 
example MeteredWindowStore.putAndReturnInternalKey is only used for logging, 
and MeteredWindowStore.putInternal / MeteredWindowStore.getInternal are never 
used since only its inner will trigger this function. So how about refactoring 
this piece as follows:

1. WindowStore only expose two APIs: put(K, V) and get(K, long).
2. Add a RollingRocksDBStores that does not extend any interface, but only 
implements putInternal, getInternal and putAndReturnInternalKey that uses 
underlying RocksDBStore as Segments.
3. RocksDBWindowStore implements WindowStore with an RollingRocksDBStores inner.
4. Let MeteredXXStore only maintain the metrics recording logic, and let 
different stores implement their own logging logic, since this is now different 
across different types and are better handled separately. Also some types of 
stores may not even have a loggingEnabled flag, if it will always log, or will 
never log.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3062) Read from kafka replication to get data likely Version based

2016-01-04 Thread Mayuresh Gharat (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15081987#comment-15081987
 ] 

Mayuresh Gharat commented on KAFKA-3062:


Hi [~itismewxg], can you explain the usecase here with an example?

Thanks,

Mayuresh

> Read from kafka replication to get data likely Version based
> 
>
> Key: KAFKA-3062
> URL: https://issues.apache.org/jira/browse/KAFKA-3062
> Project: Kafka
>  Issue Type: Improvement
>Reporter: xingang
>
> Since Kafka require all the reading happens in the leader for the consistency.
> If there could be possible for the reading can happens in replication, thus, 
> for data have a number of consumers, for the consumers Not latency-sensitive 
> But Data-Loss sensitive can fetch its data from replication, in this case, it 
> will pollute the Pagecache for other consumers which are latency-sensitive



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (KAFKA-3062) Read from kafka replication to get data likely Version based

2016-01-04 Thread xingang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15082004#comment-15082004
 ] 

xingang edited comment on KAFKA-3062 at 1/4/16 11:31 PM:
-

Yes! Example: 

Huge volume data producing to >60 partition, and 15 consumers will work on this 
data.

10 of them are time-latency sensitive, which is nearly real-time processing, 
it's better for them to consume from the page cache to get the data, sometime a 
little data loss even can be tolerant as its processing shows processing result 
for realtime

5 of them are reports processing from the data, it's Ok to be hours or even 
daily jobs, it does not require to show its result in a short time. 

considering, if the 5 stats-processing are in a lag, and they will consume from 
the disk, and make the page cache full of them, since such history data 
consuming are N times faster than the producing rate. hence, the 10 
time-latency sensitive consumers are sad, since they always see the page cache 
missing~~ once they get a short time lag


Thanks for your quick response!



was (Author: itismewxg):
Yes! Example: 

Huge volume data producing to >60 partition, and 15 consumers will work on this 
data.

10 of them are time-latency sensitive, which is nearly real-time processing, 
it's better for them to consume from the page cache to get the data, sometime a 
little data loss even can be tolerant as its processing shows processing result 
for realtime

5 of them are reports processing from the data, it's Ok to be hours or even 
daily jobs, it does not require to show its result in a short time. 

considering, if the 5 stats-processing are in a lag, and they will consume from 
the disk, and make the page cache full of them, since such history data 
consuming are N times faster than the producing rate. hence, the 10 
time-latency sensitive processing are sad, since they always see the page cache 
missing~~ once they get a short time lag


Thanks for your quick response!


> Read from kafka replication to get data likely Version based
> 
>
> Key: KAFKA-3062
> URL: https://issues.apache.org/jira/browse/KAFKA-3062
> Project: Kafka
>  Issue Type: Improvement
>Reporter: xingang
>
> Since Kafka require all the reading happens in the leader for the consistency.
> If there could be possible for the reading can happens in replication, thus, 
> for data have a number of consumers, for the consumers Not latency-sensitive 
> But Data-Loss sensitive can fetch its data from replication, in this case, it 
> will pollute the Pagecache for other consumers which are latency-sensitive



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Build failed in Jenkins: kafka-trunk-jdk8 #265

2016-01-04 Thread Apache Jenkins Server
See 

Changes:

[junrao] KAFKA-3051 KAFKA-3048; Security config docs improvements

--
[...truncated 3364 lines...]

org.apache.kafka.clients.NetworkClientTest > testClose PASSED

org.apache.kafka.clients.NetworkClientTest > testLeastLoadedNode PASSED

org.apache.kafka.clients.NetworkClientTest > testRequestTimeout PASSED

org.apache.kafka.clients.NetworkClientTest > 
testSimpleRequestResponseWithStaticNodes PASSED

org.apache.kafka.clients.NetworkClientTest > testSendToUnreadyNode PASSED

org.apache.kafka.clients.MetadataTest > testListenerCanUnregister PASSED

org.apache.kafka.clients.MetadataTest > testFailedUpdate PASSED

org.apache.kafka.clients.MetadataTest > testMetadataUpdateWaitTime PASSED

org.apache.kafka.clients.MetadataTest > testUpdateWithNeedMetadataForAllTopics 
PASSED

org.apache.kafka.clients.MetadataTest > testMetadata PASSED

org.apache.kafka.clients.MetadataTest > testListenerGetsNotifiedOfUpdate PASSED

org.apache.kafka.common.serialization.SerializationTest > testStringSerializer 
PASSED

org.apache.kafka.common.serialization.SerializationTest > testIntegerSerializer 
PASSED

org.apache.kafka.common.config.AbstractConfigTest > testOriginalsWithPrefix 
PASSED

org.apache.kafka.common.config.AbstractConfigTest > testConfiguredInstances 
PASSED

org.apache.kafka.common.config.ConfigDefTest > testBasicTypes PASSED

org.apache.kafka.common.config.ConfigDefTest > testNullDefault PASSED

org.apache.kafka.common.config.ConfigDefTest > testInvalidDefaultRange PASSED

org.apache.kafka.common.config.ConfigDefTest > testValidators PASSED

org.apache.kafka.common.config.ConfigDefTest > testInvalidDefaultString PASSED

org.apache.kafka.common.config.ConfigDefTest > testSslPasswords PASSED

org.apache.kafka.common.config.ConfigDefTest > testInvalidDefault PASSED

org.apache.kafka.common.config.ConfigDefTest > testMissingRequired PASSED

org.apache.kafka.common.config.ConfigDefTest > testDefinedTwice PASSED

org.apache.kafka.common.config.ConfigDefTest > testBadInputs PASSED

org.apache.kafka.common.protocol.types.ProtocolSerializationTest > testArray 
PASSED

org.apache.kafka.common.protocol.types.ProtocolSerializationTest > testNulls 
PASSED

org.apache.kafka.common.protocol.types.ProtocolSerializationTest > testDefault 
PASSED

org.apache.kafka.common.protocol.types.ProtocolSerializationTest > testSimple 
PASSED

org.apache.kafka.common.requests.RequestResponseTest > testSerialization PASSED

org.apache.kafka.common.requests.RequestResponseTest > fetchResponseVersionTest 
PASSED

org.apache.kafka.common.requests.RequestResponseTest > 
produceResponseVersionTest PASSED

org.apache.kafka.common.requests.RequestResponseTest > 
testControlledShutdownResponse PASSED

org.apache.kafka.common.security.auth.KafkaPrincipalTest > 
testPrincipalNameCanContainSeparator PASSED

org.apache.kafka.common.security.auth.KafkaPrincipalTest > 
testEqualsAndHashCode PASSED

org.apache.kafka.common.security.kerberos.KerberosNameTest > testParse PASSED

org.apache.kafka.common.security.ssl.SslFactoryTest > testClientMode PASSED

org.apache.kafka.common.security.ssl.SslFactoryTest > 
testSslFactoryConfiguration PASSED

org.apache.kafka.common.metrics.MetricsTest > testSimpleStats PASSED

org.apache.kafka.common.metrics.MetricsTest > testOldDataHasNoEffect PASSED

org.apache.kafka.common.metrics.MetricsTest > testQuotasEquality PASSED

org.apache.kafka.common.metrics.MetricsTest > testRemoveInactiveMetrics PASSED

org.apache.kafka.common.metrics.MetricsTest > testMetricName PASSED

org.apache.kafka.common.metrics.MetricsTest > testRateWindowing PASSED

org.apache.kafka.common.metrics.MetricsTest > testTimeWindowing PASSED

org.apache.kafka.common.metrics.MetricsTest > testEventWindowing PASSED

org.apache.kafka.common.metrics.MetricsTest > testRemoveMetric PASSED

org.apache.kafka.common.metrics.MetricsTest > testBadSensorHierarchy PASSED

org.apache.kafka.common.metrics.MetricsTest > testRemoveSensor PASSED

org.apache.kafka.common.metrics.MetricsTest > testPercentiles PASSED

org.apache.kafka.common.metrics.MetricsTest > testDuplicateMetricName PASSED

org.apache.kafka.common.metrics.MetricsTest > testQuotas PASSED

org.apache.kafka.common.metrics.MetricsTest > testHierarchicalSensors PASSED

org.apache.kafka.common.metrics.JmxReporterTest > testJmxRegistration PASSED

org.apache.kafka.common.metrics.stats.HistogramTest > testHistogram PASSED

org.apache.kafka.common.metrics.stats.HistogramTest > testConstantBinScheme 
PASSED

org.apache.kafka.common.metrics.stats.HistogramTest > testLinearBinScheme PASSED

org.apache.kafka.common.utils.CrcTest > testUpdateInt PASSED

org.apache.kafka.common.utils.CrcTest > testUpdate PASSED

org.apache.kafka.common.utils.UtilsTest > testAbs PASSED

org.apache.kafka.common.utils.UtilsTest > testMin PASSED

org.apache.kafka.common.utils.UtilsTest > testJoin PASSED


Build failed in Jenkins: kafka-trunk-jdk7 #934

2016-01-04 Thread Apache Jenkins Server
See 

Changes:

[junrao] KAFKA-3051 KAFKA-3048; Security config docs improvements

--
Started by an SCM change
[EnvInject] - Loading node environment variables.
Building remotely on ubuntu-6 (docker Ubuntu ubuntu) in workspace 

 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url 
 > https://git-wip-us.apache.org/repos/asf/kafka.git # timeout=10
Fetching upstream changes from https://git-wip-us.apache.org/repos/asf/kafka.git
 > git --version # timeout=10
 > git -c core.askpass=true fetch --tags --progress 
 > https://git-wip-us.apache.org/repos/asf/kafka.git 
 > +refs/heads/*:refs/remotes/origin/*
 > git rev-parse refs/remotes/origin/trunk^{commit} # timeout=10
 > git rev-parse refs/remotes/origin/origin/trunk^{commit} # timeout=10
Checking out Revision 57df460f8d7b225509dd8c061a5b6efa65c8ac9c 
(refs/remotes/origin/trunk)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 57df460f8d7b225509dd8c061a5b6efa65c8ac9c
 > git rev-list b93f48f7494e1db4d564b6c28772712ee7681620 # timeout=10
Setting 
GRADLE_2_4_RC_2_HOME=/home/jenkins/jenkins-slave/tools/hudson.plugins.gradle.GradleInstallation/Gradle_2.4-rc-2
Setting 
JDK_1_7U51_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/jdk-1.7u51
[kafka-trunk-jdk7] $ /bin/bash -xe /tmp/hudson4648177113352469578.sh
+ 
/home/jenkins/jenkins-slave/tools/hudson.plugins.gradle.GradleInstallation/Gradle_2.4-rc-2/bin/gradle
To honour the JVM settings for this build a new JVM will be forked. Please 
consider using the daemon: 
http://gradle.org/docs/2.4-rc-2/userguide/gradle_daemon.html.
Building project 'core' with Scala version 2.10.6
:downloadWrapper

BUILD SUCCESSFUL

Total time: 11.95 secs
Setting 
GRADLE_2_4_RC_2_HOME=/home/jenkins/jenkins-slave/tools/hudson.plugins.gradle.GradleInstallation/Gradle_2.4-rc-2
Setting 
JDK_1_7U51_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/jdk-1.7u51
[kafka-trunk-jdk7] $ /bin/bash -xe /tmp/hudson4439769173402714160.sh
+ export GRADLE_OPTS=-Xmx1024m
+ GRADLE_OPTS=-Xmx1024m
+ ./gradlew -Dorg.gradle.project.maxParallelForks=1 clean jarAll testAll
To honour the JVM settings for this build a new JVM will be forked. Please 
consider using the daemon: 
https://docs.gradle.org/2.10/userguide/gradle_daemon.html.
Building project 'core' with Scala version 2.10.6
:clean UP-TO-DATE
:clients:clean
:connect:clean UP-TO-DATE
:core:clean
:examples:clean
:log4j-appender:clean
:streams:clean
:tools:clean
:connect:api:clean
:connect:file:clean
:connect:json:clean
:connect:runtime:clean
:jar_core_2_10
Building project 'core' with Scala version 2.10.6
:kafka-trunk-jdk7:clients:compileJavaNote: 

 uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
Note: Some input files use unchecked or unsafe operations.
Note: Recompile with -Xlint:unchecked for details.

:kafka-trunk-jdk7:clients:processResources UP-TO-DATE
:kafka-trunk-jdk7:clients:classes
:kafka-trunk-jdk7:clients:determineCommitId UP-TO-DATE
:kafka-trunk-jdk7:clients:createVersionFile
:kafka-trunk-jdk7:clients:jar
:kafka-trunk-jdk7:core:compileJava UP-TO-DATE
:kafka-trunk-jdk7:core:compileScala
:79:
 value DEFAULT_TIMESTAMP in object OffsetCommitRequest is deprecated: see 
corresponding Javadoc for more information.

org.apache.kafka.common.requests.OffsetCommitRequest.DEFAULT_TIMESTAMP
 ^
:36:
 value DEFAULT_TIMESTAMP in object OffsetCommitRequest is deprecated: see 
corresponding Javadoc for more information.
 commitTimestamp: Long = 
org.apache.kafka.common.requests.OffsetCommitRequest.DEFAULT_TIMESTAMP,

  ^
:37:
 value DEFAULT_TIMESTAMP in object OffsetCommitRequest is deprecated: see 
corresponding Javadoc for more information.
 expireTimestamp: Long = 
org.apache.kafka.common.requests.OffsetCommitRequest.DEFAULT_TIMESTAMP) {

  ^
:394:
 value DEFAULT_TIMESTAMP in object OffsetCommitRequest is deprecated: see 

[jira] [Created] (KAFKA-3062) Read from kafka replication to get data likely Version based

2016-01-04 Thread xingang (JIRA)
xingang created KAFKA-3062:
--

 Summary: Read from kafka replication to get data likely Version 
based
 Key: KAFKA-3062
 URL: https://issues.apache.org/jira/browse/KAFKA-3062
 Project: Kafka
  Issue Type: Improvement
Reporter: xingang


Since Kafka require all the reading happens in the leader for the consistency.

If there could be possible for the reading can happens in replication, thus, 
for data have a number of consumers, for the consumers Not latency-sensitive 
But Data-Loss sensitive can fetch its data from replication, in this case, it 
will pollute the Pagecache for other consumers which are latency-sensitive



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3016) Add KStream-KStream window joins

2016-01-04 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15082117#comment-15082117
 ] 

Guozhang Wang commented on KAFKA-3016:
--

[~yasuhiro.matsuda] I'm leaving this ticket as in progress for the second phase 
PR.

> Add KStream-KStream window joins
> 
>
> Key: KAFKA-3016
> URL: https://issues.apache.org/jira/browse/KAFKA-3016
> Project: Kafka
>  Issue Type: Sub-task
>  Components: kafka streams
>Affects Versions: 0.9.0.1
>Reporter: Yasuhiro Matsuda
>Assignee: Yasuhiro Matsuda
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-3016) Add KStream-KStream window joins

2016-01-04 Thread Guozhang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang updated KAFKA-3016:
-
Status: In Progress  (was: Patch Available)

> Add KStream-KStream window joins
> 
>
> Key: KAFKA-3016
> URL: https://issues.apache.org/jira/browse/KAFKA-3016
> Project: Kafka
>  Issue Type: Sub-task
>  Components: kafka streams
>Affects Versions: 0.9.0.1
>Reporter: Yasuhiro Matsuda
>Assignee: Yasuhiro Matsuda
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (KAFKA-3045) ZkNodeChangeNotificationListener shouldn't log interrupted exception as error

2016-01-04 Thread Dong Lin (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dong Lin reassigned KAFKA-3045:
---

Assignee: Dong Lin

> ZkNodeChangeNotificationListener shouldn't log interrupted exception as error
> -
>
> Key: KAFKA-3045
> URL: https://issues.apache.org/jira/browse/KAFKA-3045
> Project: Kafka
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 0.9.0.0
>Reporter: Jun Rao
>Assignee: Dong Lin
>  Labels: security
> Fix For: 0.9.0.1
>
>
> Saw the following when running /opt/kafka/bin/kafka-acls.sh --authorizer 
> kafka.security.auth.SimpleAclAuthorizer.
> [2015-12-28 08:04:39,382] ERROR Error processing notification change for path 
> = /kafka-acl-changes and notification= [acl_changes_04, 
> acl_changes_03, acl_changes_02, acl_changes_01, 
> acl_changes_00] : (kafka.common.ZkNodeChangeNotificationListener)
> org.I0Itec.zkclient.exception.ZkInterruptedException: 
> java.lang.InterruptedException
> at org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:997)
> at org.I0Itec.zkclient.ZkClient.readData(ZkClient.java:1090)
> at org.I0Itec.zkclient.ZkClient.readData(ZkClient.java:1085)
> at kafka.utils.ZkUtils.readDataMaybeNull(ZkUtils.scala:525)
> at 
> kafka.security.auth.SimpleAclAuthorizer.kafka$security$auth$SimpleAclAuthorizer$$getAclsFromZk(SimpleAclAuthorizer.scala:213)
> at 
> kafka.security.auth.SimpleAclAuthorizer$AclChangedNotificaitonHandler$.processNotification(SimpleAclAuthorizer.scala:273)
> at 
> kafka.common.ZkNodeChangeNotificationListener$$anonfun$kafka$common$ZkNodeChangeNotificationListener$$processNotifications$2$$anonfun$apply$2.apply(ZkNodeChangeNotificationListener.scala:84)
> at 
> kafka.common.ZkNodeChangeNotificationListener$$anonfun$kafka$common$ZkNodeChangeNotificationListener$$processNotifications$2$$anonfun$apply$2.apply(ZkNodeChangeNotificationListener.scala:84)
> at scala.Option.map(Option.scala:146)
> at 
> kafka.common.ZkNodeChangeNotificationListener$$anonfun$kafka$common$ZkNodeChangeNotificationListener$$processNotifications$2.apply(ZkNodeChangeNotificationListener.scala:84)
> at 
> kafka.common.ZkNodeChangeNotificationListener$$anonfun$kafka$common$ZkNodeChangeNotificationListener$$processNotifications$2.apply(ZkNodeChangeNotificationListener.scala:79)
> at 
> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
> at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
> at 
> kafka.common.ZkNodeChangeNotificationListener.kafka$common$ZkNodeChangeNotificationListener$$processNotifications(ZkNodeChangeNotificationListener.scala:79)
> at 
> kafka.common.ZkNodeChangeNotificationListener$NodeChangeListener$.handleChildChange(ZkNodeChangeNotificationListener.scala:121)
> at org.I0Itec.zkclient.ZkClient$10.run(ZkClient.java:842)
> at org.I0Itec.zkclient.ZkEventThread.run(ZkEventThread.java:71)
> Caused by: java.lang.InterruptedException
> at java.lang.Object.wait(Native Method)
> at java.lang.Object.wait(Object.java:502)
> at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1342)
> at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1153)
> at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1184)
> at org.I0Itec.zkclient.ZkConnection.readData(ZkConnection.java:119)
> at org.I0Itec.zkclient.ZkClient$12.call(ZkClient.java:1094)
> at org.I0Itec.zkclient.ZkClient$12.call(ZkClient.java:1090)
> at org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:985)
> When SimpleAclAuthorizer terminates, we close zkclient, which interrupts the 
> watcher processor thread. Since this is expected, we shouldn't log this as an 
> error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request: KAFKA-3045; ZkNodeChangeNotificationListener s...

2016-01-04 Thread lindong28
GitHub user lindong28 opened a pull request:

https://github.com/apache/kafka/pull/731

KAFKA-3045; ZkNodeChangeNotificationListener shouldn't log 
InterruptedException as error



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/lindong28/kafka KAFKA-3045

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/731.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #731


commit c140a114c4d86ef5ecc408f38a4640f3e8cef5f3
Author: Dong Lin 
Date:   2016-01-05T02:23:30Z

KAFKA-3045; ZkNodeChangeNotificationListener shouldn't log interrupted 
exception as error




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (KAFKA-3016) Add KStream-KStream window joins

2016-01-04 Thread Guozhang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang updated KAFKA-3016:
-
Status: Patch Available  (was: Open)

> Add KStream-KStream window joins
> 
>
> Key: KAFKA-3016
> URL: https://issues.apache.org/jira/browse/KAFKA-3016
> Project: Kafka
>  Issue Type: Sub-task
>  Components: kafka streams
>Affects Versions: 0.9.0.1
>Reporter: Yasuhiro Matsuda
>Assignee: Yasuhiro Matsuda
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request: KAFKA-2653: Alternative Kafka Streams Stateful...

2016-01-04 Thread guozhangwang
GitHub user guozhangwang opened a pull request:

https://github.com/apache/kafka/pull/730

KAFKA-2653: Alternative Kafka Streams Stateful API Design

ping @ymatsuda for reviews.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/guozhangwang/kafka K2653r

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/730.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #730


commit e46d649c2e40078ed161c83fdc1690456f09f43a
Author: Guozhang Wang 
Date:   2015-12-10T04:31:25Z

v1

commit 2167f29ff630577fe63abc93fd8a58aa6c7d3c1c
Author: Guozhang Wang 
Date:   2015-12-10T19:32:34Z

option 1 of windowing opeartions

commit fb92b2b20f7be6f17c006de6e48cb04065808477
Author: Guozhang Wang 
Date:   2015-12-11T05:47:51Z

v1

commit 0862ec2b4ecb151ea1b3395c74787e4de99891fe
Author: Guozhang Wang 
Date:   2015-12-11T22:15:02Z

v1

commit 9558891bdaccc0b8861f882b957b5131556f896c
Author: Guozhang Wang 
Date:   2015-12-15T00:30:20Z

address Yasu's comments

commit e6373cbc4229637100c97bbb440555c2f0719d03
Author: Guozhang Wang 
Date:   2015-12-15T01:50:17Z

add built-in aggregates

commit 66e122adc8911334e924921bc7fa67275445bd71
Author: Guozhang Wang 
Date:   2015-12-15T03:17:59Z

add built-in aggregates in KTable

commit 13c15ada1edbff51e34022484bcde3955cdf99cd
Author: Guozhang Wang 
Date:   2015-12-15T19:28:12Z

Merge branch 'trunk' of https://git-wip-us.apache.org/repos/asf/kafka into 
K2653r

commit 1f360a25022d0286f6ebbf1a6735201ba8fdab53
Author: Guozhang Wang 
Date:   2015-12-15T19:43:53Z

address Yasu's comments

commit 2b027bf8614026cbec05404dffd5e9c2598db6f4
Author: Guozhang Wang 
Date:   2015-12-15T20:58:11Z

add missing files

commit 5214b12fcd66eb4cfa9af4258ca2146c11aa2e89
Author: Guozhang Wang 
Date:   2015-12-15T23:11:27Z

address Yasu's comments

commit a603a9afde8a86906d085b6cf942df67d2082fb9
Author: Guozhang Wang 
Date:   2015-12-15T23:15:29Z

rename aggregateSupplier to aggregatorSupplier

commit e186710bc3b66e88148ab81087276cedffa2bad3
Author: Guozhang Wang 
Date:   2015-12-16T22:20:59Z

modify built-in aggregates

commit 5bb1e8c95e0c1ab131d5212d1a7d793ce8b49414
Author: Guozhang Wang 
Date:   2015-12-16T22:24:10Z

add missing files

commit 4570dd0d98526f8388c13ef5fe4af12d372f73c6
Author: Guozhang Wang 
Date:   2015-12-17T00:01:58Z

further comments addressed

commit d1ce4b85eff2364a34593cefbe0e8db4070d7fb9
Author: Guozhang Wang 
Date:   2016-01-05T00:53:43Z

Merge branch 'trunk' of https://git-wip-us.apache.org/repos/asf/kafka into 
K2653r

commit 13b6b995dff5fef42690852cd5cbb1d3f3b589d4
Author: Guozhang Wang 
Date:   2016-01-05T01:34:41Z

revision v1




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-2653) Stateful operations in the KStream DSL layer

2016-01-04 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15082183#comment-15082183
 ] 

ASF GitHub Bot commented on KAFKA-2653:
---

GitHub user guozhangwang opened a pull request:

https://github.com/apache/kafka/pull/730

KAFKA-2653: Alternative Kafka Streams Stateful API Design

ping @ymatsuda for reviews.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/guozhangwang/kafka K2653r

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/730.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #730


commit e46d649c2e40078ed161c83fdc1690456f09f43a
Author: Guozhang Wang 
Date:   2015-12-10T04:31:25Z

v1

commit 2167f29ff630577fe63abc93fd8a58aa6c7d3c1c
Author: Guozhang Wang 
Date:   2015-12-10T19:32:34Z

option 1 of windowing opeartions

commit fb92b2b20f7be6f17c006de6e48cb04065808477
Author: Guozhang Wang 
Date:   2015-12-11T05:47:51Z

v1

commit 0862ec2b4ecb151ea1b3395c74787e4de99891fe
Author: Guozhang Wang 
Date:   2015-12-11T22:15:02Z

v1

commit 9558891bdaccc0b8861f882b957b5131556f896c
Author: Guozhang Wang 
Date:   2015-12-15T00:30:20Z

address Yasu's comments

commit e6373cbc4229637100c97bbb440555c2f0719d03
Author: Guozhang Wang 
Date:   2015-12-15T01:50:17Z

add built-in aggregates

commit 66e122adc8911334e924921bc7fa67275445bd71
Author: Guozhang Wang 
Date:   2015-12-15T03:17:59Z

add built-in aggregates in KTable

commit 13c15ada1edbff51e34022484bcde3955cdf99cd
Author: Guozhang Wang 
Date:   2015-12-15T19:28:12Z

Merge branch 'trunk' of https://git-wip-us.apache.org/repos/asf/kafka into 
K2653r

commit 1f360a25022d0286f6ebbf1a6735201ba8fdab53
Author: Guozhang Wang 
Date:   2015-12-15T19:43:53Z

address Yasu's comments

commit 2b027bf8614026cbec05404dffd5e9c2598db6f4
Author: Guozhang Wang 
Date:   2015-12-15T20:58:11Z

add missing files

commit 5214b12fcd66eb4cfa9af4258ca2146c11aa2e89
Author: Guozhang Wang 
Date:   2015-12-15T23:11:27Z

address Yasu's comments

commit a603a9afde8a86906d085b6cf942df67d2082fb9
Author: Guozhang Wang 
Date:   2015-12-15T23:15:29Z

rename aggregateSupplier to aggregatorSupplier

commit e186710bc3b66e88148ab81087276cedffa2bad3
Author: Guozhang Wang 
Date:   2015-12-16T22:20:59Z

modify built-in aggregates

commit 5bb1e8c95e0c1ab131d5212d1a7d793ce8b49414
Author: Guozhang Wang 
Date:   2015-12-16T22:24:10Z

add missing files

commit 4570dd0d98526f8388c13ef5fe4af12d372f73c6
Author: Guozhang Wang 
Date:   2015-12-17T00:01:58Z

further comments addressed

commit d1ce4b85eff2364a34593cefbe0e8db4070d7fb9
Author: Guozhang Wang 
Date:   2016-01-05T00:53:43Z

Merge branch 'trunk' of https://git-wip-us.apache.org/repos/asf/kafka into 
K2653r

commit 13b6b995dff5fef42690852cd5cbb1d3f3b589d4
Author: Guozhang Wang 
Date:   2016-01-05T01:34:41Z

revision v1




> Stateful operations in the KStream DSL layer
> 
>
> Key: KAFKA-2653
> URL: https://issues.apache.org/jira/browse/KAFKA-2653
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Guozhang Wang
>Assignee: Guozhang Wang
>
> This includes the interface design the implementation for stateful operations 
> including:
> 0. table representation in KStream.
> 1. stream-stream join.
> 2. stream-table join.
> 3. table-table join.
> 4. stream / table aggregations.
> With 0 and 3 being tackled in KAFKA-2856 and KAFKA-2962 separately, this 
> ticket is going to only focus on windowing definition and 1 / 2 / 4 above.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Build failed in Jenkins: kafka-trunk-jdk8 #266

2016-01-04 Thread Apache Jenkins Server
See 

Changes:

[wangguoz] KAFKA-3016: phase-1. A local store for join window

--
Started by an SCM change
[EnvInject] - Loading node environment variables.
Building remotely on ubuntu-2 (docker Ubuntu ubuntu) in workspace 

 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url 
 > https://git-wip-us.apache.org/repos/asf/kafka.git # timeout=10
Fetching upstream changes from https://git-wip-us.apache.org/repos/asf/kafka.git
 > git --version # timeout=10
 > git -c core.askpass=true fetch --tags --progress 
 > https://git-wip-us.apache.org/repos/asf/kafka.git 
 > +refs/heads/*:refs/remotes/origin/*
 > git rev-parse refs/remotes/origin/trunk^{commit} # timeout=10
 > git rev-parse refs/remotes/origin/origin/trunk^{commit} # timeout=10
Checking out Revision b0b3e5aebf381faf81bd9454ef7b448e2ad922c7 
(refs/remotes/origin/trunk)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f b0b3e5aebf381faf81bd9454ef7b448e2ad922c7
 > git rev-list 57df460f8d7b225509dd8c061a5b6efa65c8ac9c # timeout=10
Setting 
JDK1_8_0_45_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/jdk1.8.0_45
Setting 
GRADLE_2_4_RC_2_HOME=/home/jenkins/jenkins-slave/tools/hudson.plugins.gradle.GradleInstallation/Gradle_2.4-rc-2
[kafka-trunk-jdk8] $ /bin/bash -xe /tmp/hudson2400726089512515786.sh
+ 
/home/jenkins/jenkins-slave/tools/hudson.plugins.gradle.GradleInstallation/Gradle_2.4-rc-2/bin/gradle
To honour the JVM settings for this build a new JVM will be forked. Please 
consider using the daemon: 
http://gradle.org/docs/2.4-rc-2/userguide/gradle_daemon.html.
Building project 'core' with Scala version 2.10.6
:downloadWrapper

BUILD SUCCESSFUL

Total time: 8.462 secs
Setting 
JDK1_8_0_45_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/jdk1.8.0_45
Setting 
GRADLE_2_4_RC_2_HOME=/home/jenkins/jenkins-slave/tools/hudson.plugins.gradle.GradleInstallation/Gradle_2.4-rc-2
[kafka-trunk-jdk8] $ /bin/bash -xe /tmp/hudson7309766309063465237.sh
+ export GRADLE_OPTS=-Xmx1024m
+ GRADLE_OPTS=-Xmx1024m
+ ./gradlew -Dorg.gradle.project.maxParallelForks=1 clean jarAll testAll
To honour the JVM settings for this build a new JVM will be forked. Please 
consider using the daemon: 
https://docs.gradle.org/2.10/userguide/gradle_daemon.html.
Building project 'core' with Scala version 2.10.6
:clean UP-TO-DATE
:clients:clean
:connect:clean UP-TO-DATE
:core:clean
:examples:clean
:log4j-appender:clean
:streams:clean
:tools:clean
:connect:api:clean
:connect:file:clean
:connect:json:clean
:connect:runtime:clean
:jar_core_2_10
Building project 'core' with Scala version 2.10.6
:kafka-trunk-jdk8:clients:compileJavawarning: [options] bootstrap class path 
not set in conjunction with -source 1.7
Note: 

 uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
Note: Some input files use unchecked or unsafe operations.
Note: Recompile with -Xlint:unchecked for details.
1 warning

:jar_core_2_10 FAILED

FAILURE: Build failed with an exception.

* What went wrong:
Could not add entry 
'
 to cache fileHashes.bin 
(
> Corrupted FreeListBlock 674085 found in cache 
> '

* Try:
Run with --stacktrace option to get the stack trace. Run with --info or --debug 
option to get more log output.

BUILD FAILED

Total time: 13.577 secs
Build step 'Execute shell' marked build as failure
Recording test results
Setting 
JDK1_8_0_45_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/jdk1.8.0_45
Setting 
GRADLE_2_4_RC_2_HOME=/home/jenkins/jenkins-slave/tools/hudson.plugins.gradle.GradleInstallation/Gradle_2.4-rc-2
ERROR: Publisher 'Publish JUnit test result report' failed: No test report 
files were found. Configuration error?
Setting 
JDK1_8_0_45_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/jdk1.8.0_45
Setting 
GRADLE_2_4_RC_2_HOME=/home/jenkins/jenkins-slave/tools/hudson.plugins.gradle.GradleInstallation/Gradle_2.4-rc-2


[jira] [Commented] (KAFKA-3064) Improve resync method to waste less time and data transfer

2016-01-04 Thread Dong Lin (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15082367#comment-15082367
 ] 

Dong Lin commented on KAFKA-3064:
-

[~Skandragon], can you clarify why consumer would throw away data, w.r.t. 
"...just to toss it away because the source broker has discarded it already"?

By "source broker has discarded", do you mean data has expired on the broker? 
If so, what is data expiration configuration used in your setup?

> Improve resync method to waste less time and data transfer
> --
>
> Key: KAFKA-3064
> URL: https://issues.apache.org/jira/browse/KAFKA-3064
> Project: Kafka
>  Issue Type: Improvement
>  Components: controller, network
>Reporter: Michael Graff
>Assignee: Neha Narkhede
>
> We have several topics which are large (65 GB per partition) with 12 
> partitions.  Data rates into each topic vary, but in general each one has its 
> own rate.
> After a raid rebuild, we are pulling all the data over to the newly rebuild 
> raid.  This takes forever, and has yet to complete after nearly 8 hours.
> Here are my observations:
> (1)  The Kafka broker seems to pull from all topics on all partitions at the 
> same time, starting at the oldest message.
> (2)  When you divide total disk bandwidth available across all partitions 
> (really, only 48 of which have significant amounts of data, about 65 GB each 
> topic) the ingest rate of 36 out of 48 of them is higher than the available 
> bandwidth.
> (3)  The effect of (2) is that one topic SLOWLY catches up, while the other 4 
> topics continue to retrieve data at 75% of the bandwidth, just to toss it 
> away because the source broker has discarded it already.
> (4)  Eventually that one topic catches up, and the remaining bandwidth is 
> then divided into the remaining 36 partitions, one group of which starts to 
> catch up again.
> What I want to see is a way to say “don’t transfer more than X partitions at 
> the same time” and ideally a priority rule that says, “Transfer partitions 
> you are responsible for first, then transfer ones you are not.  Also, 
> transfer these first, then those, but no more than 1 topic at a time”
> What I REALLY want is for Kafka to track the new data (track the head of the 
> log) and then ask for the tail in chunks.  Ideally this would request from 
> the source, “what is the next logical older starting point?” and then start 
> there.  This way, the transfer basically becomes a file transfer of the log 
> stored on the source’s disk. Once that block is retrieved, it moves on to the 
> next oldest.  This way, there is almost zero waste as both the head and tail 
> grow, but the tail runs the risk of losing the final chunk only.  Thus, 
> bandwidth is not significantly wasted.
> All this changes the ISR check to be is “am I caught up on head AND tail?” 
> when the tail part is implied right now.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KAFKA-3064) Improve resync method to waste less time and data transfer

2016-01-04 Thread Michael Graff (JIRA)
Michael Graff created KAFKA-3064:


 Summary: Improve resync method to waste less time and data transfer
 Key: KAFKA-3064
 URL: https://issues.apache.org/jira/browse/KAFKA-3064
 Project: Kafka
  Issue Type: Improvement
  Components: controller, network
Reporter: Michael Graff
Assignee: Neha Narkhede


We have several topics which are large (65 GB per partition) with 12 
partitions.  Data rates into each topic vary, but in general each one has its 
own rate.

After a raid rebuild, we are pulling all the data over to the newly rebuild 
raid.  This takes forever, and has yet to complete after nearly 8 hours.

Here are my observations:

(1)  The Kafka broker seems to pull from all topics on all partitions at the 
same time, starting at the oldest message.

(2)  When you divide total disk bandwidth available across all partitions 
(really, only 48 of which have significant amounts of data, about 65 GB each 
topic) the ingest rate of 36 out of 48 of them is higher than the available 
bandwidth.

(3)  The effect of (2) is that one topic SLOWLY catches up, while the other 4 
topics continue to retrieve data at 75% of the bandwidth, just to toss it away 
because the source broker has discarded it already.

(4)  Eventually that one topic catches up, and the remaining bandwidth is then 
divided into the remaining 36 partitions, one group of which starts to catch up 
again.

What I want to see is a way to say “don’t transfer more than X partitions at 
the same time” and ideally a priority rule that says, “Transfer partitions you 
are responsible for first, then transfer ones you are not.  Also, transfer 
these first, then those, but no more than 1 topic at a time”

What I REALLY want is for Kafka to track the new data (track the head of the 
log) and then ask for the tail in chunks.  Ideally this would request from the 
source, “what is the next logical older starting point?” and then start there.  
This way, the transfer basically becomes a file transfer of the log stored on 
the source’s disk. Once that block is retrieved, it moves on to the next 
oldest.  This way, there is almost zero waste as both the head and tail grow, 
but the tail runs the risk of losing the final chunk only.  Thus, bandwidth is 
not significantly wasted.

All this changes the ISR check to be is “am I caught up on head AND tail?” when 
the tail part is implied right now.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request: MINOR: ignore subproject .gitignore file for e...

2016-01-04 Thread vesense
Github user vesense closed the pull request at:

https://github.com/apache/kafka/pull/389


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Comment Edited] (KAFKA-3064) Improve resync method to waste less time and data transfer

2016-01-04 Thread Dong Lin (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15082367#comment-15082367
 ] 

Dong Lin edited comment on KAFKA-3064 at 1/5/16 4:44 AM:
-

[~Skandragon], can you clarify why consumer would throw away data, w.r.t. 
"...just to toss it away because the source broker has discarded it already"?

By "source broker has discarded", do you mean data has expired on the broker? 
If so, what is the log.retention.* configured on your broker?


was (Author: lindong):
[~Skandragon], can you clarify why consumer would throw away data, w.r.t. 
"...just to toss it away because the source broker has discarded it already"?

By "source broker has discarded", do you mean data has expired on the broker? 
If so, what is data expiration configuration used in your setup?

> Improve resync method to waste less time and data transfer
> --
>
> Key: KAFKA-3064
> URL: https://issues.apache.org/jira/browse/KAFKA-3064
> Project: Kafka
>  Issue Type: Improvement
>  Components: controller, network
>Reporter: Michael Graff
>Assignee: Neha Narkhede
>
> We have several topics which are large (65 GB per partition) with 12 
> partitions.  Data rates into each topic vary, but in general each one has its 
> own rate.
> After a raid rebuild, we are pulling all the data over to the newly rebuild 
> raid.  This takes forever, and has yet to complete after nearly 8 hours.
> Here are my observations:
> (1)  The Kafka broker seems to pull from all topics on all partitions at the 
> same time, starting at the oldest message.
> (2)  When you divide total disk bandwidth available across all partitions 
> (really, only 48 of which have significant amounts of data, about 65 GB each 
> topic) the ingest rate of 36 out of 48 of them is higher than the available 
> bandwidth.
> (3)  The effect of (2) is that one topic SLOWLY catches up, while the other 4 
> topics continue to retrieve data at 75% of the bandwidth, just to toss it 
> away because the source broker has discarded it already.
> (4)  Eventually that one topic catches up, and the remaining bandwidth is 
> then divided into the remaining 36 partitions, one group of which starts to 
> catch up again.
> What I want to see is a way to say “don’t transfer more than X partitions at 
> the same time” and ideally a priority rule that says, “Transfer partitions 
> you are responsible for first, then transfer ones you are not.  Also, 
> transfer these first, then those, but no more than 1 topic at a time”
> What I REALLY want is for Kafka to track the new data (track the head of the 
> log) and then ask for the tail in chunks.  Ideally this would request from 
> the source, “what is the next logical older starting point?” and then start 
> there.  This way, the transfer basically becomes a file transfer of the log 
> stored on the source’s disk. Once that block is retrieved, it moves on to the 
> next oldest.  This way, there is almost zero waste as both the head and tail 
> grow, but the tail runs the risk of losing the final chunk only.  Thus, 
> bandwidth is not significantly wasted.
> All this changes the ISR check to be is “am I caught up on head AND tail?” 
> when the tail part is implied right now.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3057) "Checking consumer position" docs are referencing (only) deprecated ConsumerOffsetChecker

2016-01-04 Thread Jens Rantil (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15081104#comment-15081104
 ] 

Jens Rantil commented on KAFKA-3057:


Also, should the documentation also document how to check the consumer group 
offset programatically?

> "Checking consumer position" docs are referencing (only) deprecated 
> ConsumerOffsetChecker
> -
>
> Key: KAFKA-3057
> URL: https://issues.apache.org/jira/browse/KAFKA-3057
> Project: Kafka
>  Issue Type: Bug
>  Components: admin, website
>Affects Versions: 0.9.0.0
>Reporter: Stevo Slavic
>Priority: Trivial
>
> ["Checking consumer position" operations 
> instructions|http://kafka.apache.org/090/documentation.html#basic_ops_consumer_lag]
>  are referencing only ConsumerOffsetChecker which is mentioned as deprecated 
> in [Potential breaking changes in 
> 0.9.0.0|http://kafka.apache.org/documentation.html#upgrade_9_breaking]
> Please consider updating docs with new ways for checking consumer position, 
> covering differences between old and new way, and recommendation which one is 
> preferred and why.
> Would be nice to document (and support if not already available), not only 
> how to read/fetch/check consumer (group) offset, but also how to set offset 
> for consumer group using Kafka's operations tools.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3057) "Checking consumer position" docs are referencing (only) deprecated ConsumerOffsetChecker

2016-01-04 Thread Stevo Slavic (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15081128#comment-15081128
 ] 

Stevo Slavic commented on KAFKA-3057:
-

[~ztyx] that is covered already on wiki page: 
https://cwiki.apache.org/confluence/display/KAFKA/Committing+and+fetching+consumer+offsets+in+Kafka
Not sure if it needs updating with any 0.9.0.x changes.

> "Checking consumer position" docs are referencing (only) deprecated 
> ConsumerOffsetChecker
> -
>
> Key: KAFKA-3057
> URL: https://issues.apache.org/jira/browse/KAFKA-3057
> Project: Kafka
>  Issue Type: Bug
>  Components: admin, website
>Affects Versions: 0.9.0.0
>Reporter: Stevo Slavic
>Priority: Trivial
>
> ["Checking consumer position" operations 
> instructions|http://kafka.apache.org/090/documentation.html#basic_ops_consumer_lag]
>  are referencing only ConsumerOffsetChecker which is mentioned as deprecated 
> in [Potential breaking changes in 
> 0.9.0.0|http://kafka.apache.org/documentation.html#upgrade_9_breaking]
> Please consider updating docs with new ways for checking consumer position, 
> covering differences between old and new way, and recommendation which one is 
> preferred and why.
> Would be nice to document (and support if not already available), not only 
> how to read/fetch/check consumer (group) offset, but also how to set offset 
> for consumer group using Kafka's operations tools.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: KIP-41: KafkaConsumer Max Records

2016-01-04 Thread Jens Rantil
Hi guys,

I realized I never thanked yall for your input - thanks!
Jason: I apologize for assuming your stance on the issue! Feels like we all
agreed on the solution. +1

Follow-up: Jason made a point about defining prefetch and fairness
behaviour in the KIP. I am now working on putting that down in writing. To
do be able to do this I think I need to understand the current prefetch
behaviour in the new consumer API (0.9) a bit better. Some specific
questions:

   - How does a specific consumer balance incoming messages from multiple
   partitions? Is the consumer simply issuing Multi-Fetch requests[1] for the
   consumed assigned partitions of the relevant topics? Or is the consumer
   fetching from one partition at a time and balancing between them
   internally? That is, is the responsibility of partition balancing (and
   fairness) on the broker side or consumer side?
   - Is the above documented somewhere?

[1]
https://cwiki.apache.org/confluence/display/KAFKA/Writing+a+Driver+for+Kafka,
see "Multi-Fetch".

Thanks,
Jens

On Wed, Dec 23, 2015 at 2:44 AM, Ismael Juma  wrote:

> On Wed, Dec 23, 2015 at 1:24 AM, Gwen Shapira  wrote:
>
> > Given the background, it sounds like you'll generally want each call to
> > poll() to return the same number of events (which is the number you
> planned
> > on having enough memory / time for). It also sounds like tuning the
> number
> > of events will be closely tied to tuning the session timeout. That is -
> if
> > I choose to lower the session timeout for some reason, I will have to
> > modify the number of records returning too.
> >
> > If those assumptions are correct, I think a configuration makes more
> sense.
> > 1. We are unlikely to want this parameter to be change at the lifetime of
> > the consumer
> > 2. The correct value is tied to another configuration parameter, so they
> > will be controlled together.
> >
>
> I was thinking the same thing.
>
> Ismael
>



-- 
Jens Rantil
Backend engineer
Tink AB

Email: jens.ran...@tink.se
Phone: +46 708 84 18 32
Web: www.tink.se

Facebook  Linkedin

 Twitter 


[jira] [Updated] (KAFKA-2072) Add StopReplica request/response to o.a.k.common.requests and replace the usage in core module

2016-01-04 Thread David Jacot (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Jacot updated KAFKA-2072:
---
Status: Patch Available  (was: In Progress)

> Add StopReplica request/response to o.a.k.common.requests and replace the 
> usage in core module
> --
>
> Key: KAFKA-2072
> URL: https://issues.apache.org/jira/browse/KAFKA-2072
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Gwen Shapira
>Assignee: David Jacot
>
> Add StopReplica request/response to o.a.k.common.requests and replace the 
> usage in core module



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request: KAFKA-3058: remove the usage of deprecated con...

2016-01-04 Thread konradkalita
GitHub user konradkalita opened a pull request:

https://github.com/apache/kafka/pull/732

KAFKA-3058: remove the usage of deprecated config properties



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/konradkalita/kafka Kafka-3058

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/732.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #732


commit 197f9d868655901355bb650cea33b164cd03672e
Author: Konrad 
Date:   2016-01-05T07:45:45Z

Deprecated config properties deleted




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (KAFKA-2072) Add StopReplica request/response to o.a.k.common.requests and replace the usage in core module

2016-01-04 Thread David Jacot (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-2072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Jacot updated KAFKA-2072:
---
Status: In Progress  (was: Patch Available)

> Add StopReplica request/response to o.a.k.common.requests and replace the 
> usage in core module
> --
>
> Key: KAFKA-2072
> URL: https://issues.apache.org/jira/browse/KAFKA-2072
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Gwen Shapira
>Assignee: David Jacot
>
> Add StopReplica request/response to o.a.k.common.requests and replace the 
> usage in core module



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3045) ZkNodeChangeNotificationListener shouldn't log interrupted exception as error

2016-01-04 Thread Ismael Juma (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15081237#comment-15081237
 ] 

Ismael Juma commented on KAFKA-3045:


AclChangedNotificationHandler also has a similar issue. Both cases can be seen 
when running `AclCommandTest`.

> ZkNodeChangeNotificationListener shouldn't log interrupted exception as error
> -
>
> Key: KAFKA-3045
> URL: https://issues.apache.org/jira/browse/KAFKA-3045
> Project: Kafka
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 0.9.0.0
>Reporter: Jun Rao
>  Labels: security
> Fix For: 0.9.0.1
>
>
> Saw the following when running /opt/kafka/bin/kafka-acls.sh --authorizer 
> kafka.security.auth.SimpleAclAuthorizer.
> [2015-12-28 08:04:39,382] ERROR Error processing notification change for path 
> = /kafka-acl-changes and notification= [acl_changes_04, 
> acl_changes_03, acl_changes_02, acl_changes_01, 
> acl_changes_00] : (kafka.common.ZkNodeChangeNotificationListener)
> org.I0Itec.zkclient.exception.ZkInterruptedException: 
> java.lang.InterruptedException
> at org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:997)
> at org.I0Itec.zkclient.ZkClient.readData(ZkClient.java:1090)
> at org.I0Itec.zkclient.ZkClient.readData(ZkClient.java:1085)
> at kafka.utils.ZkUtils.readDataMaybeNull(ZkUtils.scala:525)
> at 
> kafka.security.auth.SimpleAclAuthorizer.kafka$security$auth$SimpleAclAuthorizer$$getAclsFromZk(SimpleAclAuthorizer.scala:213)
> at 
> kafka.security.auth.SimpleAclAuthorizer$AclChangedNotificaitonHandler$.processNotification(SimpleAclAuthorizer.scala:273)
> at 
> kafka.common.ZkNodeChangeNotificationListener$$anonfun$kafka$common$ZkNodeChangeNotificationListener$$processNotifications$2$$anonfun$apply$2.apply(ZkNodeChangeNotificationListener.scala:84)
> at 
> kafka.common.ZkNodeChangeNotificationListener$$anonfun$kafka$common$ZkNodeChangeNotificationListener$$processNotifications$2$$anonfun$apply$2.apply(ZkNodeChangeNotificationListener.scala:84)
> at scala.Option.map(Option.scala:146)
> at 
> kafka.common.ZkNodeChangeNotificationListener$$anonfun$kafka$common$ZkNodeChangeNotificationListener$$processNotifications$2.apply(ZkNodeChangeNotificationListener.scala:84)
> at 
> kafka.common.ZkNodeChangeNotificationListener$$anonfun$kafka$common$ZkNodeChangeNotificationListener$$processNotifications$2.apply(ZkNodeChangeNotificationListener.scala:79)
> at 
> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
> at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
> at 
> kafka.common.ZkNodeChangeNotificationListener.kafka$common$ZkNodeChangeNotificationListener$$processNotifications(ZkNodeChangeNotificationListener.scala:79)
> at 
> kafka.common.ZkNodeChangeNotificationListener$NodeChangeListener$.handleChildChange(ZkNodeChangeNotificationListener.scala:121)
> at org.I0Itec.zkclient.ZkClient$10.run(ZkClient.java:842)
> at org.I0Itec.zkclient.ZkEventThread.run(ZkEventThread.java:71)
> Caused by: java.lang.InterruptedException
> at java.lang.Object.wait(Native Method)
> at java.lang.Object.wait(Object.java:502)
> at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1342)
> at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1153)
> at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1184)
> at org.I0Itec.zkclient.ZkConnection.readData(ZkConnection.java:119)
> at org.I0Itec.zkclient.ZkClient$12.call(ZkClient.java:1094)
> at org.I0Itec.zkclient.ZkClient$12.call(ZkClient.java:1090)
> at org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:985)
> When SimpleAclAuthorizer terminates, we close zkclient, which interrupts the 
> watcher processor thread. Since this is expected, we shouldn't log this as an 
> error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request: KAFKA-3052; Broker properties get logged twice...

2016-01-04 Thread ijuma
GitHub user ijuma opened a pull request:

https://github.com/apache/kafka/pull/725

KAFKA-3052; Broker properties get logged twice if acl enabled

Fix it by making it possible to pass the `doLog` parameter to 
`AbstractConfig`. As explained in the code comments, this means that we can 
continue to benefit from ZK default settings as specified in `KafkaConfig` 
without having to duplicate code.

Also:
* Removed unused private methods from `KafkaConfig`
* Removed `case` modifier from `KafkaConfig` so that `hashCode`, `equals`
and `toString` from `AbstractConfig` are used.
* Made `props` a `val` and added `apply` method to `KafkaConfig` to
remain binary compatible.
* Call authorizer.close even if an exception is thrown during `configure`.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ijuma/kafka 
kafka-3052-broker-properties-get-logged-twice-if-acl-enabled

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/725.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #725


commit 7f2b2ec9a9f333bdbb042c68338d5f437c1df5af
Author: Ismael Juma 
Date:   2016-01-04T14:27:20Z

Fix duplicate logging of broker properties when ACL is enabled

Also:
* Removed unused private methods from `KafkaConfig`
* Removed `case` modifier so that `hashCode`, `equals`
and `toString` from `AbstractConfig` are used.
* Made `props` a `val` and added `apply` method to
remain binary compatible.

commit c4f3ab3013a79ca22a3bda954d54b1226c094220
Author: Ismael Juma 
Date:   2016-01-04T14:34:53Z

Call authorizer close even if an exception is thrown during `configure`




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-3050) Space in the value for "host.name" causes "Unresolved address"

2016-01-04 Thread Ismael Juma (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15081233#comment-15081233
 ] 

Ismael Juma commented on KAFKA-3050:


Thanks for the report. To be clear, are you requesting that we trim empty 
spaces after the config value before using it?

> Space in the value for "host.name" causes "Unresolved address"
> --
>
> Key: KAFKA-3050
> URL: https://issues.apache.org/jira/browse/KAFKA-3050
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8.2.1
>Reporter: Navin Markandeya
>
> In {{/config/server.properties}},  after updating the 
> {{host.name}}  to a value with a space after "localhost", received
> {code}
> [2015-12-30 11:13:43,014] FATAL Fatal error during KafkaServerStartable 
> startup. Prepare to shutdown (kafka.server.KafkaServerStartable)
> kafka.common.KafkaException: Socket server failed to bind to localhost :9092: 
> Unresolved address.
>   at kafka.network.Acceptor.openServerSocket(SocketServer.scala:260)
>   at kafka.network.Acceptor.(SocketServer.scala:205)
>   at kafka.network.SocketServer.startup(SocketServer.scala:86)
>   at kafka.server.KafkaServer.startup(KafkaServer.scala:99)
>   at 
> kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:29)
>   at kafka.Kafka$.main(Kafka.scala:46)
>   at kafka.Kafka.main(Kafka.scala)
> Caused by: java.net.SocketException: Unresolved address
>   at sun.nio.ch.Net.translateToSocketException(Net.java:131)
>   at sun.nio.ch.Net.translateException(Net.java:157)
>   at sun.nio.ch.Net.translateException(Net.java:163)
>   at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:76)
>   at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:67)
>   at kafka.network.Acceptor.openServerSocket(SocketServer.scala:256)
>   ... 6 more
> Caused by: java.nio.channels.UnresolvedAddressException
>   at sun.nio.ch.Net.checkAddress(Net.java:101)
>   at 
> sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:218)
>   at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
>   ... 8 more
> {code}
>  I am running {{kafka_2.9.1-0.8.2.2}} on Centos6.5 with Java8
> {code}
> java version "1.8.0_60"
> Java(TM) SE Runtime Environment (build 1.8.0_60-b27)
> Java HotSpot(TM) 64-Bit Server VM (build 25.60-b23, mixed mode)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-3052) broker properties get logged twice if acl is enabled

2016-01-04 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-3052:
---
Reviewer: Gwen Shapira
  Status: Patch Available  (was: Open)

> broker properties get logged twice if acl is enabled
> 
>
> Key: KAFKA-3052
> URL: https://issues.apache.org/jira/browse/KAFKA-3052
> Project: Kafka
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 0.9.0.0
>Reporter: Jun Rao
>Assignee: Ismael Juma
>  Labels: newbie, security
> Fix For: 0.9.0.1
>
>
> This is because in SimpleAclAuthorizer.configure(), there is the following 
> statement which triggers the logging of all broker properties.
> val kafkaConfig = KafkaConfig.fromProps(props)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3052) broker properties get logged twice if acl is enabled

2016-01-04 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15081217#comment-15081217
 ] 

ASF GitHub Bot commented on KAFKA-3052:
---

GitHub user ijuma opened a pull request:

https://github.com/apache/kafka/pull/725

KAFKA-3052; Broker properties get logged twice if acl enabled

Fix it by making it possible to pass the `doLog` parameter to 
`AbstractConfig`. As explained in the code comments, this means that we can 
continue to benefit from ZK default settings as specified in `KafkaConfig` 
without having to duplicate code.

Also:
* Removed unused private methods from `KafkaConfig`
* Removed `case` modifier from `KafkaConfig` so that `hashCode`, `equals`
and `toString` from `AbstractConfig` are used.
* Made `props` a `val` and added `apply` method to `KafkaConfig` to
remain binary compatible.
* Call authorizer.close even if an exception is thrown during `configure`.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ijuma/kafka 
kafka-3052-broker-properties-get-logged-twice-if-acl-enabled

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/725.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #725


commit 7f2b2ec9a9f333bdbb042c68338d5f437c1df5af
Author: Ismael Juma 
Date:   2016-01-04T14:27:20Z

Fix duplicate logging of broker properties when ACL is enabled

Also:
* Removed unused private methods from `KafkaConfig`
* Removed `case` modifier so that `hashCode`, `equals`
and `toString` from `AbstractConfig` are used.
* Made `props` a `val` and added `apply` method to
remain binary compatible.

commit c4f3ab3013a79ca22a3bda954d54b1226c094220
Author: Ismael Juma 
Date:   2016-01-04T14:34:53Z

Call authorizer close even if an exception is thrown during `configure`




> broker properties get logged twice if acl is enabled
> 
>
> Key: KAFKA-3052
> URL: https://issues.apache.org/jira/browse/KAFKA-3052
> Project: Kafka
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 0.9.0.0
>Reporter: Jun Rao
>Assignee: Ismael Juma
>  Labels: newbie, security
> Fix For: 0.9.0.1
>
>
> This is because in SimpleAclAuthorizer.configure(), there is the following 
> statement which triggers the logging of all broker properties.
> val kafkaConfig = KafkaConfig.fromProps(props)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (KAFKA-3041) NullPointerException in new Consumer API on broker restart

2016-01-04 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma resolved KAFKA-3041.

Resolution: Duplicate
  Assignee: Jason Gustafson  (was: Neha Narkhede)

> NullPointerException in new Consumer API on broker restart
> --
>
> Key: KAFKA-3041
> URL: https://issues.apache.org/jira/browse/KAFKA-3041
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer
>Affects Versions: 0.9.0.0
>Reporter: Enrico Olivelli
>Assignee: Jason Gustafson
>Priority: Blocker
>
> I 'm unning a brand new Kafka cluster (version 0.9.0.0). During my tests I 
> noticed this error at Consumer.partitionsFor during a full cluster restart.
> My DEV cluster is made of 4 brokers
> I cannot reproduce the problem consistently but it heppens sometimes during 
> the restart of the brokers
> This is my code:
>this.properties = new Properties();
> properties.put("bootstrap.servers", "list of servers"));
> properties.put("acks", "all");
> properties.put("retries", 0);
> properties.put("batch.size", 16384);
> properties.put("linger.ms", 1);
> properties.put("buffer.memory", 33554432);
> properties.put("group.id", "test");
> properties.put("session.timeout.ms", "3");
> properties.put("key.deserializer", 
> "org.apache.kafka.common.serialization.ByteArrayDeserializer");
> properties.put("value.deserializer", 
> "org.apache.kafka.common.serialization.ByteArrayDeserializer");
>String topic = "xxx”;
> try (KafkaConsumer consumer = new 
> KafkaConsumer<>(properties);) {
> List partitions = consumer.partitionsFor(topic);
>  ….
> }
> This is the error:
> java.lang.NullPointerException
> at 
> org.apache.kafka.common.requests.MetadataResponse.(MetadataResponse.java:130)
> at 
> org.apache.kafka.clients.consumer.internals.Fetcher.getTopicMetadata(Fetcher.java:203)
> at 
> org.apache.kafka.clients.consumer.KafkaConsumer.partitionsFor(KafkaConsumer.java:1143)
> at 
> magnews.datastream.KafkaDataStreamConsumer.fetchNewData(KafkaDataStreamConsumer.java:44



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request: KAFKA-3052; Broker properties get logged twice...

2016-01-04 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/725


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (KAFKA-3058) remove the usage of deprecated config properties

2016-01-04 Thread Jun Rao (JIRA)
Jun Rao created KAFKA-3058:
--

 Summary: remove the usage of deprecated config properties
 Key: KAFKA-3058
 URL: https://issues.apache.org/jira/browse/KAFKA-3058
 Project: Kafka
  Issue Type: Improvement
  Components: core
Affects Versions: 0.9.0.0
Reporter: Jun Rao
 Fix For: 0.9.1.0


There are compilation warnings like the following, which can be avoided.

core/src/main/scala/kafka/tools/EndToEndLatency.scala:74: value 
BLOCK_ON_BUFFER_FULL_CONFIG in object ProducerConfig is deprecated: see 
corresponding Javadoc for more information.
producerProps.put(ProducerConfig.BLOCK_ON_BUFFER_FULL_CONFIG, "true")
 ^
kafka/core/src/main/scala/kafka/tools/MirrorMaker.scala:195: value 
BLOCK_ON_BUFFER_FULL_CONFIG in object ProducerConfig is deprecated: see 
corresponding Javadoc for more information.
  maybeSetDefaultProperty(producerProps, 
ProducerConfig.BLOCK_ON_BUFFER_FULL_CONFIG, "true")
^
/Users/junrao/intellij/kafka/core/src/main/scala/kafka/tools/ProducerPerformance.scala:40:
 @deprecated now takes two arguments; see the scaladoc.
@deprecated




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (KAFKA-3055) JsonConverter mangles schema during serialization (fromConnectData)

2016-01-04 Thread Ewen Cheslack-Postava (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewen Cheslack-Postava resolved KAFKA-3055.
--
   Resolution: Fixed
Fix Version/s: 0.9.1.0
   0.9.0.1

Issue resolved by pull request 722
[https://github.com/apache/kafka/pull/722]

> JsonConverter mangles schema during serialization (fromConnectData)
> ---
>
> Key: KAFKA-3055
> URL: https://issues.apache.org/jira/browse/KAFKA-3055
> Project: Kafka
>  Issue Type: Bug
>  Components: copycat
>Affects Versions: 0.9.0.0
>Reporter: Kishore Senji
>Assignee: Ewen Cheslack-Postava
> Fix For: 0.9.0.1, 0.9.1.0
>
>
> Test case is here: 
> https://github.com/ksenji/kafka-connect-test/tree/master/src/test/java/org/apache/kafka/connect/json
> If Caching is disabled, it behaves correctly and JsonConverterWithNoCacheTest 
> runs successfully. Otherwise the test JsonConverterTest fails.
> The reason is that the JsonConverter has a bug where it mangles the schema as 
> it assigns all String fields with the same name (and similar for all Int32 
> fields)
> This is how the schema & payload gets serialized for the Person Struct (with 
> caching disabled):
> {code}
> {"schema":{"type":"struct","fields":[{"type":"string","optional":false,"field":"firstName"},{"type":"string","optional":false,"field":"lastName"},{"type":"string","optional":false,"field":"email"},{"type":"int32","optional":false,"field":"age"},{"type":"int32","optional":false,"field":"weightInKgs"}],"optional":false,"name":"Person"},"payload":{"firstName":"Eric","lastName":"Cartman","email":"eric.cart...@southpark.com","age":10,"weightInKgs":40}}
> {code}
> where as when caching is enabled the same Struct gets serialized as (with 
> caching enabled) :
> {code}
> {"schema":{"type":"struct","fields":[{"type":"string","optional":false,"field":"email"},{"type":"string","optional":false,"field":"email"},{"type":"string","optional":false,"field":"email"},{"type":"int32","optional":false,"field":"weightInKgs"},{"type":"int32","optional":false,"field":"weightInKgs"}],"optional":false,"name":"Person"},"payload":{"firstName":"Eric","lastName":"Cartman","email":"eric.cart...@southpark.com","age":10,"weightInKgs":40}}
> {code}
> As we can see all String fields became "email" and all int32 fields became 
> "weightInKgs". 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2937) Topics marked for delete in Zookeeper may become undeletable

2016-01-04 Thread Mayuresh Gharat (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15081352#comment-15081352
 ] 

Mayuresh Gharat commented on KAFKA-2937:


[~djh] does your integration test check if the topic is created successfully 
before attempting a delete?

> Topics marked for delete in Zookeeper may become undeletable
> 
>
> Key: KAFKA-2937
> URL: https://issues.apache.org/jira/browse/KAFKA-2937
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.9.0.0
>Reporter: Rajini Sivaram
>Assignee: Mayuresh Gharat
>
> In our clusters, we occasionally see topics marked for delete, but never 
> actually deleted. It may be due to brokers being restarted while tests were 
> running, but further restarts of Kafka dont fix the problem. The topics 
> remain marked for delete in Zookeeper.
> Topic describe shows:
> {quote}
> Topic:testtopic   PartitionCount:1ReplicationFactor:3 Configs:
>   Topic: testtopicPartition: 0Leader: noneReplicas: 3,4,0 
> Isr: 
> {quote}
> Kafka logs show:
> {quote}
> 2015-12-02 15:53:30,152] ERROR Controller 2 epoch 213 initiated state change 
> of replica 3 for partition [testtopic,0] from OnlineReplica to OfflineReplica 
> failed (state.change.logger)
> kafka.common.StateChangeFailedException: Failed to change state of replica 3 
> for partition [testtopic,0] since the leader and isr path in zookeeper is 
> empty
> at 
> kafka.controller.ReplicaStateMachine.handleStateChange(ReplicaStateMachine.scala:269)
> at 
> kafka.controller.ReplicaStateMachine$$anonfun$handleStateChanges$2.apply(ReplicaStateMachine.scala:114)
> at 
> kafka.controller.ReplicaStateMachine$$anonfun$handleStateChanges$2.apply(ReplicaStateMachine.scala:114)
> at 
> scala.collection.immutable.HashSet$HashSet1.foreach(HashSet.scala:322)
> at 
> scala.collection.immutable.HashSet$HashTrieSet.foreach(HashSet.scala:978)
> at 
> kafka.controller.ReplicaStateMachine.handleStateChanges(ReplicaStateMachine.scala:114)
> at 
> kafka.controller.TopicDeletionManager$$anonfun$startReplicaDeletion$2.apply(TopicDeletionManager.scala:342)
> at 
> kafka.controller.TopicDeletionManager$$anonfun$startReplicaDeletion$2.apply(TopicDeletionManager.scala:334)
> at scala.collection.immutable.Map$Map1.foreach(Map.scala:116)
> at 
> kafka.controller.TopicDeletionManager.startReplicaDeletion(TopicDeletionManager.scala:334)
> at 
> kafka.controller.TopicDeletionManager.kafka$controller$TopicDeletionManager$$onPartitionDeletion(TopicDeletionManager.scala:367)
> at 
> kafka.controller.TopicDeletionManager$$anonfun$kafka$controller$TopicDeletionManager$$onTopicDeletion$2.apply(TopicDeletionManager.scala:313)
> at 
> kafka.controller.TopicDeletionManager$$anonfun$kafka$controller$TopicDeletionManager$$onTopicDeletion$2.apply(TopicDeletionManager.scala:312)
> at scala.collection.immutable.Set$Set1.foreach(Set.scala:79)
> at 
> kafka.controller.TopicDeletionManager.kafka$controller$TopicDeletionManager$$onTopicDeletion(TopicDeletionManager.scala:312)
> at 
> kafka.controller.TopicDeletionManager$DeleteTopicsThread$$anonfun$doWork$1$$anonfun$apply$mcV$sp$4.apply(TopicDeletionManager.scala:431)
> at 
> kafka.controller.TopicDeletionManager$DeleteTopicsThread$$anonfun$doWork$1$$anonfun$apply$mcV$sp$4.apply(TopicDeletionManager.scala:403)
> at scala.collection.immutable.Set$Set2.foreach(Set.scala:111)
> at 
> kafka.controller.TopicDeletionManager$DeleteTopicsThread$$anonfun$doWork$1.apply$mcV$sp(TopicDeletionManager.scala:403)
> at 
> kafka.controller.TopicDeletionManager$DeleteTopicsThread$$anonfun$doWork$1.apply(TopicDeletionManager.scala:397)
> at 
> kafka.controller.TopicDeletionManager$DeleteTopicsThread$$anonfun$doWork$1.apply(TopicDeletionManager.scala:397)
> at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:262)
> at 
> kafka.controller.TopicDeletionManager$DeleteTopicsThread.doWork(TopicDeletionManager.scala:397)
> at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:63)
> {quote}  
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Build failed in Jenkins: kafka-trunk-jdk8 #263

2016-01-04 Thread Apache Jenkins Server
See 

Changes:

[me] KAFKA-3055; Fix JsonConverter mangling the Schema in Connect

[wangguoz] KAFKA-3052; Broker properties get logged twice if acl enabled

--
Started by an SCM change
[EnvInject] - Loading node environment variables.
Building remotely on H11 (Ubuntu ubuntu) in workspace 

 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url 
 > https://git-wip-us.apache.org/repos/asf/kafka.git # timeout=10
Fetching upstream changes from https://git-wip-us.apache.org/repos/asf/kafka.git
 > git --version # timeout=10
 > git -c core.askpass=true fetch --tags --progress 
 > https://git-wip-us.apache.org/repos/asf/kafka.git 
 > +refs/heads/*:refs/remotes/origin/*
 > git rev-parse refs/remotes/origin/trunk^{commit} # timeout=10
 > git rev-parse refs/remotes/origin/origin/trunk^{commit} # timeout=10
Checking out Revision f9642e2a9878faff81366dbc885a206727bd7c7b 
(refs/remotes/origin/trunk)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f f9642e2a9878faff81366dbc885a206727bd7c7b
 > git rev-list b905d489188768ba1c55226857db9713b9272918 # timeout=10
Setting 
JDK1_8_0_45_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/jdk1.8.0_45
Setting 
GRADLE_2_4_RC_2_HOME=/home/jenkins/jenkins-slave/tools/hudson.plugins.gradle.GradleInstallation/Gradle_2.4-rc-2
[kafka-trunk-jdk8] $ /bin/bash -xe /tmp/hudson5209640037247393859.sh
+ 
/home/jenkins/jenkins-slave/tools/hudson.plugins.gradle.GradleInstallation/Gradle_2.4-rc-2/bin/gradle
To honour the JVM settings for this build a new JVM will be forked. Please 
consider using the daemon: 
http://gradle.org/docs/2.4-rc-2/userguide/gradle_daemon.html.
Building project 'core' with Scala version 2.10.6
:downloadWrapper

BUILD SUCCESSFUL

Total time: 10.42 secs
Setting 
JDK1_8_0_45_HOME=/home/jenkins/jenkins-slave/tools/hudson.model.JDK/jdk1.8.0_45
Setting 
GRADLE_2_4_RC_2_HOME=/home/jenkins/jenkins-slave/tools/hudson.plugins.gradle.GradleInstallation/Gradle_2.4-rc-2
[kafka-trunk-jdk8] $ /bin/bash -xe /tmp/hudson7917740434252849124.sh
+ export GRADLE_OPTS=-Xmx1024m
+ GRADLE_OPTS=-Xmx1024m
+ ./gradlew -Dorg.gradle.project.maxParallelForks=1 clean jarAll testAll
To honour the JVM settings for this build a new JVM will be forked. Please 
consider using the daemon: 
https://docs.gradle.org/2.10/userguide/gradle_daemon.html.
Building project 'core' with Scala version 2.10.6
:clean UP-TO-DATE
:clients:clean
:connect:clean UP-TO-DATE
:core:clean
:examples:clean
:log4j-appender:clean
:streams:clean
:tools:clean
:connect:api:clean
:connect:file:clean
:connect:json:clean
:connect:runtime:clean
:jar_core_2_10
Building project 'core' with Scala version 2.10.6
:kafka-trunk-jdk8:clients:compileJavawarning: [options] bootstrap class path 
not set in conjunction with -source 1.7
Note: 

 uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
Note: Some input files use unchecked or unsafe operations.
Note: Recompile with -Xlint:unchecked for details.
1 warning

:kafka-trunk-jdk8:clients:processResources UP-TO-DATE
:kafka-trunk-jdk8:clients:classes
:kafka-trunk-jdk8:clients:determineCommitId UP-TO-DATE
:kafka-trunk-jdk8:clients:createVersionFile
:kafka-trunk-jdk8:clients:jar
:kafka-trunk-jdk8:core:compileJava UP-TO-DATE
:kafka-trunk-jdk8:core:compileScalaJava HotSpot(TM) 64-Bit Server VM warning: 
ignoring option MaxPermSize=512m; support was removed in 8.0

:79:
 value DEFAULT_TIMESTAMP in object OffsetCommitRequest is deprecated: see 
corresponding Javadoc for more information.

org.apache.kafka.common.requests.OffsetCommitRequest.DEFAULT_TIMESTAMP
 ^
:36:
 value DEFAULT_TIMESTAMP in object OffsetCommitRequest is deprecated: see 
corresponding Javadoc for more information.
 commitTimestamp: Long = 
org.apache.kafka.common.requests.OffsetCommitRequest.DEFAULT_TIMESTAMP,

  ^
:37:
 value DEFAULT_TIMESTAMP in object OffsetCommitRequest is deprecated: see 
corresponding Javadoc for more information.
 expireTimestamp: Long = 
org.apache.kafka.common.requests.OffsetCommitRequest.DEFAULT_TIMESTAMP) {
 

[jira] [Commented] (KAFKA-2937) Topics marked for delete in Zookeeper may become undeletable

2016-01-04 Thread Henri Pihkala (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15081401#comment-15081401
 ] 

Henri Pihkala commented on KAFKA-2937:
--

[~mgharat] Yes, like this:

AdminUtils.createTopic(...)
assert AdminUtils.topicExists(...)
AdminUtils.deleteTopic(...)
Thread.sleep(...)
assert !AdminUtils.topicExists(...)

When this test one day suddenly started failing (on the last line), I first 
suspected that my sleep wasn't long enough anymore. However, digging deeper 
showed that topics created by my test were ending up in the undeleteable state 
described in this JIRA.

> Topics marked for delete in Zookeeper may become undeletable
> 
>
> Key: KAFKA-2937
> URL: https://issues.apache.org/jira/browse/KAFKA-2937
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.9.0.0
>Reporter: Rajini Sivaram
>Assignee: Mayuresh Gharat
>
> In our clusters, we occasionally see topics marked for delete, but never 
> actually deleted. It may be due to brokers being restarted while tests were 
> running, but further restarts of Kafka dont fix the problem. The topics 
> remain marked for delete in Zookeeper.
> Topic describe shows:
> {quote}
> Topic:testtopic   PartitionCount:1ReplicationFactor:3 Configs:
>   Topic: testtopicPartition: 0Leader: noneReplicas: 3,4,0 
> Isr: 
> {quote}
> Kafka logs show:
> {quote}
> 2015-12-02 15:53:30,152] ERROR Controller 2 epoch 213 initiated state change 
> of replica 3 for partition [testtopic,0] from OnlineReplica to OfflineReplica 
> failed (state.change.logger)
> kafka.common.StateChangeFailedException: Failed to change state of replica 3 
> for partition [testtopic,0] since the leader and isr path in zookeeper is 
> empty
> at 
> kafka.controller.ReplicaStateMachine.handleStateChange(ReplicaStateMachine.scala:269)
> at 
> kafka.controller.ReplicaStateMachine$$anonfun$handleStateChanges$2.apply(ReplicaStateMachine.scala:114)
> at 
> kafka.controller.ReplicaStateMachine$$anonfun$handleStateChanges$2.apply(ReplicaStateMachine.scala:114)
> at 
> scala.collection.immutable.HashSet$HashSet1.foreach(HashSet.scala:322)
> at 
> scala.collection.immutable.HashSet$HashTrieSet.foreach(HashSet.scala:978)
> at 
> kafka.controller.ReplicaStateMachine.handleStateChanges(ReplicaStateMachine.scala:114)
> at 
> kafka.controller.TopicDeletionManager$$anonfun$startReplicaDeletion$2.apply(TopicDeletionManager.scala:342)
> at 
> kafka.controller.TopicDeletionManager$$anonfun$startReplicaDeletion$2.apply(TopicDeletionManager.scala:334)
> at scala.collection.immutable.Map$Map1.foreach(Map.scala:116)
> at 
> kafka.controller.TopicDeletionManager.startReplicaDeletion(TopicDeletionManager.scala:334)
> at 
> kafka.controller.TopicDeletionManager.kafka$controller$TopicDeletionManager$$onPartitionDeletion(TopicDeletionManager.scala:367)
> at 
> kafka.controller.TopicDeletionManager$$anonfun$kafka$controller$TopicDeletionManager$$onTopicDeletion$2.apply(TopicDeletionManager.scala:313)
> at 
> kafka.controller.TopicDeletionManager$$anonfun$kafka$controller$TopicDeletionManager$$onTopicDeletion$2.apply(TopicDeletionManager.scala:312)
> at scala.collection.immutable.Set$Set1.foreach(Set.scala:79)
> at 
> kafka.controller.TopicDeletionManager.kafka$controller$TopicDeletionManager$$onTopicDeletion(TopicDeletionManager.scala:312)
> at 
> kafka.controller.TopicDeletionManager$DeleteTopicsThread$$anonfun$doWork$1$$anonfun$apply$mcV$sp$4.apply(TopicDeletionManager.scala:431)
> at 
> kafka.controller.TopicDeletionManager$DeleteTopicsThread$$anonfun$doWork$1$$anonfun$apply$mcV$sp$4.apply(TopicDeletionManager.scala:403)
> at scala.collection.immutable.Set$Set2.foreach(Set.scala:111)
> at 
> kafka.controller.TopicDeletionManager$DeleteTopicsThread$$anonfun$doWork$1.apply$mcV$sp(TopicDeletionManager.scala:403)
> at 
> kafka.controller.TopicDeletionManager$DeleteTopicsThread$$anonfun$doWork$1.apply(TopicDeletionManager.scala:397)
> at 
> kafka.controller.TopicDeletionManager$DeleteTopicsThread$$anonfun$doWork$1.apply(TopicDeletionManager.scala:397)
> at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:262)
> at 
> kafka.controller.TopicDeletionManager$DeleteTopicsThread.doWork(TopicDeletionManager.scala:397)
> at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:63)
> {quote}  
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KAFKA-3059) ConsumerGroupCommand should allow resetting offsets for consumer groups

2016-01-04 Thread Gwen Shapira (JIRA)
Gwen Shapira created KAFKA-3059:
---

 Summary: ConsumerGroupCommand should allow resetting offsets for 
consumer groups
 Key: KAFKA-3059
 URL: https://issues.apache.org/jira/browse/KAFKA-3059
 Project: Kafka
  Issue Type: Bug
Reporter: Gwen Shapira


As discussed here:
http://mail-archives.apache.org/mod_mbox/kafka-users/201601.mbox/%3CCA%2BndhHpf3ib%3Ddsh9zvtfVjRiUjSz%2B%3D8umXm4myW%2BpBsbTYATAQ%40mail.gmail.com%3E

* Given a consumer group, remove all stored offsets
* Given a group and a topic, remove offset for group  and topic
* Given a group, topic, partition and offset - set the offset for the specified 
partition and group with the given value



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KAFKA-3061) Get rid of Guava dependency

2016-01-04 Thread Gwen Shapira (JIRA)
Gwen Shapira created KAFKA-3061:
---

 Summary: Get rid of Guava dependency
 Key: KAFKA-3061
 URL: https://issues.apache.org/jira/browse/KAFKA-3061
 Project: Kafka
  Issue Type: Bug
Reporter: Gwen Shapira


KAFKA-2422 adds Reflections library to KafkaConnect, which depends on Guava.
Since lots of people want to use Guavas, having it in the framework will lead 
to conflicts.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-2937) Topics marked for delete in Zookeeper may become undeletable

2016-01-04 Thread Mayuresh Gharat (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-2937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15081910#comment-15081910
 ] 

Mayuresh Gharat commented on KAFKA-2937:


Yes, that might be possible but it should be rare. The 
AdminUtils.topicExists(topicName) call checks only if the the entry 
/brokers/topics/topicName exist in the zookeeper. The createTopic() should 
write to the path : /brokers/topic/topicName and also update the 
replicaAssignment for the topic in Zookeeper.

If somehow the second part got messed up, the above error can occur. 

> Topics marked for delete in Zookeeper may become undeletable
> 
>
> Key: KAFKA-2937
> URL: https://issues.apache.org/jira/browse/KAFKA-2937
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.9.0.0
>Reporter: Rajini Sivaram
>Assignee: Mayuresh Gharat
>
> In our clusters, we occasionally see topics marked for delete, but never 
> actually deleted. It may be due to brokers being restarted while tests were 
> running, but further restarts of Kafka dont fix the problem. The topics 
> remain marked for delete in Zookeeper.
> Topic describe shows:
> {quote}
> Topic:testtopic   PartitionCount:1ReplicationFactor:3 Configs:
>   Topic: testtopicPartition: 0Leader: noneReplicas: 3,4,0 
> Isr: 
> {quote}
> Kafka logs show:
> {quote}
> 2015-12-02 15:53:30,152] ERROR Controller 2 epoch 213 initiated state change 
> of replica 3 for partition [testtopic,0] from OnlineReplica to OfflineReplica 
> failed (state.change.logger)
> kafka.common.StateChangeFailedException: Failed to change state of replica 3 
> for partition [testtopic,0] since the leader and isr path in zookeeper is 
> empty
> at 
> kafka.controller.ReplicaStateMachine.handleStateChange(ReplicaStateMachine.scala:269)
> at 
> kafka.controller.ReplicaStateMachine$$anonfun$handleStateChanges$2.apply(ReplicaStateMachine.scala:114)
> at 
> kafka.controller.ReplicaStateMachine$$anonfun$handleStateChanges$2.apply(ReplicaStateMachine.scala:114)
> at 
> scala.collection.immutable.HashSet$HashSet1.foreach(HashSet.scala:322)
> at 
> scala.collection.immutable.HashSet$HashTrieSet.foreach(HashSet.scala:978)
> at 
> kafka.controller.ReplicaStateMachine.handleStateChanges(ReplicaStateMachine.scala:114)
> at 
> kafka.controller.TopicDeletionManager$$anonfun$startReplicaDeletion$2.apply(TopicDeletionManager.scala:342)
> at 
> kafka.controller.TopicDeletionManager$$anonfun$startReplicaDeletion$2.apply(TopicDeletionManager.scala:334)
> at scala.collection.immutable.Map$Map1.foreach(Map.scala:116)
> at 
> kafka.controller.TopicDeletionManager.startReplicaDeletion(TopicDeletionManager.scala:334)
> at 
> kafka.controller.TopicDeletionManager.kafka$controller$TopicDeletionManager$$onPartitionDeletion(TopicDeletionManager.scala:367)
> at 
> kafka.controller.TopicDeletionManager$$anonfun$kafka$controller$TopicDeletionManager$$onTopicDeletion$2.apply(TopicDeletionManager.scala:313)
> at 
> kafka.controller.TopicDeletionManager$$anonfun$kafka$controller$TopicDeletionManager$$onTopicDeletion$2.apply(TopicDeletionManager.scala:312)
> at scala.collection.immutable.Set$Set1.foreach(Set.scala:79)
> at 
> kafka.controller.TopicDeletionManager.kafka$controller$TopicDeletionManager$$onTopicDeletion(TopicDeletionManager.scala:312)
> at 
> kafka.controller.TopicDeletionManager$DeleteTopicsThread$$anonfun$doWork$1$$anonfun$apply$mcV$sp$4.apply(TopicDeletionManager.scala:431)
> at 
> kafka.controller.TopicDeletionManager$DeleteTopicsThread$$anonfun$doWork$1$$anonfun$apply$mcV$sp$4.apply(TopicDeletionManager.scala:403)
> at scala.collection.immutable.Set$Set2.foreach(Set.scala:111)
> at 
> kafka.controller.TopicDeletionManager$DeleteTopicsThread$$anonfun$doWork$1.apply$mcV$sp(TopicDeletionManager.scala:403)
> at 
> kafka.controller.TopicDeletionManager$DeleteTopicsThread$$anonfun$doWork$1.apply(TopicDeletionManager.scala:397)
> at 
> kafka.controller.TopicDeletionManager$DeleteTopicsThread$$anonfun$doWork$1.apply(TopicDeletionManager.scala:397)
> at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:262)
> at 
> kafka.controller.TopicDeletionManager$DeleteTopicsThread.doWork(TopicDeletionManager.scala:397)
> at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:63)
> {quote}  
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (KAFKA-3059) ConsumerGroupCommand should allow resetting offsets for consumer groups

2016-01-04 Thread Jason Gustafson (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Gustafson reassigned KAFKA-3059:
--

Assignee: Jason Gustafson

> ConsumerGroupCommand should allow resetting offsets for consumer groups
> ---
>
> Key: KAFKA-3059
> URL: https://issues.apache.org/jira/browse/KAFKA-3059
> Project: Kafka
>  Issue Type: Bug
>Reporter: Gwen Shapira
>Assignee: Jason Gustafson
>
> As discussed here:
> http://mail-archives.apache.org/mod_mbox/kafka-users/201601.mbox/%3CCA%2BndhHpf3ib%3Ddsh9zvtfVjRiUjSz%2B%3D8umXm4myW%2BpBsbTYATAQ%40mail.gmail.com%3E
> * Given a consumer group, remove all stored offsets
> * Given a group and a topic, remove offset for group  and topic
> * Given a group, topic, partition and offset - set the offset for the 
> specified partition and group with the given value



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: KIP-41: KafkaConsumer Max Records

2016-01-04 Thread Gwen Shapira
The wiki you pointed to is no longer maintained and fell out of sync with
the code and protocol.

You may want  to refer to:
https://cwiki.apache.org/confluence/display/KAFKA/A+Guide+To+The+Kafka+Protocol

On Mon, Jan 4, 2016 at 4:38 AM, Jens Rantil  wrote:

> Hi guys,
>
> I realized I never thanked yall for your input - thanks!
> Jason: I apologize for assuming your stance on the issue! Feels like we all
> agreed on the solution. +1
>
> Follow-up: Jason made a point about defining prefetch and fairness
> behaviour in the KIP. I am now working on putting that down in writing. To
> do be able to do this I think I need to understand the current prefetch
> behaviour in the new consumer API (0.9) a bit better. Some specific
> questions:
>
>- How does a specific consumer balance incoming messages from multiple
>partitions? Is the consumer simply issuing Multi-Fetch requests[1] for
> the
>consumed assigned partitions of the relevant topics? Or is the consumer
>fetching from one partition at a time and balancing between them
>internally? That is, is the responsibility of partition balancing (and
>fairness) on the broker side or consumer side?
>- Is the above documented somewhere?
>
> [1]
>
> https://cwiki.apache.org/confluence/display/KAFKA/Writing+a+Driver+for+Kafka
> ,
> see "Multi-Fetch".
>
> Thanks,
> Jens
>
> On Wed, Dec 23, 2015 at 2:44 AM, Ismael Juma  wrote:
>
> > On Wed, Dec 23, 2015 at 1:24 AM, Gwen Shapira  wrote:
> >
> > > Given the background, it sounds like you'll generally want each call to
> > > poll() to return the same number of events (which is the number you
> > planned
> > > on having enough memory / time for). It also sounds like tuning the
> > number
> > > of events will be closely tied to tuning the session timeout. That is -
> > if
> > > I choose to lower the session timeout for some reason, I will have to
> > > modify the number of records returning too.
> > >
> > > If those assumptions are correct, I think a configuration makes more
> > sense.
> > > 1. We are unlikely to want this parameter to be change at the lifetime
> of
> > > the consumer
> > > 2. The correct value is tied to another configuration parameter, so
> they
> > > will be controlled together.
> > >
> >
> > I was thinking the same thing.
> >
> > Ismael
> >
>
>
>
> --
> Jens Rantil
> Backend engineer
> Tink AB
>
> Email: jens.ran...@tink.se
> Phone: +46 708 84 18 32
> Web: www.tink.se
>
> Facebook  Linkedin
> <
> http://www.linkedin.com/company/2735919?trk=vsrp_companies_res_photo=VSRPsearchId%3A1057023381369207406670%2CVSRPtargetId%3A2735919%2CVSRPcmpt%3Aprimary
> >
>  Twitter 
>


[GitHub] kafka pull request: KAFKA-3016: phase-1. A local store for join wi...

2016-01-04 Thread ymatsuda
GitHub user ymatsuda opened a pull request:

https://github.com/apache/kafka/pull/726

KAFKA-3016: phase-1. A local store for join window

@guozhangwang 
An implementation of local store for join window. This implementation uses 
"rolling" of RocksDB instances for timestamp based truncation.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ymatsuda/kafka windowed_join

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/726.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #726


commit 87734776268d8f9d7315cc7552cdfb1fe86ecb69
Author: Yasuhiro Matsuda 
Date:   2016-01-04T17:34:57Z

join window store

commit 096a83941f97af5da8192a71c8d5bb6e66130a45
Author: Yasuhiro Matsuda 
Date:   2016-01-04T17:42:53Z

Merge branch 'trunk' of github.com:apache/kafka into windowed_join




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-3016) Add KStream-KStream window joins

2016-01-04 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15081461#comment-15081461
 ] 

ASF GitHub Bot commented on KAFKA-3016:
---

GitHub user ymatsuda opened a pull request:

https://github.com/apache/kafka/pull/726

KAFKA-3016: phase-1. A local store for join window

@guozhangwang 
An implementation of local store for join window. This implementation uses 
"rolling" of RocksDB instances for timestamp based truncation.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ymatsuda/kafka windowed_join

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/726.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #726


commit 87734776268d8f9d7315cc7552cdfb1fe86ecb69
Author: Yasuhiro Matsuda 
Date:   2016-01-04T17:34:57Z

join window store

commit 096a83941f97af5da8192a71c8d5bb6e66130a45
Author: Yasuhiro Matsuda 
Date:   2016-01-04T17:42:53Z

Merge branch 'trunk' of github.com:apache/kafka into windowed_join




> Add KStream-KStream window joins
> 
>
> Key: KAFKA-3016
> URL: https://issues.apache.org/jira/browse/KAFKA-3016
> Project: Kafka
>  Issue Type: Sub-task
>  Components: kafka streams
>Affects Versions: 0.9.0.1
>Reporter: Yasuhiro Matsuda
>Assignee: Yasuhiro Matsuda
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3059) ConsumerGroupCommand should allow resetting offsets for consumer groups

2016-01-04 Thread Jason Gustafson (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15081466#comment-15081466
 ] 

Jason Gustafson commented on KAFKA-3059:


Would this call for a new request API (e.g. DeleteGroup or DeleteOffsets)? I'm 
not sure we'd be able to have the command write to the offsets topic directly 
without some significant changes. Currently, I think GroupMetadataManager only 
consumes from the offsets topic when it assumes leadership for one of the 
partitions, so external writes to these topics would lead to cache 
inconsistency.

> ConsumerGroupCommand should allow resetting offsets for consumer groups
> ---
>
> Key: KAFKA-3059
> URL: https://issues.apache.org/jira/browse/KAFKA-3059
> Project: Kafka
>  Issue Type: Bug
>Reporter: Gwen Shapira
>Assignee: Jason Gustafson
>
> As discussed here:
> http://mail-archives.apache.org/mod_mbox/kafka-users/201601.mbox/%3CCA%2BndhHpf3ib%3Ddsh9zvtfVjRiUjSz%2B%3D8umXm4myW%2BpBsbTYATAQ%40mail.gmail.com%3E
> * Given a consumer group, remove all stored offsets
> * Given a group and a topic, remove offset for group  and topic
> * Given a group, topic, partition and offset - set the offset for the 
> specified partition and group with the given value



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-3052) broker properties get logged twice if acl is enabled

2016-01-04 Thread Ismael Juma (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma updated KAFKA-3052:
---
Reviewer: Guozhang Wang  (was: Gwen Shapira)

> broker properties get logged twice if acl is enabled
> 
>
> Key: KAFKA-3052
> URL: https://issues.apache.org/jira/browse/KAFKA-3052
> Project: Kafka
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 0.9.0.0
>Reporter: Jun Rao
>Assignee: Ismael Juma
>  Labels: newbie, security
> Fix For: 0.9.0.1
>
>
> This is because in SimpleAclAuthorizer.configure(), there is the following 
> statement which triggers the logging of all broker properties.
> val kafkaConfig = KafkaConfig.fromProps(props)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3054) Connect Herder fail forever if sent a wrong connector config or task config

2016-01-04 Thread Ewen Cheslack-Postava (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15081370#comment-15081370
 ] 

Ewen Cheslack-Postava commented on KAFKA-3054:
--

[~jinxing6...@126.com] Yes, this sounds like a bug. In general I think there 
are a number of places where we need to do a better job handling errors -- e.g. 
this is during startup, but also during execution of tasks when an error can 
keep happening repeatedly such that a task can't even make any progress 
(whether the issue is in Connect or the other system). In order to better 
handle this generally we're going to need to keep track of status info, expose 
it via the REST API, and allow users to take corrective action (e.g. 
reconfiguring, restarting tasks, etc).

However, that's a pretty big project. For this bug, it sounds like we're just 
missing a {{catch}} block during connector/task startup which we should instead 
be catching and then handling by, at a minimum, logging some info at {{ERROR}} 
level.

> Connect Herder fail forever if sent a wrong connector config or task config
> ---
>
> Key: KAFKA-3054
> URL: https://issues.apache.org/jira/browse/KAFKA-3054
> Project: Kafka
>  Issue Type: Bug
>  Components: copycat
>Affects Versions: 0.9.0.0
>Reporter: jin xing
>Assignee: jin xing
>
> Connector Herder throws ConnectException and shutdown if sent a wrong config, 
> restarting herder will keep failing with the wrong config; It make sense that 
> herder should stay available when start connector or task failed; After 
> receiving a delete connector request, the herder can delete the wrong config 
> from "config storage"



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3055) JsonConverter mangles schema during serialization (fromConnectData)

2016-01-04 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15081345#comment-15081345
 ] 

ASF GitHub Bot commented on KAFKA-3055:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/722


> JsonConverter mangles schema during serialization (fromConnectData)
> ---
>
> Key: KAFKA-3055
> URL: https://issues.apache.org/jira/browse/KAFKA-3055
> Project: Kafka
>  Issue Type: Bug
>  Components: copycat
>Affects Versions: 0.9.0.0
>Reporter: Kishore Senji
>Assignee: Ewen Cheslack-Postava
> Fix For: 0.9.0.1, 0.9.1.0
>
>
> Test case is here: 
> https://github.com/ksenji/kafka-connect-test/tree/master/src/test/java/org/apache/kafka/connect/json
> If Caching is disabled, it behaves correctly and JsonConverterWithNoCacheTest 
> runs successfully. Otherwise the test JsonConverterTest fails.
> The reason is that the JsonConverter has a bug where it mangles the schema as 
> it assigns all String fields with the same name (and similar for all Int32 
> fields)
> This is how the schema & payload gets serialized for the Person Struct (with 
> caching disabled):
> {code}
> {"schema":{"type":"struct","fields":[{"type":"string","optional":false,"field":"firstName"},{"type":"string","optional":false,"field":"lastName"},{"type":"string","optional":false,"field":"email"},{"type":"int32","optional":false,"field":"age"},{"type":"int32","optional":false,"field":"weightInKgs"}],"optional":false,"name":"Person"},"payload":{"firstName":"Eric","lastName":"Cartman","email":"eric.cart...@southpark.com","age":10,"weightInKgs":40}}
> {code}
> where as when caching is enabled the same Struct gets serialized as (with 
> caching enabled) :
> {code}
> {"schema":{"type":"struct","fields":[{"type":"string","optional":false,"field":"email"},{"type":"string","optional":false,"field":"email"},{"type":"string","optional":false,"field":"email"},{"type":"int32","optional":false,"field":"weightInKgs"},{"type":"int32","optional":false,"field":"weightInKgs"}],"optional":false,"name":"Person"},"payload":{"firstName":"Eric","lastName":"Cartman","email":"eric.cart...@southpark.com","age":10,"weightInKgs":40}}
> {code}
> As we can see all String fields became "email" and all int32 fields became 
> "weightInKgs". 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3052) broker properties get logged twice if acl is enabled

2016-01-04 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15081349#comment-15081349
 ] 

ASF GitHub Bot commented on KAFKA-3052:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/725


> broker properties get logged twice if acl is enabled
> 
>
> Key: KAFKA-3052
> URL: https://issues.apache.org/jira/browse/KAFKA-3052
> Project: Kafka
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 0.9.0.0
>Reporter: Jun Rao
>Assignee: Ismael Juma
>  Labels: newbie, security
> Fix For: 0.9.0.1
>
>
> This is because in SimpleAclAuthorizer.configure(), there is the following 
> statement which triggers the logging of all broker properties.
> val kafkaConfig = KafkaConfig.fromProps(props)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-3052) broker properties get logged twice if acl is enabled

2016-01-04 Thread Guozhang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang updated KAFKA-3052:
-
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Issue resolved by pull request 725
[https://github.com/apache/kafka/pull/725]

> broker properties get logged twice if acl is enabled
> 
>
> Key: KAFKA-3052
> URL: https://issues.apache.org/jira/browse/KAFKA-3052
> Project: Kafka
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 0.9.0.0
>Reporter: Jun Rao
>Assignee: Ismael Juma
>  Labels: newbie, security
> Fix For: 0.9.0.1
>
>
> This is because in SimpleAclAuthorizer.configure(), there is the following 
> statement which triggers the logging of all broker properties.
> val kafkaConfig = KafkaConfig.fromProps(props)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request: KAFKA-3055: Fix the JsonConverter mangling the...

2016-01-04 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/722


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-3040) Broker didn't report new data after change in leader

2016-01-04 Thread Mayuresh Gharat (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15081519#comment-15081519
 ] 

Mayuresh Gharat commented on KAFKA-3040:


At Linkedin, we do have a separate controller log file on the broker that is 
the controller for the cluster. Can you see something like this "Broker 
HOST-NAME starting become controller state transition" on the broker that is 
the controller for the cluster?

> Broker didn't report new data after change in leader
> 
>
> Key: KAFKA-3040
> URL: https://issues.apache.org/jira/browse/KAFKA-3040
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.9.0.0
> Environment: Debian 3.2.54-2 x86_64 GNU/Linux
>Reporter: Imran Patel
>Priority: Critical
>
> Recently we had an event that causes large Kafka backlogs to develop 
> suddenty. This happened across multiple partitions. We noticed that after a 
> brief connection loss to Zookeeper, Kafka brokers were not reporting no new 
> data to our (SimpleConsumer) consumer although the producers were enqueueing 
> fine. This went on until another zk blip led to a reconfiguration which 
> suddenly caused the consumers to "see" the data. Our consumers and our 
> monitoring tools did not see the offsets move during the outage window. Here 
> is the sequence of events for a single partition (with logs attached below). 
> The brokers are running 0.9, the producer is using library version 
> kafka_2.10:0.8.2.1 and consumer is using kafka_2.10:0.8.0 (both are Java 
> programs). Our monitoring tool uses kafka-python-9.0
> Can you tell us if this could be due to a consumer bug (the libraries being 
> too "old" to operate with 0.9 broker, for e.g.)? Or does it look a Kafka core 
> issue? Please note that we recently upgraded the brokers to 0.9 and hadn't 
> seen a similar issue prior to that.
> - after a brief connection loss to zookeeper, the partition leader (broker 9 
> for partition 29 in logs below) came back and shrunk the ISR to itself. 
> - producers kept on successfully sending data to Kafka and the remaining 
> replicas (brokers 3 and 4) recorded this data. AFAICT, 3 was the new leader. 
> Broker 9 did NOT replicate this data. It did repeatedly print the ISR 
> shrinking message over and over again.
> - consumer on the other hand reported no new data presumably because it was 
> talking to 9 and that broker was doing nothing.
> - 6 hours later, another zookeeper blip causes the brokers to reconfigure and 
> now consumers started seeing new data. 
> Broker 9:
> [2015-12-16 19:46:01,523] INFO Partition [messages,29] on broker 9: Expanding 
> ISR for partition [messages,29] from 9,4 to 9,4,3 (kafka.cluster.Partition
> [2015-12-18 00:59:25,511] INFO New leader is 9 
> (kafka.server.ZookeeperLeaderElector$LeaderChangeListener)
> [2015-12-18 01:00:18,451] INFO Partition [messages,29] on broker 9: Shrinking 
> ISR for partition [messages,29] from 9,4,3 to 9 (kafka.cluster.Partition)
> [2015-12-18 01:00:18,458] INFO Partition [messages,29] on broker 9: Cached 
> zkVersion [472] not equal to that in zookeeper, skip updating ISR 
> (kafka.cluster.Partition)
> [2015-12-18 07:04:44,552] INFO Truncating log messages-29 to offset 
> 14169556269. (kafka.log.Log)
> [2015-12-18 07:04:44,649] INFO [ReplicaFetcherManager on broker 9] Added 
> fetcher for partitions List([[messages,61], initOffset 14178575900 to broker 
> BrokerEndPoint(6,kafka006-prod.c.foo.internal,9092)] , [[messages,13], 
> initOffset 14156091271 to broker 
> BrokerEndPoint(2,kafka002-prod.c.foo.internal,9092)] , [[messages,45], 
> initOffset 14135826155 to broker 
> BrokerEndPoint(4,kafka004-prod.c.foo.internal,9092)] , [[messages,41], 
> initOffset 14157926400 to broker 
> BrokerEndPoint(1,kafka001-prod.c.foo.internal,9092)] , [[messages,29], 
> initOffset 14169556269 to broker 
> BrokerEndPoint(3,kafka003-prod.c.foo.internal,9092)] , [[messages,57], 
> initOffset 14175218230 to broker 
> BrokerEndPoint(1,kafka001-prod.c.foo.internal,9092)] ) 
> (kafka.server.ReplicaFetcherManager)
> Broker 3:
> [2015-12-18 01:00:01,763] INFO [ReplicaFetcherManager on broker 3] Removed 
> fetcher for partitions [messages,29] (kafka.server.ReplicaFetcherManager)
> [2015-12-18 07:09:04,631] INFO Partition [messages,29] on broker 3: Expanding 
> ISR for partition [messages,29] from 4,3 to 4,3,9 (kafka.cluster.Partition)
> [2015-12-18 07:09:49,693] INFO [ReplicaFetcherManager on broker 3] Removed 
> fetcher for partitions [messages,29] (kafka.server.ReplicaFetcherManager)
> Broker 4:
> [2015-12-18 01:00:01,783] INFO [ReplicaFetcherManager on broker 4] Removed 
> fetcher for partitions [messages,29] (kafka.server.ReplicaFetcherManager)
> [2015-12-18 01:00:01,866] INFO [ReplicaFetcherManager on broker 4] Added 
> fetcher for partitions