[jira] [Updated] (KAFKA-1153) typos in documentation

2013-12-02 Thread Neha Narkhede (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Neha Narkhede updated KAFKA-1153:
-

Assignee: Joe Stein

 typos in documentation
 --

 Key: KAFKA-1153
 URL: https://issues.apache.org/jira/browse/KAFKA-1153
 Project: Kafka
  Issue Type: Bug
Reporter: Joe Stein
Assignee: Joe Stein
 Attachments: KAFKA-1153.patch


 Dan Hoffman hoffman...@gmail.com via kafka.apache.org 
 9:45 AM (1 hour ago)
 to users 
 *'Not that partitioning means Kafka only provides a total order over
 messages within a partition. This combined with the ability to partition
 data by key is sufficient for the vast majority of applications. However,
 if you require a total order over messages this can be achieved with a
 topic that has only one partition, though this will mean only one consumer
 process.'*
 The first word should say *NOTE*, right?  Otherwise, I don't understand the
 meaning.
 ...
 Marc Labbe via kafka.apache.org 
 12:57 PM (12 minutes ago)
 to users 
 while we're at it... I noticed the following typos in
 section 4.1 Motivation (
 http://kafka.apache.org/documentation.html#majordesignelements)
 we knew instead of we new
 
 Finally in cases where the stream is fed into other data systems for
 serving we new the system would have to be able to guarantee
 fault-tolerance in the presence of machine failures.
 
 led us instead of led use
 
 Supporting these uses led use to a design with a number of unique elements,
 more akin to a database log then a traditional messaging system. We will
 outline some elements of the design in the following sections.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Closed] (KAFKA-1153) typos in documentation

2013-12-02 Thread Neha Narkhede (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Neha Narkhede closed KAFKA-1153.



 typos in documentation
 --

 Key: KAFKA-1153
 URL: https://issues.apache.org/jira/browse/KAFKA-1153
 Project: Kafka
  Issue Type: Bug
Reporter: Joe Stein
Assignee: Joe Stein
 Attachments: KAFKA-1153.patch


 Dan Hoffman hoffman...@gmail.com via kafka.apache.org 
 9:45 AM (1 hour ago)
 to users 
 *'Not that partitioning means Kafka only provides a total order over
 messages within a partition. This combined with the ability to partition
 data by key is sufficient for the vast majority of applications. However,
 if you require a total order over messages this can be achieved with a
 topic that has only one partition, though this will mean only one consumer
 process.'*
 The first word should say *NOTE*, right?  Otherwise, I don't understand the
 meaning.
 ...
 Marc Labbe via kafka.apache.org 
 12:57 PM (12 minutes ago)
 to users 
 while we're at it... I noticed the following typos in
 section 4.1 Motivation (
 http://kafka.apache.org/documentation.html#majordesignelements)
 we knew instead of we new
 
 Finally in cases where the stream is fed into other data systems for
 serving we new the system would have to be able to guarantee
 fault-tolerance in the presence of machine failures.
 
 led us instead of led use
 
 Supporting these uses led use to a design with a number of unique elements,
 more akin to a database log then a traditional messaging system. We will
 outline some elements of the design in the following sections.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (KAFKA-1151) The Hadoop consumer API doc is not referencing the contrib consumer

2013-12-02 Thread Joe Stein (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joe Stein updated KAFKA-1151:
-

Status: Patch Available  (was: Open)

 The Hadoop consumer API doc is not referencing the contrib consumer
 ---

 Key: KAFKA-1151
 URL: https://issues.apache.org/jira/browse/KAFKA-1151
 Project: Kafka
  Issue Type: Bug
Reporter: Joe Stein
 Fix For: 0.8.1

 Attachments: KAFKA-1151.patch


 http://kafka.apache.org/documentation.html#kafkahadoopconsumerapi
 it is pointing to https://github.com/linkedin/camus/tree/camus-kafka-0.8/
 if we are still supporting the contrib/hadoop-consumer then we should point 
 to the read me (maybe this link instead 
 https://github.com/apache/kafka/tree/0.8/contrib/hadoop-consumer)
 thoughts?



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (KAFKA-1151) The Hadoop consumer API doc is not referencing the contrib consumer

2013-12-02 Thread Joe Stein (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joe Stein updated KAFKA-1151:
-

Attachment: KAFKA-1151.patch

I gave a stab at what I think would preserve the spirit of the apache contrib 
code and the progress that Camus has also brought.  Not sure if the Hadoop 
consumer can benefit from some of the Camus up stream changes (without having 
to take on and require Avro) but probably a discussion for the list or another 
JIRA but figure I start to touch/ask about that here (or a sub project or 
something... dunno).

 The Hadoop consumer API doc is not referencing the contrib consumer
 ---

 Key: KAFKA-1151
 URL: https://issues.apache.org/jira/browse/KAFKA-1151
 Project: Kafka
  Issue Type: Bug
Reporter: Joe Stein
 Fix For: 0.8.1

 Attachments: KAFKA-1151.patch


 http://kafka.apache.org/documentation.html#kafkahadoopconsumerapi
 it is pointing to https://github.com/linkedin/camus/tree/camus-kafka-0.8/
 if we are still supporting the contrib/hadoop-consumer then we should point 
 to the read me (maybe this link instead 
 https://github.com/apache/kafka/tree/0.8/contrib/hadoop-consumer)
 thoughts?



--
This message was sent by Atlassian JIRA
(v6.1#6144)


Re: [VOTE] Apache Kafka Release 0.8.0 - Candidate 5

2013-12-02 Thread David Arthur
Seems like most people are verifying the src, so I'll pick on the 
binaries and Maven stuff ;)


A few problems I see:

There are some vestigial Git files in the src download: an empty .git 
and .gitignore


In the source download, I see the SBT license in LICENSE which seems 
correct (since we distribute an SBT binary), but in the binary download 
I see the same license. Don't we need the Scala license 
(http://www.scala-lang.org/license.html) in the binary distribution?


I create a simple Ant+Ivy project to test resolving the artifacts 
published to Apache staging repo: https://github.com/mumrah/kafka-ivy. 
This will fetch Kafka libs from the Apache staging area and other things 
from Maven Central. It will fetch the jars into lib/ivy/{conf} and 
generate a report of the dependencies, conflicts, and licenses into 
ivy-report. Notice I had to add three exclusions to get things working. 
Maybe we should add these to our pom?


I think I'll have to -1 the release due to the missing Scala license in 
the binary dist. We should check the other licenses as well (see 
ivy-report from my little Ant project).


-David

On 11/26/13 5:34 PM, Joe Stein wrote:

This is the fifth candidate for release of Apache Kafka 0.8.0.   This
release candidate is now built from JDK 6 as RC4 was built with JDK 7.

Release Notes for the 0.8.0 release
http://people.apache.org/~joestein/kafka-0.8.0-candidate5/RELEASE_NOTES.html

*** Please download, test and vote by Monday December, 2nd, 12pm PDT

Kafka's KEYS file containing PGP keys we use to sign the release:
http://svn.apache.org/repos/asf/kafka/KEYS in addition to the md5 and sha1
checksum

* Release artifacts to be voted upon (source and binary):
http://people.apache.org/~joestein/kafka-0.8.0-candidate5/

* Maven artifacts to be voted upon prior to release:
https://repository.apache.org/content/groups/staging/

(i.e. in sbt land this can be added to the build.sbt to use Kafka
resolvers += Apache Staging at 
https://repository.apache.org/content/groups/staging/;
libraryDependencies += org.apache.kafka % kafka_2.10 % 0.8.0
)

* The tag to be voted upon (off the 0.8 branch) is the 0.8.0 tag
https://git-wip-us.apache.org/repos/asf?p=kafka.git;a=tag;h=2c20a71a010659e25af075a024cbd692c87d4c89

/***
  Joe Stein
  Founder, Principal Consultant
  Big Data Open Source Security LLC
  http://www.stealth.ly
  Twitter: @allthingshadoop http://www.twitter.com/allthingshadoop
/





[jira] [Commented] (KAFKA-1154) replicas may not have consistent data after becoming follower

2013-12-02 Thread Neha Narkhede (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13836683#comment-13836683
 ] 

Neha Narkhede commented on KAFKA-1154:
--

We used to do this, looks like this was a regression introduced in KAFKA-1001. 

1. ReplicaManager
Do we still need this TODO: the above may need to be fixed later ?

2. We had added the ability for a special consumer to read the replica log for 
troubleshooting. This patch takes that convenience away. We should probably 
look for another way to prevent the replica verification tool from giving false 
negatives. Can it use a different consumer id?


 replicas may not have consistent data after becoming follower
 -

 Key: KAFKA-1154
 URL: https://issues.apache.org/jira/browse/KAFKA-1154
 Project: Kafka
  Issue Type: Bug
  Components: core
Affects Versions: 0.8.1
Reporter: Jun Rao
Assignee: Jun Rao
 Fix For: 0.8.1

 Attachments: KAFKA-1154.patch


 This is an issued introduced in KAFKA-1001. The issue is that in 
 ReplicaManager.makeFollowers(), we truncate the log before marking the 
 replica as the follower. New messages from the producer can still be added to 
 the log after the log is truncated, but before the replica is marked as the 
 follower. Those newly produced messages can actually be committed, which 
 implies those truncated messages are also committed. However, the new leader 
 is not guaranteed to have those truncated messages.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


Re: Review Request 15938: replicas may not have consistent data after becoming follower

2013-12-02 Thread Neha Narkhede

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15938/#review29586
---



core/src/main/scala/kafka/server/KafkaApis.scala
https://reviews.apache.org/r/15938/#comment56981

We had added the ability for a special consumer to read the replica log for 
troubleshooting. This patch takes that convenience away. We should probably 
look for another way to prevent the replica verification tool from giving false 
negatives. Can it use a different consumer id?



core/src/main/scala/kafka/server/ReplicaManager.scala
https://reviews.apache.org/r/15938/#comment56980

Do we still need this TODO: the above may need to be fixed later ?


- Neha Narkhede


On Dec. 1, 2013, 11:33 p.m., Jun Rao wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/15938/
 ---
 
 (Updated Dec. 1, 2013, 11:33 p.m.)
 
 
 Review request for kafka.
 
 
 Bugs: KAFKA-1154
 https://issues.apache.org/jira/browse/KAFKA-1154
 
 
 Repository: kafka
 
 
 Description
 ---
 
 kafka-1154; fix 1
 
 
 Diffs
 -
 
   core/src/main/scala/kafka/api/FetchRequest.scala 
 fb2a2306003ac64a8a3b2fc5fc999e0be273f48d 
   core/src/main/scala/kafka/api/RequestOrResponse.scala 
 b62330be6241c8ff4acd21f0fa7e80b7636e0d42 
   core/src/main/scala/kafka/server/KafkaApis.scala 
 80a70f1e5e3a7670b2238fe63b8d9e0eac6b46ac 
   core/src/main/scala/kafka/server/ReplicaManager.scala 
 54f6e1674255f62eba9d90aab0db371c82baf749 
   core/src/main/scala/kafka/tools/ReplicaVerificationTool.scala 
 f1f139e485d98e42be17cdcc327961420cd8c012 
 
 Diff: https://reviews.apache.org/r/15938/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Jun Rao
 




Re: [VOTE] Apache Kafka Release 0.8.0 - Candidate 5

2013-12-02 Thread Kostya Golikov
Talking about binary release, do we really need to include bin/run-rat.sh?
As far as I understand it is only used to bring licenses to the scene and
quite redundant for already baked release.

Next, I not quite sure, but probably it makes sense to drop
libs/scala-compiler.jar -- kafka do not perform compilations during runtime
and this step will trim some fat from the resulting release (from 17 mb
down to 9.5 mb*).

I managed to satisfy maven with only two exclusions, but yes, it would be
good to see them in original pom.

* by the way using the best possible compression method  (-9 instead of
default -6) + drop of compiler lib gave me the very same result -- 9.5 Mb


2013/12/2 David Arthur mum...@gmail.com

 Seems like most people are verifying the src, so I'll pick on the binaries
 and Maven stuff ;)

 A few problems I see:

 There are some vestigial Git files in the src download: an empty .git and
 .gitignore

 In the source download, I see the SBT license in LICENSE which seems
 correct (since we distribute an SBT binary), but in the binary download I
 see the same license. Don't we need the Scala license (
 http://www.scala-lang.org/license.html) in the binary distribution?

 I create a simple Ant+Ivy project to test resolving the artifacts
 published to Apache staging repo: https://github.com/mumrah/kafka-ivy.
 This will fetch Kafka libs from the Apache staging area and other things
 from Maven Central. It will fetch the jars into lib/ivy/{conf} and generate
 a report of the dependencies, conflicts, and licenses into ivy-report.
 Notice I had to add three exclusions to get things working. Maybe we should
 add these to our pom?

 I think I'll have to -1 the release due to the missing Scala license in
 the binary dist. We should check the other licenses as well (see ivy-report
 from my little Ant project).

 -David

 On 11/26/13 5:34 PM, Joe Stein wrote:

 This is the fifth candidate for release of Apache Kafka 0.8.0.   This
 release candidate is now built from JDK 6 as RC4 was built with JDK 7.

 Release Notes for the 0.8.0 release
 http://people.apache.org/~joestein/kafka-0.8.0-
 candidate5/RELEASE_NOTES.html

 *** Please download, test and vote by Monday December, 2nd, 12pm PDT

 Kafka's KEYS file containing PGP keys we use to sign the release:
 http://svn.apache.org/repos/asf/kafka/KEYS in addition to the md5 and
 sha1
 checksum

 * Release artifacts to be voted upon (source and binary):
 http://people.apache.org/~joestein/kafka-0.8.0-candidate5/

 * Maven artifacts to be voted upon prior to release:
 https://repository.apache.org/content/groups/staging/

 (i.e. in sbt land this can be added to the build.sbt to use Kafka
 resolvers += Apache Staging at 
 https://repository.apache.org/content/groups/staging/;
 libraryDependencies += org.apache.kafka % kafka_2.10 % 0.8.0
 )

 * The tag to be voted upon (off the 0.8 branch) is the 0.8.0 tag
 https://git-wip-us.apache.org/repos/asf?p=kafka.git;a=tag;h=
 2c20a71a010659e25af075a024cbd692c87d4c89

 /***
   Joe Stein
   Founder, Principal Consultant
   Big Data Open Source Security LLC
   http://www.stealth.ly
   Twitter: @allthingshadoop http://www.twitter.com/allthingshadoop
 /





Re: Review Request 15901: Patch for KAFKA-1152

2013-12-02 Thread Neha Narkhede

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15901/#review29588
---



core/src/main/scala/kafka/server/ReplicaManager.scala
https://reviews.apache.org/r/15901/#comment56984

the check should probably be leaderId = 0. The leaders in the 
LeaderAndIsrRequest is misleading, cannot be trusted and needs to be deprecated.



core/src/main/scala/kafka/server/ReplicaManager.scala
https://reviews.apache.org/r/15901/#comment56983

this format statement is broken. We need a parentheses surrounding the 
entire trace statement


- Neha Narkhede


On Nov. 29, 2013, 6:41 a.m., Swapnil Ghike wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/15901/
 ---
 
 (Updated Nov. 29, 2013, 6:41 a.m.)
 
 
 Review request for kafka.
 
 
 Bugs: KAFKA-1152
 https://issues.apache.org/jira/browse/KAFKA-1152
 
 
 Repository: kafka
 
 
 Description
 ---
 
 ReplicaManager's handling of the leaderAndIsrRequest should gracefully handle 
 leader == -1
 
 
 ReplicaManager's handling of the leaderAndIsrRequest should gracefully handle 
 leader == -1
 
 
 Diffs
 -
 
   core/src/main/scala/kafka/server/ReplicaManager.scala 
 161f58134f20f9335dbd2bee6ac3f71897cbef7c 
 
 Diff: https://reviews.apache.org/r/15901/diff/
 
 
 Testing
 ---
 
 Builds with all scala versions; unit tests pass
 
 
 Thanks,
 
 Swapnil Ghike
 




[jira] [Commented] (KAFKA-1036) Unable to rename replication offset checkpoint in windows

2013-12-02 Thread Neha Narkhede (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13836698#comment-13836698
 ] 

Neha Narkhede commented on KAFKA-1036:
--

[~jantxu] Are you sure this is required? If we always delete the destination 
file and then execute renameTo, it should work in all cases, no? [~sriramsub] 
What do you think?

 Unable to rename replication offset checkpoint in windows
 -

 Key: KAFKA-1036
 URL: https://issues.apache.org/jira/browse/KAFKA-1036
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.8.1
 Environment: windows
Reporter: Timothy Chen
Priority: Critical
  Labels: windows
 Fix For: 0.8.1

 Attachments: filelock.patch.diff


 Although there was a fix for checkpoint file renaming in windows that tries 
 to delete the existing checkpoint file if renamed failed, I'm still seeing 
 renaming errors on windows even though the destination file doesn't exist.
 A bit investigation shows that it wasn't able to rename the file since the 
 kafka jvm still holds a fie lock on the tmp file and wasn't able to rename 
 it. 
 Attaching a patch that calls a explict writer.close so it can release the 
 lock and can able to rename it.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


Re: [VOTE] Apache Kafka Release 0.8.0 - Candidate 5

2013-12-02 Thread Neha Narkhede
I think we should maintain a wiki describing the release process in detail,
so we save the turnaround time on a release. We can have a VOTE thread to
agree on the release guidelines and follow it. Having  said that, it is
worth having the correct .pom file at the very least, since the release is
not very useful if people cannot consume it without pain.

Thanks,
Neha


On Mon, Dec 2, 2013 at 8:59 AM, Joe Stein joe.st...@stealth.ly wrote:

 General future thought comment first: lets be careful please to raising
 issues as show stoppers that have been there previously (especially if
 greater than one version previous release back also has the problem) and
 can get fixed in a subsequent release and is only now more pressing because
 we know about them... seeing something should not necessarily always create
 priority (sometimes sure, of course but not always that is not the best way
 to manage changes).  The VOTE thread should be to artifacts and what we are
 releasing as proper and correct per Apache guidelines... and to make sure
 that the person doing the release doesn't do something incorrect ... like
 using the wrong version of JDK to build =8^/.  If we are not happy with
 release as ready to ship then lets not call a VOTE and save the prolonged
 weeks that drag out with so many release candidates.  The community suffers
 from this.

 ok, now on to RC5 ...lets extend the vote until 12pm PT tomorrow ...
 hopefully a few more hours for other folks to comment and discuss the
 issues you raised with my $0.02852425 included below and follow-ups as they
 become necessary... I am also out of pocket in a few hours until tomorrow
 morning so if it passed I would not be able to publish and announce or if
 failed look towards RC6 anyways =8^)

 /***
  Joe Stein
  Founder, Principal Consultant
  Big Data Open Source Security LLC
  http://www.stealth.ly
  Twitter: @allthingshadoop http://www.twitter.com/allthingshadoop
 /


 On Mon, Dec 2, 2013 at 11:00 AM, David Arthur mum...@gmail.com wrote:

  Seems like most people are verifying the src, so I'll pick on the
 binaries
  and Maven stuff ;)
 
  A few problems I see:
 
  There are some vestigial Git files in the src download: an empty .git and
  .gitignore
 

 Ok, I can do a better job with 0.8.1 but I am not sure this is very
 different than beta1 and not necessarily a show stopper for 0.8.0 requiring
 another release candidate, is it?  I think updating the release docs and
 rmdir .git after the rm -fr and rm .gitignore moving forward makes sense.


 
  In the source download, I see the SBT license in LICENSE which seems
  correct (since we distribute an SBT binary), but in the binary download I
  see the same license. Don't we need the Scala license (
  http://www.scala-lang.org/license.html) in the binary distribution?
 

 I fixed this already not only in the binary release
 https://issues.apache.org/jira/browse/KAFKA-1131 but also in the JAR files
 that are published to Maven
 https://issues.apache.org/jira/browse/KAFKA-1133are you checking from
 http://people.apache.org/~joestein/kafka-0.8.0-candidate5/ because I just
 downloaded again and it looks alright to me.  If not then definitely this
 RC should be shot down because it does not do what we are saying it is
 doing.. but if it is wrong can you be more specific and create a JIRA with
 the fix because I thought I got it right already... but if not then lets
 get it right because that is why we pulled the release in RC3


 
  I create a simple Ant+Ivy project to test resolving the artifacts
  published to Apache staging repo: https://github.com/mumrah/kafka-ivy.
  This will fetch Kafka libs from the Apache staging area and other things
  from Maven Central. It will fetch the jars into lib/ivy/{conf} and
 generate
  a report of the dependencies, conflicts, and licenses into ivy-report.
  Notice I had to add three exclusions to get things working. Maybe we
 should
  add these to our pom?
 

 I don't think this is a showstopper is it?  can't this wait for 0.8.1 and
 not hold up the 0.8.0 release?

 I didn't have this issue with java maven pom or scala sbt so maybe
 something more ivy ant specific causing this?  folks use gradle too so I
 expect some feedback at some point to that working or not perhaps in 0.8.1
 or even 0.9 we can try to cover every way everyone uses and make sure they
 are all good to go moving forward... perhaps even some vagrant, docker,
 puppet and chef love too (which I can contribute if folks are interested)
 =8^)

 In any case can you create a JIRA and throw a patch up on it please,
 thanks! IMHO this is for 0.8.1 though ... what are thoughts here...


 
  I think I'll have to -1 the release due to the missing Scala license in
  the binary dist. We should check the other licenses as well (see
 ivy-report
  from my little Ant project).
 

 it would break my heart to have lots of binding +1 votes and 2 

Re: [VOTE] Apache Kafka Release 0.8.0 - Candidate 5

2013-12-02 Thread Joe Stein
General future thought comment first: lets be careful please to raising
issues as show stoppers that have been there previously (especially if
greater than one version previous release back also has the problem) and
can get fixed in a subsequent release and is only now more pressing because
we know about them... seeing something should not necessarily always create
priority (sometimes sure, of course but not always that is not the best way
to manage changes).  The VOTE thread should be to artifacts and what we are
releasing as proper and correct per Apache guidelines... and to make sure
that the person doing the release doesn't do something incorrect ... like
using the wrong version of JDK to build =8^/.  If we are not happy with
release as ready to ship then lets not call a VOTE and save the prolonged
weeks that drag out with so many release candidates.  The community suffers
from this.

ok, now on to RC5 ...lets extend the vote until 12pm PT tomorrow ...
hopefully a few more hours for other folks to comment and discuss the
issues you raised with my $0.02852425 included below and follow-ups as they
become necessary... I am also out of pocket in a few hours until tomorrow
morning so if it passed I would not be able to publish and announce or if
failed look towards RC6 anyways =8^)

/***
 Joe Stein
 Founder, Principal Consultant
 Big Data Open Source Security LLC
 http://www.stealth.ly
 Twitter: @allthingshadoop http://www.twitter.com/allthingshadoop
/


On Mon, Dec 2, 2013 at 11:00 AM, David Arthur mum...@gmail.com wrote:

 Seems like most people are verifying the src, so I'll pick on the binaries
 and Maven stuff ;)

 A few problems I see:

 There are some vestigial Git files in the src download: an empty .git and
 .gitignore


Ok, I can do a better job with 0.8.1 but I am not sure this is very
different than beta1 and not necessarily a show stopper for 0.8.0 requiring
another release candidate, is it?  I think updating the release docs and
rmdir .git after the rm -fr and rm .gitignore moving forward makes sense.



 In the source download, I see the SBT license in LICENSE which seems
 correct (since we distribute an SBT binary), but in the binary download I
 see the same license. Don't we need the Scala license (
 http://www.scala-lang.org/license.html) in the binary distribution?


I fixed this already not only in the binary release
https://issues.apache.org/jira/browse/KAFKA-1131 but also in the JAR files
that are published to Maven
https://issues.apache.org/jira/browse/KAFKA-1133are you checking from
http://people.apache.org/~joestein/kafka-0.8.0-candidate5/ because I just
downloaded again and it looks alright to me.  If not then definitely this
RC should be shot down because it does not do what we are saying it is
doing.. but if it is wrong can you be more specific and create a JIRA with
the fix because I thought I got it right already... but if not then lets
get it right because that is why we pulled the release in RC3



 I create a simple Ant+Ivy project to test resolving the artifacts
 published to Apache staging repo: https://github.com/mumrah/kafka-ivy.
 This will fetch Kafka libs from the Apache staging area and other things
 from Maven Central. It will fetch the jars into lib/ivy/{conf} and generate
 a report of the dependencies, conflicts, and licenses into ivy-report.
 Notice I had to add three exclusions to get things working. Maybe we should
 add these to our pom?


I don't think this is a showstopper is it?  can't this wait for 0.8.1 and
not hold up the 0.8.0 release?

I didn't have this issue with java maven pom or scala sbt so maybe
something more ivy ant specific causing this?  folks use gradle too so I
expect some feedback at some point to that working or not perhaps in 0.8.1
or even 0.9 we can try to cover every way everyone uses and make sure they
are all good to go moving forward... perhaps even some vagrant, docker,
puppet and chef love too (which I can contribute if folks are interested)
=8^)

In any case can you create a JIRA and throw a patch up on it please,
thanks! IMHO this is for 0.8.1 though ... what are thoughts here...



 I think I'll have to -1 the release due to the missing Scala license in
 the binary dist. We should check the other licenses as well (see ivy-report
 from my little Ant project).


it would break my heart to have lots of binding +1 votes and 2 non-binding
votes one +1 and one -1, I still haven't cast my vote yet was hoping
everyone would get their voices and everything in before calling the VOTE
closed or canceled.  I really don't mind preparing a release candidate 6
that is not the issue at all but I think we need to be thoughtful about
using the release candidates to fixe things that should be fixed and part
of the releases themselves where the release candidates are to make sure
that the preparation of the build is not wrong (like it was in RC4 where I
used JDK 7 

Re: Review Request 15659: Incorporate Joel/Jun's comments, MM system test passed, rebased

2013-12-02 Thread Neha Narkhede

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15659/#review29593
---



core/src/main/scala/kafka/consumer/ZookeeperTopicEventWatcher.scala
https://reviews.apache.org/r/15659/#comment56989

Can we improve this WARN? It implies that we will not shutdown the client, 
but we proceed with the shutdown anyways :)


- Neha Narkhede


On Nov. 21, 2013, 7:22 p.m., Guozhang Wang wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/15659/
 ---
 
 (Updated Nov. 21, 2013, 7:22 p.m.)
 
 
 Review request for kafka.
 
 
 Bugs: KAFKA-1103
 https://issues.apache.org/jira/browse/KAFKA-1103
 
 
 Repository: kafka
 
 
 Description
 ---
 
 KAFKA-1103.v2
 
 
 Dummy
 
 
 KAFKA-1103.v1
 
 
 Diffs
 -
 
   core/src/main/scala/kafka/consumer/TopicFilter.scala 
 cf3853b223095e1fe0921175c407a906828b8113 
   core/src/main/scala/kafka/consumer/ZookeeperConsumerConnector.scala 
 6d0cfa665e90a168a70501a81f10fa4d3c7a7f22 
   core/src/main/scala/kafka/consumer/ZookeeperTopicEventWatcher.scala 
 a67c193df9f7cbfc52f75dc1b71dc017de1b5fe2 
   core/src/test/scala/unit/kafka/consumer/TopicFilterTest.scala 
 40a2bf7a9277eb5f94bc07b40d7726d81860cefc 
   system_test/migration_tool_testsuite/0.7/config/test-log4j.properties 
 a3ae33f20e4b7cff87d8cf8368d0639b8bea73a6 
 
 Diff: https://reviews.apache.org/r/15659/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Guozhang Wang
 




[jira] [Updated] (KAFKA-1133) LICENSE and NOTICE files need to get into META-INF when jars are built before they're signed for publishing to maven

2013-12-02 Thread Neha Narkhede (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Neha Narkhede updated KAFKA-1133:
-

Assignee: Joe Stein

 LICENSE and NOTICE files need to get into  META-INF when jars are built 
 before they're signed for publishing to maven
 -

 Key: KAFKA-1133
 URL: https://issues.apache.org/jira/browse/KAFKA-1133
 Project: Kafka
  Issue Type: Bug
Reporter: Joe Stein
Assignee: Joe Stein
 Fix For: 0.8, 0.8.1

 Attachments: KAFKA-1133.patch


 This needs to happen in our Build.scala the sbt package docs 
 http://www.scala-sbt.org/release/docs/Howto/package.html probably a straight 
 forward line of code or ten or whatever to-do this maybe



--
This message was sent by Atlassian JIRA
(v6.1#6144)


Re: Review Request 15901: Patch for KAFKA-1152

2013-12-02 Thread Swapnil Ghike


 On Dec. 2, 2013, 5:06 p.m., Neha Narkhede wrote:
  core/src/main/scala/kafka/server/ReplicaManager.scala, line 358
  https://reviews.apache.org/r/15901/diff/3/?file=392523#file392523line358
 
  the check should probably be leaderId = 0. The leaders in the 
  LeaderAndIsrRequest is misleading, cannot be trusted and needs to be 
  deprecated.

On the controller, leaders exclude shutdown brokers. 

val leaders = controllerContext.liveOrShuttingDownBrokers.filter(b = 
leaderIds.contains(b.id))

On the broker, should we not check whether the leader that it is being asked to 
follow is alive or not?


- Swapnil


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15901/#review29588
---


On Nov. 29, 2013, 6:41 a.m., Swapnil Ghike wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/15901/
 ---
 
 (Updated Nov. 29, 2013, 6:41 a.m.)
 
 
 Review request for kafka.
 
 
 Bugs: KAFKA-1152
 https://issues.apache.org/jira/browse/KAFKA-1152
 
 
 Repository: kafka
 
 
 Description
 ---
 
 ReplicaManager's handling of the leaderAndIsrRequest should gracefully handle 
 leader == -1
 
 
 ReplicaManager's handling of the leaderAndIsrRequest should gracefully handle 
 leader == -1
 
 
 Diffs
 -
 
   core/src/main/scala/kafka/server/ReplicaManager.scala 
 161f58134f20f9335dbd2bee6ac3f71897cbef7c 
 
 Diff: https://reviews.apache.org/r/15901/diff/
 
 
 Testing
 ---
 
 Builds with all scala versions; unit tests pass
 
 
 Thanks,
 
 Swapnil Ghike
 




Re: [VOTE] Apache Kafka Release 0.8.0 - Candidate 5

2013-12-02 Thread Joe Stein
Neha, as far as the release process is this what you had in mind
https://cwiki.apache.org/confluence/display/KAFKA/Release+Process or
different content or more of something or such?

Per the POM, I was able to use the artifacts from the maven repository
without having to-do anything more than just specifying the artifacts with
sbt.

resolvers += Apache Staging at 
https://repository.apache.org/content/groups/staging/;

libraryDependencies ++= Seq(
...,
org.apache.kafka % kafka_2.10 % 0.8.0,

)

and on the pure maven side
repositories
repository
idApacheStaging/id
urlhttps://repository.apache.org/content/groups/staging//url
/repository
...
dependency
groupIdorg.apache.kafka/groupId
artifactIdkafka_2.9.2/artifactId
version0.8.0/version
exclusions
exclusion
groupIdlog4j/groupId
artifactIdlog4j/artifactId
/exclusion
/exclusions
/dependency

which very closely mirrors what David was talking about with ivy as well...
I didn't really think much of it just a matter of XML we can document
(there is actually no using maven documentation on the site at all we
should correct that in any case TBD post release) but if folks find it to
be a pain then we should definitely fix it for sure.  off the top of my
head I don't see how to-do that in the Build.scala but I really don't
expect it to be too difficult to figure out... the question is do we hold
it off for 0.8.1 since technically nothing is breaking (like the null
pointer exceptions we had for the bonked pom in beta1 that I shipped to
maven central).

Before canceling the vote can we at least get consensus to what we are
canceling and exactly what fixes should be in RC6 or ... agree to ship RC5
and hold whatever is left for 0.8.1

I am totally fine with working on RC6 (actually just cancelled my plans for
the evening because of a whole slew of client work that hit my plate) but I
want to make sure we have everything covered that everyone that is voting
expects to be in there.

David, a few items below don't make sense I sent another email on the
thread in regards to the LICENSE


/***
 Joe Stein
 Founder, Principal Consultant
 Big Data Open Source Security LLC
 http://www.stealth.ly
 Twitter: @allthingshadoop http://www.twitter.com/allthingshadoop
/


On Mon, Dec 2, 2013 at 12:19 PM, Neha Narkhede neha.narkh...@gmail.comwrote:

 I think we should maintain a wiki describing the release process in detail,
 so we save the turnaround time on a release. We can have a VOTE thread to
 agree on the release guidelines and follow it. Having  said that, it is
 worth having the correct .pom file at the very least, since the release is
 not very useful if people cannot consume it without pain.

 Thanks,
 Neha


 On Mon, Dec 2, 2013 at 8:59 AM, Joe Stein joe.st...@stealth.ly wrote:

  General future thought comment first: lets be careful please to raising
  issues as show stoppers that have been there previously (especially if
  greater than one version previous release back also has the problem) and
  can get fixed in a subsequent release and is only now more pressing
 because
  we know about them... seeing something should not necessarily always
 create
  priority (sometimes sure, of course but not always that is not the best
 way
  to manage changes).  The VOTE thread should be to artifacts and what we
 are
  releasing as proper and correct per Apache guidelines... and to make sure
  that the person doing the release doesn't do something incorrect ... like
  using the wrong version of JDK to build =8^/.  If we are not happy with
  release as ready to ship then lets not call a VOTE and save the prolonged
  weeks that drag out with so many release candidates.  The community
 suffers
  from this.
 
  ok, now on to RC5 ...lets extend the vote until 12pm PT tomorrow ...
  hopefully a few more hours for other folks to comment and discuss the
  issues you raised with my $0.02852425 included below and follow-ups as
 they
  become necessary... I am also out of pocket in a few hours until tomorrow
  morning so if it passed I would not be able to publish and announce or if
  failed look towards RC6 anyways =8^)
 
  /***
   Joe Stein
   Founder, Principal Consultant
   Big Data Open Source Security LLC
   http://www.stealth.ly
   Twitter: @allthingshadoop http://www.twitter.com/allthingshadoop
  /
 
 
  On Mon, Dec 2, 2013 at 11:00 AM, David Arthur mum...@gmail.com wrote:
 
   Seems like most people are verifying the src, so I'll pick on the
  binaries
   and Maven stuff ;)
  
   A few problems I see:
  
   There are some vestigial Git files in the src download: an empty .git
 and
   .gitignore
  
 
  Ok, I can do a 

Re: Review Request 15711: Patch for KAFKA-930

2013-12-02 Thread Neha Narkhede

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15711/#review29597
---



core/src/main/scala/kafka/controller/KafkaController.scala
https://reviews.apache.org/r/15711/#comment56993

instead of hardcoding this to 5 seconds, how about delaying it by 
leaderImbalanceCheckIntervalSeconds?



core/src/main/scala/kafka/controller/KafkaController.scala
https://reviews.apache.org/r/15711/#comment57002

this API is now a little awkward due to the updateZK parameter. Do we 
really need it? Another way is for the partition-rebalance-thread to always 
ensure creating the path and let this API delete it. This will keep the API 
clean.



core/src/main/scala/kafka/controller/KafkaController.scala
https://reviews.apache.org/r/15711/#comment57001

it seems we only need the preferred replica per partition, not the entire 
set of replicas right? In that case, we can simplify 
preferredReplicasForTopicsByBrokers to Map[Int, Map[TopicAndPartition, Int]] 
and call it preferredReplicaForPartitionsByBrokers



core/src/main/scala/kafka/controller/KafkaController.scala
https://reviews.apache.org/r/15711/#comment56995

It seems we don't need the brokerIds variable since it is never reused 
beyond the check in the if statement



core/src/main/scala/kafka/controller/KafkaController.scala
https://reviews.apache.org/r/15711/#comment56999

we also don't have semicolons as a coding convention. Difficult to switch 
between java and scala, eh? :)



core/src/main/scala/kafka/controller/KafkaController.scala
https://reviews.apache.org/r/15711/#comment56997

trigger a leader rebalance for partitions that should have a leader on 
this broker ?



core/src/main/scala/kafka/controller/KafkaController.scala
https://reviews.apache.org/r/15711/#comment57000

could we rename topicPartition to replicasPerPartition?



core/src/main/scala/kafka/server/KafkaConfig.scala
https://reviews.apache.org/r/15711/#comment56992

do we need this config option? It seems that the same could be achieved by 
setting a very high value for leader.imbalance.check.interval.seconds.


- Neha Narkhede


On Nov. 21, 2013, 5:42 p.m., Sriram Subramanian wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/15711/
 ---
 
 (Updated Nov. 21, 2013, 5:42 p.m.)
 
 
 Review request for kafka.
 
 
 Bugs: KAFKA-930
 https://issues.apache.org/jira/browse/KAFKA-930
 
 
 Repository: kafka
 
 
 Description
 ---
 
 Merge branch 'trunk' of http://git-wip-us.apache.org/repos/asf/kafka into 
 trunk
 
 
 commit missing code
 
 
 some more changes
 
 
 fix merge conflicts
 
 
 Add auto leader rebalance support
 
 
 Merge branch 'trunk' of http://git-wip-us.apache.org/repos/asf/kafka into 
 trunk
 
 
 Merge branch 'trunk' of http://git-wip-us.apache.org/repos/asf/kafka into 
 trunk
 
 Conflicts:
   core/src/main/scala/kafka/admin/AdminUtils.scala
   core/src/main/scala/kafka/admin/TopicCommand.scala
 
 change comments
 
 
 commit the remaining changes
 
 
 Move AddPartitions into TopicCommand
 
 
 Diffs
 -
 
   core/src/main/scala/kafka/controller/KafkaController.scala 
 4c319aba97655e7c4ec97fac2e34de4e28c9f5d3 
   core/src/main/scala/kafka/server/KafkaConfig.scala 
 b324344d0a383398db8bfe2cbeec2c1378fe13c9 
 
 Diff: https://reviews.apache.org/r/15711/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Sriram Subramanian
 




Re: [VOTE] Apache Kafka Release 0.8.0 - Candidate 5

2013-12-02 Thread David Arthur

Inline:

On 12/2/13 11:59 AM, Joe Stein wrote:

General future thought comment first: lets be careful please to raising
issues as show stoppers that have been there previously (especially if
greater than one version previous release back also has the problem) and
can get fixed in a subsequent release and is only now more pressing because
we know about them... seeing something should not necessarily always create
priority (sometimes sure, of course but not always that is not the best way
to manage changes).  The VOTE thread should be to artifacts and what we are
releasing as proper and correct per Apache guidelines... and to make sure
that the person doing the release doesn't do something incorrect ... like
using the wrong version of JDK to build =8^/.  If we are not happy with
release as ready to ship then lets not call a VOTE and save the prolonged
weeks that drag out with so many release candidates.  The community suffers
from this.
+1 If we can get most of this release preparation stuff automated, then 
we can iterate on it in a release branch before tagging and voting.

ok, now on to RC5 ...lets extend the vote until 12pm PT tomorrow ...
hopefully a few more hours for other folks to comment and discuss the
issues you raised with my $0.02852425 included below and follow-ups as they
become necessary... I am also out of pocket in a few hours until tomorrow
morning so if it passed I would not be able to publish and announce or if
failed look towards RC6 anyways =8^)

/***
  Joe Stein
  Founder, Principal Consultant
  Big Data Open Source Security LLC
  http://www.stealth.ly
  Twitter: @allthingshadoop http://www.twitter.com/allthingshadoop
/


On Mon, Dec 2, 2013 at 11:00 AM, David Arthur mum...@gmail.com wrote:


Seems like most people are verifying the src, so I'll pick on the binaries
and Maven stuff ;)

A few problems I see:

There are some vestigial Git files in the src download: an empty .git and
.gitignore


Ok, I can do a better job with 0.8.1 but I am not sure this is very
different than beta1 and not necessarily a show stopper for 0.8.0 requiring
another release candidate, is it?  I think updating the release docs and
rmdir .git after the rm -fr and rm .gitignore moving forward makes sense.

Agreed, not a show stopper.




In the source download, I see the SBT license in LICENSE which seems
correct (since we distribute an SBT binary), but in the binary download I
see the same license. Don't we need the Scala license (
http://www.scala-lang.org/license.html) in the binary distribution?


I fixed this already not only in the binary release
https://issues.apache.org/jira/browse/KAFKA-1131 but also in the JAR files
that are published to Maven
https://issues.apache.org/jira/browse/KAFKA-1133are you checking from
http://people.apache.org/~joestein/kafka-0.8.0-candidate5/ because I just
downloaded again and it looks alright to me.  If not then definitely this
RC should be shot down because it does not do what we are saying it is
doing.. but if it is wrong can you be more specific and create a JIRA with
the fix because I thought I got it right already... but if not then lets
get it right because that is why we pulled the release in RC3
The LICENSE file in both the src and binary downloads includes SBT 
LICENSE at the end. I could be wrong, but I think the src download 
should include the SBT licnese and the binary download should include 
the Scala license. Since we have released in the past without proper 
licensing, it's probably not a huge deal to do it again (but we should 
fix it).



I create a simple Ant+Ivy project to test resolving the artifacts
published to Apache staging repo: https://github.com/mumrah/kafka-ivy.
This will fetch Kafka libs from the Apache staging area and other things
from Maven Central. It will fetch the jars into lib/ivy/{conf} and generate
a report of the dependencies, conflicts, and licenses into ivy-report.
Notice I had to add three exclusions to get things working. Maybe we should
add these to our pom?


I don't think this is a showstopper is it?  can't this wait for 0.8.1 and
not hold up the 0.8.0 release?
No I don't think it's a show stopper. But to Neha's point, a painless 
Maven/Ivy/SBT/Gradle integration is important since this is how most 
users interface with Kafka. That said, ZooKeeper is what's pulling in 
these troublesome deps and it doesn't stop people from using ZooKeeper. 
I can live with this.


I didn't have this issue with java maven pom or scala sbt so maybe
something more ivy ant specific causing this?
No clue... maybe? I run into these deps all the time when dealing with 
ZooKeeper.

folks use gradle too so I
expect some feedback at some point to that working or not perhaps in 0.8.1
or even 0.9 we can try to cover every way everyone uses and make sure they
are all good to go moving forward... perhaps even some vagrant, docker,
puppet and chef love too (which I can 

[jira] [Created] (KAFKA-1155) Kafka server can miss zookeeper watches during long zkclient callbacks

2013-12-02 Thread Neha Narkhede (JIRA)
Neha Narkhede created KAFKA-1155:


 Summary: Kafka server can miss zookeeper watches during long 
zkclient callbacks
 Key: KAFKA-1155
 URL: https://issues.apache.org/jira/browse/KAFKA-1155
 Project: Kafka
  Issue Type: Bug
  Components: controller
Affects Versions: 0.8, 0.8.1
Reporter: Neha Narkhede
Assignee: Neha Narkhede
Priority: Critical


On getting a zookeeper watch, zkclient invokes the blocking user callback and 
only re-registers the watch after the callback returns. This leaves a possibly 
large window of time when Kafka has not registered for watches on the desired 
zookeeper paths and hence can miss important state changes (on the controller). 
In any case, it is worth noting that even though zookeeper has a 
read-and-set-watch API, there can always be a window of time between the watch 
being fired, the callback and the read-and-set-watch API call. Due to the 
zkclient wrapper, it is difficult to handle this properly in the Kafka code 
unless we directly use the zookeeper client. One way of getting around this 
issue is to use timestamps on the paths and when a watch fires, check if the 
timestamp in zk is different from the one in the callback handler.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


Re: Review Request 15711: Patch for KAFKA-930

2013-12-02 Thread Neha Narkhede

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15711/#review29606
---



core/src/main/scala/kafka/server/KafkaConfig.scala
https://reviews.apache.org/r/15711/#comment57005

can we disable this feature until 
https://issues.apache.org/jira/browse/KAFKA-1155 is solved?


- Neha Narkhede


On Nov. 21, 2013, 5:42 p.m., Sriram Subramanian wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/15711/
 ---
 
 (Updated Nov. 21, 2013, 5:42 p.m.)
 
 
 Review request for kafka.
 
 
 Bugs: KAFKA-930
 https://issues.apache.org/jira/browse/KAFKA-930
 
 
 Repository: kafka
 
 
 Description
 ---
 
 Merge branch 'trunk' of http://git-wip-us.apache.org/repos/asf/kafka into 
 trunk
 
 
 commit missing code
 
 
 some more changes
 
 
 fix merge conflicts
 
 
 Add auto leader rebalance support
 
 
 Merge branch 'trunk' of http://git-wip-us.apache.org/repos/asf/kafka into 
 trunk
 
 
 Merge branch 'trunk' of http://git-wip-us.apache.org/repos/asf/kafka into 
 trunk
 
 Conflicts:
   core/src/main/scala/kafka/admin/AdminUtils.scala
   core/src/main/scala/kafka/admin/TopicCommand.scala
 
 change comments
 
 
 commit the remaining changes
 
 
 Move AddPartitions into TopicCommand
 
 
 Diffs
 -
 
   core/src/main/scala/kafka/controller/KafkaController.scala 
 4c319aba97655e7c4ec97fac2e34de4e28c9f5d3 
   core/src/main/scala/kafka/server/KafkaConfig.scala 
 b324344d0a383398db8bfe2cbeec2c1378fe13c9 
 
 Diff: https://reviews.apache.org/r/15711/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Sriram Subramanian
 




Re: [VOTE] Apache Kafka Release 0.8.0 - Candidate 5

2013-12-02 Thread Neha Narkhede
Thanks for creating
https://cwiki.apache.org/confluence/display/KAFKA/Release+Process. That was
what I was looking for. It will be worth updating it right after the 0.8
release and keep it updated as we change the guidelines. Thanks again!

-Neha


On Mon, Dec 2, 2013 at 10:19 AM, David Arthur mum...@gmail.com wrote:

 Inline:


 On 12/2/13 11:59 AM, Joe Stein wrote:

 General future thought comment first: lets be careful please to raising
 issues as show stoppers that have been there previously (especially if
 greater than one version previous release back also has the problem) and
 can get fixed in a subsequent release and is only now more pressing
 because
 we know about them... seeing something should not necessarily always
 create
 priority (sometimes sure, of course but not always that is not the best
 way
 to manage changes).  The VOTE thread should be to artifacts and what we
 are
 releasing as proper and correct per Apache guidelines... and to make sure
 that the person doing the release doesn't do something incorrect ... like
 using the wrong version of JDK to build =8^/.  If we are not happy with
 release as ready to ship then lets not call a VOTE and save the prolonged
 weeks that drag out with so many release candidates.  The community
 suffers
 from this.

 +1 If we can get most of this release preparation stuff automated, then we
 can iterate on it in a release branch before tagging and voting.

  ok, now on to RC5 ...lets extend the vote until 12pm PT tomorrow ...
 hopefully a few more hours for other folks to comment and discuss the
 issues you raised with my $0.02852425 included below and follow-ups as
 they
 become necessary... I am also out of pocket in a few hours until tomorrow
 morning so if it passed I would not be able to publish and announce or if
 failed look towards RC6 anyways =8^)

 /***
   Joe Stein
   Founder, Principal Consultant
   Big Data Open Source Security LLC
   http://www.stealth.ly
   Twitter: @allthingshadoop http://www.twitter.com/allthingshadoop
 /


 On Mon, Dec 2, 2013 at 11:00 AM, David Arthur mum...@gmail.com wrote:

  Seems like most people are verifying the src, so I'll pick on the
 binaries
 and Maven stuff ;)

 A few problems I see:

 There are some vestigial Git files in the src download: an empty .git and
 .gitignore

  Ok, I can do a better job with 0.8.1 but I am not sure this is very
 different than beta1 and not necessarily a show stopper for 0.8.0
 requiring
 another release candidate, is it?  I think updating the release docs and
 rmdir .git after the rm -fr and rm .gitignore moving forward makes sense.

 Agreed, not a show stopper.



  In the source download, I see the SBT license in LICENSE which seems
 correct (since we distribute an SBT binary), but in the binary download I
 see the same license. Don't we need the Scala license (
 http://www.scala-lang.org/license.html) in the binary distribution?

  I fixed this already not only in the binary release
 https://issues.apache.org/jira/browse/KAFKA-1131 but also in the JAR
 files
 that are published to Maven
 https://issues.apache.org/jira/browse/KAFKA-1133are you checking from
 http://people.apache.org/~joestein/kafka-0.8.0-candidate5/ because I just
 downloaded again and it looks alright to me.  If not then definitely this
 RC should be shot down because it does not do what we are saying it is
 doing.. but if it is wrong can you be more specific and create a JIRA with
 the fix because I thought I got it right already... but if not then lets
 get it right because that is why we pulled the release in RC3

 The LICENSE file in both the src and binary downloads includes SBT
 LICENSE at the end. I could be wrong, but I think the src download should
 include the SBT licnese and the binary download should include the Scala
 license. Since we have released in the past without proper licensing, it's
 probably not a huge deal to do it again (but we should fix it).


  I create a simple Ant+Ivy project to test resolving the artifacts
 published to Apache staging repo: https://github.com/mumrah/kafka-ivy.
 This will fetch Kafka libs from the Apache staging area and other things
 from Maven Central. It will fetch the jars into lib/ivy/{conf} and
 generate
 a report of the dependencies, conflicts, and licenses into ivy-report.
 Notice I had to add three exclusions to get things working. Maybe we
 should
 add these to our pom?

  I don't think this is a showstopper is it?  can't this wait for 0.8.1
 and
 not hold up the 0.8.0 release?

 No I don't think it's a show stopper. But to Neha's point, a painless
 Maven/Ivy/SBT/Gradle integration is important since this is how most users
 interface with Kafka. That said, ZooKeeper is what's pulling in these
 troublesome deps and it doesn't stop people from using ZooKeeper. I can
 live with this.


 I didn't have this issue with java maven pom or scala sbt so maybe
 something more 

[jira] [Updated] (KAFKA-1074) Reassign partitions should delete the old replicas from disk

2013-12-02 Thread Neha Narkhede (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Neha Narkhede updated KAFKA-1074:
-

Assignee: Jun Rao

 Reassign partitions should delete the old replicas from disk
 

 Key: KAFKA-1074
 URL: https://issues.apache.org/jira/browse/KAFKA-1074
 Project: Kafka
  Issue Type: Bug
  Components: core
Affects Versions: 0.8
Reporter: Jun Rao
Assignee: Jun Rao
 Fix For: 0.8.1

 Attachments: KAFKA-1074.patch


 Currently, after reassigning replicas to other brokers, the old replicas are 
 not removed from disk and have to be deleted manually.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


Re: [VOTE] Apache Kafka Release 0.8.0 - Candidate 5

2013-12-02 Thread David Arthur

Joe, another thing I noticed in the staging repo:

The POM for 2.8.2, 2.9.1, 2.9.2, and 2.10 include scala-compiler and 
some other stuff that is not included in 2.8.0


2.8.0 
https://repository.apache.org/content/groups/staging/org/apache/kafka/kafka_2.8.0/0.8.0/kafka_2.8.0-0.8.0.pom
2.8.2 
https://repository.apache.org/content/groups/staging/org/apache/kafka/kafka_2.8.2/0.8.0/kafka_2.8.2-0.8.0.pom


Here's a diff of those two: 
https://gist.github.com/mumrah/7bd6bd8e2805210d5d9d/revisions


I think maybe the 2.8.0 POM is missing some stuff it needs (zkclient, 
snappy, yammer metrics). And there is a duplicate ZK entry for the POMs 
2.8.0


-David



On 12/2/13 12:57 PM, Joe Stein wrote:

Neha, as far as the release process is this what you had in mind
https://cwiki.apache.org/confluence/display/KAFKA/Release+Process or
different content or more of something or such?

Per the POM, I was able to use the artifacts from the maven repository
without having to-do anything more than just specifying the artifacts with
sbt.

resolvers += Apache Staging at 
https://repository.apache.org/content/groups/staging/;

libraryDependencies ++= Seq(
 ...,
org.apache.kafka % kafka_2.10 % 0.8.0,
 
)

and on the pure maven side
repositories
 repository
 idApacheStaging/id
 urlhttps://repository.apache.org/content/groups/staging//url
 /repository
...
 dependency
 groupIdorg.apache.kafka/groupId
 artifactIdkafka_2.9.2/artifactId
 version0.8.0/version
 exclusions
 exclusion
 groupIdlog4j/groupId
 artifactIdlog4j/artifactId
 /exclusion
 /exclusions
 /dependency

which very closely mirrors what David was talking about with ivy as well...
I didn't really think much of it just a matter of XML we can document
(there is actually no using maven documentation on the site at all we
should correct that in any case TBD post release) but if folks find it to
be a pain then we should definitely fix it for sure.  off the top of my
head I don't see how to-do that in the Build.scala but I really don't
expect it to be too difficult to figure out... the question is do we hold
it off for 0.8.1 since technically nothing is breaking (like the null
pointer exceptions we had for the bonked pom in beta1 that I shipped to
maven central).

Before canceling the vote can we at least get consensus to what we are
canceling and exactly what fixes should be in RC6 or ... agree to ship RC5
and hold whatever is left for 0.8.1

I am totally fine with working on RC6 (actually just cancelled my plans for
the evening because of a whole slew of client work that hit my plate) but I
want to make sure we have everything covered that everyone that is voting
expects to be in there.

David, a few items below don't make sense I sent another email on the
thread in regards to the LICENSE


/***
  Joe Stein
  Founder, Principal Consultant
  Big Data Open Source Security LLC
  http://www.stealth.ly
  Twitter: @allthingshadoop http://www.twitter.com/allthingshadoop
/


On Mon, Dec 2, 2013 at 12:19 PM, Neha Narkhede neha.narkh...@gmail.comwrote:


I think we should maintain a wiki describing the release process in detail,
so we save the turnaround time on a release. We can have a VOTE thread to
agree on the release guidelines and follow it. Having  said that, it is
worth having the correct .pom file at the very least, since the release is
not very useful if people cannot consume it without pain.

Thanks,
Neha


On Mon, Dec 2, 2013 at 8:59 AM, Joe Stein joe.st...@stealth.ly wrote:


General future thought comment first: lets be careful please to raising
issues as show stoppers that have been there previously (especially if
greater than one version previous release back also has the problem) and
can get fixed in a subsequent release and is only now more pressing

because

we know about them... seeing something should not necessarily always

create

priority (sometimes sure, of course but not always that is not the best

way

to manage changes).  The VOTE thread should be to artifacts and what we

are

releasing as proper and correct per Apache guidelines... and to make sure
that the person doing the release doesn't do something incorrect ... like
using the wrong version of JDK to build =8^/.  If we are not happy with
release as ready to ship then lets not call a VOTE and save the prolonged
weeks that drag out with so many release candidates.  The community

suffers

from this.

ok, now on to RC5 ...lets extend the vote until 12pm PT tomorrow ...
hopefully a few more hours for other folks to comment and discuss the
issues you raised with my $0.02852425 included below and follow-ups as

they

become necessary... I am also out of pocket in a few hours until tomorrow
morning so if it passed I 

Re: Review Request 15938: replicas may not have consistent data after becoming follower

2013-12-02 Thread Jun Rao


 On Dec. 2, 2013, 4:55 p.m., Neha Narkhede wrote:
  core/src/main/scala/kafka/server/ReplicaManager.scala, line 329
  https://reviews.apache.org/r/15938/diff/1/?file=392700#file392700line329
 
  Do we still need this TODO: the above may need to be fixed later ?

Yes, this can be removed.


 On Dec. 2, 2013, 4:55 p.m., Neha Narkhede wrote:
  core/src/main/scala/kafka/server/KafkaApis.scala, line 396
  https://reviews.apache.org/r/15938/diff/1/?file=392699#file392699line396
 
  We had added the ability for a special consumer to read the replica log 
  for troubleshooting. This patch takes that convenience away. We should 
  probably look for another way to prevent the replica verification tool from 
  giving false negatives. Can it use a different consumer id?

We could add another debugging consumer mode so that it can read beyond HW. 
This will complicate the broker side logic a bit though. Also, reading beyond 
HW always has the danger that the fetched data is garbage since it's truncated. 
Perhaps we can wait and see if this new mode is really needed?


- Jun


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15938/#review29586
---


On Dec. 1, 2013, 11:33 p.m., Jun Rao wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/15938/
 ---
 
 (Updated Dec. 1, 2013, 11:33 p.m.)
 
 
 Review request for kafka.
 
 
 Bugs: KAFKA-1154
 https://issues.apache.org/jira/browse/KAFKA-1154
 
 
 Repository: kafka
 
 
 Description
 ---
 
 kafka-1154; fix 1
 
 
 Diffs
 -
 
   core/src/main/scala/kafka/api/FetchRequest.scala 
 fb2a2306003ac64a8a3b2fc5fc999e0be273f48d 
   core/src/main/scala/kafka/api/RequestOrResponse.scala 
 b62330be6241c8ff4acd21f0fa7e80b7636e0d42 
   core/src/main/scala/kafka/server/KafkaApis.scala 
 80a70f1e5e3a7670b2238fe63b8d9e0eac6b46ac 
   core/src/main/scala/kafka/server/ReplicaManager.scala 
 54f6e1674255f62eba9d90aab0db371c82baf749 
   core/src/main/scala/kafka/tools/ReplicaVerificationTool.scala 
 f1f139e485d98e42be17cdcc327961420cd8c012 
 
 Diff: https://reviews.apache.org/r/15938/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Jun Rao
 




Re: Review Request 15938: replicas may not have consistent data after becoming follower

2013-12-02 Thread Neha Narkhede


 On Dec. 2, 2013, 4:55 p.m., Neha Narkhede wrote:
  core/src/main/scala/kafka/server/KafkaApis.scala, line 396
  https://reviews.apache.org/r/15938/diff/1/?file=392699#file392699line396
 
  We had added the ability for a special consumer to read the replica log 
  for troubleshooting. This patch takes that convenience away. We should 
  probably look for another way to prevent the replica verification tool from 
  giving false negatives. Can it use a different consumer id?
 
 Jun Rao wrote:
 We could add another debugging consumer mode so that it can read beyond 
 HW. This will complicate the broker side logic a bit though. Also, reading 
 beyond HW always has the danger that the fetched data is garbage since it's 
 truncated. Perhaps we can wait and see if this new mode is really needed?

Yes, we can probably wait. So, if the debugging consumer also reads upto the 
HW, just like a normal consumer, do we need to have a special debugging 
consumer ?


- Neha


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15938/#review29586
---


On Dec. 1, 2013, 11:33 p.m., Jun Rao wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/15938/
 ---
 
 (Updated Dec. 1, 2013, 11:33 p.m.)
 
 
 Review request for kafka.
 
 
 Bugs: KAFKA-1154
 https://issues.apache.org/jira/browse/KAFKA-1154
 
 
 Repository: kafka
 
 
 Description
 ---
 
 kafka-1154; fix 1
 
 
 Diffs
 -
 
   core/src/main/scala/kafka/api/FetchRequest.scala 
 fb2a2306003ac64a8a3b2fc5fc999e0be273f48d 
   core/src/main/scala/kafka/api/RequestOrResponse.scala 
 b62330be6241c8ff4acd21f0fa7e80b7636e0d42 
   core/src/main/scala/kafka/server/KafkaApis.scala 
 80a70f1e5e3a7670b2238fe63b8d9e0eac6b46ac 
   core/src/main/scala/kafka/server/ReplicaManager.scala 
 54f6e1674255f62eba9d90aab0db371c82baf749 
   core/src/main/scala/kafka/tools/ReplicaVerificationTool.scala 
 f1f139e485d98e42be17cdcc327961420cd8c012 
 
 Diff: https://reviews.apache.org/r/15938/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Jun Rao
 




[jira] [Assigned] (KAFKA-1050) Support for no data loss mode

2013-12-02 Thread Neha Narkhede (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Neha Narkhede reassigned KAFKA-1050:


Assignee: Neha Narkhede

 Support for no data loss mode
 ---

 Key: KAFKA-1050
 URL: https://issues.apache.org/jira/browse/KAFKA-1050
 Project: Kafka
  Issue Type: Task
Reporter: Justin SB
Assignee: Neha Narkhede

 I'd love to use Apache Kafka, but for my application data loss is not 
 acceptable.  Even at the expense of availability (i.e. I need C not A in CAP).
 I think there are two things that I need to change to get a quorum model:
 1) Make sure I set request.required.acks to 2 (for a 3 node cluster) or 3 
 (for a 5 node cluster) on every request, so that I can only write if a quorum 
 is active.
 2) Prevent the behaviour where a non-ISR can become the leader if all ISRs 
 die.  I think this is as easy as tweaking 
 core/src/main/scala/kafka/controller/PartitionLeaderSelector.scala, 
 essentially to throw an exception around line 64 in the data loss case.
 I haven't yet implemented / tested this.  I'd love to get some input from the 
 Kafka-experts on whether my plan is:
  (a) correct - will this work?
  (b) complete - have I missed any cases?
  (c) recommended - is this a terrible idea :-)
 Thanks for any pointers!



--
This message was sent by Atlassian JIRA
(v6.1#6144)


Re: Review Request 15674: Reassign partitions should delete the old replicas from disk

2013-12-02 Thread Neha Narkhede


 On Nov. 19, 2013, 6:40 p.m., Jay Kreps wrote:
  What happens if I am doing a read or write concurrently with a delete?
  
  Would it be simpler just to have the delete log work like the segment 
  delete where rather than trying to lock we remove it from the segment list 
  and then just enqueue a delete in 60 seconds. My concern is just that 
  reasoning about the various locking strategies in the log is getting 
  increasingly difficult.
 
 Jun Rao wrote:
 Yes, we could try deleting the log asynchronously. The issues there are:
 
 1. The same partition could be moved back to this broker during the 
 delayed window.
 2. It's not clear if 60 secs (or any value) is good enough since the time 
 that an ongoing scheduled flush takes is unbounded.
 
 The following is how this patch handles outstanding reads/writes on the 
 deleted data.
 
 1. All read operations are ok since we already handle unexpected 
 exceptions in KafkaApi. The caller will get an error.
 2. Currently, if we hit an IOException while writing to the log by the 
 producer request, the replica fetcher or the log flusher, we halt the broker. 
 We need to make sure that the deletion of a log doesn't cause the halt. This 
 is achieved by preventing those operations on the log once it's deleted.
 2.1 For producer requests, the delete partition operation will 
 synchronize on the leaderAndIsrUpdate lock.
 2.2 For replica fetcher, this is already handled since the fetcher is 
 removed before the log is deleted.
 2.3 For log flusher, the flush and the delete will now synchronize on a 
 delete lock.
 
 I agree that this approach uses more locks, which potentially makes the 
 code harder to understand. However, my feeling is that this is probably a 
 less hacky approach than the async delete one.

At least until the various locks are cleaned up, the current approach used in 
the patch seems safer compared to an async delete. Will take a closer look at 
the patch sometime today.


- Neha


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15674/#review29123
---


On Nov. 19, 2013, 4:28 p.m., Jun Rao wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/15674/
 ---
 
 (Updated Nov. 19, 2013, 4:28 p.m.)
 
 
 Review request for kafka.
 
 
 Bugs: KAFKA-1074
 https://issues.apache.org/jira/browse/KAFKA-1074
 
 
 Repository: kafka
 
 
 Description
 ---
 
 kafka-1074; fix 3
 
 
 kafka-1074; fix 2
 
 
 kafka-1074
 
 
 Diffs
 -
 
   core/src/main/scala/kafka/cluster/Partition.scala 
 02ccc17c79b6d44c75f9bb6ca7cda8c51ae6f6fb 
   core/src/main/scala/kafka/log/Log.scala 
 1883a53de112ad08449dc73a2ca08208c11a2537 
   core/src/main/scala/kafka/log/LogManager.scala 
 81be88aa618ed5614703d45a0556b77c97290085 
   core/src/main/scala/kafka/log/LogSegment.scala 
 0d6926ea105a99c9ff2cfc9ea6440f2f2d37bde8 
   core/src/main/scala/kafka/server/ReplicaManager.scala 
 161f58134f20f9335dbd2bee6ac3f71897cbef7c 
   core/src/test/scala/unit/kafka/admin/AdminTest.scala 
 c30069e837e54fb91bf1d5b75b133282a28dedf8 
 
 Diff: https://reviews.apache.org/r/15674/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Jun Rao
 




[jira] Subscription: outstanding kafka patches

2013-12-02 Thread jira
Issue Subscription
Filter: outstanding kafka patches (75 issues)
The list of outstanding kafka patches
Subscriber: kafka-mailing-list

Key Summary
KAFKA-1154  replicas may not have consistent data after becoming follower
https://issues.apache.org/jira/browse/KAFKA-1154
KAFKA-1151  The Hadoop consumer API doc is not referencing the contrib consumer
https://issues.apache.org/jira/browse/KAFKA-1151
KAFKA-1145  Broker fail to sync after restart
https://issues.apache.org/jira/browse/KAFKA-1145
KAFKA-1144  commitOffsets can be passed the offsets to commit
https://issues.apache.org/jira/browse/KAFKA-1144
KAFKA-1142  Patch review tool should take diff with origin from last divergent 
point
https://issues.apache.org/jira/browse/KAFKA-1142
KAFKA-1130  log.dirs is a confusing property name
https://issues.apache.org/jira/browse/KAFKA-1130
KAFKA-1116  Need to upgrade sbt-assembly to compile on scala 2.10.2
https://issues.apache.org/jira/browse/KAFKA-1116
KAFKA-1110  Unable to produce messages with snappy/gzip compression
https://issues.apache.org/jira/browse/KAFKA-1110
KAFKA-1109  Need to fix GC log configuration code, not able to override 
KAFKA_GC_LOG_OPTS
https://issues.apache.org/jira/browse/KAFKA-1109
KAFKA-1106  HighwaterMarkCheckpoint failure puting broker into a bad state
https://issues.apache.org/jira/browse/KAFKA-1106
KAFKA-1093  Log.getOffsetsBefore(t, …) does not return the last confirmed 
offset before t
https://issues.apache.org/jira/browse/KAFKA-1093
KAFKA-1086  Improve GetOffsetShell to find metadata automatically
https://issues.apache.org/jira/browse/KAFKA-1086
KAFKA-1082  zkclient dies after UnknownHostException in zk reconnect
https://issues.apache.org/jira/browse/KAFKA-1082
KAFKA-1079  Liars in PrimitiveApiTest that promise to test api in compression 
mode, but don't do this actually
https://issues.apache.org/jira/browse/KAFKA-1079
KAFKA-1074  Reassign partitions should delete the old replicas from disk
https://issues.apache.org/jira/browse/KAFKA-1074
KAFKA-1049  Encoder implementations are required to provide an undocumented 
constructor.
https://issues.apache.org/jira/browse/KAFKA-1049
KAFKA-1032  Messages sent to the old leader will be lost on broker GC resulted 
failure
https://issues.apache.org/jira/browse/KAFKA-1032
KAFKA-1020  Remove getAllReplicasOnBroker from KafkaController
https://issues.apache.org/jira/browse/KAFKA-1020
KAFKA-1012  Implement an Offset Manager and hook offset requests to it
https://issues.apache.org/jira/browse/KAFKA-1012
KAFKA-1011  Decompression and re-compression on MirrorMaker could result in 
messages being dropped in the pipeline
https://issues.apache.org/jira/browse/KAFKA-1011
KAFKA-1005  kafka.perf.ConsumerPerformance not shutting down consumer
https://issues.apache.org/jira/browse/KAFKA-1005
KAFKA-998   Producer should not retry on non-recoverable error codes
https://issues.apache.org/jira/browse/KAFKA-998
KAFKA-997   Provide a strict verification mode when reading configuration 
properties
https://issues.apache.org/jira/browse/KAFKA-997
KAFKA-996   Capitalize first letter for log entries
https://issues.apache.org/jira/browse/KAFKA-996
KAFKA-984   Avoid a full rebalance in cases when a new topic is discovered but 
container/broker set stay the same
https://issues.apache.org/jira/browse/KAFKA-984
KAFKA-976   Order-Preserving Mirror Maker Testcase
https://issues.apache.org/jira/browse/KAFKA-976
KAFKA-967   Use key range in ProducerPerformance
https://issues.apache.org/jira/browse/KAFKA-967
KAFKA-917   Expose zk.session.timeout.ms in console consumer
https://issues.apache.org/jira/browse/KAFKA-917
KAFKA-885   sbt package builds two kafka jars
https://issues.apache.org/jira/browse/KAFKA-885
KAFKA-881   Kafka broker not respecting log.roll.hours
https://issues.apache.org/jira/browse/KAFKA-881
KAFKA-873   Consider replacing zkclient with curator (with zkclient-bridge)
https://issues.apache.org/jira/browse/KAFKA-873
KAFKA-868   System Test - add test case for rolling controlled shutdown
https://issues.apache.org/jira/browse/KAFKA-868
KAFKA-863   System Test - update 0.7 version of kafka-run-class.sh for 
Migration Tool test cases
https://issues.apache.org/jira/browse/KAFKA-863
KAFKA-859   support basic auth protection of mx4j console
https://issues.apache.org/jira/browse/KAFKA-859
KAFKA-855   Ant+Ivy build for Kafka
https://issues.apache.org/jira/browse/KAFKA-855
KAFKA-854   Upgrade dependencies for 0.8
https://issues.apache.org/jira/browse/KAFKA-854
KAFKA-815   Improve SimpleConsumerShell to take in a max messages config option
 

Re: Review Request 15938: replicas may not have consistent data after becoming follower

2013-12-02 Thread Neha Narkhede


 On Dec. 2, 2013, 4:55 p.m., Neha Narkhede wrote:
  core/src/main/scala/kafka/server/KafkaApis.scala, line 396
  https://reviews.apache.org/r/15938/diff/1/?file=392699#file392699line396
 
  We had added the ability for a special consumer to read the replica log 
  for troubleshooting. This patch takes that convenience away. We should 
  probably look for another way to prevent the replica verification tool from 
  giving false negatives. Can it use a different consumer id?
 
 Jun Rao wrote:
 We could add another debugging consumer mode so that it can read beyond 
 HW. This will complicate the broker side logic a bit though. Also, reading 
 beyond HW always has the danger that the fetched data is garbage since it's 
 truncated. Perhaps we can wait and see if this new mode is really needed?
 
 Neha Narkhede wrote:
 Yes, we can probably wait. So, if the debugging consumer also reads upto 
 the HW, just like a normal consumer, do we need to have a special debugging 
 consumer ?

Hmm.. so debugging consumer will be useful to read from replicas, which 
ordinary consumers can't do. We can probably address the debugging consumer 
properly in the future if/when we find use for reading beyond the HW. Rest of 
the patch looks good.


- Neha


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15938/#review29586
---


On Dec. 1, 2013, 11:33 p.m., Jun Rao wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/15938/
 ---
 
 (Updated Dec. 1, 2013, 11:33 p.m.)
 
 
 Review request for kafka.
 
 
 Bugs: KAFKA-1154
 https://issues.apache.org/jira/browse/KAFKA-1154
 
 
 Repository: kafka
 
 
 Description
 ---
 
 kafka-1154; fix 1
 
 
 Diffs
 -
 
   core/src/main/scala/kafka/api/FetchRequest.scala 
 fb2a2306003ac64a8a3b2fc5fc999e0be273f48d 
   core/src/main/scala/kafka/api/RequestOrResponse.scala 
 b62330be6241c8ff4acd21f0fa7e80b7636e0d42 
   core/src/main/scala/kafka/server/KafkaApis.scala 
 80a70f1e5e3a7670b2238fe63b8d9e0eac6b46ac 
   core/src/main/scala/kafka/server/ReplicaManager.scala 
 54f6e1674255f62eba9d90aab0db371c82baf749 
   core/src/main/scala/kafka/tools/ReplicaVerificationTool.scala 
 f1f139e485d98e42be17cdcc327961420cd8c012 
 
 Diff: https://reviews.apache.org/r/15938/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Jun Rao
 




[jira] [Resolved] (KAFKA-1154) replicas may not have consistent data after becoming follower

2013-12-02 Thread Jun Rao (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jun Rao resolved KAFKA-1154.


Resolution: Fixed

Thanks for the review. Committed to trunk after addressing the minor review 
comments.

 replicas may not have consistent data after becoming follower
 -

 Key: KAFKA-1154
 URL: https://issues.apache.org/jira/browse/KAFKA-1154
 Project: Kafka
  Issue Type: Bug
  Components: core
Affects Versions: 0.8.1
Reporter: Jun Rao
Assignee: Jun Rao
 Fix For: 0.8.1

 Attachments: KAFKA-1154.patch


 This is an issued introduced in KAFKA-1001. The issue is that in 
 ReplicaManager.makeFollowers(), we truncate the log before marking the 
 replica as the follower. New messages from the producer can still be added to 
 the log after the log is truncated, but before the replica is marked as the 
 follower. Those newly produced messages can actually be committed, which 
 implies those truncated messages are also committed. However, the new leader 
 is not guaranteed to have those truncated messages.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Created] (KAFKA-1156) Improve reassignment tool to output the existing assignment to facilitate rollbacks

2013-12-02 Thread Neha Narkhede (JIRA)
Neha Narkhede created KAFKA-1156:


 Summary: Improve reassignment tool to output the existing 
assignment to facilitate rollbacks
 Key: KAFKA-1156
 URL: https://issues.apache.org/jira/browse/KAFKA-1156
 Project: Kafka
  Issue Type: Bug
  Components: tools
Affects Versions: 0.8.1
Reporter: Neha Narkhede
Assignee: Neha Narkhede
Priority: Critical


It is useful for the partition reassignment tool to output the current 
partition assignment as part of the dry run. This will make rollbacks easier if 
the reassignment does not work out.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Created] (KAFKA-1157) Clean up Per-topic Configuration from Kafka properties

2013-12-02 Thread Guozhang Wang (JIRA)
Guozhang Wang created KAFKA-1157:


 Summary: Clean up Per-topic Configuration from Kafka properties
 Key: KAFKA-1157
 URL: https://issues.apache.org/jira/browse/KAFKA-1157
 Project: Kafka
  Issue Type: Bug
Reporter: Guozhang Wang
Assignee: Guozhang Wang


After KAFKA-554, per-topic configurations could be removed from kafka 
properties.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


Review Request 15950: Patch for KAFKA-1157

2013-12-02 Thread Guozhang Wang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15950/
---

Review request for kafka.


Bugs: KAFKA-1157
https://issues.apache.org/jira/browse/KAFKA-1157


Repository: kafka


Description
---

KAFKA-1157.v1


Diffs
-

  core/src/main/scala/kafka/server/KafkaConfig.scala 
b324344d0a383398db8bfe2cbeec2c1378fe13c9 

Diff: https://reviews.apache.org/r/15950/diff/


Testing
---


Thanks,

Guozhang Wang



[jira] [Updated] (KAFKA-1134) onControllerFailover function should be synchronized with other functions

2013-12-02 Thread Guozhang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang updated KAFKA-1134:
-

Attachment: KAFKA-1134.patch

 onControllerFailover function should be synchronized with other functions
 -

 Key: KAFKA-1134
 URL: https://issues.apache.org/jira/browse/KAFKA-1134
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.8, 0.8.1
Reporter: Guozhang Wang
 Attachments: KAFKA-1134.patch


 Otherwise race conditions could happen. For example, handleNewSession will 
 close all sockets with brokers while the handleStateChange in 
 onControllerFailover tries to send requests to them.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (KAFKA-1134) onControllerFailover function should be synchronized with other functions

2013-12-02 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13837172#comment-13837172
 ] 

Guozhang Wang commented on KAFKA-1134:
--

Created reviewboard https://reviews.apache.org/r/15953/
 against branch origin/trunk

 onControllerFailover function should be synchronized with other functions
 -

 Key: KAFKA-1134
 URL: https://issues.apache.org/jira/browse/KAFKA-1134
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.8, 0.8.1
Reporter: Guozhang Wang
 Attachments: KAFKA-1134.patch


 Otherwise race conditions could happen. For example, handleNewSession will 
 close all sockets with brokers while the handleStateChange in 
 onControllerFailover tries to send requests to them.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


Re: Review Request 15953: Patch for KAFKA-1134

2013-12-02 Thread Neha Narkhede

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15953/#review29636
---



core/src/main/scala/kafka/controller/KafkaController.scala
https://reviews.apache.org/r/15953/#comment57045

it seems that onControllerFailover is already protected by the 
controllerLock. The elect() API of ZookeeperLeaderElector is invoked in 3 
places and each of those acquires the controllerLock


- Neha Narkhede


On Dec. 3, 2013, 12:58 a.m., Guozhang Wang wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/15953/
 ---
 
 (Updated Dec. 3, 2013, 12:58 a.m.)
 
 
 Review request for kafka.
 
 
 Bugs: KAFKA-1134
 https://issues.apache.org/jira/browse/KAFKA-1134
 
 
 Repository: kafka
 
 
 Description
 ---
 
 KAFKA-1134.v1
 
 
 Diffs
 -
 
   core/src/main/scala/kafka/controller/KafkaController.scala 
 4c319aba97655e7c4ec97fac2e34de4e28c9f5d3 
 
 Diff: https://reviews.apache.org/r/15953/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Guozhang Wang
 




Re: Review Request 15953: Patch for KAFKA-1134

2013-12-02 Thread Guozhang Wang


 On Dec. 3, 2013, 1:30 a.m., Neha Narkhede wrote:
  core/src/main/scala/kafka/controller/KafkaController.scala, line 235
  https://reviews.apache.org/r/15953/diff/1/?file=392920#file392920line235
 
  it seems that onControllerFailover is already protected by the 
  controllerLock. The elect() API of ZookeeperLeaderElector is invoked in 3 
  places and each of those acquires the controllerLock

You are right. The real issue is not that onControllerFailover is not 
synchronized, but is that the sendRequest is asynchronized. Hence in 
onControllerFailover, it just put the request on the queue, and while the send 
thread wakes up to send the message, it may have already been closed by the 
handleNewSession procedure.

I think the correct fix should be, in 
ControllerChannelManager.removeExistingBroker, we should also clear the request 
queue.


- Guozhang


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15953/#review29636
---


On Dec. 3, 2013, 12:58 a.m., Guozhang Wang wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/15953/
 ---
 
 (Updated Dec. 3, 2013, 12:58 a.m.)
 
 
 Review request for kafka.
 
 
 Bugs: KAFKA-1134
 https://issues.apache.org/jira/browse/KAFKA-1134
 
 
 Repository: kafka
 
 
 Description
 ---
 
 KAFKA-1134.v1
 
 
 Diffs
 -
 
   core/src/main/scala/kafka/controller/KafkaController.scala 
 4c319aba97655e7c4ec97fac2e34de4e28c9f5d3 
 
 Diff: https://reviews.apache.org/r/15953/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Guozhang Wang
 




[jira] [Resolved] (KAFKA-1157) Clean up Per-topic Configuration from Kafka properties

2013-12-02 Thread Jun Rao (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jun Rao resolved KAFKA-1157.


   Resolution: Fixed
Fix Version/s: 0.8.1

Thanks for the patch. +1 and committed to trunk.

 Clean up Per-topic Configuration from Kafka properties
 --

 Key: KAFKA-1157
 URL: https://issues.apache.org/jira/browse/KAFKA-1157
 Project: Kafka
  Issue Type: Bug
Reporter: Guozhang Wang
Assignee: Guozhang Wang
 Fix For: 0.8.1

 Attachments: KAFKA-1157.patch


 After KAFKA-554, per-topic configurations could be removed from kafka 
 properties.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


Re: [VOTE] Apache Kafka Release 0.8.0 - Candidate 5

2013-12-02 Thread Jun Rao
The release voting is based on lazy majority (
https://cwiki.apache.org/confluence/display/KAFKA/Bylaws#Bylaws-Voting). So
a -1 doesn't kill the release. The question is whether those issues are
really show stoppers.

Thanks,

Jun




On Mon, Dec 2, 2013 at 10:19 AM, David Arthur mum...@gmail.com wrote:

 Inline:


 On 12/2/13 11:59 AM, Joe Stein wrote:

 General future thought comment first: lets be careful please to raising
 issues as show stoppers that have been there previously (especially if
 greater than one version previous release back also has the problem) and
 can get fixed in a subsequent release and is only now more pressing
 because
 we know about them... seeing something should not necessarily always
 create
 priority (sometimes sure, of course but not always that is not the best
 way
 to manage changes).  The VOTE thread should be to artifacts and what we
 are
 releasing as proper and correct per Apache guidelines... and to make sure
 that the person doing the release doesn't do something incorrect ... like
 using the wrong version of JDK to build =8^/.  If we are not happy with
 release as ready to ship then lets not call a VOTE and save the prolonged
 weeks that drag out with so many release candidates.  The community
 suffers
 from this.

 +1 If we can get most of this release preparation stuff automated, then we
 can iterate on it in a release branch before tagging and voting.

  ok, now on to RC5 ...lets extend the vote until 12pm PT tomorrow ...
 hopefully a few more hours for other folks to comment and discuss the
 issues you raised with my $0.02852425 included below and follow-ups as
 they
 become necessary... I am also out of pocket in a few hours until tomorrow
 morning so if it passed I would not be able to publish and announce or if
 failed look towards RC6 anyways =8^)

 /***
   Joe Stein
   Founder, Principal Consultant
   Big Data Open Source Security LLC
   http://www.stealth.ly
   Twitter: @allthingshadoop http://www.twitter.com/allthingshadoop
 /


 On Mon, Dec 2, 2013 at 11:00 AM, David Arthur mum...@gmail.com wrote:

  Seems like most people are verifying the src, so I'll pick on the
 binaries
 and Maven stuff ;)

 A few problems I see:

 There are some vestigial Git files in the src download: an empty .git and
 .gitignore

  Ok, I can do a better job with 0.8.1 but I am not sure this is very
 different than beta1 and not necessarily a show stopper for 0.8.0
 requiring
 another release candidate, is it?  I think updating the release docs and
 rmdir .git after the rm -fr and rm .gitignore moving forward makes sense.

 Agreed, not a show stopper.



  In the source download, I see the SBT license in LICENSE which seems
 correct (since we distribute an SBT binary), but in the binary download I
 see the same license. Don't we need the Scala license (
 http://www.scala-lang.org/license.html) in the binary distribution?

  I fixed this already not only in the binary release
 https://issues.apache.org/jira/browse/KAFKA-1131 but also in the JAR
 files
 that are published to Maven
 https://issues.apache.org/jira/browse/KAFKA-1133are you checking from
 http://people.apache.org/~joestein/kafka-0.8.0-candidate5/ because I just
 downloaded again and it looks alright to me.  If not then definitely this
 RC should be shot down because it does not do what we are saying it is
 doing.. but if it is wrong can you be more specific and create a JIRA with
 the fix because I thought I got it right already... but if not then lets
 get it right because that is why we pulled the release in RC3

 The LICENSE file in both the src and binary downloads includes SBT
 LICENSE at the end. I could be wrong, but I think the src download should
 include the SBT licnese and the binary download should include the Scala
 license. Since we have released in the past without proper licensing, it's
 probably not a huge deal to do it again (but we should fix it).


  I create a simple Ant+Ivy project to test resolving the artifacts
 published to Apache staging repo: https://github.com/mumrah/kafka-ivy.
 This will fetch Kafka libs from the Apache staging area and other things
 from Maven Central. It will fetch the jars into lib/ivy/{conf} and
 generate
 a report of the dependencies, conflicts, and licenses into ivy-report.
 Notice I had to add three exclusions to get things working. Maybe we
 should
 add these to our pom?

  I don't think this is a showstopper is it?  can't this wait for 0.8.1
 and
 not hold up the 0.8.0 release?

 No I don't think it's a show stopper. But to Neha's point, a painless
 Maven/Ivy/SBT/Gradle integration is important since this is how most users
 interface with Kafka. That said, ZooKeeper is what's pulling in these
 troublesome deps and it doesn't stop people from using ZooKeeper. I can
 live with this.


 I didn't have this issue with java maven pom or scala sbt so maybe
 something more ivy ant