[jira] [Created] (KAFKA-1006) Mirror maker loses messages of a new topic

2013-08-09 Thread Swapnil Ghike (JIRA)
Swapnil Ghike created KAFKA-1006:


 Summary: Mirror maker loses messages of a new topic
 Key: KAFKA-1006
 URL: https://issues.apache.org/jira/browse/KAFKA-1006
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.8
Reporter: Swapnil Ghike


Mirror maker currently uses auto.offset.reset = largest on the consumer side by 
default. If a new topic is created, consumer's topic watcher is fired. The 
consumer will first finish partition reassignment as part of rebalance and then 
start consuming from the tail of each partition. Until the partition 
reassignment is over, the server may have appended new messages to the new 
topic, mirror maker won't consume these messages. Thus, multiple batches of 
messages may be lost when a topic is newly created.

The fix is to start consuming from the earliest offset for newly created topics.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (KAFKA-992) Double Check on Broker Registration to Avoid False NodeExist Exception

2013-08-09 Thread Jun Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13734903#comment-13734903
 ] 

Jun Rao commented on KAFKA-992:
---

Thanks for the patch. It doesn't seem to apply for me. Do you need to rebase? 
Just one quick comment. It seems there is common code in the re-registering 
logic of broker, controller and consumer. Instead of duplicating the code, 
could we create a common util to share the code?

> Double Check on Broker Registration to Avoid False NodeExist Exception
> --
>
> Key: KAFKA-992
> URL: https://issues.apache.org/jira/browse/KAFKA-992
> Project: Kafka
>  Issue Type: Bug
>Reporter: Neha Narkhede
>Assignee: Guozhang Wang
> Attachments: KAFKA-992.v1.patch, KAFKA-992.v2.patch, 
> KAFKA-992.v3.patch, KAFKA-992.v4.patch, KAFKA-992.v5.patch, 
> KAFKA-992.v6.patch, KAFKA-992.v7.patch
>
>
> The current behavior of zookeeper for ephemeral nodes is that session 
> expiration and ephemeral node deletion is not an atomic operation. 
> The side-effect of the above zookeeper behavior in Kafka, for certain corner 
> cases, is that ephemeral nodes can be lost even if the session is not 
> expired. The sequence of events that can lead to lossy ephemeral nodes is as 
> follows -
> 1. The session expires on the client, it assumes the ephemeral nodes are 
> deleted, so it establishes a new session with zookeeper and tries to 
> re-create the ephemeral nodes. 
> 2. However, when it tries to re-create the ephemeral node,zookeeper throws 
> back a NodeExists error code. Now this is legitimate during a session 
> disconnect event (since zkclient automatically retries the
> operation and raises a NodeExists error). Also by design, Kafka server 
> doesn't have multiple zookeeper clients create the same ephemeral node, so 
> Kafka server assumes the NodeExists is normal. 
> 3. However, after a few seconds zookeeper deletes that ephemeral node. So 
> from the client's perspective, even though the client has a new valid 
> session, its ephemeral node is gone.
> This behavior is triggered due to very long fsync operations on the zookeeper 
> leader. When the leader wakes up from such a long fsync operation, it has 
> several sessions to expire. And the time between the session expiration and 
> the ephemeral node deletion is magnified. Between these 2 operations, a 
> zookeeper client can issue a ephemeral node creation operation, that could've 
> appeared to have succeeded, but the leader later deletes the ephemeral node 
> leading to permanent ephemeral node loss from the client's perspective. 
> Thread from zookeeper mailing list: 
> http://zookeeper.markmail.org/search/?q=Zookeeper+3.3.4#query:Zookeeper%203.3.4%20date%3A201307%20+page:1+mid:zma242a2qgp6gxvx+state:results

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (KAFKA-347) change number of partitions of a topic online

2013-08-09 Thread Jun Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13734908#comment-13734908
 ] 

Jun Rao commented on KAFKA-347:
---

You just need to run bin/kafka-add-partitions.sh. We will add the docs when 0.8 
final is released. 

> change number of partitions of a topic online
> -
>
> Key: KAFKA-347
> URL: https://issues.apache.org/jira/browse/KAFKA-347
> Project: Kafka
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 0.8
>Reporter: Jun Rao
>Assignee: Sriram Subramanian
>  Labels: features
> Fix For: 0.8.1
>
> Attachments: kafka-347.patch, kafka-347-v2.patch, 
> KAFKA-347-v2-rebased.patch, KAFKA-347-v3.patch, KAFKA-347-v4.patch, 
> KAFKA-347-v5.patch
>
>
> We will need an admin tool to change the number of partitions of a topic 
> online.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (KAFKA-1007) Document new tools before 0.8 release

2013-08-09 Thread Sriram Subramanian (JIRA)
Sriram Subramanian created KAFKA-1007:
-

 Summary: Document new tools before 0.8 release
 Key: KAFKA-1007
 URL: https://issues.apache.org/jira/browse/KAFKA-1007
 Project: Kafka
  Issue Type: Bug
Reporter: Sriram Subramanian


We need to document the following tools before 0.8 release
1. Add partition tool
2. ReassignPartition tool

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Kafka/Hadoop consumers and producers

2013-08-09 Thread Andrew Psaltis
Dibyendu,
According to the pull request: https://github.com/linkedin/camus/pull/15 it
was merged into the camus-kafka-0.8 branch. I have not checked if the code
was subsequently removed, however, two at least one the important files
from this patch
(camus-api/src/main/java/com/linkedin/camus/etl/RecordWriterProvider.java)
is still present.

Thanks,
Andrew


On Fri, Aug 9, 2013 at 9:39 AM,  wrote:

> Hi Ken,
>
> I am also working on making the Camus fit for Non Avro message for our
> requirement.
>
> I see you mentioned about this patch (
> https://github.com/linkedin/camus/commit/87917a2aea46da9d21c8f67129f6463af52f7aa8)
> which supports custom data writer for Camus. But this patch is not pulled
> into camus-kafka-0.8 branch. Is there any plan for doing the same ?
>
> Regards,
> Dibyendu
>
> --
> You received this message because you are subscribed to a topic in the
> Google Groups "Camus - Kafka ETL for Hadoop" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/camus_etl/KKS6t5-O-Ng/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> camus_etl+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.
>


[jira] [Commented] (KAFKA-992) Double Check on Broker Registration to Avoid False NodeExist Exception

2013-08-09 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13734994#comment-13734994
 ] 

Guozhang Wang commented on KAFKA-992:
-

Thanks for the comments Jun. I think the re-registering logic is slightly 
different for broker, controller and consumer:

Broker: need to check hostname + port
Controller: only need to check brokerId
Consumer: need not check anything since the consumer info like hostname and 
port is encoded in the ZkPath.

So I think it is hard to unify consumer's logic with broker and controller's 
logic; it is possible to unify the broker and controller's logic though, by 
passing the list of json fields that we need to check. But I am not sure it 
worth the effort.

> Double Check on Broker Registration to Avoid False NodeExist Exception
> --
>
> Key: KAFKA-992
> URL: https://issues.apache.org/jira/browse/KAFKA-992
> Project: Kafka
>  Issue Type: Bug
>Reporter: Neha Narkhede
>Assignee: Guozhang Wang
> Attachments: KAFKA-992.v1.patch, KAFKA-992.v2.patch, 
> KAFKA-992.v3.patch, KAFKA-992.v4.patch, KAFKA-992.v5.patch, 
> KAFKA-992.v6.patch, KAFKA-992.v7.patch
>
>
> The current behavior of zookeeper for ephemeral nodes is that session 
> expiration and ephemeral node deletion is not an atomic operation. 
> The side-effect of the above zookeeper behavior in Kafka, for certain corner 
> cases, is that ephemeral nodes can be lost even if the session is not 
> expired. The sequence of events that can lead to lossy ephemeral nodes is as 
> follows -
> 1. The session expires on the client, it assumes the ephemeral nodes are 
> deleted, so it establishes a new session with zookeeper and tries to 
> re-create the ephemeral nodes. 
> 2. However, when it tries to re-create the ephemeral node,zookeeper throws 
> back a NodeExists error code. Now this is legitimate during a session 
> disconnect event (since zkclient automatically retries the
> operation and raises a NodeExists error). Also by design, Kafka server 
> doesn't have multiple zookeeper clients create the same ephemeral node, so 
> Kafka server assumes the NodeExists is normal. 
> 3. However, after a few seconds zookeeper deletes that ephemeral node. So 
> from the client's perspective, even though the client has a new valid 
> session, its ephemeral node is gone.
> This behavior is triggered due to very long fsync operations on the zookeeper 
> leader. When the leader wakes up from such a long fsync operation, it has 
> several sessions to expire. And the time between the session expiration and 
> the ephemeral node deletion is magnified. Between these 2 operations, a 
> zookeeper client can issue a ephemeral node creation operation, that could've 
> appeared to have succeeded, but the leader later deletes the ephemeral node 
> leading to permanent ephemeral node loss from the client's perspective. 
> Thread from zookeeper mailing list: 
> http://zookeeper.markmail.org/search/?q=Zookeeper+3.3.4#query:Zookeeper%203.3.4%20date%3A201307%20+page:1+mid:zma242a2qgp6gxvx+state:results

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (KAFKA-992) Double Check on Broker Registration to Avoid False NodeExist Exception

2013-08-09 Thread Neha Narkhede (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13734999#comment-13734999
 ] 

Neha Narkhede commented on KAFKA-992:
-

I agree with Guozhang that the logic to ensure we get around the 
de-registration issue is very nuanced to the specific path and semantics of 
that path. 

+1 on the latest patch.

> Double Check on Broker Registration to Avoid False NodeExist Exception
> --
>
> Key: KAFKA-992
> URL: https://issues.apache.org/jira/browse/KAFKA-992
> Project: Kafka
>  Issue Type: Bug
>Reporter: Neha Narkhede
>Assignee: Guozhang Wang
> Attachments: KAFKA-992.v1.patch, KAFKA-992.v2.patch, 
> KAFKA-992.v3.patch, KAFKA-992.v4.patch, KAFKA-992.v5.patch, 
> KAFKA-992.v6.patch, KAFKA-992.v7.patch
>
>
> The current behavior of zookeeper for ephemeral nodes is that session 
> expiration and ephemeral node deletion is not an atomic operation. 
> The side-effect of the above zookeeper behavior in Kafka, for certain corner 
> cases, is that ephemeral nodes can be lost even if the session is not 
> expired. The sequence of events that can lead to lossy ephemeral nodes is as 
> follows -
> 1. The session expires on the client, it assumes the ephemeral nodes are 
> deleted, so it establishes a new session with zookeeper and tries to 
> re-create the ephemeral nodes. 
> 2. However, when it tries to re-create the ephemeral node,zookeeper throws 
> back a NodeExists error code. Now this is legitimate during a session 
> disconnect event (since zkclient automatically retries the
> operation and raises a NodeExists error). Also by design, Kafka server 
> doesn't have multiple zookeeper clients create the same ephemeral node, so 
> Kafka server assumes the NodeExists is normal. 
> 3. However, after a few seconds zookeeper deletes that ephemeral node. So 
> from the client's perspective, even though the client has a new valid 
> session, its ephemeral node is gone.
> This behavior is triggered due to very long fsync operations on the zookeeper 
> leader. When the leader wakes up from such a long fsync operation, it has 
> several sessions to expire. And the time between the session expiration and 
> the ephemeral node deletion is magnified. Between these 2 operations, a 
> zookeeper client can issue a ephemeral node creation operation, that could've 
> appeared to have succeeded, but the leader later deletes the ephemeral node 
> leading to permanent ephemeral node loss from the client's perspective. 
> Thread from zookeeper mailing list: 
> http://zookeeper.markmail.org/search/?q=Zookeeper+3.3.4#query:Zookeeper%203.3.4%20date%3A201307%20+page:1+mid:zma242a2qgp6gxvx+state:results

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (KAFKA-990) Fix ReassignPartitionCommand and improve usability

2013-08-09 Thread Joel Koshy (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13735120#comment-13735120
 ] 

Joel Koshy commented on KAFKA-990:
--

Can you elaborate on the change to shutdownBroker in KafkaController? I
think we need to include shutting down brokers because the previous shutdown
attempt may have been incomplete due to no other brokers in ISR for some
partition which would have prevented leader movement. Subsequent attempts
would now be rejected.

Good catches on the controller failover. Agree with Neha that #2 is not a
problem for replicas that are in ISR, however, we do need to re-register the
ISR change listener for those replicas that are in ISR.

Finally, we should probably open a separate jira to implement a feature to
cancel an ongoing reassignment given that it is a long-running operation.
The dry-run option reduces the need for this but nevertheless I think it's a
good feature to support in the future.



> Fix ReassignPartitionCommand and improve usability
> --
>
> Key: KAFKA-990
> URL: https://issues.apache.org/jira/browse/KAFKA-990
> Project: Kafka
>  Issue Type: Bug
>Reporter: Sriram Subramanian
>Assignee: Sriram Subramanian
> Attachments: KAFKA-990-v1.patch, KAFKA-990-v1-rebased.patch
>
>
> 1. The tool does not register for IsrChangeListener on controller failover.
> 2. There is a race condition where the previous listener can fire on 
> controller failover and the replicas can be in ISR. Even after re-registering 
> the ISR listener after failover, it will never be triggered.
> 3. The input the tool is a static list which is very hard to use. To improve 
> this, as a first step the tool needs to take a list of topics and list of 
> brokers to do the assignment to and then generate the reassignment plan.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Kafka/Hadoop consumers and producers

2013-08-09 Thread Ken Goodhope
I just checked and that patch is in .8 branch.   Thanks for working on back
porting it Andrew.  We'd be happy to commit that work to master.

As for the kafka contrib project vs Camus, they are similar but not quite
identical.  Camus is intended to be a high throughput ETL for bulk
ingestion of Kafka data into HDFS.  Where as what we have in contrib is
more of a simple KafkaInputFormat.  Neither can really replace the other.
If you had a complex hadoop workflow and wanted to introduce some Kafka
data into that workflow, using Camus would be a gigantic overkill and a
pain to setup.  On the flipside, if what you want is frequent reliable
ingest of Kafka data into HDFS, a simple InputFormat doesn't provide you
with that.

I think it would be preferable to simplify the existing contrib
Input/OutputFormats by refactoring them to use the more stable higher level
Kafka APIs.  Currently they use the lower level APIs.  This should make
them easier to maintain, and user friendly enough to avoid the need for
extensive documentation.

Ken


On Fri, Aug 9, 2013 at 8:52 AM, Andrew Psaltis wrote:

> Dibyendu,
> According to the pull request: https://github.com/linkedin/camus/pull/15it 
> was merged into the camus-kafka-0.8
> branch. I have not checked if the code was subsequently removed, however,
> two at least one the important files from this patch 
> (camus-api/src/main/java/com/linkedin/camus/etl/RecordWriterProvider.java)
> is still present.
>
> Thanks,
> Andrew
>
>
> On Fri, Aug 9, 2013 at 9:39 AM,  wrote:
>
>> Hi Ken,
>>
>> I am also working on making the Camus fit for Non Avro message for our
>> requirement.
>>
>> I see you mentioned about this patch (
>> https://github.com/linkedin/camus/commit/87917a2aea46da9d21c8f67129f6463af52f7aa8)
>> which supports custom data writer for Camus. But this patch is not pulled
>> into camus-kafka-0.8 branch. Is there any plan for doing the same ?
>>
>> Regards,
>> Dibyendu
>>
>> --
>> You received this message because you are subscribed to a topic in the
>> Google Groups "Camus - Kafka ETL for Hadoop" group.
>> To unsubscribe from this topic, visit
>> https://groups.google.com/d/topic/camus_etl/KKS6t5-O-Ng/unsubscribe.
>> To unsubscribe from this group and all its topics, send an email to
>> camus_etl+unsubscr...@googlegroups.com.
>> For more options, visit https://groups.google.com/groups/opt_out.
>>
>
>  --
> You received this message because you are subscribed to the Google Groups
> "Camus - Kafka ETL for Hadoop" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to camus_etl+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.
>
>
>


[jira] [Updated] (KAFKA-992) Double Check on Broker Registration to Avoid False NodeExist Exception

2013-08-09 Thread Guozhang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang updated KAFKA-992:


Attachment: KAFKA-992.v8.patch

Incremental patch uploaded. Fixed a small issue that Json.parse will not throw 
exception but instead returns None. System test passed.

> Double Check on Broker Registration to Avoid False NodeExist Exception
> --
>
> Key: KAFKA-992
> URL: https://issues.apache.org/jira/browse/KAFKA-992
> Project: Kafka
>  Issue Type: Bug
>Reporter: Neha Narkhede
>Assignee: Guozhang Wang
> Attachments: KAFKA-992.v1.patch, KAFKA-992.v2.patch, 
> KAFKA-992.v3.patch, KAFKA-992.v4.patch, KAFKA-992.v5.patch, 
> KAFKA-992.v6.patch, KAFKA-992.v7.patch, KAFKA-992.v8.patch
>
>
> The current behavior of zookeeper for ephemeral nodes is that session 
> expiration and ephemeral node deletion is not an atomic operation. 
> The side-effect of the above zookeeper behavior in Kafka, for certain corner 
> cases, is that ephemeral nodes can be lost even if the session is not 
> expired. The sequence of events that can lead to lossy ephemeral nodes is as 
> follows -
> 1. The session expires on the client, it assumes the ephemeral nodes are 
> deleted, so it establishes a new session with zookeeper and tries to 
> re-create the ephemeral nodes. 
> 2. However, when it tries to re-create the ephemeral node,zookeeper throws 
> back a NodeExists error code. Now this is legitimate during a session 
> disconnect event (since zkclient automatically retries the
> operation and raises a NodeExists error). Also by design, Kafka server 
> doesn't have multiple zookeeper clients create the same ephemeral node, so 
> Kafka server assumes the NodeExists is normal. 
> 3. However, after a few seconds zookeeper deletes that ephemeral node. So 
> from the client's perspective, even though the client has a new valid 
> session, its ephemeral node is gone.
> This behavior is triggered due to very long fsync operations on the zookeeper 
> leader. When the leader wakes up from such a long fsync operation, it has 
> several sessions to expire. And the time between the session expiration and 
> the ephemeral node deletion is magnified. Between these 2 operations, a 
> zookeeper client can issue a ephemeral node creation operation, that could've 
> appeared to have succeeded, but the leader later deletes the ephemeral node 
> leading to permanent ephemeral node loss from the client's perspective. 
> Thread from zookeeper mailing list: 
> http://zookeeper.markmail.org/search/?q=Zookeeper+3.3.4#query:Zookeeper%203.3.4%20date%3A201307%20+page:1+mid:zma242a2qgp6gxvx+state:results

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (KAFKA-992) Double Check on Broker Registration to Avoid False NodeExist Exception

2013-08-09 Thread Neha Narkhede (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13735437#comment-13735437
 ] 

Neha Narkhede commented on KAFKA-992:
-

+1 on v8. Good catch!

> Double Check on Broker Registration to Avoid False NodeExist Exception
> --
>
> Key: KAFKA-992
> URL: https://issues.apache.org/jira/browse/KAFKA-992
> Project: Kafka
>  Issue Type: Bug
>Reporter: Neha Narkhede
>Assignee: Guozhang Wang
> Attachments: KAFKA-992.v1.patch, KAFKA-992.v2.patch, 
> KAFKA-992.v3.patch, KAFKA-992.v4.patch, KAFKA-992.v5.patch, 
> KAFKA-992.v6.patch, KAFKA-992.v7.patch, KAFKA-992.v8.patch
>
>
> The current behavior of zookeeper for ephemeral nodes is that session 
> expiration and ephemeral node deletion is not an atomic operation. 
> The side-effect of the above zookeeper behavior in Kafka, for certain corner 
> cases, is that ephemeral nodes can be lost even if the session is not 
> expired. The sequence of events that can lead to lossy ephemeral nodes is as 
> follows -
> 1. The session expires on the client, it assumes the ephemeral nodes are 
> deleted, so it establishes a new session with zookeeper and tries to 
> re-create the ephemeral nodes. 
> 2. However, when it tries to re-create the ephemeral node,zookeeper throws 
> back a NodeExists error code. Now this is legitimate during a session 
> disconnect event (since zkclient automatically retries the
> operation and raises a NodeExists error). Also by design, Kafka server 
> doesn't have multiple zookeeper clients create the same ephemeral node, so 
> Kafka server assumes the NodeExists is normal. 
> 3. However, after a few seconds zookeeper deletes that ephemeral node. So 
> from the client's perspective, even though the client has a new valid 
> session, its ephemeral node is gone.
> This behavior is triggered due to very long fsync operations on the zookeeper 
> leader. When the leader wakes up from such a long fsync operation, it has 
> several sessions to expire. And the time between the session expiration and 
> the ephemeral node deletion is magnified. Between these 2 operations, a 
> zookeeper client can issue a ephemeral node creation operation, that could've 
> appeared to have succeeded, but the leader later deletes the ephemeral node 
> leading to permanent ephemeral node loss from the client's perspective. 
> Thread from zookeeper mailing list: 
> http://zookeeper.markmail.org/search/?q=Zookeeper+3.3.4#query:Zookeeper%203.3.4%20date%3A201307%20+page:1+mid:zma242a2qgp6gxvx+state:results

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Closed] (KAFKA-992) Double Check on Broker Registration to Avoid False NodeExist Exception

2013-08-09 Thread Neha Narkhede (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Neha Narkhede closed KAFKA-992.
---


> Double Check on Broker Registration to Avoid False NodeExist Exception
> --
>
> Key: KAFKA-992
> URL: https://issues.apache.org/jira/browse/KAFKA-992
> Project: Kafka
>  Issue Type: Bug
>Reporter: Neha Narkhede
>Assignee: Guozhang Wang
> Attachments: KAFKA-992.v1.patch, KAFKA-992.v2.patch, 
> KAFKA-992.v3.patch, KAFKA-992.v4.patch, KAFKA-992.v5.patch, 
> KAFKA-992.v6.patch, KAFKA-992.v7.patch, KAFKA-992.v8.patch
>
>
> The current behavior of zookeeper for ephemeral nodes is that session 
> expiration and ephemeral node deletion is not an atomic operation. 
> The side-effect of the above zookeeper behavior in Kafka, for certain corner 
> cases, is that ephemeral nodes can be lost even if the session is not 
> expired. The sequence of events that can lead to lossy ephemeral nodes is as 
> follows -
> 1. The session expires on the client, it assumes the ephemeral nodes are 
> deleted, so it establishes a new session with zookeeper and tries to 
> re-create the ephemeral nodes. 
> 2. However, when it tries to re-create the ephemeral node,zookeeper throws 
> back a NodeExists error code. Now this is legitimate during a session 
> disconnect event (since zkclient automatically retries the
> operation and raises a NodeExists error). Also by design, Kafka server 
> doesn't have multiple zookeeper clients create the same ephemeral node, so 
> Kafka server assumes the NodeExists is normal. 
> 3. However, after a few seconds zookeeper deletes that ephemeral node. So 
> from the client's perspective, even though the client has a new valid 
> session, its ephemeral node is gone.
> This behavior is triggered due to very long fsync operations on the zookeeper 
> leader. When the leader wakes up from such a long fsync operation, it has 
> several sessions to expire. And the time between the session expiration and 
> the ephemeral node deletion is magnified. Between these 2 operations, a 
> zookeeper client can issue a ephemeral node creation operation, that could've 
> appeared to have succeeded, but the leader later deletes the ephemeral node 
> leading to permanent ephemeral node loss from the client's perspective. 
> Thread from zookeeper mailing list: 
> http://zookeeper.markmail.org/search/?q=Zookeeper+3.3.4#query:Zookeeper%203.3.4%20date%3A201307%20+page:1+mid:zma242a2qgp6gxvx+state:results

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (KAFKA-990) Fix ReassignPartitionCommand and improve usability

2013-08-09 Thread Joel Koshy (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13735653#comment-13735653
 ] 

Joel Koshy commented on KAFKA-990:
--

Looks like I might have looked at the wrong patch. I'll review this again this 
weekend.

> Fix ReassignPartitionCommand and improve usability
> --
>
> Key: KAFKA-990
> URL: https://issues.apache.org/jira/browse/KAFKA-990
> Project: Kafka
>  Issue Type: Bug
>Reporter: Sriram Subramanian
>Assignee: Sriram Subramanian
> Attachments: KAFKA-990-v1.patch, KAFKA-990-v1-rebased.patch
>
>
> 1. The tool does not register for IsrChangeListener on controller failover.
> 2. There is a race condition where the previous listener can fire on 
> controller failover and the replicas can be in ISR. Even after re-registering 
> the ISR listener after failover, it will never be triggered.
> 3. The input the tool is a static list which is very hard to use. To improve 
> this, as a first step the tool needs to take a list of topics and list of 
> brokers to do the assignment to and then generate the reassignment plan.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Kafka/Hadoop consumers and producers

2013-08-09 Thread dibyendu . bhattacharya
Hi Ken, 

I am also working on making the Camus fit for Non Avro message for our 
requirement. 

I see you mentioned about this patch 
(https://github.com/linkedin/camus/commit/87917a2aea46da9d21c8f67129f6463af52f7aa8)
 which supports custom data writer for Camus. But this patch is not pulled into 
camus-kafka-0.8 branch. Is there any plan for doing the same ?

Regards, 
Dibyendu