librdkafka 0.8.0 released

2013-11-22 Thread Magnus Edenhill
This announces the 0.8.0 release of librdkafka - The Apache Kafka client C
library - now with 0.8 protocol support.

Features:
* Producer (~800K msgs/s)
* Consumer  (~3M msgs/s)
* Compression (Snappy, gzip)
* Proper failover and leader re-election support - no message is ever lost.
* Configuration properties compatible with official Apache Kafka.
* Stabilized ABI-safe API
* Mainline Debian package submitted
* Production quality


Home:
https://github.com/edenhill/librdkafka

Introduction and performance numbers:
https://github.com/edenhill/librdkafka/blob/master/INTRODUCTION.md

Have fun.

Regards,
Magnus

P.S.
Check out Wikimedia Foundation's varnishkafka daemon for a use case -
varnish log forwarding over Kafka:
https://github.com/wikimedia/varnishkafka


Re: librdkafka 0.8.0 released

2013-11-22 Thread Neha Narkhede
Thanks for sharing this! What is the message size for the throughput
numbers stated below?

Thanks,
Neha
On Nov 22, 2013 6:59 AM, "Magnus Edenhill"  wrote:

> This announces the 0.8.0 release of librdkafka - The Apache Kafka client C
> library - now with 0.8 protocol support.
>
> Features:
> * Producer (~800K msgs/s)
> * Consumer  (~3M msgs/s)
> * Compression (Snappy, gzip)
> * Proper failover and leader re-election support - no message is ever lost.
> * Configuration properties compatible with official Apache Kafka.
> * Stabilized ABI-safe API
> * Mainline Debian package submitted
> * Production quality
>
>
> Home:
> https://github.com/edenhill/librdkafka
>
> Introduction and performance numbers:
> https://github.com/edenhill/librdkafka/blob/master/INTRODUCTION.md
>
> Have fun.
>
> Regards,
> Magnus
>
> P.S.
> Check out Wikimedia Foundation's varnishkafka daemon for a use case -
> varnish log forwarding over Kafka:
> https://github.com/wikimedia/varnishkafka
>


[jira] [Updated] (KAFKA-1136) Add subAppend in Log4jAppender for generic usage

2013-11-22 Thread Jun Rao (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jun Rao updated KAFKA-1136:
---

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Thanks for the patch. +1. Committed to trunk.

> Add subAppend in Log4jAppender for generic usage
> 
>
> Key: KAFKA-1136
> URL: https://issues.apache.org/jira/browse/KAFKA-1136
> Project: Kafka
>  Issue Type: Improvement
>  Components: producer 
>Reporter: Jie Huang
>Assignee: Jun Rao
>Priority: Trivial
> Fix For: 0.8.1
>
> Attachments: KAFKA-1136.diff
>
>
> KafkaLog4jAppender is quite useful for us to send our log4j logs to the Kafka 
> system with ease. However, according to our experience,  it is not so that 
> convenient to customize the message content before emitting it out. 
> Sometimes, we need to decorate the message like adding more system level 
> information before passing it to the producer. I wonder if it is possible to 
> add one subAppend() function, like org.apache.log4j.WriterAppender does. 
> Thus, the end user can customize their message by overwriting the subAppend() 
> only in their own hierarchy class, and re-use all the rest part.  



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Resolved] (KAFKA-1103) Consumer uses two zkclients

2013-11-22 Thread Jun Rao (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jun Rao resolved KAFKA-1103.


Resolution: Fixed

Thanks for the patch. +1. Committed to trunk after removing the unnecessary 
change to log4j property file.

> Consumer uses two zkclients
> ---
>
> Key: KAFKA-1103
> URL: https://issues.apache.org/jira/browse/KAFKA-1103
> Project: Kafka
>  Issue Type: Bug
>Reporter: Joel Koshy
>Assignee: Guozhang Wang
> Fix For: 0.8.1
>
> Attachments: KAFKA-1103.patch, KAFKA-1103_2013-11-20_12:59:09.patch, 
> KAFKA-1103_2013-11-21_11:22:04.patch
>
>
> .. which is very confusing when debugging consumer logs. I don't remember any 
> good reason for this, and we should get rid of the one instantiated in 
> ZookeeperTopicEventWatcher if possible.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (KAFKA-1103) Consumer uses two zkclients

2013-11-22 Thread Joel Koshy (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13830186#comment-13830186
 ] 

Joel Koshy commented on KAFKA-1103:
---

I think the issue is that the patch-review tool compares both the feature 
branch and the origin branch at the current HEAD. We should just have it take 
the diff from the last rebase point. I'll file a jira and provide a patch for 
that.

> Consumer uses two zkclients
> ---
>
> Key: KAFKA-1103
> URL: https://issues.apache.org/jira/browse/KAFKA-1103
> Project: Kafka
>  Issue Type: Bug
>Reporter: Joel Koshy
>Assignee: Guozhang Wang
> Fix For: 0.8.1
>
> Attachments: KAFKA-1103.patch, KAFKA-1103_2013-11-20_12:59:09.patch, 
> KAFKA-1103_2013-11-21_11:22:04.patch
>
>
> .. which is very confusing when debugging consumer logs. I don't remember any 
> good reason for this, and we should get rid of the one instantiated in 
> ZookeeperTopicEventWatcher if possible.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Created] (KAFKA-1142) Patch review tool should take diff with origin from last divergent point

2013-11-22 Thread Joel Koshy (JIRA)
Joel Koshy created KAFKA-1142:
-

 Summary: Patch review tool should take diff with origin from last 
divergent point
 Key: KAFKA-1142
 URL: https://issues.apache.org/jira/browse/KAFKA-1142
 Project: Kafka
  Issue Type: Bug
Reporter: Joel Koshy
Assignee: Joel Koshy






--
This message was sent by Atlassian JIRA
(v6.1#6144)


Review Request 15793: Patch for KAFKA-1142

2013-11-22 Thread joel koshy

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15793/
---

Review request for kafka.


Bugs: KAFKA-1142
https://issues.apache.org/jira/browse/KAFKA-1142


Repository: kafka


Description
---

Take diff from last divergent point


Diffs
-

  kafka-patch-review.py 7fa6cb5165d0d497ec3004dc2c98b60fb8d0436d 

Diff: https://reviews.apache.org/r/15793/diff/


Testing
---


Thanks,

joel koshy



[jira] [Commented] (KAFKA-1142) Patch review tool should take diff with origin from last divergent point

2013-11-22 Thread Joel Koshy (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13830191#comment-13830191
 ] 

Joel Koshy commented on KAFKA-1142:
---

Created reviewboard  against branch origin/trunk

> Patch review tool should take diff with origin from last divergent point
> 
>
> Key: KAFKA-1142
> URL: https://issues.apache.org/jira/browse/KAFKA-1142
> Project: Kafka
>  Issue Type: Bug
>Reporter: Joel Koshy
>Assignee: Joel Koshy
> Attachments: KAFKA-1142.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (KAFKA-1142) Patch review tool should take diff with origin from last divergent point

2013-11-22 Thread Joel Koshy (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Koshy updated KAFKA-1142:
--

Attachment: KAFKA-1142.patch

> Patch review tool should take diff with origin from last divergent point
> 
>
> Key: KAFKA-1142
> URL: https://issues.apache.org/jira/browse/KAFKA-1142
> Project: Kafka
>  Issue Type: Bug
>Reporter: Joel Koshy
>Assignee: Joel Koshy
> Attachments: KAFKA-1142.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (KAFKA-1142) Patch review tool should take diff with origin from last divergent point

2013-11-22 Thread Joel Koshy (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13830193#comment-13830193
 ] 

Joel Koshy commented on KAFKA-1142:
---

I got: rbtools.api.errors.APIError: HTTP 502
(from my laptop)
Can someone else try the above patch?

> Patch review tool should take diff with origin from last divergent point
> 
>
> Key: KAFKA-1142
> URL: https://issues.apache.org/jira/browse/KAFKA-1142
> Project: Kafka
>  Issue Type: Bug
>Reporter: Joel Koshy
>Assignee: Joel Koshy
> Attachments: KAFKA-1142.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (KAFKA-1142) Patch review tool should take diff with origin from last divergent point

2013-11-22 Thread Joel Koshy (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13830200#comment-13830200
 ] 

Joel Koshy commented on KAFKA-1142:
---

Actually, it did post a diff:
https://reviews.apache.org/r/15793/diff/
Not sure why it did not succeed posting the link to jira.

I used this jira to test:
- Checked out a much older git hash
- Applied the above patch
- The patch review tool now only shows the diff from that older git hash (i.e., 
the divergent point)

I think it is good to do this on 0.8 as well - just to avoid patches that 
accidentally delete old commits.


> Patch review tool should take diff with origin from last divergent point
> 
>
> Key: KAFKA-1142
> URL: https://issues.apache.org/jira/browse/KAFKA-1142
> Project: Kafka
>  Issue Type: Bug
>Reporter: Joel Koshy
>Assignee: Joel Koshy
> Attachments: KAFKA-1142.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Created] (KAFKA-1143) Consumer should cache topic partition info

2013-11-22 Thread Guozhang Wang (JIRA)
Guozhang Wang created KAFKA-1143:


 Summary: Consumer should cache topic partition info
 Key: KAFKA-1143
 URL: https://issues.apache.org/jira/browse/KAFKA-1143
 Project: Kafka
  Issue Type: Bug
Reporter: Guozhang Wang
Assignee: Guozhang Wang
 Fix For: 0.8.1


So that

1. It can check if rebalances are necessary when topic/partition watcher fires 
(they can be triggered for state change event even the data does not change at 
all).

2. Rebalance does not need to read again from ZK.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


Review Request 15805: Patch for KAFKA-1140

2013-11-22 Thread Guozhang Wang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15805/
---

Review request for kafka.


Bugs: KAFKA-1140
https://issues.apache.org/jira/browse/KAFKA-1140


Repository: kafka


Description
---

KAFKA-1140.v1


Dummy


Diffs
-

  core/src/main/scala/kafka/consumer/ConsumerIterator.scala 
a4227a49684c7de08e07cb1f3a10d2f76ba28da7 

Diff: https://reviews.apache.org/r/15805/diff/


Testing
---


Thanks,

Guozhang Wang



[jira] [Updated] (KAFKA-1140) Move the decoding logic from ConsumerIterator.makeNext to next

2013-11-22 Thread Guozhang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-1140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang updated KAFKA-1140:
-

Attachment: KAFKA-1140.patch

> Move the decoding logic from ConsumerIterator.makeNext to next
> --
>
> Key: KAFKA-1140
> URL: https://issues.apache.org/jira/browse/KAFKA-1140
> Project: Kafka
>  Issue Type: Bug
>Reporter: Guozhang Wang
>Assignee: Guozhang Wang
> Fix For: 0.8.1
>
> Attachments: KAFKA-1140.patch
>
>
> Usually people will write code around consumer like
> while(iter.hasNext()) {
> try {
>   msg = iter.next()
>   // do something
> }
> catch{
> }
> }
> 
> However, the iter.hasNext() call itself can throw exceptions due to decoding 
> failures. It would be better to move the decoding to the next function call.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (KAFKA-1140) Move the decoding logic from ConsumerIterator.makeNext to next

2013-11-22 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13830373#comment-13830373
 ] 

Guozhang Wang commented on KAFKA-1140:
--

Created reviewboard https://reviews.apache.org/r/15805/
 against branch origin/trunk

> Move the decoding logic from ConsumerIterator.makeNext to next
> --
>
> Key: KAFKA-1140
> URL: https://issues.apache.org/jira/browse/KAFKA-1140
> Project: Kafka
>  Issue Type: Bug
>Reporter: Guozhang Wang
>Assignee: Guozhang Wang
> Fix For: 0.8.1
>
> Attachments: KAFKA-1140.patch
>
>
> Usually people will write code around consumer like
> while(iter.hasNext()) {
> try {
>   msg = iter.next()
>   // do something
> }
> catch{
> }
> }
> 
> However, the iter.hasNext() call itself can throw exceptions due to decoding 
> failures. It would be better to move the decoding to the next function call.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


Re: Kafka/Hadoop consumers and producers

2013-11-22 Thread Abhi Basu
I agree with you. We are looking for a simple solution for data from Kafka 
to Hadoop. I have tried using Camus earlier (Non-Avro) and documentation is 
lacking to make it work correctly, as we do not need to introduce another 
component to the solution. In the meantime, can the Kafka Hadoop 
Consumer/Producer be documented well so we can try it out ASAP. :)  Thanks.

On Friday, August 9, 2013 12:27:12 PM UTC-7, Ken Goodhope wrote:
>
> I just checked and that patch is in .8 branch.   Thanks for working on 
> back porting it Andrew.  We'd be happy to commit that work to master.
>
> As for the kafka contrib project vs Camus, they are similar but not quite 
> identical.  Camus is intended to be a high throughput ETL for bulk 
> ingestion of Kafka data into HDFS.  Where as what we have in contrib is 
> more of a simple KafkaInputFormat.  Neither can really replace the other.  
> If you had a complex hadoop workflow and wanted to introduce some Kafka 
> data into that workflow, using Camus would be a gigantic overkill and a 
> pain to setup.  On the flipside, if what you want is frequent reliable 
> ingest of Kafka data into HDFS, a simple InputFormat doesn't provide you 
> with that.
>
> I think it would be preferable to simplify the existing contrib 
> Input/OutputFormats by refactoring them to use the more stable higher level 
> Kafka APIs.  Currently they use the lower level APIs.  This should make 
> them easier to maintain, and user friendly enough to avoid the need for 
> extensive documentation.
>
> Ken
>
>
> On Fri, Aug 9, 2013 at 8:52 AM, Andrew Psaltis 
> 
> > wrote:
>
>> Dibyendu,
>> According to the pull request: 
>> https://github.com/linkedin/camus/pull/15it
>>  was merged into the camus-kafka-0.8 
>> branch. I have not checked if the code was subsequently removed, however, 
>> two at least one the important files from this patch 
>> (camus-api/src/main/java/com/linkedin/camus/etl/RecordWriterProvider.java) 
>> is still present.
>>
>> Thanks,
>> Andrew
>>
>>
>>  On Fri, Aug 9, 2013 at 9:39 AM, 
>> > wrote:
>>
>>>  Hi Ken,
>>>
>>> I am also working on making the Camus fit for Non Avro message for our 
>>> requirement.
>>>
>>> I see you mentioned about this patch (
>>> https://github.com/linkedin/camus/commit/87917a2aea46da9d21c8f67129f6463af52f7aa8)
>>>  
>>> which supports custom data writer for Camus. But this patch is not pulled 
>>> into camus-kafka-0.8 branch. Is there any plan for doing the same ?
>>>
>>> Regards,
>>> Dibyendu
>>>
>>> --
>>> You received this message because you are subscribed to a topic in the 
>>> Google Groups "Camus - Kafka ETL for Hadoop" group.
>>> To unsubscribe from this topic, visit 
>>> https://groups.google.com/d/topic/camus_etl/KKS6t5-O-Ng/unsubscribe.
>>> To unsubscribe from this group and all its topics, send an email to 
>>> camus_etl+...@googlegroups.com .
>>> For more options, visit https://groups.google.com/groups/opt_out.
>>>
>>
>>  -- 
>> You received this message because you are subscribed to the Google Groups 
>> "Camus - Kafka ETL for Hadoop" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to camus_etl+...@googlegroups.com .
>> For more options, visit https://groups.google.com/groups/opt_out.
>>  
>>  
>>
>
>