Re: Unable to build kafka.

2014-06-05 Thread Abhishek Bhattacharjee
Thanks for the reply. I re built kafka. And now when I am running server it is stopping saying it cannot find the class Kafka.Kafka. I don't know what went most probably it's an sbt problem. Any help is welcome. Thanks :-) On Jun 5, 2014 6:05 AM, "Joe Stein" wrote: > The build looks fine your err

Re: Unable to build kafka.

2014-06-05 Thread Joe Stein
Can you try the latest stable branch, please? We moved to gradle https://github.com/apache/kafka/blob/0.8.1/README.md and also distribute binaries as artifacts for download https://kafka.apache.org/downloads.html /*** Joe Stein Founder, Principal Consulta

Re: Unable to build kafka.

2014-06-05 Thread Abhishek Bhattacharjee
I have a project running on kafka.0.8.0. Are there any changes in the Producer and consumer APIs in the new version ? Thanks. On Jun 5, 2014 3:53 PM, "Joe Stein" wrote: > Can you try the latest stable branch, please? > > We moved to gradle https://github.com/apache/kafka/blob/0.8.1/README.md > an

Re: Unable to build kafka.

2014-06-05 Thread Joe Stein
There are overloaded changes. You don't have to use them (e.g. key is now used in compaction with partKey for the partition key https://github.com/apache/kafka/commit/d285e263bf403f3db27f5d138594c395643a2284#diff-9df10051217ceff96ecce28d087cf0bb for new compaction feature https://cwiki.apache.org/c

RE: question about synchronous producer

2014-06-05 Thread Libo Yu
I want to know why there will be message loss when brokers are down for too long. I've noticed message loss when brokers are restarted during publishing. It is a sync producer with request.required.acks set to 1. Libo > Date: Thu, 29 May 2014 20:11:48 -0700 > Subject: Re: question about synchro

Re: Hadoop Summit Meetups

2014-06-05 Thread Jun Rao
It sounds like that you want to write to a data store and a data pipe atomically. Since both the data store and the data pipe that you want to use are highly available, the only case that you want to protect is the client failing btw the two writes. One way to do that is to let the client publish t

Re: Hadoop Summit Meetups

2014-06-05 Thread Nagesh
As Junn Rao said, it is pretty much possible multiple publishers publishes to a topic and different group of consumers can consume a message and apply group specific logic example raw data processing, aggregation etc., Each distinguished group will receive a copy. But the offset cannot be used UUI

Re: [DISCUSS] Kafka Security Specific Features

2014-06-05 Thread Rajasekar Elango
Hi Jay, Thanks for putting together a spec for security. Joe, Looks "Securing zookeeper.." part has been deleted from assumptions section. communication with zookeeper need to be secured as well to make entire kafka cluster secure. It may or may not require changes to kafka. But it's good to hav

Re: [DISCUSS] Kafka Security Specific Features

2014-06-05 Thread Joe Stein
Raja, you need to sign an ICLA http://www.apache.org/licenses/icla.txt once that is on file your user can get permed to contribute. I think securing communication to "offset & broker management source" which can be a zookeeper implementation is important. I will elaborate more on that with the ot

Re: question about synchronous producer

2014-06-05 Thread Guozhang Wang
When the producer exhausted all the retries it will drop the message on the floor. So when the broker is down for too long there will be data loss. Guozhang On Thu, Jun 5, 2014 at 6:20 AM, Libo Yu wrote: > I want to know why there will be message loss when brokers are down for > too long. > I'

Re: Unable to build kafka.

2014-06-05 Thread Abhishek Bhattacharjee
Thanks a lot for your reply. I'll migrate later :-D . I'll use the 0.8.0 as per now. Thanks. On Jun 5, 2014 6:03 PM, "Joe Stein" wrote: > There are overloaded changes. You don't have to use them (e.g. key is now > used in compaction with partKey for the partition key > > https://github.com/apache

RE: question about synchronous producer

2014-06-05 Thread Libo Yu
When all the brokers are down the producer should retry for a few times and throw FailedToSendMessageException. And user code can catch the exception and retry after a backoff. However, in my tests, no exception was caught and the message was lost silently. My broker is 0.8.1.1 and my client is

Re: Hadoop Summit Meetups

2014-06-05 Thread Neha Narkhede
Jonathan, A third last resort pattern might be go the CDC route with something like Databus. This would require implementing additional fetchers and relays to support Cassandra and MongoDB. Also the data will need to be transformed on the Hadoop/Spark side for virtually every learning applicatio

Re: question about synchronous producer

2014-06-05 Thread Guozhang Wang
Libo, For clarification, you can use sync producer to reproduce this issue? Guozhang On Thu, Jun 5, 2014 at 10:03 AM, Libo Yu wrote: > When all the brokers are down the producer should retry for a few times > and throw FailedToSendMessageException. And user code can catch the > exception and

Re: are consumer offsets stored in a log?

2014-06-05 Thread Dennis Haller
This will force a rewrite of those monitoring tools and UI tools that read offsets from Zookeeper in order directly to get lag information for reporting on consumer clients. It seems a good thing to know this is coming down the pipe. Dennis On Wed, Jun 4, 2014 at 6:50 PM, Neha Narkhede wrote:

RE: question about synchronous producer

2014-06-05 Thread Libo Yu
Yes. I used three sync producers with request.required.acks=1. I let them publish 2k short messages and in the process I restart all zookeeper and kafka processes ( 3 hosts in a cluster). Normally there will be message loss after 3 restarts. After 3 restarts, I use a consumer to retrieve the mes

Message details

2014-06-05 Thread Achanta Vamsi Subhash
Hi, We are experimenting Kafka for a MQ use-case. We found it very useful but couldn't find the following info from the documentation: I have a consumer logic which can say that a message consumption failed. Is there any way I can remove the message from that partition and put it in other topic?

Re: [DISCUSS] Kafka Security Specific Features

2014-06-05 Thread Jay Kreps
Hey Joe, I don't really understand the sections you added to the wiki. Can you clarify them? Is non-repudiation what SASL would call integrity checks? If so don't SSL and and many of the SASL schemes already support this as well as on-the-wire encryption? Or are you proposing an on-disk encrypti

Re: [DISCUSS] Kafka Security Specific Features

2014-06-05 Thread Todd Palino
No, at-rest encryption is definitely important. When you start talking about data that is used for financial reporting, restricting access to it (both modification and visibility) is a critical component. -Todd On 6/5/14, 2:01 PM, "Jay Kreps" wrote: >Hey Joe, > >I don't really understand the s

Re: [DISCUSS] Kafka Security Specific Features

2014-06-05 Thread Jay Kreps
Hey Todd, Can you elaborate on this? Certainly restricting access to and modification of data is important. But this doesn't imply storing the data encrypted. Are we assuming the attacker can (1) get on the network, (2) get on the kafka server as a non-root and non-kafka user or (3) get root on th

Re: [DISCUSS] Kafka Security Specific Features

2014-06-05 Thread Todd Palino
My concern is specifically around the rules for SOX compliance, or rules around PII, PCI, or HIPAA compliance. The audits get very complication, but my understanding is that the general rule is that sensitive data should be encrypted at rest and only decrypted when needed. And we don¹t just need to

Re: Message details

2014-06-05 Thread Guozhang Wang
Hi Achanta, Your use case is quite interesting. If I do not understand wrong you want to use a transaction that atomically consume on message from a partition and send it to another partition correct? I pre-assume by saying "remove the message from that partition" you actually mean to skip consum

Kafka-Storm Run-time Exception

2014-06-05 Thread Abhishek Bhattacharjee
I am using kafka with storm. I am using maven to build my topology and I am using scala 2.9.2 same as I am using kafka_2.9.2_0.8.1. Topology build perfectly using maven. But hwn I submit the topology to storm I get the following Exception: java.lang.NoSuchMethodError: scala.Predef$.int2Integer(I)

Re: question about synchronous producer

2014-06-05 Thread Guozhang Wang
Libo, did you see any exception/error entries on the producer log? Guozhang On Thu, Jun 5, 2014 at 10:33 AM, Libo Yu wrote: > Yes. I used three sync producers with request.required.acks=1. I let them > publish 2k short messages and in the process I restart all zookeeper and > kafka processes (

Re: Kafka-Storm Run-time Exception

2014-06-05 Thread Andrew Neilson
it's possible you have some other dependency using an earlier version of Scala. A common one to check for when using Kafka is jline 0.9.94, which comes through the zookeeper 3.3.4 dependency included with kafka_2.9.2-0.8.1 and has more than one dependency that uses scala 2.8.x. If this is where it

RE: question about synchronous producer

2014-06-05 Thread Libo Yu
Not really. The issue was reported by a client. I added a lot of logging to make sure no exception was thrown from send() when the message was lost. It is not hard to reproduce. This is a critical issue for operation. It may not be possible for brokers and producers to be restarted at the same

Re: Hadoop Summit Meetups

2014-06-05 Thread Jun Rao
The offset of a message in Kafka never changes. Thanks, Jun On Thu, Jun 5, 2014 at 8:27 AM, Nagesh wrote: > As Junn Rao said, it is pretty much possible multiple publishers publishes > to a topic and different group of consumers can consume a message and apply > group specific logic example r

Re: Message details

2014-06-05 Thread Nagesh
Hi, As per AMQP standards 0.9/1.0 any messaging system for that matter is just a pipe/pipes allows multiple producers to publish messages, and allows multiple pointers (A pointer per group) to consume message. It is upto the message system to discard the message on expiry. As the message is share

Trouble with snappy and SimpleConsumer

2014-06-05 Thread Vinay Gupta
Hi, I am using kafka_2.9.2-0.8.1and snappy-java-1.1.0.1.jar I have been able to successfully use gzip with the same library. however “snappy” doesn’t work in consumer side. Producer is able to send snappy messages to broker though. I have made sure that snappy java lib is the same on both

Re: Kafka-Storm Run-time Exception

2014-06-05 Thread Abhishek Bhattacharjee
hi, thanks for the reply. I tried the solution although it is not working I am getting the same Exception. here's the output of my: >> mvn dependency:tree | grep "scala" [INFO] | +- org.scala-lang:scala-compiler:jar:2.9.2:compile [INFO] +- org.scala-lang:scala-library:jar:2.9.2:compile I don't