Application Logic: In Kafka, Storm or Redis?
I have an application where I will be getting some Time Series data which I am feeding to Kafka and Kafka in turn is giving data to Storm for doing some real time processing. Now one of my use case is that there might be certain lag in my data. For an example: I might not get all the data for 2:00:00 PM all together. There is a possibility that say all the data for 2:00:00 PM does not arrive at a time and the application has to wait for all the data to arrive to perform certain analytics. For example, say at 2:00:00 pm I get 990 points and another 10 points (say I know beforehand that there would be 1000 points of data per millisecond) arrive at 2:00:40 PM. Now I have to wait for all the data to arrive to perform analytics. Where should I place my application logic: (1) In Kafka, (2) In Storm or should I use something like Redis to get all the timestamped data and when I get all the points for a particular time than only I give it to Kafka/Storm. I am confused :) Any help would be appreciated. Sorry for any grammatical errors as I just was thinking aloud and jotting down my question. Regards, Yavar
Sending Data from more than one producer
How can I make multiple producers to write data? I have written a producer that produces some data for 15 seconds on a single machine setup. Now when I run another instance of same producer it says the port is in use (which is natural as I think the first producer is sending data using TCP). So it is a blocking call for me. How can I start and send data from multiple producers at the same time. Note that it is a vanilla setup with 1 broker on a single machine. I don't need any synchronization and I can send data in random from both the producers.
Re: Apache Kafka Question
Millions of messages per day (with each message being few bytes) is not really 'Big Data'. Kafka has been tested for a million message per second. The answer to all your question IMO is "It depends". You can start with a single instance (Single machine installation). Let your producer send messages. Keep one broker. Increase to N brokers. When you touch the upper limit add a server and repeat all the stuff. Bench marking and scalability are aspects which you should try on your own by playing with Kafka. Every use case is different. So performance metric of one is not a global answer. For your question on Topic or Queue, please read something about Distributed Computing Pub/Sub, Message Queue's and other patterns which are generic concepts and has nothing to do with Kafka. It again depends on your use case. Please read as to what topics in Kafka are? If you just go through the definition of topics you would yourself answer your question within a minute. Replications and all would be next steps once you are done with a single running instance of Kafka. So go ahead and get your hands dirty. You will love Kafka :) And yes, the most important thing: Please read the documentation first (bit of theory) and then dive. There is no silver bullet. Cheers, Yavar http://lnkd.in/GRrrDJ On Mon, Jul 22, 2013 at 4:27 PM, wrote: > Hi, > > > > I am planning to use Apache Kafka 0.8 to handle millions of messages per > day. Now I need to form the environment, like > > > > (i) How many Topics to be created? > (ii) How many partitions/replications to be created? > (iii) How many Brokers to be created? > (iv) How many consumer instances in consumer group? > > (v) Topic or Queue? If topic whether we need to create multiple group Id > as supposed to single one? > > > > How we can go about it? Please clarify. > > Thanks & Regards, > Anantha > > Please do not print this email unless it is absolutely necessary. > > The information contained in this electronic message and any attachments > to this message are intended for the exclusive use of the addressee(s) and > may contain proprietary, confidential or privileged information. If you are > not the intended recipient, you should not disseminate, distribute or copy > this e-mail. Please notify the sender immediately and destroy all copies of > this message and any attachments. > > WARNING: Computer viruses can be transmitted via email. The recipient > should check this email and any attachments for the presence of viruses. > The company accepts no liability for any damage caused by any virus > transmitted by this email. > > www.wipro.com >
Re: Kafka 0.7 Quickstart Errors
Perfect Jun! It works. Thanks a ton. On Mon, Jul 8, 2013 at 9:00 AM, Jun Rao wrote: > The following is the weird part. 0:0 is not a valid host and port. Could > you take a look at the EC2 FAQ in > https://cwiki.apache.org/confluence/display/KAFKA/FAQ? It's for the > consumers, but may apply to the producers too. > > [2013-06-28 14:07:19,653] ERROR Connection attempt to 0:0 failed, next > attempt in 1 ms (kafka.producer.SyncProducer) > java.net.ConnectException: Connection refused > > Thanks, > > Jun > > > On Sat, Jul 6, 2013 at 3:30 PM, Yavar Husain > wrote: > > > Hi Jun > > > > I am still not able to run Kafka 0.7. and getting the same error as > > described in my thread. As for Kafka Spout to work I need Kafka 0.7 so it > > would be great if you could help me out with this. I did not understand > > what you mentioned in your last message "wipe out both Zookeeper and > Kafka > > 0.8 data".I just changed the log data directories in both kafka and > > zookeeper configs and still I am getting the same error. Isn't that > > sufficient? What else do I need to do to wipe out the data? What > > directories do I need to visit? > > > > Will the above be the reason for getting the following error: > > > > [2013-06-28 14:06:05,606] INFO Creating async producer for broker id = > > > > 0 at 0:0 (kafka.producer.ProducerPool) > > > > 5) Time to send some messages & oops I get this error: > > > > [2013-06-28 14:07:19,650] INFO Disconnecting from 0:0 > > > > (kafka.producer.SyncProducer) > > > > [2013-06-28 14:07:19,653] ERROR Connection attempt to 0:0 failed, > next > > > > attempt in 1 ms (kafka.producer.SyncProducer) > > > > java.net.ConnectException: Connection refused > > > > at sun.nio.ch.Net.connect0(Native Method) > > > > at sun.nio.ch.Net.connect(Net.java:364) > > > > at sun.nio.ch.Net.connect(Net.java:356) > > > > at > > > sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:623) > > > > at > kafka.producer.SyncProducer.connect(SyncProducer.scala:173) > > > > at > > > > > kafka.producer.SyncProducer.getOrMakeConnection(SyncProducer.scala:196) > > > > at kafka.producer.SyncProducer.send(SyncProducer.scala:92) > > > > at > > kafka.producer.SyncProducer.multiSend(SyncProducer.scala:135) > > > > at > > > > > > > > > > kafka.producer.async.DefaultEventHandler.send(DefaultEventHandler.scala:58) > > > > at > > > > > > > > > > > > kafka.producer.async.DefaultEventHandler.handle(DefaultEventHandler.scala:44) > > > > at > > > > > > > > > > > > kafka.producer.async.ProducerSendThread.tryToHandle(ProducerSendThread.scala:116) > > > > at > scala.collection.immutable.Stream.foreach(Stream.scala:254) > > > > at > > > > > > > > > > > > kafka.producer.async.ProducerSendThread.processEvents(ProducerSendThread.scala:70) > > > > at > > > > > > kafka.producer.async.ProducerSendThread.run(ProducerSendThread.scala:41) > > > > Regards, > > Yavar > > > > On Thu, Jul 4, 2013 at 4:53 PM, Yavar Husain > > wrote: > > > > > Hey Jun > > > > > > Thanks for your prompt response. I don't really get "wipe out both > > > Zookeeper and Kafka 0.8 data". I just changed the log data directories > in > > > both kafka and zookeeper configs and still I am getting the same error. > > > Isn't that sufficient? What else do I need to do to wipe out the data? > > What > > > directories do I need to visit? > > > > > > Thanks, > > > Yavar > > > > > > > > > On Mon, Jul 1, 2013 at 9:13 PM, Jun Rao wrote: > > > > > >> You need to wipe out both the ZK data and the Kafka data from 0.8, in > > >> order > > >> to try 0.7. > > >> > > >> Thanks, > > >> > > >> Jun > > >> > > >> > > >> On Sun, Jun 30, 2013 at 11:28 PM, Yavar Husain > >> >wrote: > > >> > > >> > Kafka 0.8 works great. I am able to use CLI as well as write my own > > >> > producers/consumers! > > >> > > > >> > Checking Zookeeper... and I see all the topics and partitions > created >
Re: Kafka 0.7 Quickstart Errors
Hi Jun I am still not able to run Kafka 0.7. and getting the same error as described in my thread. As for Kafka Spout to work I need Kafka 0.7 so it would be great if you could help me out with this. I did not understand what you mentioned in your last message "wipe out both Zookeeper and Kafka 0.8 data".I just changed the log data directories in both kafka and zookeeper configs and still I am getting the same error. Isn't that sufficient? What else do I need to do to wipe out the data? What directories do I need to visit? Will the above be the reason for getting the following error: [2013-06-28 14:06:05,606] INFO Creating async producer for broker id = > > 0 at 0:0 (kafka.producer.ProducerPool) > > 5) Time to send some messages & oops I get this error: > > [2013-06-28 14:07:19,650] INFO Disconnecting from 0:0 > > (kafka.producer.SyncProducer) > > [2013-06-28 14:07:19,653] ERROR Connection attempt to 0:0 failed, next > > attempt in 1 ms (kafka.producer.SyncProducer) > > java.net.ConnectException: Connection refused > > at sun.nio.ch.Net.connect0(Native Method) > > at sun.nio.ch.Net.connect(Net.java:364) > > at sun.nio.ch.Net.connect(Net.java:356) > > at > sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:623) > > at kafka.producer.SyncProducer.connect(SyncProducer.scala:173) > > at > > kafka.producer.SyncProducer.getOrMakeConnection(SyncProducer.scala:196) > > at kafka.producer.SyncProducer.send(SyncProducer.scala:92) > > at kafka.producer.SyncProducer.multiSend(SyncProducer.scala:135) > > at > > > kafka.producer.async.DefaultEventHandler.send(DefaultEventHandler.scala:58) > > at > > > kafka.producer.async.DefaultEventHandler.handle(DefaultEventHandler.scala:44) > > at > > > kafka.producer.async.ProducerSendThread.tryToHandle(ProducerSendThread.scala:116) > > at scala.collection.immutable.Stream.foreach(Stream.scala:254) > > at > > > kafka.producer.async.ProducerSendThread.processEvents(ProducerSendThread.scala:70) > > at > > kafka.producer.async.ProducerSendThread.run(ProducerSendThread.scala:41) Regards, Yavar On Thu, Jul 4, 2013 at 4:53 PM, Yavar Husain wrote: > Hey Jun > > Thanks for your prompt response. I don't really get "wipe out both > Zookeeper and Kafka 0.8 data". I just changed the log data directories in > both kafka and zookeeper configs and still I am getting the same error. > Isn't that sufficient? What else do I need to do to wipe out the data? What > directories do I need to visit? > > Thanks, > Yavar > > > On Mon, Jul 1, 2013 at 9:13 PM, Jun Rao wrote: > >> You need to wipe out both the ZK data and the Kafka data from 0.8, in >> order >> to try 0.7. >> >> Thanks, >> >> Jun >> >> >> On Sun, Jun 30, 2013 at 11:28 PM, Yavar Husain > >wrote: >> >> > Kafka 0.8 works great. I am able to use CLI as well as write my own >> > producers/consumers! >> > >> > Checking Zookeeper... and I see all the topics and partitions created >> > successfully for 0.8. >> > >> > Kafka 0.7 does not work! >> > >> > Why Kafka 0.7? I am using Kafka Spout from Storm which is made for Kafka >> > 0.7. >> > >> > First I just want to run CLI based producer/consumer for Kafka 0.7, >> which I >> > am unable to. I carry out the following steps: >> > >> > 1) I delete all the topics/partitions etc. in Zookeeper that were >> > created from my Kafka 0.8 >> > 2) I change the dataDir in zoo.cfg to point to different location. >> > 3) Now I start the kafka server 0.7. It starts successfully. However >> > I don’t know why it again registers the broker topics I deleted? >> > 4) Now I start the Kafka Producer : >> > bin/kafka-console-producer.sh --zookeeper localhost:2181 --topic >> topicime >> > & it starts successfully: >> > [2013-06-28 14:06:05,521] INFO zookeeper state changed (SyncConnected) >> > (org.I0Itec.zkclient.ZkClient) >> > [2013-06-28 14:06:05,606] INFO Creating async producer for broker id = >> > 0 at 0:0 (kafka.producer.ProducerPool) >> > 5) Time to send some messages & oops I get this error: >> > [2013-06-28 14:07:19,650] INFO Disconnecting from 0:0 >> > (kafka.producer.SyncProducer) >> > [2013-06-28 14:07:19,653] ERROR Connection attempt to 0:0 failed, next >> > attempt in 1 ms (kafka.producer.SyncProducer) >> > java.net.ConnectException: Connection ref
Re: Kafka 0.7 Quickstart Errors
Hey Jun Thanks for your prompt response. I don't really get "wipe out both Zookeeper and Kafka 0.8 data". I just changed the log data directories in both kafka and zookeeper configs and still I am getting the same error. Isn't that sufficient? What else do I need to do to wipe out the data? What directories do I need to visit? Thanks, Yavar On Mon, Jul 1, 2013 at 9:13 PM, Jun Rao wrote: > You need to wipe out both the ZK data and the Kafka data from 0.8, in order > to try 0.7. > > Thanks, > > Jun > > > On Sun, Jun 30, 2013 at 11:28 PM, Yavar Husain >wrote: > > > Kafka 0.8 works great. I am able to use CLI as well as write my own > > producers/consumers! > > > > Checking Zookeeper... and I see all the topics and partitions created > > successfully for 0.8. > > > > Kafka 0.7 does not work! > > > > Why Kafka 0.7? I am using Kafka Spout from Storm which is made for Kafka > > 0.7. > > > > First I just want to run CLI based producer/consumer for Kafka 0.7, > which I > > am unable to. I carry out the following steps: > > > > 1) I delete all the topics/partitions etc. in Zookeeper that were > > created from my Kafka 0.8 > > 2) I change the dataDir in zoo.cfg to point to different location. > > 3) Now I start the kafka server 0.7. It starts successfully. However > > I don’t know why it again registers the broker topics I deleted? > > 4) Now I start the Kafka Producer : > > bin/kafka-console-producer.sh --zookeeper localhost:2181 --topic topicime > > & it starts successfully: > > [2013-06-28 14:06:05,521] INFO zookeeper state changed (SyncConnected) > > (org.I0Itec.zkclient.ZkClient) > > [2013-06-28 14:06:05,606] INFO Creating async producer for broker id = > > 0 at 0:0 (kafka.producer.ProducerPool) > > 5) Time to send some messages & oops I get this error: > > [2013-06-28 14:07:19,650] INFO Disconnecting from 0:0 > > (kafka.producer.SyncProducer) > > [2013-06-28 14:07:19,653] ERROR Connection attempt to 0:0 failed, next > > attempt in 1 ms (kafka.producer.SyncProducer) > > java.net.ConnectException: Connection refused > > at sun.nio.ch.Net.connect0(Native Method) > > at sun.nio.ch.Net.connect(Net.java:364) > > at sun.nio.ch.Net.connect(Net.java:356) > > at > sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:623) > > at kafka.producer.SyncProducer.connect(SyncProducer.scala:173) > > at > > kafka.producer.SyncProducer.getOrMakeConnection(SyncProducer.scala:196) > > at kafka.producer.SyncProducer.send(SyncProducer.scala:92) > > at kafka.producer.SyncProducer.multiSend(SyncProducer.scala:135) > > at > > > kafka.producer.async.DefaultEventHandler.send(DefaultEventHandler.scala:58) > > at > > > kafka.producer.async.DefaultEventHandler.handle(DefaultEventHandler.scala:44) > > at > > > kafka.producer.async.ProducerSendThread.tryToHandle(ProducerSendThread.scala:116) > > at scala.collection.immutable.Stream.foreach(Stream.scala:254) > > at > > > kafka.producer.async.ProducerSendThread.processEvents(ProducerSendThread.scala:70) > > at > > kafka.producer.async.ProducerSendThread.run(ProducerSendThread.scala:41) > > > > Note that Zookeeper is already running. > > > > Any help would really be appreciated. > > > > *EDIT:* > > > > I don't even see the topic being created in zookeeper. I am running the > > following command: > > > > bin/kafka-console-producer.sh --zookeeper localhost:2181 --topic topicime > > > > After the command everything is fine & I get the following message: > > > > [2013-06-28 14:30:17,614] INFO Session establishment complete on > > server localhost/127.0.0.1:2181, sessionid = 0x13f805c6673004b, > > negotiated timeout = 6000 (org.apache.zookeeper.ClientCnxn) > > [2013-06-28 14:30:17,615] INFO zookeeper state changed (SyncConnected) > > (org.I0Itec.zkclient.ZkClient) > > [2013-06-28 14:30:17,700] INFO Creating async producer for broker id = > > 0 at 0:0 (kafka.producer.ProducerPool) > > > > However now when i type a string to send I get the above error > (Connection > > refused!) > > >
Kafka 0.7 Quickstart Errors
Kafka 0.8 works great. I am able to use CLI as well as write my own producers/consumers! Checking Zookeeper... and I see all the topics and partitions created successfully for 0.8. Kafka 0.7 does not work! Why Kafka 0.7? I am using Kafka Spout from Storm which is made for Kafka 0.7. First I just want to run CLI based producer/consumer for Kafka 0.7, which I am unable to. I carry out the following steps: 1) I delete all the topics/partitions etc. in Zookeeper that were created from my Kafka 0.8 2) I change the dataDir in zoo.cfg to point to different location. 3) Now I start the kafka server 0.7. It starts successfully. However I don’t know why it again registers the broker topics I deleted? 4) Now I start the Kafka Producer : bin/kafka-console-producer.sh --zookeeper localhost:2181 --topic topicime & it starts successfully: [2013-06-28 14:06:05,521] INFO zookeeper state changed (SyncConnected) (org.I0Itec.zkclient.ZkClient) [2013-06-28 14:06:05,606] INFO Creating async producer for broker id = 0 at 0:0 (kafka.producer.ProducerPool) 5) Time to send some messages & oops I get this error: [2013-06-28 14:07:19,650] INFO Disconnecting from 0:0 (kafka.producer.SyncProducer) [2013-06-28 14:07:19,653] ERROR Connection attempt to 0:0 failed, next attempt in 1 ms (kafka.producer.SyncProducer) java.net.ConnectException: Connection refused at sun.nio.ch.Net.connect0(Native Method) at sun.nio.ch.Net.connect(Net.java:364) at sun.nio.ch.Net.connect(Net.java:356) at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:623) at kafka.producer.SyncProducer.connect(SyncProducer.scala:173) at kafka.producer.SyncProducer.getOrMakeConnection(SyncProducer.scala:196) at kafka.producer.SyncProducer.send(SyncProducer.scala:92) at kafka.producer.SyncProducer.multiSend(SyncProducer.scala:135) at kafka.producer.async.DefaultEventHandler.send(DefaultEventHandler.scala:58) at kafka.producer.async.DefaultEventHandler.handle(DefaultEventHandler.scala:44) at kafka.producer.async.ProducerSendThread.tryToHandle(ProducerSendThread.scala:116) at scala.collection.immutable.Stream.foreach(Stream.scala:254) at kafka.producer.async.ProducerSendThread.processEvents(ProducerSendThread.scala:70) at kafka.producer.async.ProducerSendThread.run(ProducerSendThread.scala:41) Note that Zookeeper is already running. Any help would really be appreciated. *EDIT:* I don't even see the topic being created in zookeeper. I am running the following command: bin/kafka-console-producer.sh --zookeeper localhost:2181 --topic topicime After the command everything is fine & I get the following message: [2013-06-28 14:30:17,614] INFO Session establishment complete on server localhost/127.0.0.1:2181, sessionid = 0x13f805c6673004b, negotiated timeout = 6000 (org.apache.zookeeper.ClientCnxn) [2013-06-28 14:30:17,615] INFO zookeeper state changed (SyncConnected) (org.I0Itec.zkclient.ZkClient) [2013-06-28 14:30:17,700] INFO Creating async producer for broker id = 0 at 0:0 (kafka.producer.ProducerPool) However now when i type a string to send I get the above error (Connection refused!)