Kafka stream or ksql design question

2018-07-16 Thread Will Du
Hi folks, As far as I know, Kafka Stream is a separate process by reading data from topic, transform, and writing to another topic if needed. In this case, how this process supports high throughout stream as well as load balance in terms of message traffic and computing resource for stream

Re: Kafka as a data ingest

2017-01-10 Thread Will Du
In terms of big files which is quite often in HDFS, does connect task parallel process the same file like what MR deal with split files? I do not think so. In this case, Kafka connect implement has no advantages to read single big file unless you also use mapreduce. Sent from my iPhone On Jan

Kafka connect distribute start failed

2016-12-05 Thread Will Du
Hi folks, I try to start the kafka connect in the distribute ways as follows. It has below error. Standalone mode is fine. It happens on the 3.0.1. and 3.1 version of confluent kafka. Des anyone know the cause of this error? Thanks, Will security.protocol = PLAINTEXT

How to collect connect metrcs

2016-12-03 Thread Will Du
Hi folks, How I can collect Kafka connect metrics from Confluent? Are there any API to use? In addition, if one file is very big, can multiple task working on the same file simultaneously? Thanks, Will

Re: Link read avro from Kafka Connect Issue

2016-11-02 Thread Will Du
target is to get Flink costume avro data produced by Kafka connect > On Nov 2, 2016, at 7:36 PM, Will Du <will...@gmail.com> wrote: > > > On Nov 2, 2016, at 7:31 PM, Will Du <will...@gmail.com > <mailto:will...@gmail.com>> wrote: > > Hi folks, > I

Link read avro from Kafka Connect Issue

2016-11-02 Thread Will Du
On Nov 2, 2016, at 7:31 PM, Will Du <will...@gmail.com> wrote: Hi folks, I am trying to consume avro data from Kafka in Flink. The data is produced by Kafka connect using AvroConverter. I have created a AvroDeserializationSchema.java <https://gist.github.com/d

connection time out

2015-11-29 Thread Yuheng Du
Hi guys, I was running a single node broker in a cluster. And when I run the producer in another cluster, I got connection time out error. I can ping into port 9092 and other ports on the broker machine from the producer. I just can't publish any messages. The command I used to run the producer

Re: connection time out

2015-11-29 Thread Yuheng Du
Also, I can see the topic "speedx2" being created in the broker, but not message data is coming through. On Sun, Nov 29, 2015 at 7:00 PM, Yuheng Du <yuheng.du.h...@gmail.com> wrote: > Hi guys, > > I was running a single node broker in a cluster. And when I run the >

Re: producer api

2015-09-14 Thread Yuheng Du
o connect for publishing. Kafka will > tell the client about all the other brokers. But best practices state > including all of them is best. > -Erik > > On 9/14/15, 2:46 PM, "Yuheng Du" <yuheng.du.h...@gmail.com> wrote: > > >I am writing a kafka producer applicatio

producer api

2015-09-14 Thread Yuheng Du
I am writing a kafka producer application in java. I want the producer to publish data to a cluster of 6 brokers. Is there a way to specify only the load balancing node but not all the brokers list? For example, like in the benchmarking kafka commandssdg: bin/kafka-run-class.sh

Re: latency test

2015-09-09 Thread Yuheng Du
ere was a burst of > slower messages which caused this behavior, or if it was a consistent > issue with that node. > -Erik > > > On 9/9/15, 2:24 PM, "Yuheng Du" <yuheng.du.h...@gmail.com> wrote: > > >So are you suggesting that the long delays happened in %

Re: latency test

2015-09-09 Thread Yuheng Du
at least in my > case, one of my brokers is further than the others. > -Erik > > On 9/4/15, 1:06 PM, "Yuheng Du" <yuheng.du.h...@gmail.com> wrote: > > >No problem. Thanks for your advice. I think it would be fun to explore. I > >only know how to program in ja

When a message is exposed to the consumer

2015-09-04 Thread Yuheng Du
According to the section 3.1 of the paper "Kafka: a Distributed Messaging System for Log Processing": "a message is only exposed to the consumers after it is flushed"? Is it still true in the current kafka? like the message can only be available after it is flushed to disk? Thanks.

Re: latency test

2015-09-04 Thread Yuheng Du
When I using 32 partitions, the 4 brokers latency becomes larger than the 8 brokers latency. So is it always true that using more brokers can give less latency when the number of partitions is at least the size of the brokers? Thanks. On Thu, Sep 3, 2015 at 10:45 PM, Yuheng Du <yuheng.d

Re: latency test

2015-09-04 Thread Yuheng Du
roughput first and low latency second. And > it does a really good job at both. > > Disclaimer: I might not like linear algebra, but I do like statistics. > Let me know if there are topics that need more explanation above that > aren¹t covered by Gil¹s lecture. > -Erik > > On 9/

Re: When a message is exposed to the consumer

2015-09-04 Thread Yuheng Du
Can't read it. Sorry On Fri, Sep 4, 2015 at 12:08 PM, Roman Shramkov <roman_shram...@epam.com> wrote: > Её ай н Анны уйг > > sent from a mobile device, please excuse brevity and typos > > > ----Пользователь Yuheng Du написал > > According to the s

Re: latency test

2015-09-04 Thread Yuheng Du
ts will be > this slow or faster”, or for values that are high like 99.9%’ile, “0.1% of > all events will be slower than this”. > -Erik > > On 9/4/15, 12:05 PM, "Yuheng Du" <yuheng.du.h...@gmail.com> wrote: > > >Thank you Erik! That's is helpful! > > > &

Re: latency test

2015-09-04 Thread Yuheng Du
o it might be a while… > -Erik > > > On 9/4/15, 12:55 PM, "Yuheng Du" <yuheng.du.h...@gmail.com> wrote: > > >Thanks for your reply Erik. I am running some more tests according to your > >suggestions now and I will share with my results here. Is it necessary

latency test

2015-09-03 Thread Yuheng Du
I am running a producer latency test. When using 92 producers in 92 physical node publishing to 4 brokers, the latency is slightly lower than using 8 brokers, I am using 8 partitions for the topic. I have rerun the test and it gives me the same result, the 4 brokers scenario still has lower

Re: Reduce latency

2015-08-18 Thread Yuheng Du
Also, When I set the target throughput to be 1 records/s, The actual test results show I got an average of 579.86 records per second among all my producers. How did that happen? Why this number is not 1 then? Thanks. On Tue, Aug 18, 2015 at 10:03 AM, Yuheng Du yuheng.du.h...@gmail.com

Re: Reduce latency

2015-08-18 Thread Yuheng Du
and your setup. -Tao On Tue, Aug 18, 2015 at 11:34 AM, Yuheng Du yuheng.du.h...@gmail.com wrote: Also, When I set the target throughput to be 1 records/s, The actual test results show I got an average of 579.86 records per second among all my producers. How did that happen? Why this number

Re: Reduce latency

2015-08-18 Thread Yuheng Du
latency will become meaningless for a latency-purpose test. On Tue, Aug 18, 2015 at 11:48 AM, Yuheng Du yuheng.du.h...@gmail.com wrote: I see. Thank you Tao. But now I don't get it what Jay said that my latency test only makes sense if I set a fixed throughput. Why do I need to set a fixed

Re: Reduce latency

2015-08-18 Thread Yuheng Du
records/sec). -Jay On Thu, Aug 13, 2015 at 12:18 PM, Yuheng Du yuheng.du.h...@gmail.com wrote: Thank you Alvaro, How to use sync producers? I am running the standard ProducerPerformance test from kafka to measure the latency of each message to send from producer to broker only

Re: Reduce latency

2015-08-17 Thread Yuheng Du
unnecessarily) . Also may be you want to increase the batch.size further more, you will get even better throughput with more or less same latency (as there is no shortage of events in the test program). On Thu, Aug 13, 2015 at 1:13 PM Yuheng Du yuheng.du.h...@gmail.com wrote: Yes there is. But if we

Re: use page cache as much as possiblee

2015-08-14 Thread Yuheng Du
on log.flush.interval.messages and log.flush.interval.ms, if the segment file is in the pagecache, the consumers will still benefit from that pagecache and OS wouldn't read it again from disk. On Thu, Aug 13, 2015 at 2:54 PM Yuheng Du yuheng.du.h...@gmail.com wrote: Hi

Reduce latency

2015-08-13 Thread Yuheng Du
I am running an experiment where 92 producers is publishing data into 6 brokers and 10 consumer are reading online data simultaneously. How should I do to reduce the latency? Currently when I run the producer performance test the average latency is around 10s. Should I disable log.flush? How to

Re: Reduce latency

2015-08-13 Thread Yuheng Du
Also, the latency results show no major difference when using ack=0 or ack=1. Why is that? On Thu, Aug 13, 2015 at 11:51 AM, Yuheng Du yuheng.du.h...@gmail.com wrote: I am running an experiment where 92 producers is publishing data into 6 brokers and 10 consumer are reading online data

use page cache as much as possiblee

2015-08-13 Thread Yuheng Du
Hi, As I understand it, kafka brokers will store the incoming messages into pagecache as much as possible and then flush them into disk, right? But in my experiment where 90 producers is publishing data into 6 brokers, I see that the log directory on disk where broker stores the data is

Variation of producer latency in ProducerPerformance test

2015-08-11 Thread Yuheng Du
Hi, I am running a test which 92 producers each publish 53000 records of size 254 bytes to 2 brokers. The average latency shown in each producer has high variations. For some producer, the average latency is as low as 38ms to send the 53000 records; but for some producer, the average latency is

Kafka vs RabbitMQ latency

2015-08-04 Thread Yuheng Du
Hi guys, I was reading a paper today in which the latency of kafka and rabbitmq is compared: http://downloads.hindawi.com/journals/js/2015/468047.pdf To my surprise, kafka has shown some large variations of latency as the number of records per second increases. So I am curious about why is

multiple producer throughput

2015-07-27 Thread Yuheng Du
Hi, I am running 40 producers on 40 nodes cluster. The messages are sent to 6 brokers in another cluster. The producers are running ProducerPerformance test. When 20 nodes are running, the throughput is around 13MB/s and when running 40 nodes, the throughput is around 9MB/s. I have set

Re: deleting data automatically

2015-07-27 Thread Yuheng Du
, Yuheng Du yuheng.du.h...@gmail.com wrote: Thank you! what performance impacts will it be if I change log.segment.bytes? Thanks. On Mon, Jul 27, 2015 at 1:25 PM, Ewen Cheslack-Postava e...@confluent.io wrote: I think log.cleanup.interval.mins was removed in the first 0.8 release

Re: deleting data automatically

2015-07-27 Thread Yuheng Du
log.segment.bytes/log.roll.{ms,hours} and log.retention.check.interval.ms. On Fri, Jul 24, 2015 at 12:49 PM, Yuheng Du yuheng.du.h...@gmail.com wrote: Hi, I am testing the kafka producer performance. So I created a queue and writes a large amount of data to that queue

Re: deleting data automatically

2015-07-27 Thread Yuheng Du
, 2015 at 10:03 AM, Yuheng Du yuheng.du.h...@gmail.com wrote: If I want to get higher throughput, should I increase the log.segment.bytes? I don't see log.retention.check.interval.ms, but there is log.cleanup.interval.mins, is that what you mean? If I set log.roll.ms

Re: multiple producer throughput

2015-07-27 Thread Yuheng Du
prabhbha...@gmail.com wrote: Hi, Have you tried with acks=1 and -1 as well? Please share the numbers and the message size Regards, Prabcs On Jul 27, 2015 10:24 PM, Yuheng Du yuheng.du.h...@gmail.com wrote: Hi, I am running 40 producers on 40 nodes cluster. The messages are sent to 6

Re: properducertest on multiple nodes

2015-07-24 Thread Yuheng Du
I deleted the queue and recreated it before I run the test. Things are working after restart the broker cluster, thanks! On Fri, Jul 24, 2015 at 12:06 PM, Gwen Shapira gshap...@cloudera.com wrote: Does topic speedx1 exist? On Fri, Jul 24, 2015 at 7:09 AM, Yuheng Du yuheng.du.h...@gmail.com

properducertest on multiple nodes

2015-07-24 Thread Yuheng Du
Hi, I am trying to run 20 performance test on 10 nodes using pbsdsh. The messages will send to a 6 brokers cluster. It seems to work for a while. When I delete the test queue and rerun the test, the broker does not seem to process incoming messages: [yuhengd@node1739 kafka_2.10-0.8.2.1]$

deleting data automatically

2015-07-24 Thread Yuheng Du
Hi, I am testing the kafka producer performance. So I created a queue and writes a large amount of data to that queue. Is there a way to delete the data automatically after some time, say whenever the data size reaches 50GB or the retention time exceeds 10 seconds, it will be deleted so my disk

Re: broker data directory

2015-07-21 Thread Yuheng Du
PM, Yuheng Du yuheng.du.h...@gmail.com wrote: Just wanna make sure, in server.properties, the configuration log.dirs=/tmp/kafka-logs specifies the directory of where the log (data) stores, right? If I want the data to be saved elsewhere, this is the configuration I need to change

broker data directory

2015-07-21 Thread Yuheng Du
Just wanna make sure, in server.properties, the configuration log.dirs=/tmp/kafka-logs specifies the directory of where the log (data) stores, right? If I want the data to be saved elsewhere, this is the configuration I need to change, right? Thanks for answering. best,

Re: latency performance test

2015-07-16 Thread Yuheng Du
*every* record waits that long. Of course, these numbers are estimates, depend on my having used 1ms, but hopefully should make it clear why you can see relatively large latencies. -Ewen On Wed, Jul 15, 2015 at 1:38 AM, Yuheng Du yuheng.du.h...@gmail.com wrote: Hi, I have run the end

Re: kafka benchmark tests

2015-07-15 Thread Yuheng Du
of insight into the issue. Though it is understandable that your specific results need to be verified, it seems that the KIP-25 patch is functional and I can use it for my own benchmarking purposes? Is that correct? Thanks again! On Tue, Jul 14, 2015 at 8:22 AM, Yuheng Du yuheng.du.h

Re: Latency test

2015-07-15 Thread Yuheng Du
( http://kafka.apache.org/documentation.html#consumerconfigs). The default value listed at document is 100(ms). To add java heap space to jvm, put -Xmx$Size(max heap size) for your jvm option. On Wed, Jul 15, 2015 at 12:29 AM, Yuheng Du yuheng.du.h...@gmail.com wrote: Tao, Thanks

Re: Latency test

2015-07-15 Thread Yuheng Du
at kafka.tools.TestEndToEndLatency$.main(TestEndToEndLatency.scala:69) at kafka.tools.TestEndToEndLatency.main(TestEndToEndLatency.scala) What command should I do to add java heap space to jvm? Thanks! Yuheng On Wed, Jul 15, 2015 at 3:29 AM, Yuheng Du yuheng.du.h...@gmail.com wrote: Tao

latency performance test

2015-07-15 Thread Yuheng Du
Hi, I have run the end to end latency test and the producerPerformance test on my kafka cluster according to https://gist.github.com/jkreps/c7ddb4041ef62a900e6c In end to end latency test, the latency was around 2ms. In producerperformance test, if use batch size 8196 to send 50,000,000 records:

Re: Latency test

2015-07-15 Thread Yuheng Du
be put in consumer_fetch_max_wait? Thanks. On Tue, Jul 14, 2015 at 5:21 PM, Tao Feng fengta...@gmail.com wrote: I think ProducerPerformance microbenchmark only measure between client to brokers(producer to brokers) and provide latency information. On Tue, Jul 14, 2015 at 11:05 AM, Yuheng Du

Re: Latency test

2015-07-15 Thread Yuheng Du
delay, and what other components? Thanks. best, Yuheng On Wed, Jul 15, 2015 at 3:51 AM, Yuheng Du yuheng.du.h...@gmail.com wrote: Tao, If I am running on the command line the following command bin/kafka-run-class.sh kafka.tools.TestEndToEndLatency 192.168.1.3:9092 192.168.1.1:2181

kafka TestEndtoEndLatency

2015-07-15 Thread Yuheng Du
In kafka performance tests https://gist.github.com/jkreps /c7ddb4041ef62a900e6c The TestEndtoEndLatency results are typically around 2ms, while the ProducerPerformance normally has average latencyaround several hundres ms when using batch size 8196. Are both results talking about end to end

Re: kafka benchmark tests

2015-07-15 Thread Yuheng Du
kafkatest/tests/benchmark_test.py Definitely keep us posted about which parts are difficult, annoying, or confusing about this process and we'll do our best to help. Thanks, Geoff On Wed, Jul 15, 2015 at 12:49 AM, Yuheng Du yuheng.du.h...@gmail.com wrote: Jiefu, Have you tried to run

Re: kafka TestEndtoEndLatency

2015-07-15 Thread Yuheng Du
. Guozhang On Wed, Jul 15, 2015 at 11:36 AM, Yuheng Du yuheng.du.h...@gmail.com wrote: In kafka performance tests https://gist.github.com/jkreps /c7ddb4041ef62a900e6c The TestEndtoEndLatency results are typically around 2ms, while the ProducerPerformance normally has average

Re: kafka TestEndtoEndLatency

2015-07-15 Thread Yuheng Du
from producer to broker, then to consumer. I cannot remember the details not but I think the EndtoEndLatency test record the latency as average, hence it is small. Guozhang On Wed, Jul 15, 2015 at 12:28 PM, Yuheng Du yuheng.du.h...@gmail.com wrote: Guozhang, Thank you for explaining

Re: kafka benchmark tests

2015-07-15 Thread Yuheng Du
/tests). The tool we're using to bring up the slave virtual machines is called vagrant, so the vagrant steps in the quickstart are really telling you how to install the virtual machines. Hope that helps! Cheers, Geoff On Wed, Jul 15, 2015 at 12:13 PM, Yuheng Du yuheng.du.h...@gmail.com

Re: Latency test

2015-07-15 Thread Yuheng Du
/trunk/bin/kafka-run-class.sh KAFKA_JVM_PERFORMANCE_OPTS. On Wed, Jul 15, 2015 at 12:51 AM, Yuheng Du yuheng.du.h...@gmail.com wrote: Tao, If I am running on the command line the following command bin/kafka-run-class.sh kafka.tools.TestEndToEndLatency 192.168.1.3:9092

Re: kafka benchmark tests

2015-07-14 Thread Yuheng Du
to the Kafka cluster https://kafka.apache.org/documentation.html#newproducerconfigs On Tue, Jul 14, 2015 at 7:29 PM, Yuheng Du yuheng.du.h...@gmail.com wrote: Does anyone know what is bootstrap.servers= esv4-hcl198.grid.linkedin.com:9092 means in the following test command: bin/kafka

Re: kafka benchmark tests

2015-07-14 Thread Yuheng Du
Also, I guess setting the target throughput to -1 means let it be as high as possible? On Tue, Jul 14, 2015 at 10:36 AM, Yuheng Du yuheng.du.h...@gmail.com wrote: Thanks. If I set the acks=1 in the producer config options in bin/kafka-run-class.sh

Re: kafka benchmark tests

2015-07-14 Thread Yuheng Du
Does anyone know what is bootstrap.servers= esv4-hcl198.grid.linkedin.com:9092 means in the following test command: bin/kafka-run-class.sh org.apache.kafka.clients.tools.ProducerPerformance test7 5000 100 -1 acks=1 bootstrap.servers= esv4-hcl198.grid.linkedin.com:9092 buffer.memory=67108864

Re: How to run the three producers test

2015-07-14 Thread Yuheng Du
pointed out. Do any of your brokers fall out of the ISR when sending messages? It seems like your setup should be fine, so I'm not entirely sure. On Tue, Jul 14, 2015 at 1:31 PM, Yuheng Du yuheng.du.h...@gmail.com wrote: Jiefu, I am performing these tests on a 6 nodes cluster in cloudlab

Re: How to run the three producers test

2015-07-14 Thread Yuheng Du
) at sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:221) at sun.nio.cs.StreamEncoder.implFlushBuffe (END) On Tue, Jul 14, 2015 at 5:24 PM, Yuheng Du yuheng.du.h...@gmail.com wrote: Hi Jiefu, Gwen, I am running the Throughput versus stored data test: bin/kafka-run-class.sh

Re: How to run the three producers test

2015-07-14 Thread Yuheng Du
:12 PM, Yuheng Du yuheng.du.h...@gmail.com wrote: I checked the logs on the brokers, it seems that the zookeeper or the kafka server process is not running on this broker...Thank you guys. I will see if it happens again. On Tue, Jul 14, 2015 at 4:53 PM, JIEFU GONG jg...@berkeley.edu wrote: Hmm

Re: How to run the three producers test

2015-07-14 Thread Yuheng Du
But is there a way to let kafka override the old data if the disk is filled? Or is it not necessary to use this figure? Thanks. On Tue, Jul 14, 2015 at 10:14 PM, Yuheng Du yuheng.du.h...@gmail.com wrote: Jiefu, I agree with you. I checked the hardware specs of my machines, each one of them

Re: How to run the three producers test

2015-07-14 Thread Yuheng Du
to write data? On Tue, Jul 14, 2015 at 2:27 PM, Yuheng Du yuheng.du.h...@gmail.com wrote: Also, the log in another broker (not the bootstrap) says: [2015-07-14 15:18:41,220] FATAL [Replica Manager on Broker 1]: Error writing to highwatermark file: (kafka.server.ReplicaManager) [2015-07

Re: How to run the three producers test

2015-07-14 Thread Yuheng Du
:48,737] INFO [Kafka Server 1], shutting down (kafka.server.KafkaServer) I have checked that the zookeeper is running fine. Can anyone help why I got the error? Thanks. On Tue, Jul 14, 2015 at 10:24 PM, Yuheng Du yuheng.du.h...@gmail.com wrote: But is there a way to let kafka override the old data

How to run the three producers test

2015-07-14 Thread Yuheng Du
Hi, I am running the performance test for kafka. https://gist.github.com/jkreps /c7ddb4041ef62a900e6c For the Three Producers, 3x async replication scenario, the command is the same as one producer: bin/kafka-run-class.sh org.apache.kafka.clients.tools.ProducerPerformance test 5000 100 -1

Re: How to run the three producers test

2015-07-14 Thread Yuheng Du
: Yuheng, Yes, if you read the blog post it specifies that he's using three separate machines. There's no reason the producers cannot be started at the same time, I believe. On Tue, Jul 14, 2015 at 11:42 AM, Yuheng Du yuheng.du.h...@gmail.com wrote: Hi, I am running the performance test

Latency test

2015-07-14 Thread Yuheng Du
Currently, the latency test from kafka test the end to end latency between producers and consumers. Is there a way to test the producer to broker and broker to consumer delay seperately? Thanks.

Re: performance benchmarking of kafka

2015-07-13 Thread Yuheng Du
org.apache.kafka.clients.tools.ProducerPerformance topic_name num_records record_size target_records_sec [prop_name=prop_value]* On Tue, 14 Jul 2015 at 05:08 Yuheng Du yuheng.du.h...@gmail.com wrote: I am using the binaries of kafka_2.10-0.8.2.1. Could that be the problem? Should I use the source of kafka

performance benchmarking of kafka

2015-07-13 Thread Yuheng Du
Hi guys, I am trying to replicate the test of benchmarking kafka at http://engineering.linkedin.com/kafka/benchmarking-apache-kafka-2-million-writes-second-three-cheap-machines . When I run bin/kafka-run-class.sh org.apache.kafka.clients.tools.ProducerPerformance test7 5000 100 -1 acks=1

Re: performance benchmarking of kafka

2015-07-13 Thread Yuheng Du
directory is the ProducerPerformance class resides? Thanks. On Mon, Jul 13, 2015 at 4:37 PM, JIEFU GONG jg...@berkeley.edu wrote: You may need to open up your run-class.sh in a text editor and modify the classpath -- I believe I had a similar error before. On Mon, Jul 13, 2015 at 1:16 PM, Yuheng Du

Re: performance benchmarking of kafka

2015-07-13 Thread Yuheng Du
-class.sh in a text editor and modify the classpath -- I believe I had a similar error before. On Mon, Jul 13, 2015 at 1:16 PM, Yuheng Du yuheng.du.h...@gmail.com wrote: Hi guys, I am trying to replicate the test of benchmarking kafka at http://engineering.linkedin.com/kafka/benchmarking

Re: A kafka web monitor

2015-03-27 Thread Yuheng Du
Hi Wan, I tried to install this DCMonitor, but when I try to clone the project, but it gives me Permission denied, the remote end hung up unexpectedly. Can you provide any suggestions to this issue? Thanks. best, Yuheng On Mon, Mar 23, 2015 at 8:54 AM, Wan Wei flowbeha...@gmail.com wrote: We

kafka topic information

2015-03-09 Thread Yuheng Du
I am wondering where does kafka cluster keep the topic metadata (name, partition, replication, etc)? How does a server recover the topic's metadata and messages after restart and what data will be lost? Thanks for anyone to answer my questions. best, Yuheng

Re: kafka topic information

2015-03-09 Thread Yuheng Du
with topic metadata as well. You can use zookeeper-shell.sh or zkCli.sh to check zk nodes, /brokers/topics will give you the list of topics . -- Harsha On March 9, 2015 at 8:20:59 AM, Yuheng Du (yuheng.du.h...@gmail.com) wrote: I am wondering where does kafka cluster keep the topic metadata

Re: kafka topic information

2015-03-09 Thread Yuheng Du
://zookeeper.apache.org/doc/r3.4.6/zookeeperAdmin.html -- Harsha On March 9, 2015 at 8:39:00 AM, Yuheng Du (yuheng.du.h...@gmail.com) wrote: Harsha, Thanks for reply. So what if the zookeeper cluster fails? Will the topics information be lost? What fault-tolerant mechanism does zookeeper offer? best

Re: Set up kafka cluster

2015-03-05 Thread Yuheng Du
cluster. Good luck! On Thu, Mar 5, 2015 at 12:30 PM, Yuheng Du yuheng.du.h...@gmail.com wrote: Thank you Gwen, I also need the kafka cluster continue to provide message brokering service to a Storm cluster after the benchmarking. I am fairly new to cluster setups. So

Re: Set up kafka cluster

2015-03-05 Thread Yuheng Du
with the results: https://engineering.linkedin.com/kafka/benchmarking-apache-kafka-2-million-writes-second-three-cheap-machines Gwen On Thu, Mar 5, 2015 at 12:16 PM, Yuheng Du yuheng.du.h...@gmail.com wrote: Hi everyone, I am trying to set up a kafka cluster consisting of three machines. I wanna

Set up kafka cluster

2015-03-05 Thread Yuheng Du
Hi everyone, I am trying to set up a kafka cluster consisting of three machines. I wanna run a benchmarking program in them. Can anyone recommend a step by step tutorial/instruction of how I can do it? Thanks. best, Yuheng