Thank you so much Hans for your enlightening, it is definitely greatly helpful to me as a new starter.
So for my case, what is the right options I should put together to run the commands for producer and consumer respectively? Thanks. *------------------------------------------------* *Sincerely yours,* *Raymond* On Sat, May 26, 2018 at 4:26 PM, Hans Jespersen <h...@confluent.io> wrote: > There are two concepts in Kafka that are not always familiar to people who > have used other pub/sub systems. > > 1) partitions: > > Kafka topics are partitioned which means a single topic is sharded into > multiple pieces that are distributed across multiple brokers in the cluster > for parallel processing. > > Order is guaranteed per partition (not per topic). > > You can think of each kafka topic partition like an exclusive queue is > traditional messaging systems and order is not guaranteed when the data is > spread out across multiple queues in tradition messaging either. > > 2) keys > > Kafka messages have keys in addition the value (I.e body) and the header. > When messages are published with the same key they will be all be sent in > order to the same partition. > > If messages are published with a “null” key then they will be spread out > round robin across all partitions (which is what you have done). > > > Conclusion > > You will see ordered delivery if your either use a key when you publish or > create a topic with one partition. > > > -hans > > On May 26, 2018, at 7:59 AM, Raymond Xie <xie3208...@gmail.com> wrote: > > Thanks. By default, can you explain me why I received the message in wrong > order? Note there are only 9 lines from 1 to 9, but on consumer side their > original order becomes messed up. > > ~~~sent from my cell phone, sorry if there is any typo > > Hans Jespersen <h...@confluent.io> 于 2018年5月26日周六 上午12:16写道: > >> If you create a topic with one partition they will be in order. >> >> Alternatively if you publish with the same key for every message they >> will be in the same order even if your topic has more than 1 partition. >> >> Either way above will work for Kafka. >> >> -hans >> >> > On May 25, 2018, at 8:56 PM, Raymond Xie <xie3208...@gmail.com> wrote: >> > >> > Hello, >> > >> > I just started learning Kafka and have the environment setup on my >> > hortonworks sandbox at home vmware. >> > >> > test.csv is what I want the producer to send out: >> > >> > more test1.csv ./kafka-console-producer.sh --broker-list >> > sandbox.hortonworks.com:6667 --topic kafka-topic2 >> > >> > 1, abc >> > 2, def >> > ... >> > 8, vwx >> > 9, zzz >> > >> > What I received are all the content of test.csv, however, not in their >> > original order; >> > >> > kafka-console-consumer.sh --zookeeper 192.168.112.129:2181 --topic >> > kafka-topic2 >> > >> > 2, def >> > 1, abc >> > ... >> > 9, zzz >> > 8, vwx >> > >> > >> > I read from google that partition could be the feasible solution, >> however, >> > my questions are: >> > >> > 1. for small files like this one, shall I really do the partitioning? >> how >> > small a partition would be acceptable to ensure the sequence? >> > 2. for big files, each partition could still contain multiple lines, >> how to >> > ensure all the lines in each partition won't get messed up on consumer >> side? >> > >> > >> > I also want to know what is the best practice to process large volume of >> > data through kafka? There should be better way other than console >> command. >> > >> > Thank you very much. >> > >> > >> > >> > *------------------------------------------------* >> > *Sincerely yours,* >> > >> > >> > *Raymond* >> >