Thank you so much Hans for your enlightening, it is definitely greatly
helpful to me as a new starter.

So for my case, what is the right options I should put together to run the
commands for producer and consumer respectively?

Thanks.


*------------------------------------------------*
*Sincerely yours,*


*Raymond*

On Sat, May 26, 2018 at 4:26 PM, Hans Jespersen <h...@confluent.io> wrote:

> There are two concepts in Kafka that are not always familiar to people who
> have used other pub/sub systems.
>
> 1) partitions:
>
> Kafka topics are partitioned which means a single topic is sharded into
> multiple pieces that are distributed across multiple brokers in the cluster
> for parallel processing.
>
> Order is guaranteed per partition (not per topic).
>
> You can think of each kafka topic partition like an exclusive queue is
> traditional messaging systems and order is not guaranteed when the data is
> spread out across multiple queues in tradition messaging either.
>
> 2) keys
>
> Kafka messages have keys in addition the value (I.e body) and the header.
> When messages are published with the same key they will be all be sent in
> order to the same partition.
>
> If messages are published with a “null” key then they will be spread out
> round robin across all partitions (which is what you have done).
>
>
> Conclusion
>
> You will see ordered delivery if your either use a key when you publish or
> create a topic with one partition.
>
>
> -hans
>
> On May 26, 2018, at 7:59 AM, Raymond Xie <xie3208...@gmail.com> wrote:
>
> Thanks. By default, can you explain me why I received the message in wrong
> order? Note there are only 9 lines from 1 to 9, but on consumer side their
> original order becomes messed up.
>
> ~~~sent from my cell phone, sorry if there is any typo
>
> Hans Jespersen <h...@confluent.io> 于 2018年5月26日周六 上午12:16写道:
>
>> If you create a topic with one partition they will be in order.
>>
>> Alternatively if you publish with the same key for every message they
>> will be in the same order even if your topic has more than 1 partition.
>>
>> Either way above will work for Kafka.
>>
>> -hans
>>
>> > On May 25, 2018, at 8:56 PM, Raymond Xie <xie3208...@gmail.com> wrote:
>> >
>> > Hello,
>> >
>> > I just started learning Kafka and have the environment setup on my
>> > hortonworks sandbox at home vmware.
>> >
>> > test.csv is what I want the producer to send out:
>> >
>> > more test1.csv ./kafka-console-producer.sh --broker-list
>> > sandbox.hortonworks.com:6667 --topic kafka-topic2
>> >
>> > 1, abc
>> > 2, def
>> > ...
>> > 8, vwx
>> > 9, zzz
>> >
>> > What I received are all the content of test.csv, however, not in their
>> > original order;
>> >
>> > kafka-console-consumer.sh --zookeeper 192.168.112.129:2181 --topic
>> > kafka-topic2
>> >
>> > 2, def
>> > 1, abc
>> > ...
>> > 9, zzz
>> > 8, vwx
>> >
>> >
>> > I read from google that partition could be the feasible solution,
>> however,
>> > my questions are:
>> >
>> > 1. for small files like this one, shall I really do the partitioning?
>> how
>> > small a partition would be acceptable to ensure the sequence?
>> > 2. for big files, each partition could still contain multiple lines,
>> how to
>> > ensure all the lines in each partition won't get messed up on consumer
>> side?
>> >
>> >
>> > I also want to know what is the best practice to process large volume of
>> > data through kafka? There should be better way other than console
>> command.
>> >
>> > Thank you very much.
>> >
>> >
>> >
>> > *------------------------------------------------*
>> > *Sincerely yours,*
>> >
>> >
>> > *Raymond*
>>
>

Reply via email to