Hey Anh,

Kafka topics will automatically be created when they're read from or
written to, but they'll be created with the default partition count. If
you want a custom partition count (and it sounds like you do), you'll need
to run the scripts to create the topics beforehand, as you described.

Cheers,
Chris

On 2/28/14 1:51 AM, "Anh Thu Vu" <[email protected]> wrote:

>Thank you, Chris!
>
>So if I want to have a list of streams/topics with different number of
>partitions, I should write a script to create all the topics before
>starting my samza jobs? There is no way to set that "automatically" from
>my
>samza jobs?
>
>Casey
>
>
>On Fri, Feb 28, 2014 at 12:45 AM, Chris Riccomini
><[email protected]>wrote:
>
>> Hey Casey,
>>
>> Garry is right. By default Kafka ships with 2 partitions per topic. This
>> can be configured exactly as Garry describes.
>>
>> To create a new topic with a custom partition count, use:
>>
>>   bin/kafka-create-topic.sh --zookeeper localhost:2181 --replica 1
>> --partition 1 --topic test
>>
>> See http://kafka.apache.org/08/quickstart.html for details.
>>
>> To expand a topic's partition count for a topic that already exists,
>>run:
>>
>>   bin/kafka-add-partitions.sh
>>
>> See https://cwiki.apache.org/confluence/display/KAFKA/Replication+tools
>> for details.
>>
>> You can't shrink a topic's partition count for a topic that already
>>exists.
>>
>> You can send a message to all partitions manually, as you describe.
>> Something like:
>>
>>   For(int partition: partitions) {
>>     collector.send(OUTGOING_STREAM, partition, null, outgoingMap)
>>   }
>>
>> This should result in the message being sent to each partition. If you
>> have a kafka-console-consumer.sh running against the stream, you should
>> see the message once for each partition.
>>
>> Cheers,
>> Chris
>>
>> On 2/27/14 9:21 AM, "Anh Thu Vu" <[email protected]> wrote:
>>
>> >Hi Garry,
>> >
>> >I really owe you a lot today :)
>> >
>> >About the broadcasting, I think both cases will serve my purpose.
>> >Basically, for some particular streams, I want to broadcast every
>>messages
>> >to all the replicas of the recipient StreamTask.
>> >
>> >Casey
>> >
>> >
>> >On Thu, Feb 27, 2014 at 6:00 PM, Garry Turkington <
>> >[email protected]> wrote:
>> >
>> >> Hi Casey,
>> >>
>> >> On the first question I *believe* you see 2 partitions on the created
>> >> output streams because they are created automatically by Kafka
>> >> (auto.create.topics.enabled set to true) and if so then it seems to
>> >>default
>> >> to 2 and not 1 partitions.
>> >>
>> >> For the rest I am completely out of my depth and will wait for one of
>> >>the
>> >> other guys to jump in. :)
>> >>
>> >> On the second question what are you trying to achieve here? Do you
>>want
>> >> the mechanism for a single message to go to all partitions of a
>>stream
>> >>or
>> >> do you want every client to read every message across all the
>> >>partitions?
>> >> The two are different though I'm not sure I've made that clear. I ask
>> >> because there was a discussion earlier this week about a possible
>> >>addition
>> >> of broadcast streams which are more in the second case.
>> >>
>> >> Regards
>> >> Garry
>> >>
>> >> -----Original Message-----
>> >> From: Anh Thu Vu [mailto:[email protected]]
>> >> Sent: 27 February 2014 15:14
>> >> To: [email protected]
>> >> Subject: Stream partition
>> >>
>> >> Hi all,
>> >>
>> >> It's me again :)
>> >>
>> >> I have some questions regarding partitioning the stream.
>> >> Consider the wikipedia feed task in hello-samza, when I run it, I
>>saw 2
>> >> partitions when I list kafka topics:
>> >> topic: wikipedia-raw    partition: 0    leader: 0    replicas: 0
>> >>isr: 0
>> >> topic: wikipedia-raw    partition: 1    leader: 0    replicas: 0
>> >>isr: 0
>> >>
>> >> My first question is how to increase the number of partitions? I was
>> >> playing around with the OutgoingMessageEnvelope (OUTPUT_STREAM is the
>> >> kafka.wikipedia-raw
>> >> SystemStream):
>> >> 1)
>> >> collector.send(new OutgoingMessageEnvelope(OUTPUT_STREAM,
>>outgoingMap));
>> >>
>> >> 2)
>> >> collector.send(new OutgoingMessageEnvelope(new
>> >> SystemStreamPartition(OUTPUT_STREAM, new Partition(0)),
>>outgoingMap));
>> >> collector.send(new OutgoingMessageEnvelope(new
>> >> SystemStreamPartition(OUTPUT_STREAM, new Partition(1)),
>>outgoingMap));
>> >> collector.send(new OutgoingMessageEnvelope(new
>> >> SystemStreamPartition(OUTPUT_STREAM, new Partition(2)),
>>outgoingMap));
>> >>
>> >> 3)
>> >> collector.send(new OutgoingMessageEnvelope(OUTPUT_STREAM,0, null,
>> >> outgoingMap)); collector.send(new
>> >>OutgoingMessageEnvelope(OUTPUT_STREAM,1,
>> >> null, outgoingMap)); collector.send(new
>> >> OutgoingMessageEnvelope(OUTPUT_STREAM,2, null, outgoingMap));
>> >>
>> >> but in all cases above, I always got 2 partitions (and why 2 but not
>>1?)
>> >>
>> >> Then, when I try this:
>> >> collector.send(new OutgoingMessageEnvelope(OUTPUT_STREAM,0, 0,
>> >> outgoingMap)); collector.send(new
>> >>OutgoingMessageEnvelope(OUTPUT_STREAM,1,
>> >> 1, outgoingMap)); collector.send(new
>> >> OutgoingMessageEnvelope(OUTPUT_STREAM,2, 2, outgoingMap)); I don't
>> >>receive
>> >> anything on the OUTPUT_STREAM.
>> >>
>> >> Does it has anything to do with the SinglePartitionSystemAdmin in
>> >> WikipediaSystemFactory?
>> >>
>> >> The second question is whether it's possible to send a message to all
>> >>the
>> >> partitions of a stream? (maybe, by sending a message multiple times,
>> >>each
>> >> specifying a partition?)
>> >>
>> >> Thanks in advance,
>> >> Casey
>> >>
>> >> -----
>> >> No virus found in this message.
>> >> Checked by AVG - www.avg.com
>> >> Version: 2014.0.4259 / Virus Database: 3705/7127 - Release Date:
>> >>02/26/14
>> >>
>>
>>

Reply via email to