Re: Recommended log level in prod environment.

2013-07-22 Thread Jun Rao
Yes, the kafka-request log logs every request (in TRACE). It's mostly for
debugging purpose. Other than that, there is no harm to turn it off.

Thanks,

Jun


On Mon, Jul 22, 2013 at 7:59 PM, Calvin Lei  wrote:

> nah. We just changed it to INFO and will monitor the log. We have GBs of
> logs when it was at trace level. the kafka-request log was going crazy.
>
>
> On Jul 22, 2013, at 10:54 PM, Jay Kreps  wrote:
>
> > We run at info too except when debugging stuff. Are you saying that info
> is
> > too verbose?
> >
> > -Jay
> >
> >
> > On Mon, Jul 22, 2013 at 6:43 PM, Calvin Lei  wrote:
> >
> >> The beta release comes with mostly trace level logging. Is this
> >> recommended? I notice our cluster produce way too many logs. I set all
> the
> >> level to info currently.
> >>
>
>


Re: Replacing brokers in a cluster (0.8)

2013-07-22 Thread Jun Rao
You can try kafka-reassign-partitions now. You do have to specify the new
replica assignment manually. We are improving that tool to make it more
automatic.

Thanks,

Jun


On Mon, Jul 22, 2013 at 10:40 AM, Jason Rosenberg  wrote:

> Is the kafka-reassign-partitions tool something I can experiment with now
> (this will only be staging data, in the first go-round).  How does it work?
>  Do I manually have to specify each replica I want to move?  This would be
> cumbersome, as I have on the order of 100's of topicsOr does the tool
> have the ability to specify all replicas on a particular broker?  How can I
> easily check whether a partition has all its replicas in the ISR?
>
> For some reason, I had thought there would be a default behavior, whereby a
> replica could automatically be declared dead after a configurable timeout
> period.
>
> Re-assigning broker id's would not be ideal, since I have a scheme
> currently whereby broker id's are auto-generated, from a hostname/ip, etc.
>  I could make it work, but it's not my preference to override that!
>
> Jason
>
>
> On Mon, Jul 22, 2013 at 11:50 AM, Jun Rao  wrote:
>
> > A replica's data won't be automatically moved to another broker where
> there
> > are failures. This is because we don't know if the failure is transient
> or
> > permanent. The right tool to use is the kafka-reassign-partitions tool.
> It
> > hasn't been thoroughly tested tough. We hope to harden it in the final
> > 0.8.0 release.
> >
> > You can also replace a broker with a new server by keeping the same
> broker
> > id. When the new server starts up, it will replica data from the leader.
> > You know the data is fully replicated when both replicas are in ISR.
> >
> > Thanks,
> >
> > Jun
> >
> >
> > On Mon, Jul 22, 2013 at 2:14 AM, Jason Rosenberg 
> wrote:
> >
> > > I'm planning to upgrade a 0.8 cluster from 2 old nodes, to 3 new ones
> > > (better hardware).  I'm using a replication factor of 2.
> > >
> > > I'm thinking the plan should be to spin up the 3 new nodes, and operate
> > as
> > > a 5 node cluster for a while.  Then first remove 1 of the old nodes,
> and
> > > wait for the partitions on the removed node to get replicated to the
> > other
> > > nodes.  Then, do the same for the other old node.
> > >
> > > Does this sound sensible?
> > >
> > > How does the cluster decide when to re-replicate partitions that are
> on a
> > > node that is no longer available?  Does it only happen if/when new
> > messages
> > > arrive for that partition?  Is it on a partition by partition basis?
> > >
> > > Or is it a cluster-level decision that a broker is no longer valid, in
> > > which case all affected partitions would immediately get replicated to
> > new
> > > brokers as needed?
> > >
> > > I'm just wondering how I will know when it will be safe to take down my
> > > second old node, after the first one is removed, etc.
> > >
> > > Thanks,
> > >
> > > Jason
> > >
> >
>


Re: Recommended log level in prod environment.

2013-07-22 Thread Calvin Lei
nah. We just changed it to INFO and will monitor the log. We have GBs of logs 
when it was at trace level. the kafka-request log was going crazy.


On Jul 22, 2013, at 10:54 PM, Jay Kreps  wrote:

> We run at info too except when debugging stuff. Are you saying that info is
> too verbose?
> 
> -Jay
> 
> 
> On Mon, Jul 22, 2013 at 6:43 PM, Calvin Lei  wrote:
> 
>> The beta release comes with mostly trace level logging. Is this
>> recommended? I notice our cluster produce way too many logs. I set all the
>> level to info currently.
>> 



Re: Recommended log level in prod environment.

2013-07-22 Thread Jay Kreps
We run at info too except when debugging stuff. Are you saying that info is
too verbose?

-Jay


On Mon, Jul 22, 2013 at 6:43 PM, Calvin Lei  wrote:

> The beta release comes with mostly trace level logging. Is this
> recommended? I notice our cluster produce way too many logs. I set all the
> level to info currently.
>


Recommended log level in prod environment.

2013-07-22 Thread Calvin Lei
The beta release comes with mostly trace level logging. Is this
recommended? I notice our cluster produce way too many logs. I set all the
level to info currently.


Re: Messages TTL setting

2013-07-22 Thread Jay Kreps
Yes, all configuration changes should be possible to do one node at a time.

-Jay


On Mon, Jul 22, 2013 at 2:03 PM, arathi maddula wrote:

> Hi,
>
> We have a 3 node Kafka cluster. We want to increase the maximum amount of
> time for which messages are saved in Kafka data logs.
> Can we change the configuration on one node, stop it and start it and then
> change the configuration of the next node?
> Or should we stop all 3 nodes at a time, make configuration changes and
> then restart all 3? Please suggest.
>
> Thanks,
> Arathi
>


Messages TTL setting

2013-07-22 Thread arathi maddula
Hi,

We have a 3 node Kafka cluster. We want to increase the maximum amount of
time for which messages are saved in Kafka data logs.
Can we change the configuration on one node, stop it and start it and then
change the configuration of the next node?
Or should we stop all 3 nodes at a time, make configuration changes and
then restart all 3? Please suggest.

Thanks,
Arathi


Re: Logo

2013-07-22 Thread David Arthur

  
  
I actually did this the last time a logo was discussed :)

https://docs.google.com/drawings/d/11WHfjkRGbSiZK6rRkedCrgmgFoP_vQ-QuWNENd4u7UY/edit

As it turns out, it was a dung beetle in the book (I thought it was
a roach as well).

-David

On 7/22/13 2:59 PM, David Harris wrote:


  
  It should be a roach in honor of Franz Kafka's Metamorphosis.
  
  On 7/22/2013 2:55 PM, S Ahmed wrote:
  
  
Similar, yet different.  I like it!


On Mon, Jul 22, 2013 at 1:25 PM, Jay Kreps  wrote:



  Yeah, good point. I hadn't seen that before.

-Jay


On Mon, Jul 22, 2013 at 10:20 AM, Radek Gruchalski <
radek.gruchal...@portico.io> wrote:


  
296 looks familiar: https://www.nodejitsu.com/

Kind regards,
Radek Gruchalski
radek.gruchal...@technicolor.com (mailto:

  
  radek.gruchal...@technicolor.com)

  
| radek.gruchal...@portico.io (mailto:radek.gruchal...@portico.io) |
ra...@gruchalski.com (mailto:ra...@gruchalski.com)
00447889948663

Confidentiality:
This communication is intended for the above-named person and may be
confidential and/or legally privileged.
If it has come to you in error you must take no action based on it, nor
must you copy or show it to anyone; please delete/destroy and inform the
sender immediately.


On Monday, 22 July 2013 at 18:51, Jay Kreps wrote:



  Hey guys,

We need a logo!

I got a few designs from a 99 designs contest that I would like to put
forward:
https://issues.apache.org/jira/browse/KAFKA-982

If anyone else would like to submit a design that would be great.

Let's do a vote to choose one.

-Jay




  

  
  
  -- 

  
  David Harris
  Bridge Interactive Group
  email: dhar...@big-llc.com
  cell: 404-831-7015
  office: 888-901-0150
  
  Bridge Software Products:
  www.big-llc.com
  www.realvaluator.com
  www.rvleadgen.com
  


  



Re: Logo

2013-07-22 Thread David Harris

  
  
It should be a roach in honor of Franz Kafka's Metamorphosis.

On 7/22/2013 2:55 PM, S Ahmed wrote:


  Similar, yet different.  I like it!


On Mon, Jul 22, 2013 at 1:25 PM, Jay Kreps  wrote:


  
Yeah, good point. I hadn't seen that before.

-Jay


On Mon, Jul 22, 2013 at 10:20 AM, Radek Gruchalski <
radek.gruchal...@portico.io> wrote:



  296 looks familiar: https://www.nodejitsu.com/

Kind regards,
Radek Gruchalski
radek.gruchal...@technicolor.com (mailto:


radek.gruchal...@technicolor.com)


  | radek.gruchal...@portico.io (mailto:radek.gruchal...@portico.io) |
ra...@gruchalski.com (mailto:ra...@gruchalski.com)
00447889948663

Confidentiality:
This communication is intended for the above-named person and may be
confidential and/or legally privileged.
If it has come to you in error you must take no action based on it, nor
must you copy or show it to anyone; please delete/destroy and inform the
sender immediately.


On Monday, 22 July 2013 at 18:51, Jay Kreps wrote:


  
Hey guys,

We need a logo!

I got a few designs from a 99 designs contest that I would like to put
forward:
https://issues.apache.org/jira/browse/KAFKA-982

If anyone else would like to submit a design that would be great.

Let's do a vote to choose one.

-Jay

  
  





  
  



-- 
  

David Harris
Bridge Interactive Group
email: dhar...@big-llc.com
cell: 404-831-7015
office: 888-901-0150

Bridge Software Products:
www.big-llc.com
www.realvaluator.com
www.rvleadgen.com

  



Re: Logo

2013-07-22 Thread S Ahmed
Similar, yet different.  I like it!


On Mon, Jul 22, 2013 at 1:25 PM, Jay Kreps  wrote:

> Yeah, good point. I hadn't seen that before.
>
> -Jay
>
>
> On Mon, Jul 22, 2013 at 10:20 AM, Radek Gruchalski <
> radek.gruchal...@portico.io> wrote:
>
> > 296 looks familiar: https://www.nodejitsu.com/
> >
> > Kind regards,
> > Radek Gruchalski
> > radek.gruchal...@technicolor.com (mailto:
> radek.gruchal...@technicolor.com)
> > | radek.gruchal...@portico.io (mailto:radek.gruchal...@portico.io) |
> > ra...@gruchalski.com (mailto:ra...@gruchalski.com)
> > 00447889948663
> >
> > Confidentiality:
> > This communication is intended for the above-named person and may be
> > confidential and/or legally privileged.
> > If it has come to you in error you must take no action based on it, nor
> > must you copy or show it to anyone; please delete/destroy and inform the
> > sender immediately.
> >
> >
> > On Monday, 22 July 2013 at 18:51, Jay Kreps wrote:
> >
> > > Hey guys,
> > >
> > > We need a logo!
> > >
> > > I got a few designs from a 99 designs contest that I would like to put
> > > forward:
> > > https://issues.apache.org/jira/browse/KAFKA-982
> > >
> > > If anyone else would like to submit a design that would be great.
> > >
> > > Let's do a vote to choose one.
> > >
> > > -Jay
> >
> >
>


Re: Replacing brokers in a cluster (0.8)

2013-07-22 Thread Scott Clasen
Here's a ruby cli that you can use to replace brokers...it shells out to
the kafka-reassign-partitions.sh tool after figuring out broker lists from
zk. Hope its useful.


#!/usr/bin/env ruby

require 'excon'
require 'json'
require 'zookeeper'

def replace(arr, o, n)
  arr.map{|v| v == o ? n : v }
end

if ARGV.length != 4
  puts "Usage: bundle exec bin/replace-instance zkstr topic-name
old-broker-id new-broker-id"
else
  zkstr = ARGV[0]
  zk = Zookeeper.new(zkstr)
  topic = ARGV[1]
  old = ARGV[2].to_i
  new = ARGV[3].to_i
  puts "Replacing broker #{old} with #{new} on all partitions of topic
#{topic}"

  current = JSON.parse(zk.get(:path => "/brokers/topics/#{topic}")[:data])
  replacements_array = []
  replacements = {"partitions" => replacements_array}
  current["partitions"].each { |partition, brokers|
replacements_array.push({"topic" => topic, "partition" => partition.to_i,
"replicas" => replace(brokers, old, new)}) }

  replacement_json = JSON.generate(replacements)


  file = "/tmp/replace-#{topic}-#{old}-#{new}"
  if File.exist?(file)
File.delete file
  end
  File.open(file, 'w') { |f| f.write(replacement_json) }

  puts "./bin/kafka-reassign-partitions.sh --zookeeper #{zkstr}
--path-to-json-file #{file}"
  system "./bin/kafka-reassign-partitions.sh --zookeeper #{zkstr}
--path-to-json-file #{file}"





On Mon, Jul 22, 2013 at 10:40 AM, Jason Rosenberg  wrote:

> Is the kafka-reassign-partitions tool something I can experiment with now
> (this will only be staging data, in the first go-round).  How does it work?
>  Do I manually have to specify each replica I want to move?  This would be
> cumbersome, as I have on the order of 100's of topicsOr does the tool
> have the ability to specify all replicas on a particular broker?  How can I
> easily check whether a partition has all its replicas in the ISR?
>
> For some reason, I had thought there would be a default behavior, whereby a
> replica could automatically be declared dead after a configurable timeout
> period.
>
> Re-assigning broker id's would not be ideal, since I have a scheme
> currently whereby broker id's are auto-generated, from a hostname/ip, etc.
>  I could make it work, but it's not my preference to override that!
>
> Jason
>
>
> On Mon, Jul 22, 2013 at 11:50 AM, Jun Rao  wrote:
>
> > A replica's data won't be automatically moved to another broker where
> there
> > are failures. This is because we don't know if the failure is transient
> or
> > permanent. The right tool to use is the kafka-reassign-partitions tool.
> It
> > hasn't been thoroughly tested tough. We hope to harden it in the final
> > 0.8.0 release.
> >
> > You can also replace a broker with a new server by keeping the same
> broker
> > id. When the new server starts up, it will replica data from the leader.
> > You know the data is fully replicated when both replicas are in ISR.
> >
> > Thanks,
> >
> > Jun
> >
> >
> > On Mon, Jul 22, 2013 at 2:14 AM, Jason Rosenberg 
> wrote:
> >
> > > I'm planning to upgrade a 0.8 cluster from 2 old nodes, to 3 new ones
> > > (better hardware).  I'm using a replication factor of 2.
> > >
> > > I'm thinking the plan should be to spin up the 3 new nodes, and operate
> > as
> > > a 5 node cluster for a while.  Then first remove 1 of the old nodes,
> and
> > > wait for the partitions on the removed node to get replicated to the
> > other
> > > nodes.  Then, do the same for the other old node.
> > >
> > > Does this sound sensible?
> > >
> > > How does the cluster decide when to re-replicate partitions that are
> on a
> > > node that is no longer available?  Does it only happen if/when new
> > messages
> > > arrive for that partition?  Is it on a partition by partition basis?
> > >
> > > Or is it a cluster-level decision that a broker is no longer valid, in
> > > which case all affected partitions would immediately get replicated to
> > new
> > > brokers as needed?
> > >
> > > I'm just wondering how I will know when it will be safe to take down my
> > > second old node, after the first one is removed, etc.
> > >
> > > Thanks,
> > >
> > > Jason
> > >
> >
>


Re: Replacing brokers in a cluster (0.8)

2013-07-22 Thread Jason Rosenberg
Is the kafka-reassign-partitions tool something I can experiment with now
(this will only be staging data, in the first go-round).  How does it work?
 Do I manually have to specify each replica I want to move?  This would be
cumbersome, as I have on the order of 100's of topicsOr does the tool
have the ability to specify all replicas on a particular broker?  How can I
easily check whether a partition has all its replicas in the ISR?

For some reason, I had thought there would be a default behavior, whereby a
replica could automatically be declared dead after a configurable timeout
period.

Re-assigning broker id's would not be ideal, since I have a scheme
currently whereby broker id's are auto-generated, from a hostname/ip, etc.
 I could make it work, but it's not my preference to override that!

Jason


On Mon, Jul 22, 2013 at 11:50 AM, Jun Rao  wrote:

> A replica's data won't be automatically moved to another broker where there
> are failures. This is because we don't know if the failure is transient or
> permanent. The right tool to use is the kafka-reassign-partitions tool. It
> hasn't been thoroughly tested tough. We hope to harden it in the final
> 0.8.0 release.
>
> You can also replace a broker with a new server by keeping the same broker
> id. When the new server starts up, it will replica data from the leader.
> You know the data is fully replicated when both replicas are in ISR.
>
> Thanks,
>
> Jun
>
>
> On Mon, Jul 22, 2013 at 2:14 AM, Jason Rosenberg  wrote:
>
> > I'm planning to upgrade a 0.8 cluster from 2 old nodes, to 3 new ones
> > (better hardware).  I'm using a replication factor of 2.
> >
> > I'm thinking the plan should be to spin up the 3 new nodes, and operate
> as
> > a 5 node cluster for a while.  Then first remove 1 of the old nodes, and
> > wait for the partitions on the removed node to get replicated to the
> other
> > nodes.  Then, do the same for the other old node.
> >
> > Does this sound sensible?
> >
> > How does the cluster decide when to re-replicate partitions that are on a
> > node that is no longer available?  Does it only happen if/when new
> messages
> > arrive for that partition?  Is it on a partition by partition basis?
> >
> > Or is it a cluster-level decision that a broker is no longer valid, in
> > which case all affected partitions would immediately get replicated to
> new
> > brokers as needed?
> >
> > I'm just wondering how I will know when it will be safe to take down my
> > second old node, after the first one is removed, etc.
> >
> > Thanks,
> >
> > Jason
> >
>


Re: Logo

2013-07-22 Thread Radek Gruchalski
296 looks familiar: https://www.nodejitsu.com/  

Kind regards,

Radek Gruchalski
radek.gruchal...@technicolor.com (mailto:radek.gruchal...@technicolor.com) | 
radek.gruchal...@portico.io (mailto:radek.gruchal...@portico.io) | 

ra...@gruchalski.com
 (mailto:ra...@gruchalski.com)
00447889948663





Confidentiality:
This communication is intended for the above-named person and may be 
confidential and/or legally privileged.
If it has come to you in error you must take no action based on it, nor must 
you copy or show it to anyone; please delete/destroy and inform the sender 
immediately.


On Monday, 22 July 2013 at 18:51, Jay Kreps wrote:

> Hey guys,
>  
> We need a logo!
>  
> I got a few designs from a 99 designs contest that I would like to put
> forward:
> https://issues.apache.org/jira/browse/KAFKA-982
>  
> If anyone else would like to submit a design that would be great.
>  
> Let's do a vote to choose one.
>  
> -Jay  



Re: Logo

2013-07-22 Thread Jay Kreps
Yeah, good point. I hadn't seen that before.

-Jay


On Mon, Jul 22, 2013 at 10:20 AM, Radek Gruchalski <
radek.gruchal...@portico.io> wrote:

> 296 looks familiar: https://www.nodejitsu.com/
>
> Kind regards,
> Radek Gruchalski
> radek.gruchal...@technicolor.com (mailto:radek.gruchal...@technicolor.com)
> | radek.gruchal...@portico.io (mailto:radek.gruchal...@portico.io) |
> ra...@gruchalski.com (mailto:ra...@gruchalski.com)
> 00447889948663
>
> Confidentiality:
> This communication is intended for the above-named person and may be
> confidential and/or legally privileged.
> If it has come to you in error you must take no action based on it, nor
> must you copy or show it to anyone; please delete/destroy and inform the
> sender immediately.
>
>
> On Monday, 22 July 2013 at 18:51, Jay Kreps wrote:
>
> > Hey guys,
> >
> > We need a logo!
> >
> > I got a few designs from a 99 designs contest that I would like to put
> > forward:
> > https://issues.apache.org/jira/browse/KAFKA-982
> >
> > If anyone else would like to submit a design that would be great.
> >
> > Let's do a vote to choose one.
> >
> > -Jay
>
>


Re: Replacing brokers in a cluster (0.8)

2013-07-22 Thread Glenn Nethercutt
This seems like the type of behavior I'd ultimately want from the 
controlled shutdown tool 
.


Currently, I believe the ShutdownBroker causes new leaders to be 
selected for any partition the dying node was leading, but I don't think 
it explicitly forces a rebalance for topics in which the dying node was 
just an ISR (in-sync replica set) member. Ostensibly, leadership 
elections are what we want to avoid, due to the Zookeeper chattiness 
that would ensue for ensembles with lots of partitions, but I'd wager 
we'd benefit from a reduction in rebalances too.  The preferred 
replication election tool also seems to have some similar level of 
control (manual selection of the preferred replicas), but still doesn't 
let you add/remove brokers from the ISR directly.  I know the 
kafka-reassign-partitions tool lets you specify a full list of 
partitions and replica assignment, but I don't know how easily 
integrated that will be with the lifecycle you described.


Anyone know if controlled shutdown is the right tool for this? Our 
devops team will certainly be interested in the canonical answer as well.


--glenn

On 07/22/2013 05:14 AM, Jason Rosenberg wrote:

I'm planning to upgrade a 0.8 cluster from 2 old nodes, to 3 new ones
(better hardware).  I'm using a replication factor of 2.

I'm thinking the plan should be to spin up the 3 new nodes, and operate as
a 5 node cluster for a while.  Then first remove 1 of the old nodes, and
wait for the partitions on the removed node to get replicated to the other
nodes.  Then, do the same for the other old node.

Does this sound sensible?

How does the cluster decide when to re-replicate partitions that are on a
node that is no longer available?  Does it only happen if/when new messages
arrive for that partition?  Is it on a partition by partition basis?

Or is it a cluster-level decision that a broker is no longer valid, in
which case all affected partitions would immediately get replicated to new
brokers as needed?

I'm just wondering how I will know when it will be safe to take down my
second old node, after the first one is removed, etc.

Thanks,

Jason





Logo

2013-07-22 Thread Jay Kreps
Hey guys,

We need a logo!

I got a few designs from a 99 designs contest that I would like to put
forward:
https://issues.apache.org/jira/browse/KAFKA-982

If anyone else would like to submit a design that would be great.

Let's do a vote to choose one.

-Jay


Re: Replacing brokers in a cluster (0.8)

2013-07-22 Thread Jun Rao
A replica's data won't be automatically moved to another broker where there
are failures. This is because we don't know if the failure is transient or
permanent. The right tool to use is the kafka-reassign-partitions tool. It
hasn't been thoroughly tested tough. We hope to harden it in the final
0.8.0 release.

You can also replace a broker with a new server by keeping the same broker
id. When the new server starts up, it will replica data from the leader.
You know the data is fully replicated when both replicas are in ISR.

Thanks,

Jun


On Mon, Jul 22, 2013 at 2:14 AM, Jason Rosenberg  wrote:

> I'm planning to upgrade a 0.8 cluster from 2 old nodes, to 3 new ones
> (better hardware).  I'm using a replication factor of 2.
>
> I'm thinking the plan should be to spin up the 3 new nodes, and operate as
> a 5 node cluster for a while.  Then first remove 1 of the old nodes, and
> wait for the partitions on the removed node to get replicated to the other
> nodes.  Then, do the same for the other old node.
>
> Does this sound sensible?
>
> How does the cluster decide when to re-replicate partitions that are on a
> node that is no longer available?  Does it only happen if/when new messages
> arrive for that partition?  Is it on a partition by partition basis?
>
> Or is it a cluster-level decision that a broker is no longer valid, in
> which case all affected partitions would immediately get replicated to new
> brokers as needed?
>
> I'm just wondering how I will know when it will be safe to take down my
> second old node, after the first one is removed, etc.
>
> Thanks,
>
> Jason
>


Re: Apache Kafka Question

2013-07-22 Thread Yavar Husain
Millions of messages per day (with each message being few bytes) is not
really 'Big Data'. Kafka has been tested for a million message per second.

The answer to all your question IMO is "It depends".

You can start with a single instance (Single machine installation). Let
your producer send messages. Keep one broker. Increase to N brokers. When
you touch the upper limit add a server and repeat all the stuff.

Bench marking and scalability are aspects which you should try on your own
by playing with Kafka. Every use case is different. So performance metric
of one is not a global answer.

For your question on Topic or Queue, please read something about
Distributed Computing Pub/Sub, Message Queue's and other patterns which are
generic concepts and has nothing to do with Kafka. It again depends on your
use case.

Please read as to what topics in Kafka are? If you just go through the
definition of topics you would yourself answer your question within a
minute.

Replications and all would be next steps once you are done with a single
running instance of Kafka. So go ahead and get your hands dirty. You will
love Kafka :)

And yes, the most important thing: Please read the documentation first (bit
of theory) and then dive. There is no silver bullet.

Cheers,
Yavar
http://lnkd.in/GRrrDJ

On Mon, Jul 22, 2013 at 4:27 PM,  wrote:

> Hi,
>
>
>
> I am planning to use Apache Kafka 0.8  to handle millions of messages per
> day. Now I need to form the environment, like
>
>
>
> (i) How many Topics to be created?
> (ii) How many partitions/replications to be created?
> (iii) How many Brokers to be created?
> (iv) How many consumer instances in consumer group?
>
> (v) Topic or Queue? If topic whether we need to create multiple group Id
> as supposed to single one?
>
>
>
> How we can go about it? Please clarify.
>
> Thanks & Regards,
> Anantha
>
> Please do not print this email unless it is absolutely necessary.
>
> The information contained in this electronic message and any attachments
> to this message are intended for the exclusive use of the addressee(s) and
> may contain proprietary, confidential or privileged information. If you are
> not the intended recipient, you should not disseminate, distribute or copy
> this e-mail. Please notify the sender immediately and destroy all copies of
> this message and any attachments.
>
> WARNING: Computer viruses can be transmitted via email. The recipient
> should check this email and any attachments for the presence of viruses.
> The company accepts no liability for any damage caused by any virus
> transmitted by this email.
>
> www.wipro.com
>


Apache Kafka Question

2013-07-22 Thread anantha.murugan
Hi,



I am planning to use Apache Kafka 0.8  to handle millions of messages per day. 
Now I need to form the environment, like



(i) How many Topics to be created?
(ii) How many partitions/replications to be created?
(iii) How many Brokers to be created?
(iv) How many consumer instances in consumer group?

(v) Topic or Queue? If topic whether we need to create multiple group Id as 
supposed to single one?



How we can go about it? Please clarify.

Thanks & Regards,
Anantha

Please do not print this email unless it is absolutely necessary.

The information contained in this electronic message and any attachments to 
this message are intended for the exclusive use of the addressee(s) and may 
contain proprietary, confidential or privileged information. If you are not the 
intended recipient, you should not disseminate, distribute or copy this e-mail. 
Please notify the sender immediately and destroy all copies of this message and 
any attachments.

WARNING: Computer viruses can be transmitted via email. The recipient should 
check this email and any attachments for the presence of viruses. The company 
accepts no liability for any damage caused by any virus transmitted by this 
email.

www.wipro.com


Replacing brokers in a cluster (0.8)

2013-07-22 Thread Jason Rosenberg
I'm planning to upgrade a 0.8 cluster from 2 old nodes, to 3 new ones
(better hardware).  I'm using a replication factor of 2.

I'm thinking the plan should be to spin up the 3 new nodes, and operate as
a 5 node cluster for a while.  Then first remove 1 of the old nodes, and
wait for the partitions on the removed node to get replicated to the other
nodes.  Then, do the same for the other old node.

Does this sound sensible?

How does the cluster decide when to re-replicate partitions that are on a
node that is no longer available?  Does it only happen if/when new messages
arrive for that partition?  Is it on a partition by partition basis?

Or is it a cluster-level decision that a broker is no longer valid, in
which case all affected partitions would immediately get replicated to new
brokers as needed?

I'm just wondering how I will know when it will be safe to take down my
second old node, after the first one is removed, etc.

Thanks,

Jason