Re: G1 tuning

2015-10-14 Thread Scott Clasen
You can also use -Xmn with that gc to size the new gen such that those
buffers don't get tenured

I don't think that's an option with G1

On Wednesday, October 14, 2015, Cory Kolbeck  wrote:

> I'm not sure that will help here, you'll likely have the same
> medium-lifetime buffers getting into the tenured generation and forcing
> large collections.
>
> On Wed, Oct 14, 2015 at 10:00 AM, Gerrit Jansen van Vuuren <
> gerrit...@gmail.com > wrote:
>
> > Hi,
> >
> > I've seen pauses using G1 in other applications and have found that
> > -XX:+UseParallelGC
> > -XX:+UseParallelOldGC  works best if you're having GC issues in general
> on
> > the JVM.
> >
> >
> > Regards,
> >  Gerrit
> >
> > On Wed, Oct 14, 2015 at 4:28 PM, Cory Kolbeck  > wrote:
> >
> > > Hi folks,
> > >
> > > I'm a bit new to the operational side of G1, but pretty familiar with
> its
> > > basic concept. We recently set up a Kafka cluster to support a new
> > product,
> > > and are seeing some suboptimal GC performance. We're using the
> parameters
> > > suggested in the docs, except for having switched to java 1.8_40 in
> order
> > > to get better memory debugging. Even though the cluster is handling
> only
> > > 2-3k messages per second per node, we see periodic 11-18 second
> > > stop-the-world pauses on a roughly hourly cadence. I've turned on
> > > additional GC logging, and see no humongous allocations, it all seems
> to
> > be
> > > buffers making it into the tenured gen. They appear to be collectable,
> as
> > > the collection triggered by dumping the heap collects them all. Ideas
> for
> > > additional diagnosis or tuning very welcome.
> > >
> > > --Cory
> > >
> >
>


Re: Inconsistency with Zookeeper

2015-08-07 Thread Scott Clasen
each zk needs a myid file in the data dir, with a different number 1,2,3

http://zookeeper.apache.org/doc/r3.3.3/zookeeperAdmin.html

You can find the meanings of these and other configuration settings in the
section Configuration Parameters
.
A word though about a few here:

Every machine that is part of the ZooKeeper ensemble should know about
every other machine in the ensemble. You accomplish this with the series of
lines of the form*server.id =host:port:port*. The
parameters *host* and *port* are straightforward. You attribute the server
id to each machine by creating a file named myid, one for each server,
which resides in that server's data directory, as specified by the
configuration file parameter *dataDir*.

On Fri, Aug 7, 2015 at 5:59 PM, Hemanth Abbina 
wrote:

> Yes. I have set the unique broker ids (as 0, 1, 2) in the
> server.properties file
>
> I did not get the " set broker id  in zookeeper data directory".  We don't
> set any broker id in zookeeper.
>
> We provide only "zookeeper.connect=ZK:2181,ZK2:2181,ZK3:2181" in
> server.properties file. Right ?
>
> -Original Message-
> From: Prabhjot Bharaj [mailto:prabhbha...@gmail.com]
> Sent: Friday, August 7, 2015 8:44 PM
> To: users@kafka.apache.org
> Subject: Re: Inconsistency with Zookeeper
>
> Have you set broker id  in zookeeper data directory and some unique broker
> Id in server.properties ?
>
> Regards,
> Prabhjot
> On Aug 6, 2015 1:43 PM, "Hemanth Abbina"  wrote:
>
> > Hi,
> >
> > I am running a Kafka POC with below details
> >
> > * 3 Node cluster (4 Core, 16 GB RAM each) running Kafka 0.8.2.1.
> >
> > * Each node running Kafka & Zookeeper instances. (So total of 3
> > Kafka brokers & 3 zookeepers)
> >
> > When I tried to create a topic using the kafka-topics.sh, observing
> > the below exceptions on the console. If I try multiple times, the
> > topic is getting created at some time. Please suggest a workaround to
> > make this work every time.
> >
> > Thanks in advance.
> >
> > [ec2-user@ip-10-20-0-196 kafka_2.10-0.8.2.1]$ ./bin/kafka-topics.sh
> > --zookeeper 10.20.0.196:2181,10.20.0.197:2181,10.20.0.198:2181 --topic
> > reportspoc4 --create --partitions 1 --replication-factor 2 Error while
> > executing topic command replication factor: 2 larger than available
> > brokers: 1
> > kafka.admin.AdminOperationException: replication factor: 2 larger than
> > available brokers: 1
> > at
> > kafka.admin.AdminUtils$.assignReplicasToBrokers(AdminUtils.scala:70)
> > at kafka.admin.AdminUtils$.createTopic(AdminUtils.scala:171)
> > at kafka.admin.TopicCommand$.createTopic(TopicCommand.scala:93)
> > at kafka.admin.TopicCommand$.main(TopicCommand.scala:55)
> > at kafka.admin.TopicCommand.main(TopicCommand.scala)
> >
> >
> > [ec2-user@ip-10-20-0-196 kafka_2.10-0.8.2.1]$ ./bin/kafka-topics.sh
> > --zookeeper 10.20.0.196:2181,10.20.0.197:2181,10.20.0.198:2181 --topic
> > reportspoc4 --create --partitions 1 --replication-factor 2 Error while
> > executing topic command
> > org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode
> > = NoNode for /brokers/ids
> > org.I0Itec.zkclient.exception.ZkNoNodeException:
> > org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode
> > = NoNode for /brokers/ids
> > at
> > org.I0Itec.zkclient.exception.ZkException.create(ZkException.java:47)
> > at
> > org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:685)
> > at org.I0Itec.zkclient.ZkClient.getChildren(ZkClient.java:413)
> > at org.I0Itec.zkclient.ZkClient.getChildren(ZkClient.java:409)
> > at kafka.utils.ZkUtils$.getChildren(ZkUtils.scala:462)
> > at kafka.utils.ZkUtils$.getSortedBrokerList(ZkUtils.scala:78)
> > at kafka.admin.AdminUtils$.createTopic(AdminUtils.scala:170)
> > at kafka.admin.TopicCommand$.createTopic(TopicCommand.scala:93)
> > at kafka.admin.TopicCommand$.main(TopicCommand.scala:55)
> > at kafka.admin.TopicCommand.main(TopicCommand.scala)
> > Caused by: org.apache.zookeeper.KeeperException$NoNodeException:
> > KeeperErrorCode = NoNode for /brokers/ids
> > at
> > org.apache.zookeeper.KeeperException.create(KeeperException.java:111)
> > at
> > org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
> > at
> org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1472)
> > at
> org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1500)
> > at
> > org.I0Itec.zkclient.ZkConnection.getChildren(ZkConnection.java:99)
> > at org.I0Itec.zkclient.ZkClient$2.call(ZkClient.java:416)
> > at org.I0Itec.zkclient.ZkClient$2.call(ZkClient.java:413)
> > at
> > org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:675)
> > ... 8 more
> >
> > --regards
> > Hemanth
> >
>


Re: ping kafka server

2015-02-09 Thread Scott Clasen
I have used nagios in this manner with kafaka before and worked fine.

On Mon, Feb 9, 2015 at 2:48 PM, Koert Kuipers  wrote:

> i would like to be able to ping kafka servers from nagios to confirm they
> are alive. since kafka servers dont run a http server (web ui) i am not
> sure how to do this.
>
> is it safe to establish a "test" tcp connection (so connect and immediately
> disconnect using telnet or netstat or something like that) to the kafka
> server on port 9092 to confirm its alive?
>
> thanks
>


Re: kafka consumer to write into DB

2014-12-05 Thread Scott Clasen
if you are using scala/akka this will handle the batching and acks for you.

https://github.com/sclasen/akka-kafka#akkabatchconsumer

On Fri, Dec 5, 2014 at 9:21 AM, Sa Li  wrote:

> Thank you very much for the reply, Neha, I have a question about consumer,
> I consume the data from kafka and write into DB, of course I have to create
> a hash map in memory, load data into memory and bulk copy to DB instead of
> insert into DB line by line. Does it mean I need to ack each message while
> load to memory?
>
> thanks
>
>
>
> On Thu, Dec 4, 2014 at 1:21 PM, Neha Narkhede  wrote:
>
> > This is specific for pentaho but may be useful -
> > https://github.com/RuckusWirelessIL/pentaho-kafka-consumer
> >
> > On Thu, Dec 4, 2014 at 12:58 PM, Sa Li  wrote:
> >
> > > Hello, all
> > >
> > > I never developed a kafka consumer, I want to be able to make an
> advanced
> > > kafka consumer in java to consume the data and continuously write the
> > data
> > > into postgresql DB. I am thinking to create a map in memory and
> getting a
> > > predefined number of messages in memory then write into DB in batch, is
> > > there a API or sample code to allow me to do this?
> > >
> > >
> > > thanks
> > >
> > >
> > > --
> > >
> > > Alec Li
> > >
> >
> >
> >
> > --
> > Thanks,
> > Neha
> >
>
>
>
> --
>
> Alec Li
>


Re: Announcing Confluent

2014-11-06 Thread Scott Clasen
Awesome. Congrats to all of you!

On Thu, Nov 6, 2014 at 10:28 AM, Jay Kreps  wrote:

> Hey all,
>
> I’m happy to announce that Jun Rao, Neha Narkhede and I are creating a
> company around Kafka called Confluent. We are planning on productizing the
> kind of Kafka-based real-time data platform we built out at LinkedIn. We
> are doing this because we think this is a really powerful idea and we felt
> there was a lot to do to make this idea really take root. We wanted to make
> that our full time mission and focus.
>
> There is a blog post that goes into a little more depth here:
> http://blog.confluent.io/
>
> LinkedIn will remain a heavy Kafka user and contributor. Combined with our
> additional resources from the funding of the company this should be a
> really good thing for the Kafka development effort. Especially when
> combined with the increasing contributions from the rest of the development
> community. This is great news, as there is a lot of work to do. We'll need
> to really focus on scaling this distributed development in a healthy way.
>
> One thing I do want to emphasize is that the addition of a company in the
> Kafka ecosystem won’t mean meddling with open source. Kafka will remain
> 100% open source and community focused, as of course is true of any Apache
> project. I have been doing open source for a long time and strongly believe
> it is the right model for infrastructure software development.
>
> Confluent is just getting off the ground now. We left LinkedIn, raised some
> money, and we have an office (but no furniture yet!). None the less, f you
> are interested in finding out more about the company and either getting
> help with your Kafka usage or joining us to help build all this, by all
> means reach out to us, we’d love to talk.
>
> Wish us luck!
>
> -Jay
>


Re: how to ensure strong consistency with reasonable availability

2014-07-24 Thread Scott Clasen
Thanks for the Jira info.

Just to clarify, in the case we are outlining above, would the producer
would have received an ack on m2 (with acks = -1) or not?  If not, then I
have no concerns, if so, then how would the producer know to re-publish?


On Thu, Jul 24, 2014 at 9:38 AM, Jun Rao  wrote:

> About re-publishing m2. it seems it's better to let the producer choose
> whether to do this or not.
>
> There is another known bug KAFKA-1211 that's not fixed yet. The situation
> when this can happen is relatively rare and the fix is slightly involved.
> So, it may not be addressed in 0.8.2.
>
> Thanks,
>
> Jun
>
>
> On Tue, Jul 22, 2014 at 8:36 PM, scott@heroku  wrote:
>
> > Thanks so much for the detailed explanation Jun, it pretty much lines up
> > with my understanding.
> >
> > In the case below, if we didn't particularly care about ordering and
> > re-produced m2, it would then become m5, and in many use cases this would
> > be ok?
> >
> > Perhaps a more direct question would be, once 0.8.2 is out and I have a
> > topic with unclean leader election disabled, and produce with acks = -1,
> >  are there any known series of events (other than disk failures on all
> > brokers) that would cause the loss of messages that a producer has
> received
> > an ack for?
> >
> >
> >
> >
> >
> > Sent from my iPhone
> >
> > > On Jul 22, 2014, at 8:17 PM, Jun Rao  wrote:
> > >
> > > They key point is that we have to keep all replicas consistent with
> each
> > > other such that no matter which replica a consumer reads from, it
> always
> > > reads the same data on a given offset.  The following is an example.
> > >
> > > Suppose we have 3 brokers A, B and C. Let's assume A is the leader and
> at
> > > some point, we have the following offsets and messages in each replica.
> > >
> > > offset   A   B   C
> > > 1m1  m1  m1
> > > 2m2
> > >
> > > Let's assume that message m1 is committed and message m2 is not. At
> > exactly
> > > this moment, replica A dies. After a new leader is elected, say B,  new
> > > messages can be committed with just replica B and C. Some point later
> if
> > we
> > > commit two more messages m3 and m4, we will have the following.
> > >
> > > offset   A   B   C
> > > 1m1  m1  m1
> > > 2m2  m3  m3
> > > 3m4  m4
> > >
> > > Now A comes back. For consistency, it's important for A's log to be
> > > identical to B and C. So, we have to remove m2 from A's log and add m3
> > and
> > > m4. As you can see, whether you want to republish m2 or not, m2 cannot
> > stay
> > > in its current offset, since in other replicas, that offset is already
> > > taken by other messages. Therefore, a truncation of replica A's log is
> > > needed to keep the replicas consistent. Currently, we don republish
> > > messages like m2 since (1) it's not necessary since it's never
> considered
> > > committed; (2) it will make our protocol more complicated.
> > >
> > > Thanks,
> > >
> > > Jun
> > >
> > >
> > >
> > >
> > >> On Tue, Jul 22, 2014 at 3:40 PM, scott@heroku 
> wrote:
> > >>
> > >> Thanks Jun
> > >>
> > >> Can you explain a little more about what an uncommitted message means?
> > >> The messages are in the log so presumably? they have been acked at
> least
> > >> by the the local broker.
> > >>
> > >> I guess I am hoping for some intuition around why 'replaying' the
> > messages
> > >> in question would cause bad things.
> > >>
> > >> Thanks!
> > >>
> > >>
> > >> Sent from my iPhone
> > >>> On Jul 22, 2014, at 3:06 PM, Jun Rao  wrote:
> > >>>
> > >>> Scott,
> > >>>
> > >>> The reason for truncation is that the broker that comes back may have
> > >> some
> > >>> un-committed messages. Those messages shouldn't be exposed to the
> > >> consumer
> > >>> and therefore need to be removed from the log. So, on broker startup,
> > we
> > >>> first truncate the log to a safe point before which we know all
> > messages
> > >>> are committed. This broker will then sync up with the current leader
> to
> > >> get
>

Re: how to ensure strong consistency with reasonable availability

2014-07-22 Thread Scott Clasen
Ahh, yes that message loss case. I've wondered about that myself.

I guess I dont really understand why truncating messages is ever the right
thing to do.  As kafka is an 'at least once' system. (send a message, get
no ack, it still might be on the topic) consumers that care will have to
de-dupe anyhow.

To the kafka designers:  is there anything preventing implementation of
alternatives to truncation? when a broker comes back online and needs to
truncate, cant it fire up a producer and take the extra messages and send
them back to the original topic or alternatively an error topic?

Would love to understand the rationale for the current design, as my
perspective is doubtfully as clear as the designers'




On Tue, Jul 22, 2014 at 6:21 AM, Jiang Wu (Pricehistory) (BLOOMBERG/ 731
LEX -)  wrote:

> kafka-1028 addressed another unclean leader election problem. It prevents
> a broker not in ISR from becoming a leader. The problem we are facing is
> that a broker in ISR but without complete messages may become a leader.
> It's also a kind of unclean leader election, but not the one that
> kafka-1028 addressed.
>
> Here I'm trying to give a proof that current kafka doesn't achieve the
> requirement (no message loss, no blocking when 1 broker down) due to its
> two behaviors:
> 1. when choosing a new leader from 2 followers in ISR, the one with less
> messages may be chosen as the leader
> 2. even when replica.lag.max.messages=0, a follower can stay in ISR when
> it has less messages than the leader.
>
> We consider a cluster with 3 brokers and a topic with 3 replicas. We
> analyze different cases according to the value of request.required.acks
> (acks for short). For each case and it subcases, we find situations that
> either message loss or service blocking happens. We assume that at the
> beginning, all 3 replicas, leader A, followers B and C, are in sync, i.e.,
> they have the same messages and are all in ISR.
>
> 1. acks=0, 1, 3. Obviously these settings do not satisfy the requirement.
> 2. acks=2. Producer sends a message m. It's acknowledged by A and B. At
> this time, although C hasn't received m, it's still in ISR. If A is killed,
> C can be elected as the new leader, and consumers will miss m.
> 3. acks=-1. Suppose replica.lag.max.messages=M. There are two sub-cases:
> 3.1 M>0. Suppose C be killed. C will be out of ISR after
> replica.lag.time.max.ms. Then the producer publishes M messages to A and
> B. C restarts. C will join in ISR since it is M messages behind A and B.
> Before C replicates all messages, A is killed, and C becomes leader, then
> message loss happens.
> 3.2 M=0. In this case, when the producer publishes at a high speed, B and
> C will fail out of ISR, only A keeps receiving messages. Then A is killed.
> Either message loss or service blocking will happen, depending on whether
> unclean leader election is enabled.
>
>
> From: users@kafka.apache.org At: Jul 21 2014 22:28:18
> To: JIANG WU (PRICEHISTORY) (BLOOMBERG/ 731 LEX -), users@kafka.apache.org
> Subject: Re: how to ensure strong consistency with reasonable availability
>
> You will probably need 0.8.2  which gives
> https://issues.apache.org/jira/browse/KAFKA-1028
>
>
> On Mon, Jul 21, 2014 at 6:37 PM, Jiang Wu (Pricehistory) (BLOOMBERG/ 731
> LEX -)  wrote:
>
> > Hi everyone,
> >
> > With a cluster of 3 brokers and a topic of 3 replicas, we want to achieve
> > the following two properties:
> > 1. when only one broker is down, there's no message loss, and
> > procuders/consumers are not blocked.
> > 2. in other more serious problems, for example, one broker is restarted
> > twice in a short period or two brokers are down at the same time,
> > producers/consumers can be blocked, but no message loss is allowed.
> >
> > We haven't found any producer/broker paramter combinations that achieve
> > this. If you know or think some configurations will work, please post
> > details. We have a test bed to verify any given configurations.
> >
> > In addition, I'm wondering if it's necessary to open a jira to require
> the
> > above feature?
> >
> > Thanks,
> > Jiang
>
>
>


Re: how to ensure strong consistency with reasonable availability

2014-07-21 Thread Scott Clasen
You will probably need 0.8.2  which gives
https://issues.apache.org/jira/browse/KAFKA-1028


On Mon, Jul 21, 2014 at 6:37 PM, Jiang Wu (Pricehistory) (BLOOMBERG/ 731
LEX -)  wrote:

> Hi everyone,
>
> With a cluster of 3 brokers and a topic of 3 replicas, we want to achieve
> the following two properties:
> 1. when only one broker is down, there's no message loss, and
> procuders/consumers are not blocked.
> 2. in other more serious problems, for example, one broker is restarted
> twice in a short period or two brokers are down at the same time,
> producers/consumers can be blocked, but no message loss is allowed.
>
> We haven't found any producer/broker paramter combinations that achieve
> this. If you know or think some configurations will work, please post
> details. We have a test bed to verify any given configurations.
>
> In addition, I'm wondering if it's necessary to open a jira to require the
> above feature?
>
> Thanks,
> Jiang


event-shuttle

2014-06-03 Thread Scott Clasen
Thought Id post this here, as Ive seen questions here around "What do I do
if kafka is down?"

here is one possibility.

https://github.com/sclasen/event-shuttle

this is a go service meant to run as a unix daemon, and apps running on the
same box can just post messages over http to a service on localhost. The
service journals messages through boltdb and delivers them to kafka when
its available.

It also works around the problem of producing to kafka from languages that
have *ahem* less robust client implementations, as hopefully any lang can
post over http.

Most appropriate for emmitters of higher-value, lower-volume events.

Thoughts?
SC


Re: Reaction to Reactive Streams?

2014-04-17 Thread Scott Clasen
Would be great to get a reactive consumer api (in 0.9 ??)


On Thu, Apr 17, 2014 at 11:58 AM, Richard Rodseth wrote:

> Announcement from Typesafe et al:
>
> http://www.reactive-streams.org/
> https://typesafe.com/blog/typesafe-announces-akka-streams
>


Re: New Producer Public API

2014-01-28 Thread Scott Clasen
+1  to zk bootstrap + close as an option at least


On Tue, Jan 28, 2014 at 10:09 AM, Neha Narkhede wrote:

> >> The producer since 0.8 is actually zookeeper free, so this is not new to
> this client it is true for the current client as well. Our experience was
> that direct zookeeper connections from zillions of producers wasn't a good
> idea for a number of reasons.
>
> The problem with several thousand connections to zookeeper is mainly the
> long lived sessions causing overhead on zookeeper.
> This further degrades zookeeper performance causing it to be flaky and
> expire sessions/disconnect clients and so on. That being said,
> I don't see why we can't use zookeeper *just* for the bootstrap on client
> startup and close the connection right after the bootstrap is done.
> IMO, this is more intuitive and convenient as it will allow users to the
> same "bootstrap config" across producers, consumers and brokers and
> will not cause any performance/operational issues on zookeeper. This is
> assuming that all the zillion clients don't bootstrap at the same time,
> which is rare in practice.
>
> Thanks,
> Neha
>
>
> On Tue, Jan 28, 2014 at 8:02 AM, Mattijs Ugen (DT)  >wrote:
>
> > Sorry to tune in a bit late, but here goes.
> >
> > > 1. The producer since 0.8 is actually zookeeper free, so this is not
> new
> > to
> > > this client it is true for the current client as well. Our experience
> was
> > > that direct zookeeper connections from zillions of producers wasn't a
> > good
> > > idea for a number of reasons. Our intention is to remove this
> dependency
> > > from the consumer as well. The configuration in the producer doesn't
> need
> > > the full set of brokers, though, just one or two machines to bootstrap
> > the
> > > state of the cluster from--in other words it isn't like you need to
> > > reconfigure your clients every time you add some servers. This is
> exactly
> > > how zookeeper works too--if we used zookeeper you would need to give a
> > list
> > > of zk urls in case a particular zk server was down. Basically either
> way
> > > you need a few statically configured nodes to go to discover the full
> > state
> > > of the cluster. For people who don't like hard coding hosts you can
> use a
> > > VIP or dns or something instead.
> > In our configuration, the zookeeper quorum is actually one of the few
> > stable (in the sense of host names / ip addresses) pillars of the
> > complete ecosystem: every distributed service uses zookeeper to
> > coordinate the hosts that make up the service as a whole. Considering
> > that the kafka cluster will save the information needed for this
> > bootstrap to zookeeper anyhow, having clients (either producers or
> > consumers) retrieve this information at first use makes sense to me.
> >
> > We could create routine that retrieves a list of brokers from zookeeper
> > before initializing a Producer, but that feels more like a workaround
> > for a feature that in my humble opinion could well be part of the kafka
> > client library. That said, I realise that having two options for
> > connection bootstrapping (assuming that hardcoding a list of brokers is
> > here to stay) could be confusing for new users, but bypassing zookeeper
> > for this was rather confusing for me when I first came across it :)
> >
> > So, in short, I'd love it if the option to bootstrap the broker list
> > from zookeeper was there, rather than requiring to configure additional
> > (moving) virtual hostnames or fixed ip addresses for producers in our
> > cluster setup. I've been baffled a few times by this option not being
> > available for a distributed service that coordinates itself through
> > zookeeper.
> >
> > Just my two cents :)
> >
> > Mattijs
> >
>


Re: Securing kafka

2013-08-30 Thread Scott Clasen
Please contribute that back!, Would potentially be huge for mirroring
clusters across Amazon Regions, for instance.


On Thu, Aug 29, 2013 at 8:22 PM, Rajasekar Elango wrote:

> We have made changes to kafka code to support certificate based mutual SSL
> authentication. So the clients and broker will exchange trusted
> certificates for successful communication. This provides both
> authentication and ssl encryption. Planning to contribute that code back to
> kafka soon.
>
> Thanks,
> Raja.
>
>
> On Thu, Aug 29, 2013 at 11:16 PM, Joe Stein  wrote:
>
> > One use case I have been discussing recently with a few clients is
> > verifying the digital signature of a message as part of the acceptance
> > criteria of it being committed to the log and/or when it is consumed.
> >
> > I would be very interested in discussing different scenarios such as
> Kafka
> > as a service, privacy at rest as well as authorization and authentication
> > (if required).
> >
> > Hit me up
> >
> > /***
> >  Joe Stein
> >  Founder, Principal Consultant
> >  Big Data Open Source Security LLC
> >  http://www.stealth.ly
> >  Twitter: @allthingshadoop 
> > /
> >
> >
> > On Thu, Aug 29, 2013 at 8:13 PM, Jay Kreps  wrote:
> >
> > > +1
> > >
> > > We don't have any application-level security at this time so the answer
> > is
> > > whatever you can do at the network/system level.
> > >
> > > -Jay
> > >
> > >
> > > On Thu, Aug 29, 2013 at 10:09 AM, Benjamin Black  wrote:
> > >
> > > > IP filters on the hosts.
> > > > On Aug 29, 2013 10:03 AM, "Calvin Lei"  wrote:
> > > >
> > > > > Is there a way to stop a malicious user to connect directly to a
> > kafka
> > > > > broker and send any messages? Could we have the brokers to accept a
> > > > message
> > > > > to a list of know IPs?
> > > > >
> > > >
> > >
> >
>
>
>
> --
> Thanks,
> Raja.
>


Re: message loss

2013-08-22 Thread Scott Clasen
+1 for that knob on a per topic basis, choosing consistency over availability 
would open kafka to more use cases no?

Sent from my iPhone

On Aug 22, 2013, at 1:59 PM, Neha Narkhede  wrote:

> Scott,
> 
> Kafka replication aims to guarantee that committed writes are not lost. In
> other words, as long as leader can be transitioned to a broker that was in
> the ISR, no data will be lost. For increased availability, if there are no
> other brokers in the ISR, we fall back to electing a broker that is not
> caught up with the current leader, as the new leader. IMO, this is the real
> problem that the post is complaining about.
> 
> Let me explain his test in more detail-
> 
> 1. The first part of the test partitions the leader (n1) from other brokers
> (n2-n5). The leader shrinks the ISR to just itself and ends up taking n
> writes. This is not a problem all by itself. Once the partition is
> resolved, n2-n5 would catch up from the leader and no writes will be lost,
> since n1 would continue to serve as the leader.
> 2. The problem starts in the second part of the test where it partitions
> the leader (n1) from zookeeper. This causes the unclean leader election
> (mentioned above), which causes Kafka to lose data.
> 
> We thought about this while designing replication, but never ended up
> including the feature that would allow some applications to pick
> consistency over availability. Basically, we could let applications pick
> some topics for which the controller will never attempt unclean leader
> election. The result is that Kafka would reject writes and mark the
> partition offline, instead of moving leadership to a broker that is not in
> ISR, and losing the writes.
> 
> I think if we included this knob, the tests that aphyr (jepsen) ran, would
> make more sense.
> 
> Thanks,
> Neha
> 
> 
> On Thu, Aug 22, 2013 at 12:50 PM, Scott Clasen  wrote:
> 
>> So looks like there is a jespen post coming on kafka 0.8 replication, based
>> on this thats circulating on twitter. https://www.refheap.com/17932/raw
>> 
>> Understanding that kafka isnt designed particularly to be partition
>> tolerant, the result is not completely surprising.
>> 
>> But my question is, is there something that can be done about the lost
>> messages?
>> 
>> From my understanding when broker n1 comes back on line, currently what
>> will happen is that the messages that were only on n1 will be
>> truncated/tossed while n1 is coming back to ISR. Please correct me if this
>> is not accurate.
>> 
>> Would it instead be possible to do something else with them, like sending
>> them to an internal lost messages topic, or log file where some manual
>> intervenion could be done on them, or a configuration property like
>> replay.truncated.messages=true could be set where the broker would send the
>> lost messages back onto the topic after ISR?
>> 


message loss

2013-08-22 Thread Scott Clasen
So looks like there is a jespen post coming on kafka 0.8 replication, based
on this thats circulating on twitter. https://www.refheap.com/17932/raw

Understanding that kafka isnt designed particularly to be partition
tolerant, the result is not completely surprising.

But my question is, is there something that can be done about the lost
messages?

>From my understanding when broker n1 comes back on line, currently what
will happen is that the messages that were only on n1 will be
truncated/tossed while n1 is coming back to ISR. Please correct me if this
is not accurate.

Would it instead be possible to do something else with them, like sending
them to an internal lost messages topic, or log file where some manual
intervenion could be done on them, or a configuration property like
replay.truncated.messages=true could be set where the broker would send the
lost messages back onto the topic after ISR?


Re: Kafka 08 clients

2013-08-10 Thread Scott Clasen
bpot/poseidon on github is a ruby 0.8 client, works fine for me

Sent from my iPhone

On Aug 10, 2013, at 3:08 PM, Timothy Chen  wrote:

> That's definitely means it's not up to date to the protocol, I'm tried the
> java client and it was working with latest 0.8 api.
> 
> Not sure about any other languages.
> 
> Tim
> 
> 
> On Sat, Aug 10, 2013 at 2:55 PM, Mark  wrote:
> 
>> Are all Kafka clients working with the latest version of Kafka?
>> 
>> I tried the kafka-rb client and a simple example listed in the README but
>> I keep getting a nasty error
>> require 'kafka'
>> producer = Kafka::Producer.new
>> message = Kafka::Message.new("some random message content")
>> producer.push(message)
>> 
>> [2013-08-10 14:49:52,166] ERROR Closing socket for /127.0.0.1 because of
>> error (kafka.network.Processor)
>> java.nio.BufferUnderflowException
>>at java.nio.HeapByteBuffer.get(HeapByteBuffer.java:127)
>>at java.nio.ByteBuffer.get(ByteBuffer.java:675)
>>at kafka.api.ApiUtils$.readShortString(ApiUtils.scala:38)
>>at
>> kafka.api.ProducerRequest$$anonfun$1.apply(ProducerRequest.scala:40)
>>at
>> kafka.api.ProducerRequest$$anonfun$1.apply(ProducerRequest.scala:38)
>>at
>> scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:227)
>>at
>> scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:227)
>>at
>> scala.collection.immutable.Range$ByOne$class.foreach(Range.scala:282)
>>at
>> scala.collection.immutable.Range$$anon$1.foreach(Range.scala:274)
>>at
>> scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:227)
>>at scala.collection.immutable.Range.flatMap(Range.scala:39)
>>at kafka.api.ProducerRequest$.readFrom(ProducerRequest.scala:38)
>>at kafka.api.RequestKeys$$anonfun$1.apply(RequestKeys.scala:34)
>>at kafka.api.RequestKeys$$anonfun$1.apply(RequestKeys.scala:34)
>>at
>> kafka.network.RequestChannel$Request.(RequestChannel.scala:49)
>>at kafka.network.Processor.read(SocketServer.scala:345)
>>at kafka.network.Processor.run(SocketServer.scala:245)
>>at java.lang.Thread.run(Thread.java:680)
>> 
>> 


Re: Metrics: via Broker vs. Producer vs. Consumer

2013-07-24 Thread Scott Clasen
Valid assumptions IMO 

Sent from my iPhone

On Jul 24, 2013, at 8:22 AM, Jay Kreps  wrote:

> Yeah all our monitoring is just of the local process (basically just a
> counter exposed through yammer metrics which support jmx and other
> outputs). If I understand what you want instead of having a counter that
> tracks, say, produce requests per second for a single broker you want one
> that covers the whole cluster. Obviously this would require collecting the
> local count and aggregating across all the brokers.
> 
> Our assumption is that you already have a separate monitoring system which
> can slurp all these up, aggregate them, graph them, and alert off them.
> There are a number of open source thingies like this and I think most
> bigger shops have something they use. Our assumption is that trying to do a
> kafka-specific monitoring system wouldn't work for most people because they
> are wedded to their current setup and just want to integrate with that.
> 
> I'm not sure how valid any of those assumptions actually are.
> 
> -Jay
> 
> 
> On Wed, Jul 24, 2013 at 7:29 AM, Otis Gospodnetic <
> otis.gospodne...@gmail.com> wrote:
> 
>> Hi,
>> 
>> I was looking at
>> 
>> https://cwiki.apache.org/confluence/display/KAFKA/Operations#Operations-Monitoring
>> and noticed there is no information about which metrics are available
>> in which process/JVM/JMX.
>> 
>> Some are available in the Broker process, but some are only available
>> from the JVM running Consumer and some only from the JVM running
>> Producer.  And yet some Producer and Consumer metrics are, I *believe*
>> available from Broker's JMX.
>> 
>> Would it be possible for somebody in the know to mark the metrics in
>> 
>> https://cwiki.apache.org/confluence/display/KAFKA/Operations#Operations-Monitoring
>> so one can tell where to get it?
>> 
>> Also, why is it that the Broker process doesn't have *all* metrics,
>> including Producer and Consumer one?  Is that because there can be N
>> Brokers and each P or C talk to one Broker at a time and thus there is
>> no single process/JMX that can know *all* stats for *all* Brokers and
>> for *all* Ps and Cs?
>> 
>> Thank you!
>> Otis
>> --
>> Performance Monitoring -- http://sematext.com/spm
>> Solr & ElasticSearch Support -- http://sematext.com/
>> 


Re: Replacing brokers in a cluster (0.8)

2013-07-22 Thread Scott Clasen
Here's a ruby cli that you can use to replace brokers...it shells out to
the kafka-reassign-partitions.sh tool after figuring out broker lists from
zk. Hope its useful.


#!/usr/bin/env ruby

require 'excon'
require 'json'
require 'zookeeper'

def replace(arr, o, n)
  arr.map{|v| v == o ? n : v }
end

if ARGV.length != 4
  puts "Usage: bundle exec bin/replace-instance zkstr topic-name
old-broker-id new-broker-id"
else
  zkstr = ARGV[0]
  zk = Zookeeper.new(zkstr)
  topic = ARGV[1]
  old = ARGV[2].to_i
  new = ARGV[3].to_i
  puts "Replacing broker #{old} with #{new} on all partitions of topic
#{topic}"

  current = JSON.parse(zk.get(:path => "/brokers/topics/#{topic}")[:data])
  replacements_array = []
  replacements = {"partitions" => replacements_array}
  current["partitions"].each { |partition, brokers|
replacements_array.push({"topic" => topic, "partition" => partition.to_i,
"replicas" => replace(brokers, old, new)}) }

  replacement_json = JSON.generate(replacements)


  file = "/tmp/replace-#{topic}-#{old}-#{new}"
  if File.exist?(file)
File.delete file
  end
  File.open(file, 'w') { |f| f.write(replacement_json) }

  puts "./bin/kafka-reassign-partitions.sh --zookeeper #{zkstr}
--path-to-json-file #{file}"
  system "./bin/kafka-reassign-partitions.sh --zookeeper #{zkstr}
--path-to-json-file #{file}"





On Mon, Jul 22, 2013 at 10:40 AM, Jason Rosenberg  wrote:

> Is the kafka-reassign-partitions tool something I can experiment with now
> (this will only be staging data, in the first go-round).  How does it work?
>  Do I manually have to specify each replica I want to move?  This would be
> cumbersome, as I have on the order of 100's of topicsOr does the tool
> have the ability to specify all replicas on a particular broker?  How can I
> easily check whether a partition has all its replicas in the ISR?
>
> For some reason, I had thought there would be a default behavior, whereby a
> replica could automatically be declared dead after a configurable timeout
> period.
>
> Re-assigning broker id's would not be ideal, since I have a scheme
> currently whereby broker id's are auto-generated, from a hostname/ip, etc.
>  I could make it work, but it's not my preference to override that!
>
> Jason
>
>
> On Mon, Jul 22, 2013 at 11:50 AM, Jun Rao  wrote:
>
> > A replica's data won't be automatically moved to another broker where
> there
> > are failures. This is because we don't know if the failure is transient
> or
> > permanent. The right tool to use is the kafka-reassign-partitions tool.
> It
> > hasn't been thoroughly tested tough. We hope to harden it in the final
> > 0.8.0 release.
> >
> > You can also replace a broker with a new server by keeping the same
> broker
> > id. When the new server starts up, it will replica data from the leader.
> > You know the data is fully replicated when both replicas are in ISR.
> >
> > Thanks,
> >
> > Jun
> >
> >
> > On Mon, Jul 22, 2013 at 2:14 AM, Jason Rosenberg 
> wrote:
> >
> > > I'm planning to upgrade a 0.8 cluster from 2 old nodes, to 3 new ones
> > > (better hardware).  I'm using a replication factor of 2.
> > >
> > > I'm thinking the plan should be to spin up the 3 new nodes, and operate
> > as
> > > a 5 node cluster for a while.  Then first remove 1 of the old nodes,
> and
> > > wait for the partitions on the removed node to get replicated to the
> > other
> > > nodes.  Then, do the same for the other old node.
> > >
> > > Does this sound sensible?
> > >
> > > How does the cluster decide when to re-replicate partitions that are
> on a
> > > node that is no longer available?  Does it only happen if/when new
> > messages
> > > arrive for that partition?  Is it on a partition by partition basis?
> > >
> > > Or is it a cluster-level decision that a broker is no longer valid, in
> > > which case all affected partitions would immediately get replicated to
> > new
> > > brokers as needed?
> > >
> > > I'm just wondering how I will know when it will be safe to take down my
> > > second old node, after the first one is removed, etc.
> > >
> > > Thanks,
> > >
> > > Jason
> > >
> >
>


0.8 backup strategy anyone?

2013-06-14 Thread Scott Clasen
So despite 0.8 being a release that will give much higher availability   do
people do anything at all to back up the data?

For instance if any of you are running on EC2 and using ephemeral disk for
perf reasons, what do you do about messages that you absolutely cant afford
to lose.

Basically looking to avoid EBS since many of the  major AWS outages seem to
have EBS as a  cause.


Re: Monitor Kafka Stats

2013-06-06 Thread Scott Clasen
Hi Jun

Have you looked at Kafka-898 in jira?

It has a patch for reporting metrics to librato, which would make it very easy 
for people to get started with Kafka metrics.

Sent from my iPhone

On Jun 6, 2013, at 8:55 AM, Jun Rao  wrote:

> Those stats on the broker are available through JMX. In 0.8, since we use
> metrics, you can potentially use different reporters to get the stats (see
> http://metrics.codahale.com/getting-started/#reporting-via-http).
> 
> Thanks,
> 
> Jun
> 
> 
> On Thu, Jun 6, 2013 at 4:50 AM, Hanish Bansal <
> hanish.bansal.agar...@gmail.com> wrote:
> 
>> Hi
>> 
>> I am trying to get some parameters from kafka like Data produced per
>> second,
>> Data Consumed per second etc.
>> 
>> There is a java api in Kafka "kafka.network.SocketServerStats" that i am
>> using
>> for my purpose :
>> 
>> socketServerStats.getNumProduceRequests());
>> socketServerStats.getNumFetchRequests();
>> socketServerStats.getProduceRequestsPerSecond();
>> 
>> But its not giving any values.
>> 
>> Any suggestion will be helpful for me.
>> 
>> --
>> *Thanks & Regards*
>> *Hanish Bansal*
>> 


Re: Apache Kafka in AWS

2013-05-22 Thread Scott Clasen
Thanks.  FWIW  this one has been fine so far

java version "1.7.0_13"
OpenJDK Runtime Environment (IcedTea7 2.3.6) (Ubuntu build 1.7.0_13-b20)
OpenJDK 64-Bit Server VM (build 23.7-b01, mixed mode)

though not running at the load in your tests.


On Wed, May 22, 2013 at 4:51 PM, Jason Weiss  wrote:

> [ec2-user@ip-10-194-5-76 ~]$ java -version
> java version "1.6.0_24"
> OpenJDK Runtime Environment (IcedTea6 1.11.11)
> (amazon-61.1.11.11.53.amzn1-x86_64)
> OpenJDK 64-Bit Server VM (build 20.0-b12, mixed mode)
>
>
> Yes, as soon as I put it under heavy load, it would buckle almost
> consistently. I knew it was JDK related because I temporarily gave up on
> AWS, but I was able to run the same code on my MacBook Pro without issue.
> That's when I upgraded AWS to Oracle Java 7 64-bit and all my crashes
> disappeared under load.
>
> Jason
>
>
> 
> From: Scott Clasen [sc...@heroku.com]
> Sent: Wednesday, May 22, 2013 19:27
> To: users
> Subject: Re: Apache Kafka in AWS
>
> Hey Jason,
>
>  question what openjdk version did you have issues with? Im running kafka
> on it now and has been ok. Was it a crash only at load?
>
> Thanks
> SC
>
>
> On Wed, May 22, 2013 at 1:42 PM, Jason Weiss 
> wrote:
>
> > All,
> >
> > I asked a number of questions of the group over the last week, and I'm
> > happy to report that I've had great success getting Kafka up and running
> in
> > AWS. I am using 3 EC2 instances, each of which is a M2 High-Memory
> > Quadruple Extra Large with 8 cores and 58.4 GiB of memory according to
> the
> > AWS specs. I have co-located Zookeeper instances next to Zafka on each
> > machine.
> >
> > I am able to publish in a repeatable fashion 273,000 events per second,
> > with each event payload consisting of a fixed size of 2048 bytes! This
> > represents the maximum throughput possible on this configuration, as the
> > servers became CPU constrained, averaging 97% utilization in a relatively
> > flat line. This isn't a "burst" speed – it represents a sustained
> > throughput from 20 M1 Large EC2 Kafka multi-threaded producers. Putting
> > this into perspective, if my log retention period was a month, I'd be
> > aggregating 1.3 petabytes of data on my disk drives. Suffice to say, I
> > don't see us retaining data for more than a few hours!
> >
> > Here were the keys to tuning for future folks to consider:
> >
> > First and foremost, be sure to configure your Java heap size accordingly
> > when you launch Kafka. The default is like 512MB, which in my case left
> > virtually all of my RAM inaccessible to Kafka.
> > Second, stay away from OpenJDK. No, seriously – this was a huge thorn in
> > my side, and I almost gave up on Kafka because of the problems I
> > encountered. The OpenJDK NIO functions repeatedly resulted in Kafka
> > crashing and burning in dramatic fashion. The moment I switched over to
> > Oracle's JDK for linux, Kafka didn't puke once- I mean, like not even a
> > hiccup.
> > Third know your message size. In my opinion, the more you understand
> about
> > your event payload characteristics, the better you can tune the system.
> The
> > two knobs to really turn are the log.flush.interval and
> > log.default.flush.interval.ms. The values here are intrinsically
> > connected to the types of payloads you are putting through the system.
> > Fourth and finally, to maximize throughput you have to code against the
> > async paradigm, and be prepared to tweak the batch size, queue
> properties,
> > and compression codec (wait for it…) in a way that matches the message
> > payload you are putting through the system and the capabilities of the
> > producer system itself.
> >
> >
> > Jason
> >
> >
> >
> >
> >
> > This electronic message contains information which may be confidential or
> > privileged. The information is intended for the use of the individual or
> > entity named above. If you are not the intended recipient, be aware that
> > any disclosure, copying, distribution or use of the contents of this
> > information is prohibited. If you have received this electronic
> > transmission in error, please notify us by e-mail at (
> > postmas...@rapid7.com) immediately.
> >
> This electronic message contains information which may be confidential or
> privileged. The information is intended for the use of the individual or
> entity named above. If you are not the intended recipient, be aware that
> any disclosure, copying, distribution or use of the contents of this
> information is prohibited. If you have received this electronic
> transmission in error, please notify us by e-mail at (
> postmas...@rapid7.com) immediately.
>
>


Re: Apache Kafka in AWS

2013-05-22 Thread Scott Clasen
Hey Jason,

 question what openjdk version did you have issues with? Im running kafka
on it now and has been ok. Was it a crash only at load?

Thanks
SC


On Wed, May 22, 2013 at 1:42 PM, Jason Weiss  wrote:

> All,
>
> I asked a number of questions of the group over the last week, and I'm
> happy to report that I've had great success getting Kafka up and running in
> AWS. I am using 3 EC2 instances, each of which is a M2 High-Memory
> Quadruple Extra Large with 8 cores and 58.4 GiB of memory according to the
> AWS specs. I have co-located Zookeeper instances next to Zafka on each
> machine.
>
> I am able to publish in a repeatable fashion 273,000 events per second,
> with each event payload consisting of a fixed size of 2048 bytes! This
> represents the maximum throughput possible on this configuration, as the
> servers became CPU constrained, averaging 97% utilization in a relatively
> flat line. This isn't a "burst" speed – it represents a sustained
> throughput from 20 M1 Large EC2 Kafka multi-threaded producers. Putting
> this into perspective, if my log retention period was a month, I'd be
> aggregating 1.3 petabytes of data on my disk drives. Suffice to say, I
> don't see us retaining data for more than a few hours!
>
> Here were the keys to tuning for future folks to consider:
>
> First and foremost, be sure to configure your Java heap size accordingly
> when you launch Kafka. The default is like 512MB, which in my case left
> virtually all of my RAM inaccessible to Kafka.
> Second, stay away from OpenJDK. No, seriously – this was a huge thorn in
> my side, and I almost gave up on Kafka because of the problems I
> encountered. The OpenJDK NIO functions repeatedly resulted in Kafka
> crashing and burning in dramatic fashion. The moment I switched over to
> Oracle's JDK for linux, Kafka didn't puke once- I mean, like not even a
> hiccup.
> Third know your message size. In my opinion, the more you understand about
> your event payload characteristics, the better you can tune the system. The
> two knobs to really turn are the log.flush.interval and
> log.default.flush.interval.ms. The values here are intrinsically
> connected to the types of payloads you are putting through the system.
> Fourth and finally, to maximize throughput you have to code against the
> async paradigm, and be prepared to tweak the batch size, queue properties,
> and compression codec (wait for it…) in a way that matches the message
> payload you are putting through the system and the capabilities of the
> producer system itself.
>
>
> Jason
>
>
>
>
>
> This electronic message contains information which may be confidential or
> privileged. The information is intended for the use of the individual or
> entity named above. If you are not the intended recipient, be aware that
> any disclosure, copying, distribution or use of the contents of this
> information is prohibited. If you have received this electronic
> transmission in error, please notify us by e-mail at (
> postmas...@rapid7.com) immediately.
>


Re: Relationship between Zookeeper and Kafka

2013-05-20 Thread Scott Clasen
Ahh, yeah, piops is definitely faster than standard EBS, but still much
slower than local disk.

you could try benchmarking local disk to see what the instances you are
using are capable of, then try tweaking iops etc to see where you get.

  M1.Larges arent super fast so your macbook beating them isnt suprising to
me.


On Mon, May 20, 2013 at 10:01 AM, Jason Weiss wrote:

> Hi Scott.
>
> I'm using Kafka 0.7.2. I am using the default replication factor, since I
> don't recall changing that configuration at all.
>
> I'm using provisioned IOPS, which from attending the AWS event in NYC a
> few weeks ago was presented as the "fastest storage option" for EC2. A
> number of partners presented success stories in terms of throughput with
> provisioned IOPS. I've tried to follow that model.
>
>
> Jason
>
> On 5/20/13 12:56 PM, "Scott Clasen"  wrote:
>
> >My guess, EBS is likely your bottleneck.  Try running on instance local
> >disks, and compare your results.  Is this 0.8? What replication factor are
> >you using?
> >
> >
> >On Mon, May 20, 2013 at 8:11 AM, Jason Weiss 
> >wrote:
> >
> >> I'm trying to maximize my throughput and seem to have hit a ceiling.
> >> Everything described below is running in AWS.
> >>
> >> I have configured a Kafka cluster with 5 machines, M1.Large, with 600
> >> provisioned IOPS storage for each EC2 instance. I have a Zookeeper
> >>server
> >> (we aren't in production yet, so I didn't take the time to setup a ZK
> >> cluster). Publishing to a single topic from 7 different clients, I seem
> >>to
> >> max out at around 20,000 eps with a fixed 2K message size. Each broker
> >> defines 10 file segments, with a 25000 message / 5 second flush
> >> configuration in server.properties. I have stuck with 8 threads. My
> >> producers (Java) are configured with batch.num.messages at 50, and
> >> queue.buffering.max.messages at 100.
> >>
> >> When I went from 4 servers in the cluster to 5 servers, I only saw an
> >> increase of about 500 events per second in throughput. In sharp
> >>contrast,
> >> when I run a complete environment on my MacBook Pro, tuned as described
> >> above but with a single ZK and a single Kafka broker, I am seeing 61,000
> >> events per second. I don't think I'm network constrained in the AWS
> >> environment (producer side) because when I add one more client, my
> >>MacBook
> >> Pro, I see a proportionate decrease in EC2 client throughput, and the
> >>net
> >> result is an identical 20,000 eps. Stated differently, my EC2 instance
> >>give
> >> up throughput when my local MacBook Pro joins the array of producers
> >>such
> >> that the throughput is exactly the same.
> >>
> >> Does anyone have any additional suggestions on what else I could tune to
> >> try and hit our goal, 50,000 eps with a 5 machine cluster? Based on the
> >> whitepapers published, LinkedIn describes a peak of 170,000 events per
> >> second across their cluster. My 20,000 seems so far away from their
> >> production figures.
> >>
> >> What is the relationship, in terms of performance, between ZK and Kafka?
> >> Do I need to have a more performant ZK cluster, the same, or does it
> >>really
> >> not matter in terms of maximizing throughput.
> >>
> >> Thanks for any suggestions ­ I've been pulling knobs and turning levers
> >>on
> >> this for several days now.
> >>
> >>
> >> Jason
> >>
> >> This electronic message contains information which may be confidential
> >>or
> >> privileged. The information is intended for the use of the individual or
> >> entity named above. If you are not the intended recipient, be aware that
> >> any disclosure, copying, distribution or use of the contents of this
> >> information is prohibited. If you have received this electronic
> >> transmission in error, please notify us by e-mail at (
> >> postmas...@rapid7.com) immediately.
> >>
>
> This electronic message contains information which may be confidential or
> privileged. The information is intended for the use of the individual or
> entity named above. If you are not the intended recipient, be aware that
> any disclosure, copying, distribution or use of the contents of this
> information is prohibited. If you have received this electronic
> transmission in error, please notify us by e-mail at (
> postmas...@rapid7.com) immediately.
>
>


Re: Relationship between Zookeeper and Kafka

2013-05-20 Thread Scott Clasen
My guess, EBS is likely your bottleneck.  Try running on instance local
disks, and compare your results.  Is this 0.8? What replication factor are
you using?


On Mon, May 20, 2013 at 8:11 AM, Jason Weiss  wrote:

> I'm trying to maximize my throughput and seem to have hit a ceiling.
> Everything described below is running in AWS.
>
> I have configured a Kafka cluster with 5 machines, M1.Large, with 600
> provisioned IOPS storage for each EC2 instance. I have a Zookeeper server
> (we aren't in production yet, so I didn't take the time to setup a ZK
> cluster). Publishing to a single topic from 7 different clients, I seem to
> max out at around 20,000 eps with a fixed 2K message size. Each broker
> defines 10 file segments, with a 25000 message / 5 second flush
> configuration in server.properties. I have stuck with 8 threads. My
> producers (Java) are configured with batch.num.messages at 50, and
> queue.buffering.max.messages at 100.
>
> When I went from 4 servers in the cluster to 5 servers, I only saw an
> increase of about 500 events per second in throughput. In sharp contrast,
> when I run a complete environment on my MacBook Pro, tuned as described
> above but with a single ZK and a single Kafka broker, I am seeing 61,000
> events per second. I don't think I'm network constrained in the AWS
> environment (producer side) because when I add one more client, my MacBook
> Pro, I see a proportionate decrease in EC2 client throughput, and the net
> result is an identical 20,000 eps. Stated differently, my EC2 instance give
> up throughput when my local MacBook Pro joins the array of producers such
> that the throughput is exactly the same.
>
> Does anyone have any additional suggestions on what else I could tune to
> try and hit our goal, 50,000 eps with a 5 machine cluster? Based on the
> whitepapers published, LinkedIn describes a peak of 170,000 events per
> second across their cluster. My 20,000 seems so far away from their
> production figures.
>
> What is the relationship, in terms of performance, between ZK and Kafka?
> Do I need to have a more performant ZK cluster, the same, or does it really
> not matter in terms of maximizing throughput.
>
> Thanks for any suggestions – I've been pulling knobs and turning levers on
> this for several days now.
>
>
> Jason
>
> This electronic message contains information which may be confidential or
> privileged. The information is intended for the use of the individual or
> entity named above. If you are not the intended recipient, be aware that
> any disclosure, copying, distribution or use of the contents of this
> information is prohibited. If you have received this electronic
> transmission in error, please notify us by e-mail at (
> postmas...@rapid7.com) immediately.
>


Re: Update: RE: are commitOffsets botched to zookeeper?

2013-05-17 Thread Scott Clasen
to clarify, I meant that Robert could/should store the offsets in a faster
store not that kafka should default to that :)

Thanks Neha


On Fri, May 17, 2013 at 2:22 PM, Neha Narkhede wrote:

> There is no particular need for storing the offsets in zookeeper. In fact
> with Kafka 0.8, since partitions will be highly available, offsets could be
> stored in Kafka topics. However, we haven't ironed out the design for this
> yet.
>
> Thanks,
> Neha
>
>
> On Fri, May 17, 2013 at 2:19 PM, Scott Clasen  wrote:
>
> > afaik you dont 'have' to store the consumed offsets in zk right, this is
> > only automatic with some of the clients?
> >
> > why not store them in a data store that can write at the rate that you
> > require?
> >
> >
> > On Fri, May 17, 2013 at 2:15 PM, Withers, Robert <
> robert.with...@dish.com
> > >wrote:
> >
> > > Update from our OPS team, regarding zookeeper 3.4.x.  Given stability,
> > > adoption of offset batching would be the only remaining bit of work to
> > do.
> > >  Still, I totally understand the restraint for 0.8...
> > >
> > >
> > > "As exercise in upgradability of zookeeper, I did a "out-of-the"box"
> > > upgrade on Zookeeper. I downloaded a generic distribution of Apache
> > > Zookeeper and used it for the upgrade.
> > >
> > > Kafka included version of Zookeeper 3.3.3.
> > > Out of the box Apache Zookeeper 3.4.5 (which I upgraded to)
> > >
> > > Running, working great. I did *not* have to wipe out the zookeeper
> > > databases. All data stayed intact.
> > >
> > > I got a new feature, which allows auto-purging of logs. This keeps OPS
> > > maintenance to a minimum."
> > >
> > >
> > > thanks,
> > > rob
> > >
> > >
> > > -Original Message-
> > > From: Withers, Robert [mailto:robert.with...@dish.com]
> > > Sent: Friday, May 17, 2013 7:38 AM
> > > To: users@kafka.apache.org
> > > Subject: RE: are commitOffsets botched to zookeeper?
> > >
> > > Fair enough, this is something to look forward to.  I appreciate the
> > > restraint you show to stay out of troubled waters.  :)
> > >
> > > thanks,
> > > rob
> > >
> > > 
> > > From: Neha Narkhede [neha.narkh...@gmail.com]
> > > Sent: Friday, May 17, 2013 7:35 AM
> > > To: users@kafka.apache.org
> > > Subject: RE: are commitOffsets botched to zookeeper?
> > >
> > > Upgrading to a new zookeeper version is not an easy change. Also
> > zookeeper
> > > 3.3.4 is much more stable compared to 3.4.x. We think it is better not
> to
> > > club 2 big changes together. So most likely this will be a post 08 item
> > for
> > > stability purposes.
> > >
> > > Thanks,
> > > Neha
> > > On May 17, 2013 6:31 AM, "Withers, Robert" 
> > > wrote:
> > >
> > > > Awesome!  Thanks for the clarification.  I would like to offer my
> > > > strong vote that this get tackled before a beta, to get it firmly
> into
> > > 0.8.
> > > > Stabilize everything else to the existing use, but make offset
> updates
> > > > batched.
> > > >
> > > > thanks,
> > > > rob
> > > > 
> > > > From: Neha Narkhede [neha.narkh...@gmail.com]
> > > > Sent: Friday, May 17, 2013 7:17 AM
> > > > To: users@kafka.apache.org
> > > > Subject: RE: are commitOffsets botched to zookeeper?
> > > >
> > > > Sorry I wasn't clear. Zookeeper 3.4.x has this feature. As soon as 08
> > > > is stable and released it will be worth looking into when we can use
> > > > zookeeper 3.4.x.
> > > >
> > > > Thanks,
> > > > Neha
> > > > On May 16, 2013 10:32 PM, "Rob Withers"  wrote:
> > > >
> > > > > Can a request be made to zookeeper for this feature?
> > > > >
> > > > > Thanks,
> > > > > rob
> > > > >
> > > > > > -Original Message-
> > > > > > From: Neha Narkhede [mailto:neha.narkh...@gmail.com]
> > > > > > Sent: Thursday, May 16, 2013 9:53 PM
> > > > > > To: users@kafka.apache.org
> > > > > > Subject: Re: are commitOffsets botched to zookeeper?
> > > > > >
> > > > > > Currently Kafka depends on zookeeper 3.3.4 that doesn't have a
> > > > > > batch
> > > > > write
> > > > > > api. So if you commit after every message at a high rate, it will
> > > > > > be
> > > > slow
> > > > > and
> > > > > > inefficient. Besides it will cause zookeeper performance to
> > degrade.
> > > > > >
> > > > > > Thanks,
> > > > > > Neha
> > > > > > On May 16, 2013 6:54 PM, "Rob Withers" 
> > wrote:
> > > > > >
> > > > > > > We are calling commitOffsets after every message consumption.
> > > > > > > It looks to be ~60% slower, with 29 partitions.  If a single
> > > > > > > KafkaStream thread is from a connector, and there are 29
> > > > > > > partitions, then commitOffsets sends 29 offset updates,
> correct?
> > > > > > > Are these offset updates batched in one send to zookeeper?
> > > > > > >
> > > > > > > thanks,
> > > > > > > rob
> > > > >
> > > > >
> > > >
> > >
> >
>


Re: Update: RE: are commitOffsets botched to zookeeper?

2013-05-17 Thread Scott Clasen
afaik you dont 'have' to store the consumed offsets in zk right, this is
only automatic with some of the clients?

why not store them in a data store that can write at the rate that you
require?


On Fri, May 17, 2013 at 2:15 PM, Withers, Robert wrote:

> Update from our OPS team, regarding zookeeper 3.4.x.  Given stability,
> adoption of offset batching would be the only remaining bit of work to do.
>  Still, I totally understand the restraint for 0.8...
>
>
> "As exercise in upgradability of zookeeper, I did a "out-of-the"box"
> upgrade on Zookeeper. I downloaded a generic distribution of Apache
> Zookeeper and used it for the upgrade.
>
> Kafka included version of Zookeeper 3.3.3.
> Out of the box Apache Zookeeper 3.4.5 (which I upgraded to)
>
> Running, working great. I did *not* have to wipe out the zookeeper
> databases. All data stayed intact.
>
> I got a new feature, which allows auto-purging of logs. This keeps OPS
> maintenance to a minimum."
>
>
> thanks,
> rob
>
>
> -Original Message-
> From: Withers, Robert [mailto:robert.with...@dish.com]
> Sent: Friday, May 17, 2013 7:38 AM
> To: users@kafka.apache.org
> Subject: RE: are commitOffsets botched to zookeeper?
>
> Fair enough, this is something to look forward to.  I appreciate the
> restraint you show to stay out of troubled waters.  :)
>
> thanks,
> rob
>
> 
> From: Neha Narkhede [neha.narkh...@gmail.com]
> Sent: Friday, May 17, 2013 7:35 AM
> To: users@kafka.apache.org
> Subject: RE: are commitOffsets botched to zookeeper?
>
> Upgrading to a new zookeeper version is not an easy change. Also zookeeper
> 3.3.4 is much more stable compared to 3.4.x. We think it is better not to
> club 2 big changes together. So most likely this will be a post 08 item for
> stability purposes.
>
> Thanks,
> Neha
> On May 17, 2013 6:31 AM, "Withers, Robert" 
> wrote:
>
> > Awesome!  Thanks for the clarification.  I would like to offer my
> > strong vote that this get tackled before a beta, to get it firmly into
> 0.8.
> > Stabilize everything else to the existing use, but make offset updates
> > batched.
> >
> > thanks,
> > rob
> > 
> > From: Neha Narkhede [neha.narkh...@gmail.com]
> > Sent: Friday, May 17, 2013 7:17 AM
> > To: users@kafka.apache.org
> > Subject: RE: are commitOffsets botched to zookeeper?
> >
> > Sorry I wasn't clear. Zookeeper 3.4.x has this feature. As soon as 08
> > is stable and released it will be worth looking into when we can use
> > zookeeper 3.4.x.
> >
> > Thanks,
> > Neha
> > On May 16, 2013 10:32 PM, "Rob Withers"  wrote:
> >
> > > Can a request be made to zookeeper for this feature?
> > >
> > > Thanks,
> > > rob
> > >
> > > > -Original Message-
> > > > From: Neha Narkhede [mailto:neha.narkh...@gmail.com]
> > > > Sent: Thursday, May 16, 2013 9:53 PM
> > > > To: users@kafka.apache.org
> > > > Subject: Re: are commitOffsets botched to zookeeper?
> > > >
> > > > Currently Kafka depends on zookeeper 3.3.4 that doesn't have a
> > > > batch
> > > write
> > > > api. So if you commit after every message at a high rate, it will
> > > > be
> > slow
> > > and
> > > > inefficient. Besides it will cause zookeeper performance to degrade.
> > > >
> > > > Thanks,
> > > > Neha
> > > > On May 16, 2013 6:54 PM, "Rob Withers"  wrote:
> > > >
> > > > > We are calling commitOffsets after every message consumption.
> > > > > It looks to be ~60% slower, with 29 partitions.  If a single
> > > > > KafkaStream thread is from a connector, and there are 29
> > > > > partitions, then commitOffsets sends 29 offset updates, correct?
> > > > > Are these offset updates batched in one send to zookeeper?
> > > > >
> > > > > thanks,
> > > > > rob
> > >
> > >
> >
>


Re: Using Stunnel to encrypt/authenticate Kafka producers and consumers...

2013-04-22 Thread Scott Clasen
I think you are right, even if you did put an ELB in front of kafka, it
would only be used for getting the initial broker list afaik. Producers and
consumers need to be able to talk to each broker directly, and also
consumers need to be able to talk to zookeeper as well to store offsets.

Probably have to stunnel all the things.  Id be interested in hearing how
it works out. IMO this would be a great thing to have in kafka-contrib.



On Mon, Apr 22, 2013 at 11:31 AM, Matt Wise  wrote:

> Hi there... we're currently looking into using Kafka as a pipeline for
> passing around log messages. We like its use of Zookeeper for coordination
> (as we already make heavy use of Zookeeper at Nextdoor), but I'm running
> into one big problem. Everything we do is a) in the cloud, b) secure, and
> c) cross-region/datacenter/cloud-provider.
>
> We make use of SSL for both encryption and authentication of most of our
> services. My understanding is that Kafka 0.7.x producers and consumers
> connect to Zookeeper to retrieve a list of the current Kafka servers, and
> then make direct TCP connections to the individual servers that they need
> to to publish or subscribe to a stream. In 0.8.x thats changed, so now
> clients can connect to a single Kafka server and get a list of these
> servers via an API?
>
> What I'm wondering is whether we can actually put an ELB in front of *all*
> of our Kafka servers, throw stunnel on them, and give our producers and
> clients a single endpoint to connect to (through the ELB) rather than
> having them connect directly to the individual Kafka servers. This would
> provide us both encryption of the data during transport, as well as
> authentication of the producers and subscribers. Lastly, if it works, it
> would provide  these features without impacting our ability to use existing
> kafka producer/consumers that people have written.
>
> My concern is that the Kafka clients (producers or consumers?) would
> connect once through the ELB, then get the list of servers via the API, and
> finally try to connect directly to one of those Kafka servers rather than
> just leveraging the existing connection through the ELB.
>
> Thoughts?
>
> --Matt


Re: How to create the initial zookeeper chroot path for zk.connect?

2013-04-21 Thread Scott Clasen
Since There is only 1 chroot for a zk cluster, if you specified for each server 
there would be a potential for error/mismatch

Things would probably go really bad if you had mismatched chroots :)

Sent from my iPhone

On Apr 21, 2013, at 1:34 AM, Ryan Chan  wrote:

> Thanks, this solved the problem.
> 
> But the connection string as "Zk1:2181,zk2:2181,zk3;2181/Kafka", seems
> unintuitive?
> 
> 
> On Sun, Apr 21, 2013 at 2:29 AM, Scott Clasen  wrote:
> 
>> Afaik you only put the chroot on the end of the zk conn str...
>> 
>> Zk1:2181,zk2:2181,zk3;2181/Kafka
>> 
>> Not
>> 
>> Zk1:2181/kafka,zk2:2181/Kafka,zk3:2181/Kafka
>> 
>> 
>> Sent from my iPhone
>> 
>> On Apr 20, 2013, at 9:03 AM, Neha Narkhede 
>> wrote:
>> 
>>> Please file a bug and mention the Kafka and zookeeper versions used for
>> the
>>> test.
>>> 
>>> Thanks,
>>> Neha
>>> 
>>> On Saturday, April 20, 2013, Ryan Chan wrote:
>>> 
>>>> Hello,
>>>> 
>>>> Tried, still the same...
>>>> 
>>>> 
>>>> bin/zkCli.sh -server zookeeper1:2181,zookeeper2:2181,zookeeper3:2181
>>>> [zk: zookeeper1,zookeeper2,zookeeper3(CONNECTED) 0]  ls /
>>>> [testkafka, consumers, brokers, zookeeper]
>>>> [zk: zookeeper1,zookeeper2,zookeeper3(CONNECTED) 1] rmr /testkafka
>>>> [zk: zookeeper1,zookeeper2,zookeeper3(CONNECTED) 2] create /testkafka ''
>>>> [zk: zookeeper1,zookeeper2,zookeeper3(CONNECTED) 3] ls /
>>>> [testkafka, consumers, brokers, zookeeper]
>>>> 
>>>> 
>>>> 
>>>> And restart Kafka
>>>> 
>>>> [2013-04-20 09:20:58,336] FATAL Fatal error during KafkaServerStable
>>>> startup. Prepare to shutdown (kafka.server.KafkaServerStartable)
>>>> java.lang.IllegalArgumentException: Path length must be > 0
>>>> at org.apache.zookeeper.common.PathUtils.validatePath(PathUtils.java:48)
>>>> at org.apache.zookeeper.common.PathUtils.validatePath(PathUtils.java:35)
>>>> at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:626)
>>>> at org.I0Itec.zkclient.ZkConnection.create(ZkConnection.java:87)
>>>> at org.I0Itec.zkclient.ZkClient$1.call(ZkClient.java:308)
>>>> at org.I0Itec.zkclient.ZkClient$1.call(ZkClient.java:304)
>>>> at org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:675)
>>>> at org.I0Itec.zkclient.ZkClient.create(ZkClient.java:304)
>>>> at org.I0Itec.zkclient.ZkClient.createPersistent(ZkClient.java:213)
>>>> at org.I0Itec.zkclient.ZkClient.createPersistent(ZkClient.java:223)
>>>> at org.I0Itec.zkclient.ZkClient.createPersistent(ZkClient.java:223)
>>>> at kafka.utils.ZkUtils$.createParentPath(ZkUtils.scala:47)
>>>> at kafka.utils.ZkUtils$.createEphemeralPath(ZkUtils.scala:59)
>>>> at
>> kafka.utils.ZkUtils$.createEphemeralPathExpectConflict(ZkUtils.scala:71)
>>>> at
>> kafka.server.KafkaZooKeeper.registerBrokerInZk(KafkaZooKeeper.scala:54)
>>>> at kafka.log.LogManager.startup(LogManager.scala:130)
>>>> at kafka.server.KafkaServer.startup(KafkaServer.scala:81)
>>>> at
>> kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:34)
>>>> at kafka.Kafka$.main(Kafka.scala:47)
>>>> at kafka.Kafka.main(Kafka.scala)
>>>> 
>>>> 
>>>> 
>>>> Maybe I should report a bug?
>>>> (I posted here first just to know if I have done sth stupid)
>>>> 
>>>> 
>>>> 
>>>> 
>>>> On Sat, Apr 20, 2013 at 1:02 PM, Neha Narkhede > 
>>>>> wrote:
>>>> 
>>>>> Hmm, so if you use all 3 zookeeper servers will creating and reading
>>>>> the node, do you still see the problem ?
>>>>> 
>>>>> zkCli.sh -server zookeeper1:2181,zookeeper2:2181,zookeeper3:2181
>>>>> create /testkafka
>>>>> ls /
>>>>> 
>>>>> Thanks
>>>>> Neha
>>>>> 
>>>>> On Fri, Apr 19, 2013 at 8:55 PM, Ryan Chan 
>>>> wrote:
>>>>>> Hi,
>>>>>> 
>>>>>> Actually I followed the above link to setup my zookeeper1 to
>>>> zookeeper3.
>>>>>> 
>>>>>> They are in the same quorum, as you can see in my above example that
>>>>> when I
>>>>>> created the /testkafka path in zook

Re: How to create the initial zookeeper chroot path for zk.connect?

2013-04-20 Thread Scott Clasen
Afaik you only put the chroot on the end of the zk conn str...

Zk1:2181,zk2:2181,zk3;2181/Kafka

Not

Zk1:2181/kafka,zk2:2181/Kafka,zk3:2181/Kafka


Sent from my iPhone

On Apr 20, 2013, at 9:03 AM, Neha Narkhede  wrote:

> Please file a bug and mention the Kafka and zookeeper versions used for the
> test.
> 
> Thanks,
> Neha
> 
> On Saturday, April 20, 2013, Ryan Chan wrote:
> 
>> Hello,
>> 
>> Tried, still the same...
>> 
>> 
>> bin/zkCli.sh -server zookeeper1:2181,zookeeper2:2181,zookeeper3:2181
>> [zk: zookeeper1,zookeeper2,zookeeper3(CONNECTED) 0]  ls /
>> [testkafka, consumers, brokers, zookeeper]
>> [zk: zookeeper1,zookeeper2,zookeeper3(CONNECTED) 1] rmr /testkafka
>> [zk: zookeeper1,zookeeper2,zookeeper3(CONNECTED) 2] create /testkafka ''
>> [zk: zookeeper1,zookeeper2,zookeeper3(CONNECTED) 3] ls /
>> [testkafka, consumers, brokers, zookeeper]
>> 
>> 
>> 
>> And restart Kafka
>> 
>> [2013-04-20 09:20:58,336] FATAL Fatal error during KafkaServerStable
>> startup. Prepare to shutdown (kafka.server.KafkaServerStartable)
>> java.lang.IllegalArgumentException: Path length must be > 0
>> at org.apache.zookeeper.common.PathUtils.validatePath(PathUtils.java:48)
>> at org.apache.zookeeper.common.PathUtils.validatePath(PathUtils.java:35)
>> at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:626)
>> at org.I0Itec.zkclient.ZkConnection.create(ZkConnection.java:87)
>> at org.I0Itec.zkclient.ZkClient$1.call(ZkClient.java:308)
>> at org.I0Itec.zkclient.ZkClient$1.call(ZkClient.java:304)
>> at org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:675)
>> at org.I0Itec.zkclient.ZkClient.create(ZkClient.java:304)
>> at org.I0Itec.zkclient.ZkClient.createPersistent(ZkClient.java:213)
>> at org.I0Itec.zkclient.ZkClient.createPersistent(ZkClient.java:223)
>> at org.I0Itec.zkclient.ZkClient.createPersistent(ZkClient.java:223)
>> at kafka.utils.ZkUtils$.createParentPath(ZkUtils.scala:47)
>> at kafka.utils.ZkUtils$.createEphemeralPath(ZkUtils.scala:59)
>> at kafka.utils.ZkUtils$.createEphemeralPathExpectConflict(ZkUtils.scala:71)
>> at kafka.server.KafkaZooKeeper.registerBrokerInZk(KafkaZooKeeper.scala:54)
>> at kafka.log.LogManager.startup(LogManager.scala:130)
>> at kafka.server.KafkaServer.startup(KafkaServer.scala:81)
>> at kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:34)
>> at kafka.Kafka$.main(Kafka.scala:47)
>> at kafka.Kafka.main(Kafka.scala)
>> 
>> 
>> 
>> Maybe I should report a bug?
>> (I posted here first just to know if I have done sth stupid)
>> 
>> 
>> 
>> 
>> On Sat, Apr 20, 2013 at 1:02 PM, Neha Narkhede 
>> 
>>> wrote:
>> 
>>> Hmm, so if you use all 3 zookeeper servers will creating and reading
>>> the node, do you still see the problem ?
>>> 
>>> zkCli.sh -server zookeeper1:2181,zookeeper2:2181,zookeeper3:2181
>>> create /testkafka
>>> ls /
>>> 
>>> Thanks
>>> Neha
>>> 
>>> On Fri, Apr 19, 2013 at 8:55 PM, Ryan Chan 
>> wrote:
 Hi,
 
 Actually I followed the above link to setup my zookeeper1 to
>> zookeeper3.
 
 They are in the same quorum, as you can see in my above example that
>>> when I
 created the /testkafka path in zookeeper1, I can see list it in
>>> zookeeper2
 
 Thanks
 
 
 On Sat, Apr 20, 2013 at 12:08 AM, Neha Narkhede <
>> neha.narkh...@gmail.com
 wrote:
 
> I'm pretty sure the issue is that those 3 zookeepers are not part of
> the same quorum. Try following the quickstart for starting up a local
> zookeeper instance or follow
>> http://zookeeper.apache.org/doc/r3.3.3/zookeeperAdmin.html#sc_zkMulitServerSetup
> for a zookeeper cluster setup.
> 
> Thanks,
> Neha
> 
> On Thu, Apr 18, 2013 at 10:50 PM, Jonathan Creasy  wrote:
>> I made the patch to create the chroot and it doesn't handle multiple
>>> zk
>> addresses.
>> 
>> We fixed it but I guess that patch didn't get submitted. I will
>> make a
>> ticket here to get that done.
>> On Apr 18, 2013 10:47 PM, "Ryan Chan" 
>> wrote:
>> 
>>> Yes, using the latest Kafka 0.7.2, just tried to reproduce again
>>> 
>>> 1. Install a single node Kafka, three nodes zookeeper instances
>>> 
>>>kafka1
>>>zookeeper1
>>>zookeeper2
>>>zookeeper3
>>> 
>>> 2. Using a simple Kafka config, able to start without error in the
>>> log
>>> 
>>>brokerid=1
>>>log.dir=/data/kafka
>>>zk.connect=zookeeper1:2181,zookeeper2:2181,zookeeper3:2181
>>> 
>>> 3. Now, create a path in zookeeper1
>>> 
>>>zkCli.sh -server zookeeper1:2181
>>> 
>>>ls /
>>>[consumers, brokers, zookeeper]
>>>create /testkafka ''
>>>ls /
>>>[testkafka, consumers, brokers, zookeeper]
>>> 
>>> Quit & Done.
>>> 
>>> 4. Verify from zookeeper2
>>> 
>>>zkCli.sh -server zookeeper2:2181
>>>ls /
>>>[testkafka, consumers, brokers, zookeeper]
>>

Re: kafka 0.8 release schedule?

2013-04-03 Thread Scott Clasen
+1


On Wed, Apr 3, 2013 at 4:03 PM, Ryan LeCompte  wrote:

> A 2.10 version would be awesome! Please please :-)
>
>
> On Wed, Apr 3, 2013 at 4:02 PM, Soby Chacko  wrote:
>
> > Hello,
> >
> > I am sure this question has been asked before. But, can someone tell me
> the
> > tentative release schedule for Kafka 0.8? And also, when its released, is
> > it going to be based on Scala 2.9.2 or default to 2.8?
> >
> > Regards,
> > Soby Chacko.
> >
>


Re: 0.8 best practices for migrating / electing leaders in failure situations?

2013-03-25 Thread Scott Clasen
Jun Thanks. To clarify, do you mean that clients will have cached broker
lists or some other data that will make them ignore the new brokers?

Like so

topic-1 replication factor 3, on broker-ids 1,2,3
all brokers 1,2,3 die, and are never coming back.
delete all kafka data in zookeeper.
boot 4,5,6, create new topic called topic-1 repl factor 3, brokers 4,5,6

clients will/will not start sending to topic-1 on 4,5,6?



On Sun, Mar 24, 2013 at 4:01 PM, Jun Rao  wrote:

> If you bring up 3 new brokers with different broker ids, you won't be able
> to use them on existing topics until after you have run the partition
> reassignment tool.
>
> Thanks,
>
> Jun
>
> On Fri, Mar 22, 2013 at 9:23 PM, Scott Clasen  wrote:
>
> > Thanks!
> >
> >  Would there be any difference if I instead  deleted all the Kafka data
> > from zookeeper and booted 3 instances  with different broker id? clients
> > with cached broker id lists or any other issue?
> >
> > Sent from my iPhone
> >
> > On Mar 22, 2013, at 9:15 PM, Jun Rao  wrote:
> >
> > > In scenario 2, you can bring up 3 new brokers with the same broker id.
> > You
> > > won't get the data back. However, new data can be published to and
> > consumed
> > > from the new brokers.
> > >
> > > Thanks,
> > >
> > > Jun
> > >
> > > On Fri, Mar 22, 2013 at 2:17 PM, Scott Clasen 
> wrote:
> > >
> > >> Thanks Neha-
> > >>
> > >> To Clarify...
> > >>
> > >> *In scenario => 1 will the new broker get all messages on the other
> > brokers
> > >> replicated to it?
> > >>
> > >> *In Scenario 2 => it is clear that the data is gone, but I still need
> > >> producers to be able to send and consumers to receive on the same
> > topic. In
> > >> my testing today I was unable to do that as I kept getting errors...so
> > if i
> > >> was doing the correct steps it seems there is a bug here, basically
> the
> > >> "second-cluster-topic" topic is unusable after all 3 brokers crash,
> and
> > 3
> > >> more are booted to replace them.  Something not quite correct in
> > zookeeper?
> > >>
> > >> Like so
> > >>
> > >> ./bin/kafka-reassign-partitions.sh --zookeeper ... --path-to-json-file
> > >> reassign.json
> > >>
> > >> kafka.common.LeaderNotAvailableException: Leader not available for
> topic
> > >> second-cluster-topic partition 0
> > >> at kafka.admin.AdminUtils$$anonfun$3.apply(AdminUtils.scala:120)
> > >> at kafka.admin.AdminUtils$$anonfun$3.apply(AdminUtils.scala:103)
> > >> at
> > >>
> > >>
> >
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:206)
> > >> at
> > >>
> > >>
> >
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:206)
> > >> at
> > >>
> > >>
> >
> scala.collection.LinearSeqOptimized$class.foreach(LinearSeqOptimized.scala:61)
> > >> at scala.collection.immutable.List.foreach(List.scala:45)
> > >> at
> scala.collection.TraversableLike$class.map(TraversableLike.scala:206)
> > >> at scala.collection.immutable.List.map(List.scala:45)
> > >> at
> > >>
> > >>
> >
> kafka.admin.AdminUtils$.kafka$admin$AdminUtils$$fetchTopicMetadataFromZk(AdminUtils.scala:103)
> > >> at
> kafka.admin.AdminUtils$.fetchTopicMetadataFromZk(AdminUtils.scala:92)
> > >> at kafka.admin.ListTopicCommand$.showTopic(ListTopicCommand.scala:80)
> > >> at
> > >>
> > >>
> >
> kafka.admin.ListTopicCommand$$anonfun$main$2.apply(ListTopicCommand.scala:66)
> > >> at
> > >>
> > >>
> >
> kafka.admin.ListTopicCommand$$anonfun$main$2.apply(ListTopicCommand.scala:65)
> > >> at
> > >>
> > >>
> >
> scala.collection.LinearSeqOptimized$class.foreach(LinearSeqOptimized.scala:61)
> > >> at scala.collection.immutable.List.foreach(List.scala:45)
> > >> at kafka.admin.ListTopicCommand$.main(ListTopicCommand.scala:65)
> > >> at kafka.admin.ListTopicCommand.main(ListTopicCommand.scala)
> > >> Caused by: kafka.common.LeaderNotAvailableException: No leader exists
> > for
> > >> partition 0
> > >> at kafka.admin.AdminUtils$$anonfun$3.apply(AdminUtils.scala:117)
> > >> ... 16 more
> > >> topic

Re: 0.8 best practices for migrating / electing leaders in failure situations?

2013-03-22 Thread Scott Clasen
Thanks! 

 Would there be any difference if I instead  deleted all the Kafka data from 
zookeeper and booted 3 instances  with different broker id? clients with cached 
broker id lists or any other issue?

Sent from my iPhone

On Mar 22, 2013, at 9:15 PM, Jun Rao  wrote:

> In scenario 2, you can bring up 3 new brokers with the same broker id. You
> won't get the data back. However, new data can be published to and consumed
> from the new brokers.
> 
> Thanks,
> 
> Jun
> 
> On Fri, Mar 22, 2013 at 2:17 PM, Scott Clasen  wrote:
> 
>> Thanks Neha-
>> 
>> To Clarify...
>> 
>> *In scenario => 1 will the new broker get all messages on the other brokers
>> replicated to it?
>> 
>> *In Scenario 2 => it is clear that the data is gone, but I still need
>> producers to be able to send and consumers to receive on the same topic. In
>> my testing today I was unable to do that as I kept getting errors...so if i
>> was doing the correct steps it seems there is a bug here, basically the
>> "second-cluster-topic" topic is unusable after all 3 brokers crash, and 3
>> more are booted to replace them.  Something not quite correct in zookeeper?
>> 
>> Like so
>> 
>> ./bin/kafka-reassign-partitions.sh --zookeeper ... --path-to-json-file
>> reassign.json
>> 
>> kafka.common.LeaderNotAvailableException: Leader not available for topic
>> second-cluster-topic partition 0
>> at kafka.admin.AdminUtils$$anonfun$3.apply(AdminUtils.scala:120)
>> at kafka.admin.AdminUtils$$anonfun$3.apply(AdminUtils.scala:103)
>> at
>> 
>> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:206)
>> at
>> 
>> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:206)
>> at
>> 
>> scala.collection.LinearSeqOptimized$class.foreach(LinearSeqOptimized.scala:61)
>> at scala.collection.immutable.List.foreach(List.scala:45)
>> at scala.collection.TraversableLike$class.map(TraversableLike.scala:206)
>> at scala.collection.immutable.List.map(List.scala:45)
>> at
>> 
>> kafka.admin.AdminUtils$.kafka$admin$AdminUtils$$fetchTopicMetadataFromZk(AdminUtils.scala:103)
>> at kafka.admin.AdminUtils$.fetchTopicMetadataFromZk(AdminUtils.scala:92)
>> at kafka.admin.ListTopicCommand$.showTopic(ListTopicCommand.scala:80)
>> at
>> 
>> kafka.admin.ListTopicCommand$$anonfun$main$2.apply(ListTopicCommand.scala:66)
>> at
>> 
>> kafka.admin.ListTopicCommand$$anonfun$main$2.apply(ListTopicCommand.scala:65)
>> at
>> 
>> scala.collection.LinearSeqOptimized$class.foreach(LinearSeqOptimized.scala:61)
>> at scala.collection.immutable.List.foreach(List.scala:45)
>> at kafka.admin.ListTopicCommand$.main(ListTopicCommand.scala:65)
>> at kafka.admin.ListTopicCommand.main(ListTopicCommand.scala)
>> Caused by: kafka.common.LeaderNotAvailableException: No leader exists for
>> partition 0
>> at kafka.admin.AdminUtils$$anonfun$3.apply(AdminUtils.scala:117)
>> ... 16 more
>> topic: second-cluster-topic
>> 
>> ./bin/kafka-preferred-replica-election.sh  --zookeeper...
>> --path-to-json-file elect.json
>> 
>> 
>> [2013-03-22 10:24:20,706] INFO Created preferred replica election path
>> with { "partitions":[ { "partition":0, "topic":"first-cluster-topic" }, {
>> "partition":0, "topic":"second-cluster-topic" } ], "version":1 }
>> (kafka.admin.PreferredReplicaLeaderElectionCommand$)
>> 
>> ./bin/kafka-list-topic.sh  --zookeeper ... --topic second-cluster-topic
>> 
>> 2013-03-22 10:24:30,869] ERROR Error while fetching metadata for partition
>> [second-cluster-topic,0] (kafka.admin.AdminUtils$)
>> kafka.common.LeaderNotAvailableException: Leader not available for topic
>> second-cluster-topic partition 0
>> at kafka.admin.AdminUtils$$anonfun$3.apply(AdminUtils.scala:120)
>> at kafka.admin.AdminUtils$$anonfun$3.apply(AdminUtils.scala:103)
>> at
>> 
>> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:206)
>> at
>> 
>> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:206)
>> at
>> 
>> scala.collection.LinearSeqOptimized$class.foreach(LinearSeqOptimized.scala:61)
>> at scala.collection.immutable.List.foreach(List.scala:45)
>> at scala.collection.TraversableLike$class.map(TraversableLike.scala:206)
>> at scala.collection.immutable.List.map(List.scala:45)
>> at
>> 
>> kafka.admin.AdminUtils$.kafka$admin$AdminUtils$$fetc

Re: 0.8 best practices for migrating / electing leaders in failure situations?

2013-03-22 Thread Scott Clasen
Thanks Neha-

To Clarify...

*In scenario => 1 will the new broker get all messages on the other brokers
replicated to it?

*In Scenario 2 => it is clear that the data is gone, but I still need
producers to be able to send and consumers to receive on the same topic. In
my testing today I was unable to do that as I kept getting errors...so if i
was doing the correct steps it seems there is a bug here, basically the
"second-cluster-topic" topic is unusable after all 3 brokers crash, and 3
more are booted to replace them.  Something not quite correct in zookeeper?

Like so

./bin/kafka-reassign-partitions.sh --zookeeper ... --path-to-json-file
reassign.json

kafka.common.LeaderNotAvailableException: Leader not available for topic
second-cluster-topic partition 0
at kafka.admin.AdminUtils$$anonfun$3.apply(AdminUtils.scala:120)
at kafka.admin.AdminUtils$$anonfun$3.apply(AdminUtils.scala:103)
at
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:206)
at
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:206)
at
scala.collection.LinearSeqOptimized$class.foreach(LinearSeqOptimized.scala:61)
at scala.collection.immutable.List.foreach(List.scala:45)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:206)
at scala.collection.immutable.List.map(List.scala:45)
at
kafka.admin.AdminUtils$.kafka$admin$AdminUtils$$fetchTopicMetadataFromZk(AdminUtils.scala:103)
at kafka.admin.AdminUtils$.fetchTopicMetadataFromZk(AdminUtils.scala:92)
at kafka.admin.ListTopicCommand$.showTopic(ListTopicCommand.scala:80)
at
kafka.admin.ListTopicCommand$$anonfun$main$2.apply(ListTopicCommand.scala:66)
at
kafka.admin.ListTopicCommand$$anonfun$main$2.apply(ListTopicCommand.scala:65)
at
scala.collection.LinearSeqOptimized$class.foreach(LinearSeqOptimized.scala:61)
at scala.collection.immutable.List.foreach(List.scala:45)
at kafka.admin.ListTopicCommand$.main(ListTopicCommand.scala:65)
at kafka.admin.ListTopicCommand.main(ListTopicCommand.scala)
Caused by: kafka.common.LeaderNotAvailableException: No leader exists for
partition 0
at kafka.admin.AdminUtils$$anonfun$3.apply(AdminUtils.scala:117)
... 16 more
topic: second-cluster-topic

./bin/kafka-preferred-replica-election.sh  --zookeeper...
--path-to-json-file elect.json


[2013-03-22 10:24:20,706] INFO Created preferred replica election path
with { "partitions":[ { "partition":0, "topic":"first-cluster-topic" }, {
"partition":0, "topic":"second-cluster-topic" } ], "version":1 }
(kafka.admin.PreferredReplicaLeaderElectionCommand$)

./bin/kafka-list-topic.sh  --zookeeper ... --topic second-cluster-topic

2013-03-22 10:24:30,869] ERROR Error while fetching metadata for partition
[second-cluster-topic,0] (kafka.admin.AdminUtils$)
kafka.common.LeaderNotAvailableException: Leader not available for topic
second-cluster-topic partition 0
at kafka.admin.AdminUtils$$anonfun$3.apply(AdminUtils.scala:120)
at kafka.admin.AdminUtils$$anonfun$3.apply(AdminUtils.scala:103)
at
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:206)
at
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:206)
at
scala.collection.LinearSeqOptimized$class.foreach(LinearSeqOptimized.scala:61)
at scala.collection.immutable.List.foreach(List.scala:45)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:206)
at scala.collection.immutable.List.map(List.scala:45)
at
kafka.admin.AdminUtils$.kafka$admin$AdminUtils$$fetchTopicMetadataFromZk(AdminUtils.scala:103)
at kafka.admin.AdminUtils$.fetchTopicMetadataFromZk(AdminUtils.scala:92)
at kafka.admin.ListTopicCommand$.showTopic(ListTopicCommand.scala:80)
at
kafka.admin.ListTopicCommand$$anonfun$main$2.apply(ListTopicCommand.scala:66)
at
kafka.admin.ListTopicCommand$$anonfun$main$2.apply(ListTopicCommand.scala:65)
at
scala.collection.LinearSeqOptimized$class.foreach(LinearSeqOptimized.scala:61)
at scala.collection.immutable.List.foreach(List.scala:45)
at kafka.admin.ListTopicCommand$.main(ListTopicCommand.scala:65)
at kafka.admin.ListTopicCommand.main(ListTopicCommand.scala)
Caused by: kafka.common.LeaderNotAvailableException: No leader exists for
partition 0
at kafka.admin.AdminUtils$$anonfun$3.apply(AdminUtils.scala:117)
... 16 more





On Fri, Mar 22, 2013 at 1:12 PM, Neha Narkhede wrote:

> * Scenario 1:  BrokerID 1,2,3   Broker 2 dies.
>
> Here, you can use reassign partitions tool and for all partitions that
> had a replica on broker 2, move it to broker 4
>
> * Scenario 2: BrokerID 1,2,3 Catastrophic failure 1,2,3 die but ZK still
> there.
>
> There is no way to recover any data here since there is nothing
> available to consume data from.
>
> Thanks,
> Neha
>
> On Fri, Mar 22, 2013 at 10:46 AM, Scott Clasen  wrote:
> > What would the recommended practice be for the following scenarios?
> >
> > 

0.8 best practices for migrating / electing leaders in failure situations?

2013-03-22 Thread Scott Clasen
What would the recommended practice be for the following scenarios?

Running on EC2, ephemperal disks only for kafka.

There are 3 kafka servers. The broker ids are always increasing. If a
broker dies its never coming back.

All topics have a replication factor of 3.

* Scenario 1:  BrokerID 1,2,3   Broker 2 dies.

Recover by:

Boot another: BrokerID 4
?? run bin/kafka-reassign-partitions.sh   for any topic+partition and
replace brokerid 2 with brokerid 4
?? anything else to do to cause messages to be replicated to 4??

NOTE: This appears to work but not positive 4 got messages replicated to it.

* Scenario 2: BrokerID 1,2,3 Catastrophic failure 1,2,3 die but ZK still
there.

Messages obviously lost.
Recover to a functional state by:

Boot 3 more: 4,5 6
?? run bin/kafka-reassign-partitions.sh  for all topics/partitions, swap
1,2,3 for 4,5,6?
?? rin bin/kafka-preferred-replica-election.sh for all topics/partitions
?? anything else to do to allow producers to start sending successfully??


NOTE: I had some trouble with scenario 2. Will try to reproduce and open a
ticket, if in fact my procedures for scenario 2 are correct, and I still
cant get to a good state.


Re: 0.8 reassignment issue (built from 7b14ebae3382427b44a928d0b186001735c15efb)

2013-03-21 Thread Scott Clasen
KAFKA-821   Cheers!


On Thu, Mar 21, 2013 at 1:58 PM, Neha Narkhede wrote:

> Please can you file a bug, this needs to be fixed.
>
> Thanks
> Neha
>
>
> On Thu, Mar 21, 2013 at 1:50 PM, Scott Clasen  wrote:
>
> > So, deleting /admin/reassign_partitions  from Zookeeper, which had an
> empty
> > array of partitions seemed to fix things. Probably isnt the preferred way
> > though :)
> >
> >
> > On Thu, Mar 21, 2013 at 12:40 PM, Scott Clasen  wrote:
> >
> > > Hi all, Im currently testing out 0.8 failure/recovery stuff. Having an
> > > issue with reassigning a partition,
> > > after making a mistake in my first attempt to reassign, seems like
> > > reassignment is hosed.
> > >
> > > Is this known or should I open a ticket?
> > >
> > >
> > > Have 3 brokers running. Ids 25,26,27
> > >
> > > ./bin/kafka-create-topic.sh --replica 3 --topic first-cluster-topic
> > > --zookeeper :2181/kafka
> > >
> > > Seems fine, can send/receive, etc..
> > >
> > > Kill 27, start 28. Try to reassign the single partition topic with the
> > > following json.
> > >
> > > Contains an error. partition should be 0 not 1.
> > >
> > >  {"partitions":
> > >  [{"topic": "first-cluster-topic", "partition": 1, "replicas":
> [25,26,28]
> > > }]
> > > }
> > >
> > > ./bin/kafka-reassign-partitions.sh  --zookeeper ... -path-to-json-file
> > > reassign.json
> > >
> > > 2013-03-21 12:14:46,170] INFO zookeeper state changed (SyncConnected)
> > > (org.I0Itec.zkclient.ZkClient)
> > > [2013-03-21 12:14:46,310] ERROR Skipping reassignment of partition
> > > [first-cluster-topic,1] since it doesn't exist
> > > (kafka.admin.ReassignPartitionsCommand)
> > > Successfully started reassignment of partitions
> > > Map([first-cluster-topic,1] -> List(25, 26, 28))
> > > [2013-03-21 12:14:46,665] INFO Terminate ZkClient event thread.
> > > (org.I0Itec.zkclient.ZkEventThread)
> > > [2013-03-21 12:14:46,780] INFO Session: 0x13d8a63a3760007 closed
> > > (org.apache.zookeeper.ZooKeeper)
> > > [2013-03-21 12:14:46,780] INFO EventThread shut down
> > > (org.apache.zookeeper.ClientCnxn)
> > >
> > > Ok, fix the JSON
> > >
> > >  {"partitions":
> > >  [{"topic": "first-cluster-topic", "partition": 0, "replicas":
> [25,26,28]
> > > }]
> > > }
> > >
> > > ./bin/kafka-reassign-partitions.sh  --zookeeper ... -path-to-json-file
> > > reassign.json
> > >
> > >
> > > [2013-03-21 12:17:34,367] INFO zookeeper state changed (SyncConnected)
> > > (org.I0Itec.zkclient.ZkClient)
> > > Partitions reassignment failed due to Partition reassignment currently
> in
> > > progress for Map(). Aborting operation
> > > kafka.common.AdminCommandFailedException: Partition reassignment
> > currently
> > > in progress for Map(). Aborting operation
> > >  at
> > >
> >
> kafka.admin.ReassignPartitionsCommand.reassignPartitions(ReassignPartitionsCommand.scala:91)
> > > at
> > >
> >
> kafka.admin.ReassignPartitionsCommand$.main(ReassignPartitionsCommand.scala:65)
> > >  at
> > >
> >
> kafka.admin.ReassignPartitionsCommand.main(ReassignPartitionsCommand.scala)
> > >
> > >  ./bin/kafka-check-reassignment-status.sh --zookeeper ...
> > > --path-to-json-file reassign.json
> > >
> > >
> > > [2013-03-21 12:20:40,607] INFO zookeeper state changed (SyncConnected)
> > > (org.I0Itec.zkclient.ZkClient)
> > > Exception in thread "main" java.lang.ClassCastException:
> > > scala.collection.immutable.Map$Map1 cannot be cast to
> > > [Lscala.collection.Map;
> > >  at
> > >
> >
> kafka.admin.CheckReassignmentStatus$.main(CheckReassignmentStatus.scala:44)
> > > at
> > kafka.admin.CheckReassignmentStatus.main(CheckReassignmentStatus.scala)
> > >
> > >
> > >
> > >
> > >
> >
>


Re: 0.8 reassignment issue (built from 7b14ebae3382427b44a928d0b186001735c15efb)

2013-03-21 Thread Scott Clasen
So, deleting /admin/reassign_partitions  from Zookeeper, which had an empty
array of partitions seemed to fix things. Probably isnt the preferred way
though :)


On Thu, Mar 21, 2013 at 12:40 PM, Scott Clasen  wrote:

> Hi all, Im currently testing out 0.8 failure/recovery stuff. Having an
> issue with reassigning a partition,
> after making a mistake in my first attempt to reassign, seems like
> reassignment is hosed.
>
> Is this known or should I open a ticket?
>
>
> Have 3 brokers running. Ids 25,26,27
>
> ./bin/kafka-create-topic.sh --replica 3 --topic first-cluster-topic
> --zookeeper :2181/kafka
>
> Seems fine, can send/receive, etc..
>
> Kill 27, start 28. Try to reassign the single partition topic with the
> following json.
>
> Contains an error. partition should be 0 not 1.
>
>  {"partitions":
>  [{"topic": "first-cluster-topic", "partition": 1, "replicas": [25,26,28]
> }]
> }
>
> ./bin/kafka-reassign-partitions.sh  --zookeeper ... -path-to-json-file
> reassign.json
>
> 2013-03-21 12:14:46,170] INFO zookeeper state changed (SyncConnected)
> (org.I0Itec.zkclient.ZkClient)
> [2013-03-21 12:14:46,310] ERROR Skipping reassignment of partition
> [first-cluster-topic,1] since it doesn't exist
> (kafka.admin.ReassignPartitionsCommand)
> Successfully started reassignment of partitions
> Map([first-cluster-topic,1] -> List(25, 26, 28))
> [2013-03-21 12:14:46,665] INFO Terminate ZkClient event thread.
> (org.I0Itec.zkclient.ZkEventThread)
> [2013-03-21 12:14:46,780] INFO Session: 0x13d8a63a3760007 closed
> (org.apache.zookeeper.ZooKeeper)
> [2013-03-21 12:14:46,780] INFO EventThread shut down
> (org.apache.zookeeper.ClientCnxn)
>
> Ok, fix the JSON
>
>  {"partitions":
>  [{"topic": "first-cluster-topic", "partition": 0, "replicas": [25,26,28]
> }]
> }
>
> ./bin/kafka-reassign-partitions.sh  --zookeeper ... -path-to-json-file
> reassign.json
>
>
> [2013-03-21 12:17:34,367] INFO zookeeper state changed (SyncConnected)
> (org.I0Itec.zkclient.ZkClient)
> Partitions reassignment failed due to Partition reassignment currently in
> progress for Map(). Aborting operation
> kafka.common.AdminCommandFailedException: Partition reassignment currently
> in progress for Map(). Aborting operation
>  at
> kafka.admin.ReassignPartitionsCommand.reassignPartitions(ReassignPartitionsCommand.scala:91)
> at
> kafka.admin.ReassignPartitionsCommand$.main(ReassignPartitionsCommand.scala:65)
>  at
> kafka.admin.ReassignPartitionsCommand.main(ReassignPartitionsCommand.scala)
>
>  ./bin/kafka-check-reassignment-status.sh --zookeeper ...
> --path-to-json-file reassign.json
>
>
> [2013-03-21 12:20:40,607] INFO zookeeper state changed (SyncConnected)
> (org.I0Itec.zkclient.ZkClient)
> Exception in thread "main" java.lang.ClassCastException:
> scala.collection.immutable.Map$Map1 cannot be cast to
> [Lscala.collection.Map;
>  at
> kafka.admin.CheckReassignmentStatus$.main(CheckReassignmentStatus.scala:44)
> at kafka.admin.CheckReassignmentStatus.main(CheckReassignmentStatus.scala)
>
>
>
>
>


Re: Not seeing any replies

2013-03-21 Thread Scott Clasen
testing 1 2 3 for Bob


On Thu, Mar 21, 2013 at 12:48 PM, Bob Jervis  wrote:

> How do I get to see replies from your e-mails???
>
>
> I have seen nothing weither through my gmail account or through my company
> account.
>
> THis mailing list is not wokring for me.
>
> How do I communicate with you guys???
>


0.8 reassignment issue (built from 7b14ebae3382427b44a928d0b186001735c15efb)

2013-03-21 Thread Scott Clasen
Hi all, Im currently testing out 0.8 failure/recovery stuff. Having an
issue with reassigning a partition,
after making a mistake in my first attempt to reassign, seems like
reassignment is hosed.

Is this known or should I open a ticket?


Have 3 brokers running. Ids 25,26,27

./bin/kafka-create-topic.sh --replica 3 --topic first-cluster-topic
--zookeeper :2181/kafka

Seems fine, can send/receive, etc..

Kill 27, start 28. Try to reassign the single partition topic with the
following json.

Contains an error. partition should be 0 not 1.

 {"partitions":
 [{"topic": "first-cluster-topic", "partition": 1, "replicas": [25,26,28] }]
}

./bin/kafka-reassign-partitions.sh  --zookeeper ... -path-to-json-file
reassign.json

2013-03-21 12:14:46,170] INFO zookeeper state changed (SyncConnected)
(org.I0Itec.zkclient.ZkClient)
[2013-03-21 12:14:46,310] ERROR Skipping reassignment of partition
[first-cluster-topic,1] since it doesn't exist
(kafka.admin.ReassignPartitionsCommand)
Successfully started reassignment of partitions Map([first-cluster-topic,1]
-> List(25, 26, 28))
[2013-03-21 12:14:46,665] INFO Terminate ZkClient event thread.
(org.I0Itec.zkclient.ZkEventThread)
[2013-03-21 12:14:46,780] INFO Session: 0x13d8a63a3760007 closed
(org.apache.zookeeper.ZooKeeper)
[2013-03-21 12:14:46,780] INFO EventThread shut down
(org.apache.zookeeper.ClientCnxn)

Ok, fix the JSON

 {"partitions":
 [{"topic": "first-cluster-topic", "partition": 0, "replicas": [25,26,28] }]
}

./bin/kafka-reassign-partitions.sh  --zookeeper ... -path-to-json-file
reassign.json


[2013-03-21 12:17:34,367] INFO zookeeper state changed (SyncConnected)
(org.I0Itec.zkclient.ZkClient)
Partitions reassignment failed due to Partition reassignment currently in
progress for Map(). Aborting operation
kafka.common.AdminCommandFailedException: Partition reassignment currently
in progress for Map(). Aborting operation
at
kafka.admin.ReassignPartitionsCommand.reassignPartitions(ReassignPartitionsCommand.scala:91)
at
kafka.admin.ReassignPartitionsCommand$.main(ReassignPartitionsCommand.scala:65)
at
kafka.admin.ReassignPartitionsCommand.main(ReassignPartitionsCommand.scala)

 ./bin/kafka-check-reassignment-status.sh --zookeeper ...
--path-to-json-file reassign.json


[2013-03-21 12:20:40,607] INFO zookeeper state changed (SyncConnected)
(org.I0Itec.zkclient.ZkClient)
Exception in thread "main" java.lang.ClassCastException:
scala.collection.immutable.Map$Map1 cannot be cast to
[Lscala.collection.Map;
at
kafka.admin.CheckReassignmentStatus$.main(CheckReassignmentStatus.scala:44)
at kafka.admin.CheckReassignmentStatus.main(CheckReassignmentStatus.scala)