Re: Testing broker failover

2016-08-08 Thread Alex Loddengaard
Hi Alper,

Thanks for sharing. I was particularly interested in seeing what *acks* was
set to. Since you haven't set it, its value is the default, *1*.

To handle errors, you need to use the send() method that takes a callback,
and build an appropriate callback to handle errors. Take a look here for an
example:

http://kafka.apache.org/0100/javadoc/org/apache/kafka/clients/producer/KafkaProducer.html#send(org.apache.kafka.clients.producer.ProducerRecord,%20org.apache.kafka.clients.producer.Callback)

Let us know if you have follow-up questions.

Alex

On Mon, Aug 8, 2016 at 11:24 AM, Alper Akture <al...@goldenratstudios.com>
wrote:

> Thanks Alex... using producer props:
>
> {timeout.ms=500, max.block.ms=500, request.timeout.ms=500,
> bootstrap.servers=localhost:9092,
> serializer.class=kafka.serializer.StringEncoder,
> value.serializer=org.apache.kafka.common.serialization.StringSerializer,
> metadata.fetch.timeout.ms=500,
> key.serializer=org.apache.kafka.common.serialization.StringSerializer}
>
>
>
>
> On Mon, Aug 8, 2016 at 9:21 AM, Alex Loddengaard <a...@confluent.io>
> wrote:
>
> > Hi Alper, can you share your producer config -- the Properties object? We
> > need to learn more to help you understand the behavior you're observing.
> >
> > Thanks,
> >
> > Alex
> >
> > On Fri, Aug 5, 2016 at 7:45 PM, Alper Akture <al...@goldenratstudios.com
> >
> > wrote:
> >
> > > I'm using 0.10.0.0 and testing some failover scenarios. For dev, i have
> > > single kafka node and a zookeeper instance. While sending events to a
> > > topic, I shutdown the broker to see if my failover handling works.
> > However,
> > > I don't see any indication that the send failed, but I do see the
> > > connection refused errors logged at debug. What is the standard way to
> > > detect a message send failure, and handle it for offline processing
> > later?
> > >
> > > Here's the debug output I see:
> > >
> > > 19:20:00.906 [kafka-producer-network-thread | producer-1] DEBUG
> > > org.apache.kafka.clients.NetworkClient - Initialize connection to node
> > -1
> > > for sending metadata request
> > > 19:20:00.906 [kafka-producer-network-thread | producer-1] DEBUG
> > > org.apache.kafka.clients.NetworkClient - Initiating connection to node
> > -1
> > > at localhost:9092.
> > > 19:20:00.907 [kafka-producer-network-thread | producer-1] DEBUG
> > > org.apache.kafka.common.network.Selector - Connection with localhost/
> > > 127.0.0.1 disconnected
> > > java.net.ConnectException: Connection refused
> > > at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
> > ~[?:1.8.0_66]
> > > at sun.nio.ch.SocketChannelImpl.finishConnect(
> > SocketChannelImpl.java:717)
> > > ~[?:1.8.0_66]
> > > at
> > > org.apache.kafka.common.network.PlaintextTransportLayer.finishConnect(
> > > PlaintextTransportLayer.java:51)
> > > ~[kafka-clients-0.10.0.0.jar:?]
> > > at
> > > org.apache.kafka.common.network.KafkaChannel.
> finishConnect(KafkaChannel.
> > > java:73)
> > > ~[kafka-clients-0.10.0.0.jar:?]
> > > at
> > > org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.
> > > java:309)
> > > [kafka-clients-0.10.0.0.jar:?]
> > > at org.apache.kafka.common.network.Selector.poll(Selector.java:283)
> > > [kafka-clients-0.10.0.0.jar:?]
> > > at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:260)
> > > [kafka-clients-0.10.0.0.jar:?]
> > > at org.apache.kafka.clients.producer.internals.Sender.run(
> > Sender.java:229)
> > > [kafka-clients-0.10.0.0.jar:?]
> > > at org.apache.kafka.clients.producer.internals.Sender.run(
> > Sender.java:134)
> > > [kafka-clients-0.10.0.0.jar:?]
> > > at java.lang.Thread.run(Thread.java:745) [?:1.8.0_66]
> > > 19:20:00.907 [kafka-producer-network-thread | producer-1] DEBUG
> > > org.apache.kafka.clients.NetworkClient - Node -1 disconnected.
> > > 19:20:00.907 [kafka-producer-network-thread | producer-1] DEBUG
> > > org.apache.kafka.clients.NetworkClient - Give up sending metadata
> > request
> > > since no node is available
> > >
> >
>


Re: Kafka cluster with a different version that the java API

2016-08-05 Thread Alex Loddengaard
Hi Sergio, clients have to be the same version or older than the brokers. A
newer client won't work with an older broker.

Alex

On Fri, Aug 5, 2016 at 7:37 AM, Sergio Gonzalez <
sgonza...@cecropiasolutions.com> wrote:

> Hi users,
>
> Is there some issue if I create the kafka cluster using the
> kafka_2.10-0.8.2.0 version  and I have my java producers and consumers with
> the 0.10.0.0 version?
>
> 
> org.apache.kafka
> kafka-clients
> 0.10.0.0
> 
> 
>org.apache.kafka
>kafka-streams
>0.10.0.0
> 
>
> What are the repercutions that i could have if I use this enviroment?
>
> Thanks,
> Sergio GQ
>


Re: log.retention.bytes

2016-06-24 Thread Alex Loddengaard
Hi Dave,

log.retention.bytes is per partition. If you change it after the topic was
created, you'll see the behavior you expect -- namely that the new value is
used when the log is cleaned. The frequency that the log is cleaned is
controlled by log.retention.check.interval.ms, with a default value of 5
minutes.

Hope this helps!

Alex

On Fri, Jun 24, 2016 at 9:19 AM, Tauzell, Dave  wrote:

> Is the log.retention.bytes setting per partition or for the whole topic?
>   If I change it after a topic has been created do the changes apply to the
> existing topics?
>
> Thanks,
>Dave
>
> This e-mail and any files transmitted with it are confidential, may
> contain sensitive information, and are intended solely for the use of the
> individual or entity to whom they are addressed. If you have received this
> e-mail in error, please notify the sender by reply e-mail immediately and
> destroy all copies of the e-mail and any attachments.
>


Re: How to delete data from topic with interval of two days

2016-06-22 Thread Alex Loddengaard
Hi Kotesh,

log.retention.hours sets how long messages are kept in the long, and
log.retention.check.interval.ms sets how often the log cleaner checks if
messages should be deleted based on the retention setting.

I hope this helps.

Alex

On Wed, Jun 22, 2016 at 3:13 AM, kotesh banoth  wrote:

> Hi All,
>
> Where can i configure this, Please can some one suggest me the
> configuration path in kafka
>
>
> Banoth Kotesh
> Computer Science and Engineering(2010-14),
> NIT Rourkela,
> +917338918143
>


Re: Question about heterogeneous brokers in a cluster

2016-06-09 Thread Alex Loddengaard
Hi Kevin,

If you keep the same configs on the new brokers with more storage capacity,
I don't foresee any issues. Although I haven't tried it myself.

What may introduce headaches is if you have different configuration options
per broker. Or if you try to assign more partitions to the newer brokers to
use more of their disk space.

Let's see if others notice anything I'm missing (again, I've never tried
this before). Hope this helps.

Alex

On Thu, Jun 9, 2016 at 10:27 AM, Kevin A  wrote:

> Hi there,
>
> I have a couple of Kafka brokers and thinking about adding a few more. The
> new broker machines would have a lot more storage available to them than
> the existing brokers. Am I setting myself up for operational headaches by
> deploying a heterogeneous (in terms of storage capacity) cluster?
>
> (Asked on IRC but thought I'd try here too.)
>
> Thanks!
> -Kevin
>


Re: Use Zookeeper for a Producer

2016-05-31 Thread Alex Loddengaard
Hi Igor, a change in the number of brokers generally doesn't require
configuration or code changes in producers and consumers. You will need to
change bootstrap.servers if its original value no longer contains an active
broker.

Alex

On Tue, May 31, 2016 at 12:44 PM, Igor Kravzov <igork.ine...@gmail.com>
wrote:

> What if number of brokers change? Does it mean I need to change
> configuration or potentially recompile my producer and consumer?
>
> On Tue, May 31, 2016 at 3:27 PM, Alex Loddengaard <a...@confluent.io>
> wrote:
>
> > The "old" consumer used ZooKeeper. The "new" consumer, introduced in 0.9,
> > doesn't use ZooKeeper. The producer doesn't use ZooKeeper, either.
> However,
> > brokers still use ZooKeeper.
> >
> > Alex
> >
> > On Tue, May 31, 2016 at 12:03 PM, Igor Kravzov <igork.ine...@gmail.com>
> > wrote:
> >
> > > When I look at code samples producers mostly write to brokers and
> > consumers
> > > use Zookeeper to consume from topics.
> > >
> > > Using Microsoft .net client (
> > > https://github.com/Microsoft/CSharpClient-for-Kafka)  I wrote producer
> > > witch uses Zookeeper and was able to write data successfully.
> > >
> > > Am I missing something if I use Zookeeper in producer?
> > >
> >
>


Re: Use Zookeeper for a Producer

2016-05-31 Thread Alex Loddengaard
The "old" consumer used ZooKeeper. The "new" consumer, introduced in 0.9,
doesn't use ZooKeeper. The producer doesn't use ZooKeeper, either. However,
brokers still use ZooKeeper.

Alex

On Tue, May 31, 2016 at 12:03 PM, Igor Kravzov 
wrote:

> When I look at code samples producers mostly write to brokers and consumers
> use Zookeeper to consume from topics.
>
> Using Microsoft .net client (
> https://github.com/Microsoft/CSharpClient-for-Kafka)  I wrote producer
> witch uses Zookeeper and was able to write data successfully.
>
> Am I missing something if I use Zookeeper in producer?
>


Re: Topics, partitions and keys

2016-05-31 Thread Alex Loddengaard
Hi Igor, see inline:

On Sat, May 28, 2016 at 8:14 AM, Igor Kravzov 
wrote:

> I need some clarification on subject.
> In Kafka documentations I found the following:
>
> Kafka only provides a total order over messages *within* a partition, not
> between different partitions in a topic. Per-partition ordering combined
> with the ability to partition data by key is sufficient for most
> applications. However, if you require a total order over messages this can
> be achieved with a topic that has only one partition, though this will mean
> only one consumer process per consumer group.
>
> So here are my questions:
> 1. Does it mean if i want to have more than 1 consumer (from the same
> group) reading from one topic I need to have more than 1 partition?
>

Yes.


>
> 2. Does it mean I need same amount of partitions as amount of consumers for
> the same group?
>

No. If you have more partitions than consumers, consumers will consume from
more than one topic.


>
> 3. How many consumers can read from one partition?
>

Only one.


>
> Also have some questions regarding relationship between keys and partitions
> with regard to API. I only looked at .net APIs (especially one from MS)
>  but looks like the mimic Java API.
> I see when using a producer to send a message to a topic there is a key
> parameter. But when consumer reads from a topic there is a partition
> number.
>
> 1. How are partitions numbered? Starting from 0 or 1?
> 2. What exactly relationship between a key and partition?
>

Using the default partitioner, and assuming you don't add new partitions,
all messages with the same key are guaranteed to land in the same partition.


> As I understand some function on key will determine a partition. is that
> correct?
> 3. If I have 2 partitions in a topic and want some particular messages go
> to one partition and other messages go to another I should use a specific
> key for one specific partition, and the rest for another?
>

If you always want one key to go to a specific partition, and another key
to go to a different partition, you can use a custom partitioner or the
Java API:

http://kafka.apache.org/090/javadoc/org/apache/kafka/clients/producer/ProducerRecord.html


> 4. What if I have 3 partitions and one type of messages to one particular
> partition and the rest to other 2?
>

Same as above. Although, in most cases, all messages in a topic are the
same "type" of message. A topic is like a database table in this way.


> 5. How in general I send messages to a particular partition in order to
> know  for a consumer from where to read?
>

See above.


> Or I better off with multiple topics?
>

This will depend on your application.


>
> Thanks in advance.
>


Re: Kafka error: FATAL Fatal error during KafkaServerStartable startup.after enabling security

2016-05-31 Thread Alex Loddengaard
nnect to
> zookeeper server within timeout: 3
> at org.I0Itec.zkclient.ZkClient.connect(ZkClient.java:1232)
> at org.I0Itec.zkclient.ZkClient.(ZkClient.java:156)
> at org.I0Itec.zkclient.ZkClient.(ZkClient.java:130)
>
>
> Thanks,
> Kiran
>
> On Sat, May 28, 2016 at 4:15 AM, Alex Loddengaard <a...@confluent.io>
> wrote:
>
>> Hi Kiran,
>>
>> Can you attach your configuration files not in a .zip?
>>
>> Most likely your broker isn't using the correct hostname:port to connect
>> to
>> ZooKeeper. Although if you're using ZooKeeper SASL, you may have a SASL
>> misconfiguration. Set the `sun.security.krb5.debug` JVM property to `true`
>> to get SASL debug messages.
>>
>> Alex
>>
>> On Fri, May 27, 2016 at 5:38 AM, kiran kumar <kiran.cse...@gmail.com>
>> wrote:
>>
>> > Hello All,
>> >
>> > After enabling the security for the kafka brokers and zookeeper, I am
>> > unable to start the kafka brokers.
>> >
>> > I have attached all the required configuration files.
>> >
>> > Please let me know if you need more information.
>> >
>> > Below is the complete error:
>> >
>> > FATAL Fatal error during KafkaServer startup. Prepare to shutdown
>> > (kafka.server.KafkaServer)
>> > org.I0Itec.zkclient.exception.ZkTimeoutException: Unable to connect to
>> > zookeeper server within timeout: 3
>> > at org.I0Itec.zkclient.ZkClient.connect(ZkClient.java:1232)
>> > at org.I0Itec.zkclient.ZkClient.(ZkClient.java:156)
>> > at org.I0Itec.zkclient.ZkClient.(ZkClient.java:130)
>> > at kafka.utils.ZkUtils$.createZkClientAndConnection(ZkUtils.scala:75)
>> > at kafka.utils.ZkUtils$.apply(ZkUtils.scala:57)
>> > at kafka.server.KafkaServer.initZk(KafkaServer.scala:294)
>> > at kafka.server.KafkaServer.startup(KafkaServer.scala:180)
>> > at
>> kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:37)
>> > at kafka.Kafka$.main(Kafka.scala:67)
>> > at kafka.Kafka.main(Kafka.scala)
>> > [2016-05-27 07:55:36,450] INFO shutting down (kafka.server.KafkaServer)
>> > [2016-05-27 07:55:36,453] DEBUG Shutting down task scheduler.
>> > (kafka.utils.KafkaScheduler)
>> > [2016-05-27 07:55:36,456] INFO shut down completed
>> > (kafka.server.KafkaServer)
>> > [2016-05-27 07:55:36,458] FATAL Fatal error during KafkaServerStartable
>> > startup. Prepare to shutdown (kafka.server.KafkaServerStartable)
>> > org.I0Itec.zkclient.exception.ZkTimeoutException: Unable to connect to
>> > zookeeper server within timeout: 3
>> > at org.I0Itec.zkclient.ZkClient.connect(ZkClient.java:1232)
>> > at org.I0Itec.zkclient.ZkClient.(ZkClient.java:156)
>> > at org.I0Itec.zkclient.ZkClient.(ZkClient.java:130)
>> > at kafka.utils.ZkUtils$.createZkClientAndConnection(ZkUtils.scala:75)
>> > at kafka.utils.ZkUtils$.apply(ZkUtils.scala:57)
>> > at kafka.server.KafkaServer.initZk(KafkaServer.scala:294)
>> > at kafka.server.KafkaServer.startup(KafkaServer.scala:180)
>> > at
>> kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:37)
>> > at kafka.Kafka$.main(Kafka.scala:67)
>> > at kafka.Kafka.main(Kafka.scala)
>> >
>> > --
>> > G.Kiran Kumar
>> >
>>
>
>
>
> --
> G.Kiran Kumar
>


Re: Kafka error: FATAL Fatal error during KafkaServerStartable startup.after enabling security

2016-05-27 Thread Alex Loddengaard
Hi Kiran,

Can you attach your configuration files not in a .zip?

Most likely your broker isn't using the correct hostname:port to connect to
ZooKeeper. Although if you're using ZooKeeper SASL, you may have a SASL
misconfiguration. Set the `sun.security.krb5.debug` JVM property to `true`
to get SASL debug messages.

Alex

On Fri, May 27, 2016 at 5:38 AM, kiran kumar  wrote:

> Hello All,
>
> After enabling the security for the kafka brokers and zookeeper, I am
> unable to start the kafka brokers.
>
> I have attached all the required configuration files.
>
> Please let me know if you need more information.
>
> Below is the complete error:
>
> FATAL Fatal error during KafkaServer startup. Prepare to shutdown
> (kafka.server.KafkaServer)
> org.I0Itec.zkclient.exception.ZkTimeoutException: Unable to connect to
> zookeeper server within timeout: 3
> at org.I0Itec.zkclient.ZkClient.connect(ZkClient.java:1232)
> at org.I0Itec.zkclient.ZkClient.(ZkClient.java:156)
> at org.I0Itec.zkclient.ZkClient.(ZkClient.java:130)
> at kafka.utils.ZkUtils$.createZkClientAndConnection(ZkUtils.scala:75)
> at kafka.utils.ZkUtils$.apply(ZkUtils.scala:57)
> at kafka.server.KafkaServer.initZk(KafkaServer.scala:294)
> at kafka.server.KafkaServer.startup(KafkaServer.scala:180)
> at kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:37)
> at kafka.Kafka$.main(Kafka.scala:67)
> at kafka.Kafka.main(Kafka.scala)
> [2016-05-27 07:55:36,450] INFO shutting down (kafka.server.KafkaServer)
> [2016-05-27 07:55:36,453] DEBUG Shutting down task scheduler.
> (kafka.utils.KafkaScheduler)
> [2016-05-27 07:55:36,456] INFO shut down completed
> (kafka.server.KafkaServer)
> [2016-05-27 07:55:36,458] FATAL Fatal error during KafkaServerStartable
> startup. Prepare to shutdown (kafka.server.KafkaServerStartable)
> org.I0Itec.zkclient.exception.ZkTimeoutException: Unable to connect to
> zookeeper server within timeout: 3
> at org.I0Itec.zkclient.ZkClient.connect(ZkClient.java:1232)
> at org.I0Itec.zkclient.ZkClient.(ZkClient.java:156)
> at org.I0Itec.zkclient.ZkClient.(ZkClient.java:130)
> at kafka.utils.ZkUtils$.createZkClientAndConnection(ZkUtils.scala:75)
> at kafka.utils.ZkUtils$.apply(ZkUtils.scala:57)
> at kafka.server.KafkaServer.initZk(KafkaServer.scala:294)
> at kafka.server.KafkaServer.startup(KafkaServer.scala:180)
> at kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:37)
> at kafka.Kafka$.main(Kafka.scala:67)
> at kafka.Kafka.main(Kafka.scala)
>
> --
> G.Kiran Kumar
>


Re: Separating internal and external traffic

2016-05-25 Thread Alex Loddengaard
Hi there,

You can use the `listeners` config to tell Kafka which interfaces to listen
on. The `listeners` config also supports setting the port and protocol. You
may also want to set `advertised.listeners` if the `listeners` hostnames or
IPs aren't reachable by your clients.

Alex

On Wed, May 25, 2016 at 11:41 AM, D C  wrote:

> I'm sure i can do this but I'm just not stumbling on the right
> documentation anywhere.  I have a handful of kafka servers that I am trying
> to get ready for production. I'm trying separate the internal and external
> network traffic, but I don't see how to do it.
>
> Each host has two addresses.
> 10.x.y.z = default interface
> 192.168.x.y = private network seen only by the kafka nodes.
>
> How can I tell kafka to make use of this?
>


Re: Best monitoring tool for Kafka in production

2016-05-25 Thread Alex Loddengaard
Hi Hafsa,

We often see Grafana and Graphite, which are both free. Keep in mind you
should monitor the system's metrics and Kafka's JMX metrics.

Alex

On Wed, May 25, 2016 at 3:42 AM, Hafsa Asif 
wrote:

> Hello,
>
> What is the best monitoring tool for Kafka in production, preferable free
> tool? If there is no free tool, then please mention non-free efficient
> monitoring tools also.
>
> We are feeling so much problem without monitoring tool. Sometimes brokers
> goes down or consumer is not working, we are not informed.
>
> Best Regards,
> Hafsa
>


Re: ISR shrinking/expanding problem

2016-05-16 Thread Alex Loddengaard
Hi Russ,

They should eventually catch back up and rejoin the ISR. Did they not?

Alex

On Fri, May 13, 2016 at 6:33 PM, Russ Lavoie  wrote:

> Hello,
>
> I moved an entire topic from one set of brokers to another set of brokers.
> The network throughput was so high, that they fell behind the leaders and
> dropped out of the ISR set.  How can I recover from this?
>
> Thanks!
>


Re: New broker as follower for a topic replication

2016-05-10 Thread Alex Loddengaard
Hi Paolo,

The best way to do this would be to have broker3 start up with the same
broker id as the failed broker2. broker3 will then rejoin the cluster,
begin catching up with broker1, and eventually rejoin the ISR. If it starts
up with a new broker id, you'll need to run the partition reassignment tool
to manually assign partition replicas to the new broker.

Alex

On Mon, May 9, 2016 at 9:03 AM, Paolo Patierno  wrote:

> Hello,
>
> in order to try running Kafka in Kubernetes I'm facing the following
> problem ...
>
> Imagine that I start a cluster with a zookeeper instance and two kafka
> broker. I create a topic with one partition but replication factor 2 : I
> have broker1 as leader and broker2 as follower.
> All works great !
>
> Imagine that the pod where the broker2 is running stops to work.
> Kubernetes starts a new pod (let me name it broker3) but with a different
> IP.
> What's the right way to make the new broker instance as a follower for the
> broker1 in order to have it "in sync" and guarantee the replication factor
> 2 for the topic ?
>
> Thanks,
> Paolo.
>
> Paolo PatiernoSenior Software Engineer (IoT) @ Red Hat
> Microsoft MVP on Windows Embedded & IoTMicrosoft Azure Advisor
> Twitter : @ppatierno
> Linkedin : paolopatierno
> Blog : DevExperience


Re: Consumer stopped after reading some messages

2016-05-10 Thread Alex Loddengaard
Hi Sahitya,

I wonder if your consumers are experiencing soft failures because they're
busy processing a large collection of messages and not calling poll()
within session.timeout.ms? In this scenario, the group coordinator (a
broker) would not receive a heartbeat within session.timeout.ms and would
consider the consumer failed. The coordinator would then reassign the
"failed" consumer's partitions to other consumers in the same group. If all
consumers are experiencing soft failures, you may observe them all
"freezing" their consumption. I suggest checking the logs to see if your
consumer group is being rebalanced frequently.

If you are hitting the issue I've explained, checkout this KIP:

https://cwiki.apache.org/confluence/display/KAFKA/KIP-41%3A+KafkaConsumer+Max+Records

Or consider increasing session.timeout.ms.

Hope this helps.

Alex

On Thu, May 5, 2016 at 8:22 AM, John Bickerstaff 
wrote:

> This may or may not help.  I found it to be a clever workaround for some of
> the limitations in the 8.x version of the high level consumer...  I ended
> up writing code that "waited" a lot because I couldn't be sure how quickly
> Kafka would respond...
>
> Nothing ever took minutes however...  the waits were 30 seconds or so if I
> recall...
>
> In case it helps...
>
> http://ingest.tips/2014/10/12/kafka-high-level-consumer-frequently-missing-
> pieces/
> 
>
> On Thu, May 5, 2016 at 4:58 AM, sahitya agrawal 
> wrote:
>
> > I am using high level consumer API ( Kafka API version 0.9.0.0 )
> >
> > I am running consumers on a topic of 10 partitions. There are lot of
> unread
> > messages in that topic. Initially all of them , consume from the topic
> and
> > read messages. After sometime, all of them hangs and doesn't read any
> > message at all.
> > I have to manually restart them to make them consume again.
> > Surprisingly I do not see any exception or error in logs also.
> >
> > Has anybody ever faced this issue?
> >
>


Re: Backing up Kafka data and using it later?

2016-05-10 Thread Alex Loddengaard
You may find this interesting, although I don't believe it's exactly what
you're looking for:

https://github.com/pinterest/secor

I'm not sure how stable and commonly used it is.

Additionally, I see a lot of users use MirrorMaker for a "backup," where
MirrorMaker copies all topics from one Kafka cluster to another "backup"
cluster. I put "backup" in quotes because this architecture doesn't support
snapshotting like a traditional backup would. I realize this doesn't
address your specific use case, but thought you may find it interesting
regardless.

Sorry I'm a little late to the thread, too.

Alex

On Thu, May 5, 2016 at 7:05 AM, Rad Gruchalski  wrote:

> John,
>
> I’m not as expert expert in Kafka but I would assume so.
>
>
>
>
>
>
>
>
>
>
> Best regards,
> Radek Gruchalski
> ra...@gruchalski.com (mailto:ra...@gruchalski.com) (mailto:
> ra...@gruchalski.com)
> de.linkedin.com/in/radgruchalski/ (
> http://de.linkedin.com/in/radgruchalski/)
>
> Confidentiality:
> This communication is intended for the above-named person and may be
> confidential and/or legally privileged.
> If it has come to you in error you must take no action based on it, nor
> must you copy or show it to anyone; please delete/destroy and inform the
> sender immediately.
>
>
>
> On Thursday, 5 May 2016 at 01:46, John Bickerstaff wrote:
>
> > Thanks - does that mean that the only way to safely back up Kafka is to
> > have replication?
> >
> > (I have done this partially - I can get the entire topic on the command
> > line, after completely recreating the server, but my code that is
> intended
> > to do the same thing just hangs)
> >
> > On Wed, May 4, 2016 at 3:18 PM, Rad Gruchalski  (mailto:ra...@gruchalski.com)> wrote:
> >
> > > John,
> > >
> > > I believe you mean something along the lines of:
> > > http://markmail.org/message/f7xb5okr3ujkplk4
> > > I don’t think something like this has been done.
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > > Best regards,
> > > Radek Gruchalski
> > > ra...@gruchalski.com (mailto:ra...@gruchalski.com) (mailto:
> > > ra...@gruchalski.com (mailto:ra...@gruchalski.com))
> > > de.linkedin.com/in/radgruchalski/ (
> http://de.linkedin.com/in/radgruchalski/) (
> > > http://de.linkedin.com/in/radgruchalski/)
> > >
> > > Confidentiality:
> > > This communication is intended for the above-named person and may be
> > > confidential and/or legally privileged.
> > > If it has come to you in error you must take no action based on it, nor
> > > must you copy or show it to anyone; please delete/destroy and inform
> the
> > > sender immediately.
> > >
> > >
> > >
> > > On Wednesday, 4 May 2016 at 23:04, John Bickerstaff wrote:
> > >
> > > > Hi,
> > > >
> > > > I have what is probably an edge use case. I'd like to back up a
> single
> > > > Kafka instance such that I can recreate a new server, drop Kafka in,
> drop
> > > > the data in, start Kafka -- and have all my data ready to go again
> for
> > > > consumers.
> > > >
> > > > Is such a thing done? Does anyone have any experience trying this?
> > > >
> > > > I have, and I've run into some problems which suggest there's a
> setting
> > > or
> > > > some other thing I'm unaware of...
> > > >
> > > > If you like, don't think of it as a backup problem so much as a
> "cloning"
> > > > problem. I want to clone a new Kafka machine without actually
> cloning it
> > > >
> > >
> > > -
> > > > I.E. the data is somewhere else (log and index files) although
> Zookeeper
> > >
> > > is
> > > > up and running just fine.
> > > >
> > > > Thanks
>
>


Re: Multiple topics to one consumer

2016-03-08 Thread Alex Loddengaard
Hi there,

One consumer can indeed consume from multiple topics (and multiple
partitions). For example, see here:

http://kafka.apache.org/090/javadoc/org/apache/kafka/clients/consumer/KafkaConsumer.html#subscribe(java.util.List)

Then, in your poll() loop, you can get the topic and partition from the
returned ConsumerRecord.

I hope this helps.

Alex

On Tue, Mar 8, 2016 at 1:12 AM, 복영빈  wrote:

> Hi!
> I am quite new to Kafka and pub-sub systems.
>
> I am trying to make a single consumer to consume messages from multiple
> topics.
>
> Although I understand that one topic can be sent to multiple consumers via
> partitions,
>
> I could not find the part in the documentation that specifies that a single
> consumer can consume data from multiple topics.
>
> My understanding is that one consumer holds one thread until it is closed,
> by looping indefinitely on while(it.hasNext())).
>
> So I assumed that to read from multiple topics, I would have to create
> multiple consumers that each read from a single topic.
>
> Am I right in that a single consumer cannot read from multiple topics?
>
> If not, how would I achieve this?
>
> Thanks in advance!
>


Re: Enforce Unique consumer group id in Kafka

2016-02-26 Thread Alex Loddengaard
Hi Nishant,

You could use SASL authentication and authorization (ACLs) to control
access to topics. In your use case, you would require authentication and
control which principals have access to which consumer groups. These
features are available in 0.9 but not 0.8. Here are some resources:

http://docs.confluent.io/2.0.1/kafka/security.html

http://www.confluent.io/blog/apache-kafka-security-authorization-authentication-encryption

I hope this helps.

Alex

On Fri, Feb 26, 2016 at 1:44 AM, NISHANT BULCHANDANI  wrote:

> We are making a Kafka Queue into which messages are being published from
> source system. Now multiple consumers can connect to this queue to read
> messages.
>
> While doing this the consumers have to specify a groupId based on which the
> messages are distributed , if two apps have the same groupId both of them
> won't get the messages.
>
> Is there a way i can enforce every app. to have a unique consumer group Id?
>


Re: Creating new consumers after data has been discarded

2016-02-24 Thread Alex Loddengaard
Hi Ted,

The largest a partition can get is the size of the disk storing the
partition's log on the broker. You can use RAID to increase the "disk"
size, and hence the partition size. Whether or not you can fit all messages
in Kafka depends on your use case -- how much data you have, the size of
your disks, number of partitions, etc.

If you can't fit all messages on disk, consider using a compacted topic.
Most use cases I see that use Kafka as a source of truth are based on
compacted topics. Learn more about compaction here:

http://kafka.apache.org/documentation.html#compaction

Alex

On Wed, Feb 24, 2016 at 12:47 AM, Gerard Klijs 
wrote:

> Hi Ted,
>
> Maybe it's usefull to take a look at samza, http://samza.apache.org/ they
> use kafka in a way which sounds similar to how you want to use it. As I
> recall from a youtube conference the creator of samza also mentioned to
> never delete the events. These things are off course very dependent on your
> use case, some events aren't worth keeping them around for long.
>
> On Wed, Feb 24, 2016 at 9:08 AM Ted Swerve  wrote:
>
> > Hello,
> >
> > One of the big attractions of Kafka for me was the ability to write new
> > consumers of topics that would then be able to connect to a topic and
> > replay all the previous events.
> >
> > However, most of the time, Kafka appears to be used with a retention
> period
> > - presumably in such cases, the events have been warehoused into HDFS
> > or something similar.
> >
> > So my question is - how do people typically approach the scenario where a
> > new piece of code needs to process all events in a topic from "day one",
> > but has to source some of them from e.g HDFS and then connect to the
> > real-time Kafka topic?  Are there any wrinkles with such an approach?
> >
> > Thanks,
> > Ted
> >
>


Re: Does kafka.common.QueueFullException indicate back pressure in Kafka?

2016-02-22 Thread Alex Loddengaard
Hi John,

I'm glad the info was helpful.

It's hard to diagnose this issue without monitoring. I suggest setting up
graphite to graph JMX metrics. There's a good (not designed for production)
script here (as part of a Vagrant VM):
https://github.com/gwenshap/ops_training_vm/blob/master/bootstrap.sh

Once monitoring is setup, see where the brokers spend their time servicing
a request. More producers will put more load on the brokers and, depending
on configuration, can cause requests to be ACK'd slower, which would cause
the producers to buffer more. Again, this is all very configuration
specific so it's hard to give you specific advice.

Alex


On Sat, Feb 20, 2016 at 2:59 AM, John Yost <hokiege...@gmail.com> wrote:

> Hi Alex,
>
> Excellent information, thanks! I very much appreciate your time.  BTW,
> Kafka is an EXCELLENT product.
>
> It seems like my situation may be a bit of an edge case, based upon your
> response. Specifically, when I added more producers (in the case of Storm,
> a Kafka producer is a KafkaBolt), that is when the QueueFullExceptions were
> thrown. In one instance, I went from 30 KafkaBolts (producers) to 60, and,
> after about 45 minutes or so, I started seeing QueueFullExceptions.  This
> is why I am wondering if these exceptions can also be a symptom of back
> pressure from Kafka.
>
> Is this plausible?
>
> --John
>
>
>
> I have tuned the producers
>
> On Thu, Feb 18, 2016 at 3:59 PM, Alex Loddengaard <a...@confluent.io>
> wrote:
>
> > Hi John,
> >
> > I should preface this by saying I've never used Storm and KafkaBolt and
> am
> > not a streaming expert.
> >
> > However, if you're running out of buffer in the producer (as is what's
> > happening in the other thread you referenced), you can possibly alleviate
> > this by adding more producers, or by tuning the producers. Tuning the
> > brokers or adding more brokers may help as well, but it's hard to say for
> > sure without looking at your monitors and knowing more about the use case
> > and cluster.
> >
> > I suggest setting up monitoring and looking deeply at the JMX metrics
> that
> > are created to understand where each message spends most of its time
> > (producer, broker, consumer to start, then request queues, io threads,
> > etc). The docs go through each JMX metric relevant here. Then from there
> > you can start understanding how to alleviate the problem.
> >
> > Feel free to share metrics and more information and we can explore them
> > together.
> >
> > Alex
> >
> > On Thu, Feb 18, 2016 at 5:18 AM, John Yost <hokiege...@gmail.com> wrote:
> >
> > > Hi Everyone,
> > >
> > > I am encountering this exception similar to Saurabh's report earlier
> > today
> > > as I try to scale up a Storm -> Kafka output via the KafkaBolt (i.e.,
> add
> > > more KafkaBolt executors).
> > >
> > > Question...does this necessarily indicate back pressure from Kafka
> where
> > > the Kafka writes cannot keep up with the incoming messages sent over by
> > > Storm? If so, do I add brokers to the cluster, do I add more topics, a
> > > combo thereof or something else?
> > >
> > > As always, any thoughts from people who know more than I do are
> > > appreciated. :)
> > >
> > > Thanks
> > >
> > > --John
> > >
> >
> >
> >
> > --
> > *Alex Loddengaard | **Solutions Architect | Confluent*
> > *Download Apache Kafka and Confluent Platform: www.confluent.io/download
> > <http://www.confluent.io/download>*
> >
>



-- 
*Alex Loddengaard | **Solutions Architect | Confluent*
*Download Apache Kafka and Confluent Platform: www.confluent.io/download
<http://www.confluent.io/download>*


Re: Does kafka.common.QueueFullException indicate back pressure in Kafka?

2016-02-18 Thread Alex Loddengaard
Hi John,

I should preface this by saying I've never used Storm and KafkaBolt and am
not a streaming expert.

However, if you're running out of buffer in the producer (as is what's
happening in the other thread you referenced), you can possibly alleviate
this by adding more producers, or by tuning the producers. Tuning the
brokers or adding more brokers may help as well, but it's hard to say for
sure without looking at your monitors and knowing more about the use case
and cluster.

I suggest setting up monitoring and looking deeply at the JMX metrics that
are created to understand where each message spends most of its time
(producer, broker, consumer to start, then request queues, io threads,
etc). The docs go through each JMX metric relevant here. Then from there
you can start understanding how to alleviate the problem.

Feel free to share metrics and more information and we can explore them
together.

Alex

On Thu, Feb 18, 2016 at 5:18 AM, John Yost <hokiege...@gmail.com> wrote:

> Hi Everyone,
>
> I am encountering this exception similar to Saurabh's report earlier today
> as I try to scale up a Storm -> Kafka output via the KafkaBolt (i.e., add
> more KafkaBolt executors).
>
> Question...does this necessarily indicate back pressure from Kafka where
> the Kafka writes cannot keep up with the incoming messages sent over by
> Storm? If so, do I add brokers to the cluster, do I add more topics, a
> combo thereof or something else?
>
> As always, any thoughts from people who know more than I do are
> appreciated. :)
>
> Thanks
>
> --John
>



-- 
*Alex Loddengaard | **Solutions Architect | Confluent*
*Download Apache Kafka and Confluent Platform: www.confluent.io/download
<http://www.confluent.io/download>*


Re: kafka.common.QueueFullException

2016-02-18 Thread Alex Loddengaard
Hi Saurabh,

This is occurring because the produce message queue is full when a produce
request is made. The size of the queue is configured
via queue.buffering.max.messages. You can experiment with increasing this
(which will require more JVM heap space), or fiddling with
queue.enqueue.timeout.ms to control the blocking behavior when the queue is
full. Both of these configuration options are explained here:

https://kafka.apache.org/08/configuration.html

I didn't quite follow your last paragraph, so I'm not sure if the following
advice is applicable to you or not. You may also experiment with adding
more producers (either on the same or different machines).

I hope this helps.

Alex

On Thu, Feb 18, 2016 at 2:12 AM, Saurabh Kumar <saurabh...@gmail.com> wrote:

> Hi,
>
> We have a Kafka server deployment shared between multiple teams and i have
> created a topic with multiple partitions on it for pushing some JSON data.
>
> We have multiple such Kafka producers running from different machines which
> produce/push data to a Kafka topic. A lot of times i see the following
> exception in the logs : "*Event queue is full of unsent messages, could not
> send event"*
>
> Any idea how to solve this ? We can not synchronise the volume or timing of
> different Kafka producers across machines and between multiple processes.
> There is a limit on maximum number of concurrent processes (kafka producer)
>  that can run on a mchine but it is only going to increase with time as we
> scale. What is the right way to solve this problem ?
>
> Thanks,
> Saurabh
>



-- 
*Alex Loddengaard | **Solutions Architect | Confluent*
*Download Apache Kafka and Confluent Platform: www.confluent.io/download
<http://www.confluent.io/download>*


Re: Replication Factor and number of brokers

2016-02-17 Thread Alex Loddengaard
Thanks, Alexis and Ben!

Alex

On Wed, Feb 17, 2016 at 5:57 AM, Ben Stopford <b...@confluent.io> wrote:

> If you create a topic with more replicas than brokers it should throw an
> error but if you lose a broker you'd have under replicated partitions.
>
> B
>
> On Tuesday, 16 February 2016, Alex Loddengaard <a...@confluent.io> wrote:
>
> > Hi Sean, you'll want equal or more brokers than your replication factor.
> > Meaning, if your replication factor is 3, you'll want 3 or more brokers.
> >
> > I'm not sure what Kafka will do if you have fewer brokers than your
> > replication factor. It will either give you the highest replication
> factor
> > it can (in this case, the number of brokers), or it will put more than
> one
> > replica on some brokers. My guess is the former, but again, I'm not sure.
> >
> > Hope this helps.
> >
> > Alex
> >
> > On Tue, Feb 16, 2016 at 7:47 AM, Damian Guy <damian@gmail.com
> > <javascript:;>> wrote:
> >
> > > Then you'll have under-replicated partitions. However, even if you
> have 3
> > > brokers with a replication factor of 2 and you lose a single broker
> > you'll
> > > still likely have under-replicated partitions.
> > > Partitions are assigned to brokers, 1 broker will be the leader and n
> > > brokers will be followers. If any of the brokers with replicas of the
> > > partition on it crash then you'll have under-replicated partitions.
> > >
> > >
> > > On 16 February 2016 at 14:45, Sean Morris (semorris) <
> semor...@cisco.com
> > <javascript:;>>
> > > wrote:
> > >
> > > > So if I have a replication factor of 2, but only 2 brokers, then
> > > > replication works, but what if I lose one broker?
> > > >
> > > > Thanks,
> > > > Sean
> > > >
> > > > On 2/16/16, 9:14 AM, "Damian Guy" <damian@gmail.com
> <javascript:;>>
> > wrote:
> > > >
> > > > >Hi,
> > > > >
> > > > >You need to have at least replication factor brokers.
> > > > >replication factor  = 1 is no replication.
> > > > >
> > > > >HTH,
> > > > >Damian
> > > > >
> > > > >On 16 February 2016 at 14:08, Sean Morris (semorris) <
> > > semor...@cisco.com <javascript:;>>
> > > > >wrote:
> > > > >
> > > > >> Should your number of brokers be atleast one more then your
> > > replication
> > > > >> factor of your topic(s)?
> > > > >>
> > > > >> So if I have a replication factor of 2, I need atleast 3 brokers?
> > > > >>
> > > > >> Thanks,
> > > > >> Sean
> > > > >>
> > > > >>
> > > >
> > > >
> > >
> >
> >
> >
> > --
> > *Alex Loddengaard | **Solutions Architect | Confluent*
> > *Download Apache Kafka and Confluent Platform: www.confluent.io/download
> > <http://www.confluent.io/download>*
> >
>



-- 
*Alex Loddengaard | **Solutions Architect | Confluent*
*Download Apache Kafka and Confluent Platform: www.confluent.io/download
<http://www.confluent.io/download>*


Re: Consumer seek on 0.9.0 API

2016-02-17 Thread Alex Loddengaard
Hi Robin,

I believe seek() needs to be called after the consumer gets its partition
assignments. Try calling poll() before you call seek(), then poll() again
and process the records from the latter poll().

There may be a better way to do this -- let's see if anyone else has a
suggestion.

Alex

On Wed, Feb 17, 2016 at 9:13 AM, Péricé Robin <perice.ro...@gmail.com>
wrote:

> Hi,
>
> I'm trying to use the new Consumer API with this example :
>
> https://github.com/apache/kafka/tree/trunk/examples/src/main/java/kafka/examples
>
> With a Producer I sent 1000 messages to my Kafka broker. I need to know if
> it's possible, for example, to read message from offset 500 to 1000.
>
> What I did :
>
>
>- consumer.seek(new TopicPartition("topic1", 0), 500);
>
>
>- final ConsumerRecords<Integer, String> records =
>consumer.poll(1000);
>
>
> But this didn't nothing (when I don't use seek() method I consume all the
> messages without any problems).
>
> Any help on this will be greatly appreciated !
>
> Regards,
>
> Robin
>



-- 
*Alex Loddengaard | **Solutions Architect | Confluent*
*Download Apache Kafka and Confluent Platform: www.confluent.io/download
<http://www.confluent.io/download>*


Re: Replication Factor and number of brokers

2016-02-16 Thread Alex Loddengaard
Hi Sean, you'll want equal or more brokers than your replication factor.
Meaning, if your replication factor is 3, you'll want 3 or more brokers.

I'm not sure what Kafka will do if you have fewer brokers than your
replication factor. It will either give you the highest replication factor
it can (in this case, the number of brokers), or it will put more than one
replica on some brokers. My guess is the former, but again, I'm not sure.

Hope this helps.

Alex

On Tue, Feb 16, 2016 at 7:47 AM, Damian Guy <damian@gmail.com> wrote:

> Then you'll have under-replicated partitions. However, even if you have 3
> brokers with a replication factor of 2 and you lose a single broker you'll
> still likely have under-replicated partitions.
> Partitions are assigned to brokers, 1 broker will be the leader and n
> brokers will be followers. If any of the brokers with replicas of the
> partition on it crash then you'll have under-replicated partitions.
>
>
> On 16 February 2016 at 14:45, Sean Morris (semorris) <semor...@cisco.com>
> wrote:
>
> > So if I have a replication factor of 2, but only 2 brokers, then
> > replication works, but what if I lose one broker?
> >
> > Thanks,
> > Sean
> >
> > On 2/16/16, 9:14 AM, "Damian Guy" <damian@gmail.com> wrote:
> >
> > >Hi,
> > >
> > >You need to have at least replication factor brokers.
> > >replication factor  = 1 is no replication.
> > >
> > >HTH,
> > >Damian
> > >
> > >On 16 February 2016 at 14:08, Sean Morris (semorris) <
> semor...@cisco.com>
> > >wrote:
> > >
> > >> Should your number of brokers be atleast one more then your
> replication
> > >> factor of your topic(s)?
> > >>
> > >> So if I have a replication factor of 2, I need atleast 3 brokers?
> > >>
> > >> Thanks,
> > >> Sean
> > >>
> > >>
> >
> >
>



-- 
*Alex Loddengaard | **Solutions Architect | Confluent*
*Download Apache Kafka and Confluent Platform: www.confluent.io/download
<http://www.confluent.io/download>*


Re: Encryption on disk

2016-01-15 Thread Alex Loddengaard
Have you considered encrypting at the broker filesystem level, perhaps with
something like LUKS?

Alex

On Fri, Jan 15, 2016 at 8:38 AM, Jim Hoagland <jim_hoagl...@symantec.com>
wrote:

> We did not look at compression and did not use it.  You'll probably get
> the best compression while having encryption by building a batch of
> messages, compressing that, then encrypting the compressed batch.
>
> Compressing across the batch will probably almost certainly be better
> space-wise than compressing each message separately because there are
> likely to be similarities between the messages and a good compression
> algorithm will pick up on that make the message smaller.  Even small
> similarities such as it containing a lot of ASCII can be picked up.
>
> To defy cryptanalysis, a good encryption algorithm will make the encrypted
> message appear random.  Random data will not really compress.  If it is
> reliably compressing after encryption, then your encryption is not as
> secure as it should be.  Also discussed here:
> http://security.stackexchange.com/a/19970.
>
> -- Jim
>
> On 1/15/16, 6:39 AM, "Bruno Rassaerts" <bruno.rassae...@novazone.be>
> wrote:
>
> >Thanks for the input Jim.
> >
> >We managed to reduce the encryption impact to about 25% by disabling the
> >kafka batch compression and compressing the messages ourselves before
> >encrypting them one-by-one. However we still believe we could improve by
> >batch compressing + batch encrypting.
> >
> >Can you confirm that in your tests batch compression was disabled ?
> >
> >Thanks,
> >Bruno
> >
> >
> >> On 14 Jan 2016, at 23:47, Jim Hoagland <jim_hoagl...@symantec.com>
> >>wrote:
> >>
> >> We did a proof of concept on end-to-end encryption using an approach
> >>which
> >> sounds similar to what you describe.  We blogged about it here:
> >>
> >>
> >>
> http://www.symantec.com/connect/blogs/end-end-encryption-though-kafka-our
> >>-p
> >> roof-concept
> >>
> >> You might want to review what is there to see how it differs from what
> >>you
> >> did.  In our tests, the encryption didn't add as much overhead as we
> >> thought it would.
> >>
> >> -- Jim
> >>
> >> --
> >> Jim Hoagland, Ph.D.
> >> Sr. Principal Software Engineer
> >> Big Data Analytics Team
> >> Cloud Platform Engineering
> >>
> >>
> >>
> >> On 1/14/16, 2:23 PM, "Bruno Rassaerts" <bruno.rassae...@novazone.be>
> >>wrote:
> >>
> >>> Hello,
> >>>
> >>> In our project we have a very strong requirement to protect all data,
> >>>all
> >>> the time. Even when the data is “in-rest” on disk, it needs to be
> >>> protected.
> >>> We’ve been trying to figure out how to this with Kafka, and hit some
> >>> obstacles.
> >>>
> >>> One thing we’ve tried to do is to encrypt every message we hand over to
> >>> kafka. This results in the encrypted messages being written to disk on
> >>> the brokers.
> >>> However, the performance of performing encryption has serious
> >>>performance
> >>> implications, due to the CPU intensive operation which encryption is,
> >>>and
> >>> the fact that batch compression offered by Kafka is not nearly as
> >>> efficient anymore after encrypting the data. Doing this message by
> >>> message encryption gives us a performance penalty of about 75%, even if
> >>> we compress the messages before encryption.
> >>>
> >>> What we are looking for is a way to plugin our encryption in two
> >>>possible
> >>> locations:
> >>>
> >>> 1. As a custom compression algorithm, which would batch compress, and
> >>> batch encrypt. And get the files stored as such.
> >>> 2. As a encryption plugin specifically designed for storing the kafka
> >>> broker files.
> >>>
> >>> Is there any way that this can be done using Kafka (0.9), or can
> >>>somebody
> >>> point us to the place were we could add this in the Kafka codebase.
> >>>
> >>> Thanks,
> >>> Bruno Rassaerts
> >>
> >
>
>


-- 
*Alex Loddengaard | **Solutions Architect | Confluent*
*Download Apache Kafka and Confluent Platform: www.confluent.io/download
<http://www.confluent.io/download>*


Re: Kafka Connect usage

2016-01-12 Thread Alex Loddengaard
Hi Shiti, I'm not able to reproduce the problem with the default
*.properties files you're passing to connect-standalone.sh. Can you share
these three files?

Thanks,

Alex

On Mon, Jan 11, 2016 at 10:14 PM, Shiti Saxena <ssaxena@gmail.com>
wrote:

> Hi,
>
> I tried executing the following,
>
> bin/connect-standalone.sh config/connect-standalone.properties
> config/connect-file-source.properties
> config/connect-console-sink.properties
>
> I created a file text.txt in kafka directory but get the error,
>
> ERROR Failed to flush WorkerSourceTask{id=local-file-source-0}, timed out
> while waiting for producer to flush outstanding messages, 1 left
> ({ProducerRecord(topic=connect-test, partition=null, key=[B@71bb594d,
> value=[B@7f7ca90a=ProducerRecord(topic=connect-test, partition=null,
> key=[B@71bb594d, value=[B@7f7ca90a})
> (org.apache.kafka.connect.runtime.WorkerSourceTask:237)
> [2016-01-12 11:43:51,948] ERROR Failed to commit offsets for
> WorkerSourceTask{id=local-file-source-0}
> (org.apache.kafka.connect.runtime.SourceTaskOffsetCommitter:112)
>
> Is any other configuration required?
>
> Thanks,
> Shiti
>



-- 
*Alex Loddengaard | **Solutions Architect | Confluent*
*Download Apache Kafka and Confluent Platform: www.confluent.io/download
<http://www.confluent.io/download>*


Re: Programmable API for Kafka Connect ?

2016-01-06 Thread Alex Loddengaard
Hi Shiti,

In the context of your question, there are three relevant components to
Connect: connectors, tasks, and workers.

Connectors do a number of things but at a high level they specify the tasks
(Java classes) to perform the work, how many to start, and some
coordination. The tasks do the actual work of reading/writing data to/from
Kafka (using the Connect API, not producer/consumer API). And the workers
are the daemons that run on one or many nodes that run the connectors and
tasks (in distributed mode).

The only way to start worker processes is through the command line --
bin/connect-distributed.sh.

The only way to add/remove/modify connectors in distributed mode is through
the REST API. I don't know of any API clients, unfortunately, but maybe one
exists that I don't know of? The REST API is started along with the worker
process, as part of bin/connect-distributed.sh. Meaning, you don't have to
start a separate REST process.

Let me know if you have any follow-up questions, Shiti.

Alex

On Tue, Jan 5, 2016 at 9:12 PM, Shiti Saxena <ssaxena@gmail.com> wrote:

> Hi,
>
> Does Kafka Connect have an API which can be used by applications to start
> Kafka Connect, add/remove Connectors?
>
> I also do not want to use the REST API and do not want to start the REST
> server.
>
> Thanks,
> Shiti
>



-- 
*Alex Loddengaard | **Solutions Architect | Confluent*
*Download Apache Kafka and Confluent Platform: www.confluent.io/download
<http://www.confluent.io/download>*


Re: Leader Election

2016-01-06 Thread Alex Loddengaard
Hi Heath,

I assume you're referring to the partition leader. This is not something
you need to worry about when using the REST API. Kafka handles leader
election, failure recovery, etc. for you behind the scenes. Do know that if
a leader fails, you'll experience a small latency hit because a new leader
needs to be elected for the partitions that were led by said failed leader.

Alex

On Wed, Jan 6, 2016 at 8:09 AM, Heath Ivie <hi...@autoanything.com> wrote:

> Hi Folks,
>
> I am trying to use the REST proxy, but I have some fundamental questions
> about how the leader election works.
>
> My understanding of how the way the leader elections work is that the
> proxy hides all of that complexity and processes my produce/consume request
> transparently.
>
> Is that the case or do I need to manage that myself?
>
> Thanks
> Heath
>
>
>
> Warning: This e-mail may contain information proprietary to AutoAnything
> Inc. and is intended only for the use of the intended recipient(s). If the
> reader of this message is not the intended recipient(s), you have received
> this message in error and any review, dissemination, distribution or
> copying of this message is strictly prohibited. If you have received this
> message in error, please notify the sender immediately and delete all
> copies.
>



-- 
*Alex Loddengaard | **Solutions Architect | Confluent*
*Download Apache Kafka and Confluent Platform: www.confluent.io/download
<http://www.confluent.io/download>*


Re: Find current kafka memory usage

2016-01-05 Thread Alex Loddengaard
Hi Dillian,

You can use a Linux script, general monitoring/alerting tool, or JMX (see
here, for example:
http://stackoverflow.com/questions/6487802/check-available-heapsize-programatically
).

Let me know if you have a more specific question.

Alex



On Mon, Jan 4, 2016 at 4:42 PM, Dillian Murphey <crackshotm...@gmail.com>
wrote:

> I was running out of heap space for my kafka broker. Is there a way I can
> check how much memory kafka is using so I can alert myself if it is
> reaching the max heap size?  Default is 1GB.
>
> Thanks
>



-- 
*Alex Loddengaard | **Solutions Architect | Confluent*
*Download Apache Kafka and Confluent Platform: www.confluent.io/download
<http://www.confluent.io/download>*


Re: Topic Deletion Issues

2016-01-05 Thread Alex Loddengaard
Hi Brenden, your previous email went through, and you got a response (that
I believe answers your question). See here:

http://mail-archives.apache.org/mod_mbox/kafka-users/201512.mbox/browser

Alex

On Mon, Jan 4, 2016 at 7:18 PM, Brenden Cobb <brendenc...@gmail.com> wrote:

> I might have sent this recently, but was not able to receive mail from
> this list (fixed)
> ---
>
> Hello-
>
> We have a use case where we're trying to create a topic, delete, then
> recreate with the same topic name.
>
> Running into inconsistant results.
>
> Creating the topic:
> /opt/kafka/bin/kafka-topics.sh --create --partitions 3
> --replication-factor 3 --topic test-01 --zookeeper zoo01:2181,
> zoo02:2181, zoo03:2181
>
> Delete:
> /opt/kafka/bin/kafka-topics.sh --delete --topic test-01 --zookeeper
> zoo01:2181, zoo02:2181, zoo03:2181
>
> Repeat creation.
>
> The results are inconsistant. Executing the above several times can be
> successful, then sporadically we get caught in "topic marked for
> deletion" and it does not clear.
>
> This appears to be a Zookeeper issue of sorts as the logs will show:
> [2015-12-30 22:32:32,946] WARN Conditional update of path
> /brokers/topics/test-01/partitions/0/state with data
>
> {"controller_epoch":21,"leader":2,"version":1,"leader_epoch":1,"isr":[2,0,1]}
> and expected version 1 failed due to
> org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode
> = NoNode for /brokers/topics/test-01/partitions/0/state
> (kafka.utils.ZkUtils$)
>
> In this instance no subdirectories exist beyond /brokers/topics/test-01
>
> I'd like to know if this is a common occurrance and why the Zookeeper
> node isn't "fully" created as Kafka deletion seems stuck without the
> expected node path.
>
> We are using Kafka 8.2. Purging is also an option (if achievable
> programmatically), we just need to make sure there are no messages
> left when resuming the producer.
>
> Appreciate any info/guidance.
>
> Thanks,
> BC
>



-- 
*Alex Loddengaard | **Solutions Architect | Confluent*
*Download Apache Kafka and Confluent Platform: www.confluent.io/download
<http://www.confluent.io/download>*


Re: Topic Deletion Issues

2016-01-05 Thread Alex Loddengaard
Hi Brenden, I sent the wrong permalink. Try this:

http://mail-archives.apache.org/mod_mbox/kafka-users/201512.mbox/%3CCAH7vnfhUQHQcCJ1R3pS_DYn0LvmA8fJtLtvCuvvDboWSqm-NBg%40mail.gmail.com%3E

Alex

On Tue, Jan 5, 2016 at 10:21 AM, Alex Loddengaard <a...@confluent.io> wrote:

> Hi Brenden, your previous email went through, and you got a response (that
> I believe answers your question). See here:
>
> http://mail-archives.apache.org/mod_mbox/kafka-users/201512.mbox/browser
>
> Alex
>
> On Mon, Jan 4, 2016 at 7:18 PM, Brenden Cobb <brendenc...@gmail.com>
> wrote:
>
>> I might have sent this recently, but was not able to receive mail from
>> this list (fixed)
>> ---
>>
>> Hello-
>>
>> We have a use case where we're trying to create a topic, delete, then
>> recreate with the same topic name.
>>
>> Running into inconsistant results.
>>
>> Creating the topic:
>> /opt/kafka/bin/kafka-topics.sh --create --partitions 3
>> --replication-factor 3 --topic test-01 --zookeeper zoo01:2181,
>> zoo02:2181, zoo03:2181
>>
>> Delete:
>> /opt/kafka/bin/kafka-topics.sh --delete --topic test-01 --zookeeper
>> zoo01:2181, zoo02:2181, zoo03:2181
>>
>> Repeat creation.
>>
>> The results are inconsistant. Executing the above several times can be
>> successful, then sporadically we get caught in "topic marked for
>> deletion" and it does not clear.
>>
>> This appears to be a Zookeeper issue of sorts as the logs will show:
>> [2015-12-30 22:32:32,946] WARN Conditional update of path
>> /brokers/topics/test-01/partitions/0/state with data
>>
>> {"controller_epoch":21,"leader":2,"version":1,"leader_epoch":1,"isr":[2,0,1]}
>> and expected version 1 failed due to
>> org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode
>> = NoNode for /brokers/topics/test-01/partitions/0/state
>> (kafka.utils.ZkUtils$)
>>
>> In this instance no subdirectories exist beyond /brokers/topics/test-01
>>
>> I'd like to know if this is a common occurrance and why the Zookeeper
>> node isn't "fully" created as Kafka deletion seems stuck without the
>> expected node path.
>>
>> We are using Kafka 8.2. Purging is also an option (if achievable
>> programmatically), we just need to make sure there are no messages
>> left when resuming the producer.
>>
>> Appreciate any info/guidance.
>>
>> Thanks,
>> BC
>>
>
>
>
> --
> *Alex Loddengaard | **Solutions Architect | Confluent*
> *Download Apache Kafka and Confluent Platform: www.confluent.io/download
> <http://www.confluent.io/download>*
>



-- 
*Alex Loddengaard | **Solutions Architect | Confluent*
*Download Apache Kafka and Confluent Platform: www.confluent.io/download
<http://www.confluent.io/download>*


Re: Producing when broker goes down

2015-12-18 Thread Alex Loddengaard
Hi Buck,

What are your settings for:

   - acks
   - request.timeout.ms
   - timeout.ms
   - min.insync.replicas (on the broker)

Thanks,

Alex

On Fri, Dec 18, 2015 at 1:23 PM, Buck Tandyco 
wrote:

> I'm stress testing my kafka setup. I have a producer that is working just
> fine and then I kill off one of the two brokers that I have running with
> replication factor of 2.  I'm able to keep receiving from my consumer
> thread but my producer generates this exception:
> "kafka.common.FailedToSendMessageException: Failed to send messages after 3
> tries"
> I've tried messing with the producer config such as timeouts, reconnect
> intervals, etc. But haven't made any progress.
> Does anyone have any ideas of what I might try?
> Thanks,Zack
>