Re: [jira] [Created] (KAFKA-3377) add REST interface to JMX

2016-03-10 Thread Gerard Klijs
I would like to know why you want/need it to be integrated into Kafka?
For our current project we tried out zabbix,
https://www.zabbix.com/documentation/3.0/manual/config/items/itemtypes/jmx_monitoring,
it takes some configuration, but then you can fetch all the jmx you want
and put them into graphs.

On Thu, Mar 10, 2016 at 2:52 PM Christian Posta (JIRA) 
wrote:

> Christian Posta created KAFKA-3377:
> --
>
>  Summary: add REST interface to JMX
>  Key: KAFKA-3377
>  URL: https://issues.apache.org/jira/browse/KAFKA-3377
>  Project: Kafka
>   Issue Type: Improvement
>   Components: core
> Reporter: Christian Posta
>
>
> Would be awesome if we could get JMX metrics w/out having to use the JMX
> APIs.. would there be any interest in adding something like
> https://jolokia.org to Kafka? I'll happily volunteer :)
>
>
>
> --
> This message was sent by Atlassian JIRA
> (v6.3.4#6332)
>


Re: [jira] [Commented] (KAFKA-1173) Using Vagrant to get up and running with Apache Kafka

2016-03-21 Thread Gerard Klijs
I had good experiences using the vagrant setup as it is on a mac, but did
had to change some things. We are using docker now. I'm not sure about the
general preference, but I would like a docker compose over the vagrant
setup. Don't know if you really want it Kafka itself, and to give it
support through.

On Mon, Mar 21, 2016 at 9:43 PM Ewen Cheslack-Postava (JIRA) <
j...@apache.org> wrote:

>
> [
> https://issues.apache.org/jira/browse/KAFKA-1173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15205083#comment-15205083
> ]
>
> Ewen Cheslack-Postava commented on KAFKA-1173:
> --
>
> [~gwenshap] Maybe? I had been thinking of our Vagrantfile as a tool for
> Kafka developers. Technically I guess it gets shipped with the source
> version. It doesn't get shipped with the binary versions afaik, which may
> be confusing.
>
> I guess it's also a question of whether we want to treat it as
> "supported"...
>
> > Using Vagrant to get up and running with Apache Kafka
> > -
> >
> > Key: KAFKA-1173
> > URL: https://issues.apache.org/jira/browse/KAFKA-1173
> > Project: Kafka
> >  Issue Type: Improvement
> >Reporter: Joe Stein
> >Assignee: Ewen Cheslack-Postava
> > Fix For: 0.9.0.0
> >
> > Attachments: KAFKA-1173-JMX.patch, KAFKA-1173.patch,
> KAFKA-1173_2013-12-07_12:07:55.patch, KAFKA-1173_2014-11-11_13:50:55.patch,
> KAFKA-1173_2014-11-12_11:32:09.patch, KAFKA-1173_2014-11-18_16:01:33.patch
> >
> >
> > Vagrant has been getting a lot of pickup in the tech communities.  I
> have found it very useful for development and testing and working with a
> few clients now using it to help virtualize their environments in
> repeatable ways.
> > Using Vagrant to get up and running.
> > For 0.8.0 I have a patch on github https://github.com/stealthly/kafka
> > 1) Install Vagrant [
> http://www.vagrantup.com/](http://www.vagrantup.com/)
> > 2) Install Virtual Box [
> https://www.virtualbox.org/](https://www.virtualbox.org/)
> > In the main kafka folder
> > 1) ./sbt update
> > 2) ./sbt package
> > 3) ./sbt assembly-package-dependency
> > 4) vagrant up
> > once this is done
> > * Zookeeper will be running 192.168.50.5
> > * Broker 1 on 192.168.50.10
> > * Broker 2 on 192.168.50.20
> > * Broker 3 on 192.168.50.30
> > When you are all up and running you will be back at a command brompt.
> > If you want you can login to the machines using vagrant shh
>  but you don't need to.
> > You can access the brokers and zookeeper by their IP
> > e.g.
> > bin/kafka-console-producer.sh --broker-list 192.168.50.10:9092,
> 192.168.50.20:9092,192.168.50.30:9092 --topic sandbox
> > bin/kafka-console-consumer.sh --zookeeper 192.168.50.5:2181 --topic
> sandbox --from-beginning
>
>
>
> --
> This message was sent by Atlassian JIRA
> (v6.3.4#6332)
>


Re: [jira] [Created] (KAFKA-4078) VIP for Kafka doesn't work

2016-08-23 Thread Gerard Klijs
If you change the url the broker is available on, you need to change the
public advertised hostname in the configuration of the broker, this is not
a bug, there is no way the broker could know how it can be reached from the
outside, and when it changes.

On Tue, Aug 23, 2016 at 10:33 AM chao (JIRA)  wrote:

> chao created KAFKA-4078:
> ---
>
>  Summary: VIP for Kafka  doesn't work
>  Key: KAFKA-4078
>  URL: https://issues.apache.org/jira/browse/KAFKA-4078
>  Project: Kafka
>   Issue Type: Bug
>   Components: clients
> Affects Versions: 0.9.0.1
> Reporter: chao
> Priority: Blocker
>
>
> We create VIP for chao007kfk002.chao007.com, 9092 ,
> chao007kfk003.chao007.com, 9092 ,chao007kfk001.chao007.com, 9092
>
> But we found that Kafka client API has some issues ,  client send metadata
> update will return three brokers ,  so it will create three connections for
> 001 002 003
>
> When we change VIP to  chao008kfk002.chao008.com, 9092 ,
> chao008kfk003.chao008.com, 9092 ,chao008kfk001.chao008.com, 9092
>
> it still produce data to 007
>
>
> The following is log information
>
>
> sasl.kerberos.ticket.renew.window.factor = 0.8
> bootstrap.servers = [kfk.chao.com:9092]
> client.id =
>
> 2016-08-23 07:00:48,451:DEBUG kafka-producer-network-thread | producer-1
> (NetworkClient.java:623) - Initialize connection to node -1 for sending
> metadata request
> 2016-08-23 07:00:48,452:DEBUG kafka-producer-network-thread | producer-1
> (NetworkClient.java:487) - Initiating connection to node -1 at
> kfk.chao.com:9092.
> 2016-08-23 07:00:48,463:DEBUG kafka-producer-network-thread | producer-1
> (Metrics.java:201) - Added sensor with name node--1.bytes-sent
>
>
> 2016-08-23 07:00:48,489:DEBUG kafka-producer-network-thread | producer-1
> (NetworkClient.java:619) - Sending metadata request
> ClientRequest(expectResponse=true, callback=null,
> request=RequestSend(header={api_key=3,api_version=0,correlation_id=0,client_id=producer-1},
> body={topics=[chao_vip]}), isInitiatedByNetworkClient,
> createdTimeMs=1471935648465, sendTimeMs=0) to node -1
> 2016-08-23 07:00:48,512:DEBUG kafka-producer-network-thread | producer-1
> (Metadata.java:172) - Updated cluster metadata version 2 to Cluster(nodes =
> [Node(1, chao007kfk002.chao007.com, 9092), Node(2,
> chao007kfk003.chao007.com, 9092), Node(0, chao007kfk001.chao007.com,
> 9092)], partitions = [Partition(topic = chao_vip, partition = 0, leader =
> 0, replicas = [0,], isr = [0,], Partition(topic = chao_vip, partition = 3,
> leader = 0, replicas = [0,], isr = [0,], Partition(topic = chao_vip,
> partition = 2, leader = 2, replicas = [2,], isr = [2,], Partition(topic =
> chao_vip, partition = 1, leader = 1, replicas = [1,], isr = [1,],
> Partition(topic = chao_vip, partition = 4, leader = 1, replicas = [1,], isr
> = [1,]])
>
>
>
>
>
>
> --
> This message was sent by Atlassian JIRA
> (v6.3.4#6332)
>


Re: Message sent ordering guarantees

2016-09-01 Thread Gerard Klijs
For async you could set ack to -1, but it keep slow, because it has to wait
for the broker(s), to know it is received in order, before sending the next
one, you need this if order is very important. When sending async the order
gets changed in case the leader becomes temporarily unavailable and other
errors, because data is send correct to the new leader, before it fails on
the old leader, and is send to the new leader.
Depending on your use case you have different options to handle this, for
example with using timestamps in the messages, so you know which event
happened first, bust this wil not work when you use compaction.

On Thu, Sep 1, 2016 at 6:19 AM 郭旭  wrote:

> Hi Kafka Experts,
>
> (Sorry to send this question to DEV group, but it seems that I can not find
> related document in user manual.)
>
> For official document ,I can find message sent guarantee as below.
> For *sync producer*, I think it is true but sync sent are very slow.(about
> 408 message per second if ack = all, 1000 message per second if ack = 1).
>
> batch and async sent could satisfy our throughput requirement, but I'm not
> sure if message sent ordering are guaranteed in *async *style.
>
> For some critical application, for example( replicate mysql binlog to kafka
> distributed committed log), binlog ordering are important(partitioned by
> database/table/PK). throughput also important.
>
> If I use async producer, partition the binlog by table and send them in
> batch. Is it safe for binlog ordering for a single table?
>
> Will async producer guarantee the send ordering?
>
>
> Regards
> Shawn
>
> Guarantees At
> a high-level Kafka gives the following guarantees:
>
>- Messages sent by a producer to a particular topic partition will be
>appended in the order they are sent. That is, if a message M1 is sent by
>the same producer as a message M2, and M1 is sent first, then M1 will
> have
>a lower offset than M2 and appear earlier in the log.
>


Re: [jira] [Commented] (KAFKA-3044) Consumer.poll doesnot return messages when poll interval is less

2015-12-29 Thread Gerard Klijs
You can also set fetch.min.bytes, It will be a trade-off between getting
the messages as fast as possible, with fetch.min.bytes=0 and the poll at 0.
This will result in a lot of empty returns, and a lot of io overhead as
compared to getting a lot of massages in one go. If you for example set the
fetch.min.bytes to max.int, and the poll timeout to 6 you might only
get one big bunch of records every minute.

On Tue, Dec 29, 2015 at 8:34 AM Praveen Devarao (JIRA) 
wrote:

>
> [
> https://issues.apache.org/jira/browse/KAFKA-3044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15073597#comment-15073597
> ]
>
> Praveen Devarao commented on KAFKA-3044:
> 
>
> Hi [~guozhang] and [~jkreps]
>
> OK.
>
> Also, could we recommend a value for timeout [based on some assumed
> factors]?
>
> Thanks
>
> Praveen
>
> > Consumer.poll doesnot return messages when poll interval is less
> > 
> >
> > Key: KAFKA-3044
> > URL: https://issues.apache.org/jira/browse/KAFKA-3044
> > Project: Kafka
> >  Issue Type: Bug
> >  Components: clients
> >Affects Versions: 0.9.0.0
> >Reporter: Praveen Devarao
> >Assignee: Jason Gustafson
> > Fix For: 0.9.0.1
> >
> >
> > When seeking to particular position in consumer and starting poll with
> timeout param 0 the consumer does not come back with data though there is
> data published via a producer already. If the timeout is increased slowly
> in chunks of 100ms then at 700ms value the consumer returns back the record
> on first call to poll.
> > Docs [
> http://kafka.apache.org/090/javadoc/org/apache/kafka/clients/consumer/KafkaConsumer.html#poll(long)]
> for poll reads if timeout is 0 then data will be returned immediately but
> the behaviour seen is that data is not returned.
> > The test code I am using can be found here
> https://gist.github.com/praveend/013dcab01ebb8c7e2f2d
> > I have created a topic with data published as below and then running the
> test program [ConsumerPollTest.java]
> > $ bin/kafka-topics.sh --create --zookeeper localhost:2181
> --replication-factor 1 --partitions 1 --topic mytopic
> > $ bin/kafka-console-producer.sh --broker-list localhost:9092 --topic
> mytopic
> > Hello
> > Hai
> > bye
> > $ java ConsumerPollTest
> > I have published this 3 lines of data to kafka only oncelater on I
> just use the above program with different poll interval
> > Let me know if I am missing anything and interpreting it wrongly.
>
>
>
> --
> This message was sent by Atlassian JIRA
> (v6.3.4#6332)
>


Re: Migrating Kafka from old VMs to new VMs in a different Cluster

2016-05-11 Thread Gerard Klijs
Depends on your use case but I guess something like this:
- Install al fresh on the new VM's
- Start a mirror maker in the the new VM's to copy data from the old ones
- Be sure it's working right
- Shut down the old VM's and start using the new ones

The last step is the trickiest and depends a lot on the setup, like how the
clients connect to kafka and if you can just shut the client down and up
again. There might also be an issue with consumers getting all the data
again.

On Wed, May 11, 2016 at 9:24 PM Abhinav Damarapati 
wrote:

> Hello Everyone,
>
>
> We have Kafka brokers, Zookeepers and Mirror-makers running on old Virtual
> Machines. We need to migrate all of this to brand new VMs on a different
> DataCenter and bring down the old VMs. Is this possible? If so, please
> suggest a way to do it.
>
>
> Best,
>
> Abhinav
>


Re: [jira] [Updated] (KAFKA-3772) MirrorMaker crashes on Corrupted Message

2016-06-01 Thread Gerard Klijs
Just had a look at the code, as I would like some way to prevent such a
scenario happening to us. It seems you can't prevent the mirror maker from
exiting.
In the 0.10 mirror maker, any exception which is not an
ConsumerTimeoutException, or an WakeupException, will cause the whole
mirror maker to shutdown, because the finally in the MirrorMakerThread will
be executed. I would however expect in
the commitOffsets(mirrorMakerConsumer) part the offset will be committed,
and at a second time it would start just after the corrupted record.
I think it would be better to include at least the InvalidMessageException
to the exceptions which are only logged, but this could possibly lead to
other problems, when there is a solvable way to prevent the
InvalidMessageException,
and you are losing those records, because they are just skipped.

On Tue, May 31, 2016 at 5:30 PM James Ranson (JIRA)  wrote:

>
>  [
> https://issues.apache.org/jira/browse/KAFKA-3772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
> ]
>
> James Ranson updated KAFKA-3772:
> 
> Description:
> We recently came across an issue where a message on our source kafka
> cluster became corrupted. When MirrorMaker tried to consume this message,
> the thread crashed and caused the entire process to also crash. Each time
> we attempted to restart MM, it crashed on the same message. There is no
> information in the MM logs about which message it was trying to consume
> (what topic, what offset, etc). So the only way we were able to get past
> the issue was to go into the zookeeper tree for our mirror consumer group
> and increment the offset for every partition on every topic until the MM
> process could start without crashing. This is not a tenable operational
> solution. MirrorMaker should gracefully skip corrupt messages since they
> will never be able to be replicated anyway.
>
> {noformat}2016-05-26 20:02:26,787 FATAL  MirrorMaker$MirrorMakerThread -
> [{}] [mirrormaker-thread-3] Mirror maker thread failure due to
> kafka.message.InvalidMessageException: Message is corrupt (stored crc =
> 33747148, computed crc = 3550736267)
> at kafka.message.Message.ensureValid(Message.scala:167)
> at
> kafka.consumer.ConsumerIterator.makeNext(ConsumerIterator.scala:101)
> at
> kafka.consumer.ConsumerIterator.makeNext(ConsumerIterator.scala:33)
> at
> kafka.utils.IteratorTemplate.maybeComputeNext(IteratorTemplate.scala:66)
> at kafka.utils.IteratorTemplate.hasNext(IteratorTemplate.scala:58)
> at
> kafka.tools.MirrorMaker$MirrorMakerOldConsumer.hasData(MirrorMaker.scala:483)
> at
> kafka.tools.MirrorMaker$MirrorMakerThread.run(MirrorMaker.scala:394)
>
> 2016-05-26 20:02:27,580 FATAL  MirrorMaker$MirrorMakerThread - [{}]
> [mirrormaker-thread-3] Mirror maker thread exited abnormally, stopping the
> whole mirror maker.{noformat}
>
>   was:
> We recently came across an issue where a message on our source kafka
> cluster became corrupted. When MirrorMaker tried to consume this message,
> the thread crashed and caused the entire process to also crash. Each time
> we attempted to restart MM, it crashed on the same message. There is no
> information in the MM logs about which message it was trying to consume
> (what topic, what offset, etc). So the only way we were able to get past
> the issue was to go into the zookeeper tree for our mirror consumer group
> and increment the offset for every partition on every topic until the MM
> process could start without crashing. This is not a tenable operational
> solution. MirrorMaker should gracefully skip corrupt messages since they
> will never be able to be replicated anyway.
>
> ```2016-05-26 20:02:26,787 FATAL  MirrorMaker$MirrorMakerThread - [{}]
> [mirrormaker-thread-3] Mirror maker thread failure due to
> kafka.message.InvalidMessageException: Message is corrupt (stored crc =
> 33747148, computed crc = 3550736267)
> at kafka.message.Message.ensureValid(Message.scala:167)
> at
> kafka.consumer.ConsumerIterator.makeNext(ConsumerIterator.scala:101)
> at
> kafka.consumer.ConsumerIterator.makeNext(ConsumerIterator.scala:33)
> at
> kafka.utils.IteratorTemplate.maybeComputeNext(IteratorTemplate.scala:66)
> at kafka.utils.IteratorTemplate.hasNext(IteratorTemplate.scala:58)
> at
> kafka.tools.MirrorMaker$MirrorMakerOldConsumer.hasData(MirrorMaker.scala:483)
> at
> kafka.tools.MirrorMaker$MirrorMakerThread.run(MirrorMaker.scala:394)
>
> 2016-05-26 20:02:27,580 FATAL  MirrorMaker$MirrorMakerThread - [{}]
> [mirrormaker-thread-3] Mirror maker thread exited abnormally, stopping the
> whole mirror maker.```
>
>
> > MirrorMaker crashes on Corrupted Message
> > 
> >
> > Key: KAFKA-3772
> > URL: https://issues.apache.org/jira/browse/KAFKA-3772
> > Project: Kafka
> >  Issue Type: Bug
> >  

Re: Migrating from 07.1 .100

2016-06-13 Thread Gerard Klijs
There has been a lot of changes, and it can also be quit a challenge to get
the SSL working. And with only a null-pointer, there is little to go on. I
would first focus on getting it to work with 0.10 (if possible also on a
clustered test setup), and if it all works, try to configure the ssl.

On Mon, Jun 13, 2016 at 9:08 PM Subhash Agrawal 
wrote:

> Hi,
> I currently embed kafka 0.7.1 in my java process. To support SSL, we have
> decided to upgrade Kafka to 0.10.0.
> After upgrade, I am seeing following error during kafka startup.
>
> java.lang.NullPointerException
> at kafka.utils.Throttler.(Throttler.scala:45)
> at kafka.log.LogCleaner.(LogCleaner.scala:75)
> at kafka.log.LogManager.(LogManager.scala:66)
> at
> kafka.server.KafkaServer.createLogManager(KafkaServer.scala:609)
> at kafka.server.KafkaServer.startup(KafkaServer.scala:183)
>
> Has anybody seen this error? If I run kafka externally, then I don't see
> this error.
>
> Thanks
> Subhash A.
>
>
> From: Subhash Agrawal
> Sent: Wednesday, June 08, 2016 12:40 PM
> To: 'us...@kafka.apache.org'
> Subject: Migrating from 07.1 .100
>
> Hi,
> I am currently using Kafka 0.7.1 without zookeeper. We have single node
> kafka server.
> To enhance security, we have decided to support SSL. As 0.7.1 version does
> not support SSL,
> we are upgrading to latest version 0.10.0.0. We noticed that with the
> latest version, it is
> mandatory to use zookeeper.
>
> Is there any way I can use Kafka 0.10 or 0.9 version without zookeeper?
>
> Thanks
> Subhash A.
>
>


Re: [jira] [Commented] (KAFKA-3841) MirrorMaker topic renaming

2016-06-15 Thread Gerard Klijs
We have also something similar, it's real easy to use a message handler, in
our case we check for a bit to determine if we already copied the event,
and change the topic name in dev-mode. You can configure a string to pass
to your messageHandler for each instance.

On Wed, Jun 15, 2016 at 1:38 AM Ning Zhang (JIRA)  wrote:

>
> [
> https://issues.apache.org/jira/browse/KAFKA-3841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15330877#comment-15330877
> ]
>
> Ning Zhang commented on KAFKA-3841:
> ---
>
> Thanks. This is good to know. As MessageHandler was recently added since
> 0.9 (where our production has not been upgraded to there yet), we has been
> using our own solution for topic renaming.
>
> Looks like we may keep using our stuff or adopt the message handler when
> upgrading to 0.9 or above.
>
> Intend to mark this jira as "Resolve".
>
> > MirrorMaker topic renaming
> > --
> >
> > Key: KAFKA-3841
> > URL: https://issues.apache.org/jira/browse/KAFKA-3841
> > Project: Kafka
> >  Issue Type: New Feature
> >  Components: tools
> >Affects Versions: 0.10.0.0
> >Reporter: Ning Zhang
> >
> > Our organization (walmart.com) has been a Kafka user since some years
> back and MirrorMaker has been a convenient tool to bring our Kafka data
> from one Kafka cluster to another cluster.
> > In many our use cases, the mirrored topic from the source cluster may
> not want to have the same name in the target cluster. This could be a valid
> scenario when the same topic name already exists on the target cluster, or
> we want to append the name of the data center to the topic name in the
> target cluster, such as "grocery_items_mirror_sunnyvale", to explicitly
> identify the source (e.g. sunnyvale) and nature (e.g. mirroring) of the
> topic.
> > We have implemented the MirrorMaker topic renaming feature internally
> which has been used for production over a couple of years. While keeping
> our internal Kafka fork with the above "renaming" branch across version
> upgrade does not cost us too much labor, we think it may be meaningful to
> contribute back to the community so that potentially many people may have
> the similar expectation and could benefit from this feature.
>
>
>
> --
> This message was sent by Atlassian JIRA
> (v6.3.4#6332)
>


Re: Jars in Kafka 0.10

2016-07-29 Thread Gerard Klijs
No, if you don't use streams you don't need them. If you have no clients
(so also no mirror maker) running on the same machine you also don't need
the client jar, if you run zookeeper separately you also don't need those.

On Fri, Jul 29, 2016 at 4:22 PM Bhuvaneswaran Gopalasami <
bhuvanragha...@gmail.com> wrote:

> I have recently started looking into Kafka I noticed the number of Jars in
> Kafka 0.10 has increased when compared to 0.8.2. Do we really need all
> those libraries to run Kafka ?
>
> Thanks,
> Bhuvanes
>