A kafka web monitor
We have make a simple web console to monitor some kafka informations like consumer offset, logsize. https://github.com/shunfei/DCMonitor Hope you like it and offer your help to make it better :) Regards Flow
Re: Check topic exists after deleting it.
DeleteTopic makes a node in zookeeper to let controller know that there is a topic up for deletion. This doesn’t immediately delete the topic it can take time depending if all the partitions of that topic are online and brokers are available as well. Once all the Log files deleted zookeeper node gets deleted as well. Also make sure you don’t have any producers or consumers are running while the topic deleting is going on. -- Harsha On March 23, 2015 at 1:29:50 AM, anthony musyoki (anthony.musy...@gmail.com) wrote: On deleting a topic via TopicCommand.deleteTopic() I get Topic test-delete is marked for deletion. I follow up by checking if the topic exists by using AdminUtils.topicExists() which suprisingly returns true. I expected AdminUtils.TopicExists() to check both BrokerTopicsPath and DeleteTopicsPath before returning a verdict but it only checks BrokerTopicsPath Shouldn't a topic marked for deletion return false for topicExists() ?
kafka audit
Hi What is best practice for adding audit feature in kafka, Is there any framework available for enabling audit feature at producer and consumer level and any UI frameworks for monitoring. tx SunilKalva
Re: Post on running Kafka at LinkedIn
Emmanuel, if it helps, here's a little more detail on the hardware spec we are using at the moment: 12 CPU (HT enabled) 64 GB RAM 16 x 1TB SAS drives (2 are used as a RAID-1 set for the OS, 14 are a RAID-10 set just for the Kafka log segments) We don't colocate any other applications with Kafka except for a couple monitoring agents. Zookeeper runs on completely separate nodes. I suggest starting with looking at the basics - watch the CPU, memory, and disk IO usage on the brokers as you are testing. You're likely going to find one of these three is the constraint. Disk IO in particular can lead to a significant increase in produce latency as it increases even over 10-15% utilization. -Todd On Fri, Mar 20, 2015 at 3:41 PM, Emmanuel ele...@msn.com wrote: This is why I'm confused because I'm tryign to benchmark and I see numbers that seem pretty low to me...8000 events/sec on 2 brokers with 3CPU each and 5 partitions should be way faster than this and I don't know where to start to debug... the kafka-consumer-perf-test script gives me ridiculously low numbers (1000 events/sec/thread) So what could be causing this? From: jbringhu...@linkedin.com.INVALID To: users@kafka.apache.org Subject: Re: Post on running Kafka at LinkedIn Date: Fri, 20 Mar 2015 22:16:29 + Keep in mind that these brokers aren't really stressed too much at any given time -- we need to stay ahead of the capacity curve. Your message throughput will really just depend on what hardware you're using. However, in the past, we've benchmarked at 400,000 to more than 800,000 messages / broker / sec, depending on configuration ( https://engineering.linkedin.com/kafka/benchmarking-apache-kafka-2-million-writes-second-three-cheap-machines ). -Jon On Mar 20, 2015, at 3:03 PM, Emmanuel ele...@msn.com wrote:800B messages / day = 9.26M messages / sec over 1100 brokers = ~8400 message / broker / sec Do I get this right? Trying to benchmark my own test cluster and that's what I see with 2 brokers...Just wondering if my numbers are good or bad... Subject: Re: Post on running Kafka at LinkedIn From: cl...@kafka.guru Date: Fri, 20 Mar 2015 14:27:58 -0700 To: users@kafka.apache.org Yep! We are growing :) -Clark Sent from my iPhone On Mar 20, 2015, at 2:14 PM, James Cheng jch...@tivo.com wrote: Amazing growth numbers. At the meetup on 1/27, Clark Haskins presented their Kafka usage at the time. It was: Bytes in: 120 TB Messages In: 585 million Bytes out: 540 TB Total brokers: 704 In Todd's post, the current numbers: Bytes in: 175 TB (45% growth) Messages In: 800 billion (36% growth) Bytes out: 650 TB (20% growth) Total brokers: 1100 (56% growth) That much growth in just 2 months? Wowzers. -James On Mar 20, 2015, at 11:30 AM, James Cheng jch...@tivo.com wrote: For those who missed it: The Kafka Audit tool was also presented at the 1/27 Kafka meetup: http://www.meetup.com/http-kafka-apache-org/events/219626780/ Recorded video is here, starting around the 40 minute mark: http://www.ustream.tv/recorded/58109076 Slides are here: http://www.ustream.tv/recorded/58109076 -James On Mar 20, 2015, at 9:47 AM, Todd Palino tpal...@gmail.com wrote: For those who are interested in detail on how we've got Kafka set up at LinkedIn, I have just published a new posted to our Engineering blog titled Running Kafka at Scale https://engineering.linkedin.com/kafka/running-kafka-scale It's a general overview of our current Kafka install, tiered architecture, audit, and the libraries we use for producers and consumers. You'll also be seeing more posts from the SRE team here in the coming weeks on deeper looks into both Kafka and Samza. Additionally, I'll be giving a talk at ApacheCon next month on running tiered Kafka architectures. If you're in Austin for that, please come by and check it out. -Todd
Re: kafka mirrormaker cross datacenter replication
With MM, the source and destination cluster can choose different number of partitions for the mirrored topic, and hence messages may be re-grouped in the destination cluster. In addition, let's say you have two MMs piping data to the same destination from two sources, the ordering of which messages from the two source clusters arrive to the destination is non-deterministic as well. Guozhang On Sun, Mar 22, 2015 at 9:40 PM, Kane Kim kane.ist...@gmail.com wrote: I thought that ordering is guaranteed within the partition or mirror maker doesn't preserve partitions? On Fri, Mar 20, 2015 at 4:44 PM, Guozhang Wang wangg...@gmail.com wrote: I think 1) will work, but not sure if about 2), since messages replicated at two clusters may be out of order as well, hence you may get message 1,2,3,4 in one cluster and 1,3,4,2 in another. If you remember that your latest message processed in the first cluster is 2, when you fail over to the other cluster you may skip and miss message 3 and 4. Guozhang On Fri, Mar 20, 2015 at 1:07 PM, Kane Kim kane.ist...@gmail.com wrote: Also, as I understand we either have to mark all messages with unique IDs and then deduplicate them, or, if we want just store last message processed per partition we will need exactly the same partitions number in both clusters? On Fri, Mar 20, 2015 at 10:19 AM, Guozhang Wang wangg...@gmail.com wrote: Not sure if transactional messaging will help in this case, as at least for now it is still targeted within a single DC, i.e. a transaction is only defined within a Kafka cluster, not across clusters. Guozhang On Fri, Mar 20, 2015 at 10:08 AM, Jon Bringhurst jbringhu...@linkedin.com.invalid wrote: Hey Kane, When mirrormakers loose offsets on catastrophic failure, you generally have two options. You can keep auto.offset.reset set to latest and handle the loss of messages, or you can have it set to earliest and handle the duplication of messages. Although we try to avoid duplicate messages overall, when failure happens, we (mostly) take the earliest path and deal with the duplication of messages. If your application doesn't treat messages as idempotent, you might be able to get away with something like couchbase or memcached with a TTL slightly higher than your Kafka retention time and use that to filter duplicates. Another pattern may be to deduplicate messages in Hadoop before taking action on them. -Jon P.S. An option in the future might be https://cwiki.apache.org/confluence/display/KAFKA/Transactional+Messaging+in+Kafka On Mar 19, 2015, at 5:32 PM, Kane Kim kane.ist...@gmail.com wrote: Hello, What's the best strategy for failover when using mirror-maker to replicate across datacenters? As I understand offsets in both datacenters will be different, how consumers should be reconfigured to continue reading from the same point where they stopped without data loss and/or duplication? Thanks. -- -- Guozhang -- -- Guozhang -- -- Guozhang
Re: Check topic exists after deleting it.
Thanks for your prompt response. In my check for topicExists i will add a check for topic in DeleteTopicsPath. On Mon, Mar 23, 2015 at 8:21 PM, Harsha ka...@harsha.io wrote: Just to be clear, one needs to stop producers and consumers that writing/reading from a topic “test” if they are trying to delete that specific topic “test”. Not all producers and clients. -- Harsha On March 23, 2015 at 10:13:47 AM, Harsha (harsh...@fastmail.fm) wrote: Currently we have auto.create.topics.enable set to true by default. If this is set true any one who is making TopicMetadataRequest can create a topic . As both producers and consumers can send TopicMetadataRequest which will create a topic if the above config is true. So while doing deletion if there is producer or consumer running it can re-create a topic thats in deletion process. This issue going to be addressed in upcoming versions. Meanwhile if you are not creating topics via producer than turn this config off or stop producer and consumers while you are trying to delete a topic. -- Harsha On March 23, 2015 at 9:57:53 AM, Grant Henke (ghe...@cloudera.com) wrote: What happens when producers or consumers are running while the topic deleting is going on? On Mon, Mar 23, 2015 at 10:02 AM, Harsha ka...@harsha.io wrote: DeleteTopic makes a node in zookeeper to let controller know that there is a topic up for deletion. This doesn’t immediately delete the topic it can take time depending if all the partitions of that topic are online and brokers are available as well. Once all the Log files deleted zookeeper node gets deleted as well. Also make sure you don’t have any producers or consumers are running while the topic deleting is going on. -- Harsha On March 23, 2015 at 1:29:50 AM, anthony musyoki ( anthony.musy...@gmail.com) wrote: On deleting a topic via TopicCommand.deleteTopic() I get Topic test-delete is marked for deletion. I follow up by checking if the topic exists by using AdminUtils.topicExists() which suprisingly returns true. I expected AdminUtils.TopicExists() to check both BrokerTopicsPath and DeleteTopicsPath before returning a verdict but it only checks BrokerTopicsPath Shouldn't a topic marked for deletion return false for topicExists() ? -- Grant Henke Solutions Consultant | Cloudera ghe...@cloudera.com | 920-980-8979 twitter.com/ghenke http://twitter.com/gchenke | linkedin.com/in/granthenke
Kafka Sync Producer threads(ack=0) are blocked
Hi All, Currently we are using kafka_2.8.0-0.8.0-beta1 in our production system. I am using sync producer with ack=0 to send the events to broker. but I am seeing most of my producer threads are blocked. jmsListnerTaskExecutor-818 prio=10 tid=0x7f3f5c05a800 nid=0x1719 waiting for monitor entry [0x7f405935e000] * java.lang.Thread.State: BLOCKED (on object monitor)* *at kafka.producer.async.DefaultEventHandler.handle(DefaultEventHandler.scala:53)* *- waiting to lock 0x000602358ee8 (a java.lang.Object)* at kafka.producer.Producer.send(Producer.scala:74) at kafka.javaapi.producer.Producer.send(Producer.scala:32) at com.snapdeal.coms.kafka.KafkaProducer.send(KafkaProducer.java:135) at com.snapdeal.coms.kafka.KafkaEventPublisher.publishCOMSEventOnESB(KafkaEventPublisher.java:61) at com.snapdeal.coms.service.EventsPublishingService.publishStateChangeEvent(EventsPublishingService.java:88) at com.snapdeal.coms.publisher.core.PublisherEventPublishingService.publishUploadIdState(PublisherEventPublishingService.java:46) at com.snapdeal.coms.publisher.splitter.VendorProductUpdateSplitter.split(VendorProductUpdateSplitter.java:112) at sun.reflect.GeneratedMethodAccessor227.invoke(Unknown Source) [image: Inline image 1] Jvisualvm also shows that most of the time producer threads are in blocked state though I don't see any exception in kafka sever logs. Any insight??
Re: [ANN] sqlstream: Simple MySQL binlog to Kafka stream
I created a wiki page that lists all the MySQL replication options that people posted, plus a couple others. People may/may not find it useful. https://github.com/wushujames/mysql-cdc-projects/wiki I wasn't sure where to host it, so I put it up on a Github Wiki. -James On Mar 17, 2015, at 11:09 PM, Xiao lixiao1...@gmail.com wrote: Linkedin Gabblin compaction tool is using Hive to perform the compaction. Does it mean Lumos is replaced? Confused… On Mar 17, 2015, at 10:00 PM, Xiao lixiao1...@gmail.com wrote: Hi, all, Do you know whether Linkedin plans to open source Lumos in the near future? I found the answer from Qiao Lin’s post about replication from Oracle/mySQL to Hadoop. - https://engineering.linkedin.com/data-ingestion/gobblin-big-data-ease At the source side, it can be DataBus-based or file based. At the target side, it is Lumos to rebuild the snapshots due to inability to do an update/delete in Hadoop. The slides about Lumos: http://www.slideshare.net/Hadoop_Summit/th-220p230-cramachandranv1 The talk about Lumos: https://www.youtube.com/watch?v=AGlRjlrNDYk Event publishing is different from database replication. Kafka is used for change publishing or maybe also used for sending changes (recorded in files). Thanks, Xiao Li On Mar 17, 2015, at 7:26 PM, Arya Ketan ketan.a...@gmail.com wrote: AFAIK , linkedin uses databus to do the same. Aesop is built on top of databus , extending its beautiful capabilities to mysql n hbase On Mar 18, 2015 7:37 AM, Xiao lixiao1...@gmail.com wrote: Hi, all, Do you know how Linkedin team publishes changed rows in Oracle to Kafka? I believe they already knew the whole problem very well. Using triggers? or directly parsing the log? or using any Oracle GoldenGate interfaces? Any lesson or any standard message format? Could the Linkedin people share it with us? I believe it can help us a lot. Thanks, Xiao Li On Mar 17, 2015, at 12:26 PM, James Cheng jch...@tivo.com wrote: This is a great set of projects! We should put this list of projects on a site somewhere so people can more easily see and refer to it. These aren't Kafka-specific, but most seem to be MySQL CDC. Does anyone have a place where they can host a page? Preferably a wiki, so we can keep it up to date easily. -James On Mar 17, 2015, at 8:21 AM, Hisham Mardam-Bey hisham.mardam...@gmail.com wrote: Pretty much a hijack / plug as well (= https://github.com/mardambey/mypipe MySQL binary log consumer with the ability to act on changed rows and publish changes to different systems with emphasis on Apache Kafka. Mypipe currently encodes events using Avro before pushing them into Kafka and is Avro schema repository aware. The project is young; and patches for improvements are appreciated (= On Mon, Mar 16, 2015 at 10:35 PM, Arya Ketan ketan.a...@gmail.com wrote: Great work. Sorry for kinda hijacking this thread, but I though that we had built some-thing on mysql bin log event propagator and wanted to share it . You guys can also look into Aesop ( https://github.com/Flipkart/aesop ). Its a change propagation frame-work. It has relays which listens to bin logs of Mysql, keeps track of SCNs and has consumers which can then (transform/map or interpret as is) the bin log-event to a destination. Consumers also keep track of SCNs and a slow consumer can go back to a previous SCN if it wants to re-listen to events ( similar to kafka's consumer view ). All the producers/consumers are extensible and you can write your own custom consumer and feed off the data to it. Common use-cases: a) Archive mysql based data into say hbase b) Move mysql based data to say a search store for serving reads. It has a decent ( not an awesome :) ) console too which gives a nice human readable view of where the producers and consumers are. Current supported producers are mysql bin logs, hbase wall-edits. Further insights/reviews/feature reqs/pull reqs/advices are all welcome. -- Arya Arya On Tue, Mar 17, 2015 at 1:48 AM, Gwen Shapira gshap...@cloudera.com wrote: Really really nice! Thank you. On Mon, Mar 16, 2015 at 7:18 AM, Pierre-Yves Ritschard p...@spootnik.org wrote: Hi kafka, I just wanted to mention I published a very simple project which can connect as MySQL replication client and stream replication events to kafka: https://github.com/pyr/sqlstream When you don't have control over an application, it can provide a simple way of consolidating SQL data in kafka. This is an early release and there are a few caveats (mentionned in the README), mostly the poor partitioning which I'm going to evolve quickly and the reconnection strategy which doesn't try to keep track of binlog position, other than that, it should work as advertised. Cheers, - pyr -- Hisham Mardam-Bey http://hisham.cc/
Kafka Meetup @ LinkedIn 3/24
Hey Everyone – Just a reminder about the Meetup tomorrow night @ LinkedIn. There will be 3 talks: Offset management - 6:35PM Joel Koshy(LinkedIn) The Netflix Data Pipeline - 7:05PM - Allen Wang Steven Wu(Netflix) Best Practices - 7:50PM - Jay Kreps(Confluent) If you are interested in attending please sign up at the link below: http://www.meetup.com/http-kafka-apache-org/events/220355031/ We will also be recording streaming the event live at: http://www.ustream.tv/linkedin-events -Clark *Clark Elliott Haskins III* LinkedIn DDS Site Reliability Engineering Kafka, Zookeeper, Samza SRE Manager https://www.linkedin.com/in/clarkhaskins *There is no place like 127.0.0.1*
Re: kafka audit
We've talked about it a little bit, but part of the problem is that it is pretty well integrated into our infrastructure, and as such it's hard to pull it out. I illustrated this a little differently than Jon did in my latest blog post (http://engineering.linkedin.com/kafka/running-kafka-scale), how the producer (and consumer) bits that handle audit are integrated in our internal libraries that wrap the open source libraries. Between the schema-registry, the publishing of the audit data back into Kafka, the audit consumers, and the database that is needed for storing the audit data, it gets woven in pretty tightly. Confluent has made a start on this by releasing a stack with schemas integrated in. This is probably a good place to start as far as building an open source audit service. -Todd On Mon, Mar 23, 2015 at 12:47 AM, Navneet Gupta (Tech - BLR) navneet.gu...@flipkart.com wrote: Are there any plans to open source the same? What alternates do we have here? We are building an internal auditing framework for our entire big data pipeline. Kafka is one of the data sources we have (ingested data). On Mon, Mar 23, 2015 at 1:03 PM, tao xiao xiaotao...@gmail.com wrote: Linkedin has an excellent tool that monitors lag/data loss/data duplication and etc. Here is the reference http://www.slideshare.net/JonBringhurst/kafka-audit-kafka-meetup-january-27th-2015 it is not open sourced though. On Mon, Mar 23, 2015 at 3:26 PM, sunil kalva kalva.ka...@gmail.com wrote: Hi What is best practice for adding audit feature in kafka, Is there any framework available for enabling audit feature at producer and consumer level and any UI frameworks for monitoring. tx SunilKalva -- Regards, Tao -- Thanks Regards, Navneet Gupta
Re: Mirror maker fetcher thread unexpectedly stopped
I think I worked out the answer to question 1. java.lang.IllegalMonitorStateException was thrown due to no ownership of ReentrantLock when trying to call await() on the lock condition. Here is the code snippet from the AbstractFetcherThread.scala in trunk partitionMapLock synchronized { partitionsWithError ++= partitionMap.keys // there is an error occurred while fetching partitions, sleep a while partitionMapCond.await(fetchBackOffMs, TimeUnit.MILLISECONDS) } as shown above partitionMapLock is not acquired before calling partitionMapCond.await we can fix this by explicitly calling partitionMapLock.lock(). below code block should work inLock(partitionMapLock) { partitionsWithError ++= partitionMap.keys // there is an error occurred while fetching partitions, sleep a while partitionMapCond.await(fetchBackOffMs, TimeUnit.MILLISECONDS) } On Mon, Mar 23, 2015 at 1:50 PM, tao xiao xiaotao...@gmail.com wrote: Hi, I was running a mirror maker and got java.lang.IllegalMonitorStateException that caused the underlying fetcher thread completely stopped. Here is the log from mirror maker. [2015-03-21 02:11:53,069] INFO Reconnect due to socket error: java.io.EOFException: Received -1 when reading from channel, socket has likely been closed. (kafka.consumer.SimpleConsumer) [2015-03-21 02:11:53,081] WARN [ConsumerFetcherThread-localhost-1426853968478-4ebaa354-0-6], Error in fetch Name: FetchRequest; Version: 0; CorrelationId: 2398588; ClientId: phx-slc-mm-user-3; ReplicaId: -1; MaxWait: 100 ms; MinBytes: 1 bytes; RequestInfo: [test.topic,0] - PartitionFetchInfo(3766065,1048576). Possible cause: java.nio.channels.ClosedChannelException (kafka.consumer.ConsumerFetcherThread) [2015-03-21 02:11:53,083] ERROR [ConsumerFetcherThread-localhost-1426853968478-4ebaa354-0-6], Error due to (kafka.consumer.ConsumerFetcherThread) java.lang.IllegalMonitorStateException at java.util.concurrent.locks.ReentrantLock$Sync.tryRelease(ReentrantLock.java:155) at java.util.concurrent.locks.AbstractQueuedSynchronizer.release(AbstractQueuedSynchronizer.java:1260) at java.util.concurrent.locks.AbstractQueuedSynchronizer.fullyRelease(AbstractQueuedSynchronizer.java:1723) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2166) at kafka.server.AbstractFetcherThread.processFetchRequest(AbstractFetcherThread.scala:106) at kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:90) at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:60) [2015-03-21 02:11:53,083] INFO [ConsumerFetcherThread-localhost-1426853968478-4ebaa354-0-6], Stopped (kafka.consumer.ConsumerFetcherThread) I am still investigating what caused the connection error on server side but I have a couple of questions related to mirror maker itself 1. What is root cause of java.lang.IllegalMonitorStateException? As shown in the AbstractFetcherThread source the fetcher thread should catch the java.io.EOFException thrown from underlying simplyConsumer and sleep a while before next run. 2. Mirror maker is unaware of the termination of fetcher thread. That makes it unable to detect the failure and trigger rebalancing. I have 3 mirror maker instances running in 3 different machines listening to the same topic. I would expect the mirror maker will release the partition ownership when underlying fetcher thread terminates so that rebalancing can be triggered.but in fact this is not the case. is this expected behavior or do I miss configure anything? I am running the trunk version as of commit 82789e75199fdc1cae115c5c2eadfd0f1ece4d0d -- Regards, Tao -- Regards, Tao
Check topic exists after deleting it.
On deleting a topic via TopicCommand.deleteTopic() I get Topic test-delete is marked for deletion. I follow up by checking if the topic exists by using AdminUtils.topicExists() which suprisingly returns true. I expected AdminUtils.TopicExists() to check both BrokerTopicsPath and DeleteTopicsPath before returning a verdict but it only checks BrokerTopicsPath Shouldn't a topic marked for deletion return false for topicExists() ?
Re: kafka audit
Are there any plans to open source the same? What alternates do we have here? We are building an internal auditing framework for our entire big data pipeline. Kafka is one of the data sources we have (ingested data). On Mon, Mar 23, 2015 at 1:03 PM, tao xiao xiaotao...@gmail.com wrote: Linkedin has an excellent tool that monitors lag/data loss/data duplication and etc. Here is the reference http://www.slideshare.net/JonBringhurst/kafka-audit-kafka-meetup-january-27th-2015 it is not open sourced though. On Mon, Mar 23, 2015 at 3:26 PM, sunil kalva kalva.ka...@gmail.com wrote: Hi What is best practice for adding audit feature in kafka, Is there any framework available for enabling audit feature at producer and consumer level and any UI frameworks for monitoring. tx SunilKalva -- Regards, Tao -- Thanks Regards, Navneet Gupta
Re: kafka audit
Linkedin has an excellent tool that monitors lag/data loss/data duplication and etc. Here is the reference http://www.slideshare.net/JonBringhurst/kafka-audit-kafka-meetup-january-27th-2015 it is not open sourced though. On Mon, Mar 23, 2015 at 3:26 PM, sunil kalva kalva.ka...@gmail.com wrote: Hi What is best practice for adding audit feature in kafka, Is there any framework available for enabling audit feature at producer and consumer level and any UI frameworks for monitoring. tx SunilKalva -- Regards, Tao
Re: Check topic exists after deleting it.
Currently we have auto.create.topics.enable set to true by default. If this is set true any one who is making TopicMetadataRequest can create a topic . As both producers and consumers can send TopicMetadataRequest which will create a topic if the above config is true. So while doing deletion if there is producer or consumer running it can re-create a topic thats in deletion process. This issue going to be addressed in upcoming versions. Meanwhile if you are not creating topics via producer than turn this config off or stop producer and consumers while you are trying to delete a topic. -- Harsha On March 23, 2015 at 9:57:53 AM, Grant Henke (ghe...@cloudera.com) wrote: What happens when producers or consumers are running while the topic deleting is going on? On Mon, Mar 23, 2015 at 10:02 AM, Harsha ka...@harsha.io wrote: DeleteTopic makes a node in zookeeper to let controller know that there is a topic up for deletion. This doesn’t immediately delete the topic it can take time depending if all the partitions of that topic are online and brokers are available as well. Once all the Log files deleted zookeeper node gets deleted as well. Also make sure you don’t have any producers or consumers are running while the topic deleting is going on. -- Harsha On March 23, 2015 at 1:29:50 AM, anthony musyoki ( anthony.musy...@gmail.com) wrote: On deleting a topic via TopicCommand.deleteTopic() I get Topic test-delete is marked for deletion. I follow up by checking if the topic exists by using AdminUtils.topicExists() which suprisingly returns true. I expected AdminUtils.TopicExists() to check both BrokerTopicsPath and DeleteTopicsPath before returning a verdict but it only checks BrokerTopicsPath Shouldn't a topic marked for deletion return false for topicExists() ? -- Grant Henke Solutions Consultant | Cloudera ghe...@cloudera.com | 920-980-8979 twitter.com/ghenke http://twitter.com/gchenke | linkedin.com/in/granthenke
Re: Check topic exists after deleting it.
What happens when producers or consumers are running while the topic deleting is going on? On Mon, Mar 23, 2015 at 10:02 AM, Harsha ka...@harsha.io wrote: DeleteTopic makes a node in zookeeper to let controller know that there is a topic up for deletion. This doesn’t immediately delete the topic it can take time depending if all the partitions of that topic are online and brokers are available as well. Once all the Log files deleted zookeeper node gets deleted as well. Also make sure you don’t have any producers or consumers are running while the topic deleting is going on. -- Harsha On March 23, 2015 at 1:29:50 AM, anthony musyoki ( anthony.musy...@gmail.com) wrote: On deleting a topic via TopicCommand.deleteTopic() I get Topic test-delete is marked for deletion. I follow up by checking if the topic exists by using AdminUtils.topicExists() which suprisingly returns true. I expected AdminUtils.TopicExists() to check both BrokerTopicsPath and DeleteTopicsPath before returning a verdict but it only checks BrokerTopicsPath Shouldn't a topic marked for deletion return false for topicExists() ? -- Grant Henke Solutions Consultant | Cloudera ghe...@cloudera.com | 920-980-8979 twitter.com/ghenke http://twitter.com/gchenke | linkedin.com/in/granthenke
Re: Check topic exists after deleting it.
Just to be clear, one needs to stop producers and consumers that writing/reading from a topic “test” if they are trying to delete that specific topic “test”. Not all producers and clients. -- Harsha On March 23, 2015 at 10:13:47 AM, Harsha (harsh...@fastmail.fm) wrote: Currently we have auto.create.topics.enable set to true by default. If this is set true any one who is making TopicMetadataRequest can create a topic . As both producers and consumers can send TopicMetadataRequest which will create a topic if the above config is true. So while doing deletion if there is producer or consumer running it can re-create a topic thats in deletion process. This issue going to be addressed in upcoming versions. Meanwhile if you are not creating topics via producer than turn this config off or stop producer and consumers while you are trying to delete a topic. -- Harsha On March 23, 2015 at 9:57:53 AM, Grant Henke (ghe...@cloudera.com) wrote: What happens when producers or consumers are running while the topic deleting is going on? On Mon, Mar 23, 2015 at 10:02 AM, Harsha ka...@harsha.io wrote: DeleteTopic makes a node in zookeeper to let controller know that there is a topic up for deletion. This doesn’t immediately delete the topic it can take time depending if all the partitions of that topic are online and brokers are available as well. Once all the Log files deleted zookeeper node gets deleted as well. Also make sure you don’t have any producers or consumers are running while the topic deleting is going on. -- Harsha On March 23, 2015 at 1:29:50 AM, anthony musyoki ( anthony.musy...@gmail.com) wrote: On deleting a topic via TopicCommand.deleteTopic() I get Topic test-delete is marked for deletion. I follow up by checking if the topic exists by using AdminUtils.topicExists() which suprisingly returns true. I expected AdminUtils.TopicExists() to check both BrokerTopicsPath and DeleteTopicsPath before returning a verdict but it only checks BrokerTopicsPath Shouldn't a topic marked for deletion return false for topicExists() ? -- Grant Henke Solutions Consultant | Cloudera ghe...@cloudera.com | 920-980-8979 twitter.com/ghenke http://twitter.com/gchenke | linkedin.com/in/granthenke
Re: KafkaSpout forceFromStart Issue
Hi Francois, Looks like this belong storm mailing lists. Can you please send this question on storm mailing lists. Thanks, Harsha On March 23, 2015 at 11:17:47 AM, François Méthot (fmetho...@gmail.com) wrote: Hi, We have a storm topology that uses Kafka to read a topic with 6 partitions. ( Kafka 0.8.2, Storm 0.9.3 ) Recently, we had to set the KafkaSpout to read from the beginning, so we temporary configured our KafkaConfig this way: kafkaConfig.forceFromStart=true kafkaConfig.startOffsetTime = OffsetRequest.EarliestTime() It worked well, but afterward, setting those parameters back to false and to LatestTime respectively had no effect. In fact the topology won't read from our topic anymore. When the topology starts, The spout successully logs the offset and consumer group's cursor position for each partition in the worker log. But nothing is read. The only way we can read back from our Topic is to give our SpoutConfig a new Kafka ConsumerGroup Id. So it looks like, if we don't want to modify any KafkaSpout/Kafka code, the only way to read from the beginning would be to write the position we want to read from in Zookeeper for our Consumer Group where offset are stored and to restart our topology. Would anyone know if this is a bug in the KafkaSpout or an issue inherited from bug in Kafka? Thanks Francois
Re: Updates To cwiki For Producer
If you have feedback, don't hesitate to comment on the JIRA. On Mon, Mar 23, 2015 at 4:19 PM, Pete Wright pwri...@rubiconproject.com wrote: Hi Gwen - thanks for sending this along. I'll patch my local checkout and take a look at this. Cheers, -pete On 03/20/15 21:16, Gwen Shapira wrote: We have a patch with examples: https://issues.apache.org/jira/browse/KAFKA-1982 Unfortunately, its not committed yet. Gwen On Fri, Mar 20, 2015 at 11:24 AM, Pete Wright pwri...@rubiconproject.com wrote: Thanks that's helpful. I am working on an example producer using the new API, if I have any helpful notes or examples I'll share that. I was basically trying to be lazy and poach some example code as a starting point for our internal tests :) Cheers, -pete On 03/20/15 10:59, Guozhang Wang wrote: For the new java producer, its java doc can be found here: http://kafka.apache.org/083/javadoc/index.html?org/apache/kafka/clients/producer/KafkaProducer.html We can update the wiki if there are some examples that are still missing from this java doc. Guozhang On Thu, Mar 19, 2015 at 4:37 PM, Pete Wright pwri...@rubiconproject.com wrote: Hi, Is there a plan to update the producer documentation on the wiki located here: https://cwiki.apache.org/confluence/display/KAFKA/0.8.0+Producer+Example This would be helpful for people working on implementing the new producer class deployed in 0.8.2.x. If there are any patches available for this or other docs available for reference that would be helpful as well. Thanks! -pete -- Pete Wright Lead Systems Architect Rubicon Project pwri...@rubiconproject.com 310.309.9298 -- Pete Wright Lead Systems Architect Rubicon Project pwri...@rubiconproject.com 310.309.9298 -- Pete Wright Lead Systems Architect Rubicon Project pwri...@rubiconproject.com 310.309.9298
Re: Updates To cwiki For Producer
Hi Gwen - thanks for sending this along. I'll patch my local checkout and take a look at this. Cheers, -pete On 03/20/15 21:16, Gwen Shapira wrote: We have a patch with examples: https://issues.apache.org/jira/browse/KAFKA-1982 Unfortunately, its not committed yet. Gwen On Fri, Mar 20, 2015 at 11:24 AM, Pete Wright pwri...@rubiconproject.com wrote: Thanks that's helpful. I am working on an example producer using the new API, if I have any helpful notes or examples I'll share that. I was basically trying to be lazy and poach some example code as a starting point for our internal tests :) Cheers, -pete On 03/20/15 10:59, Guozhang Wang wrote: For the new java producer, its java doc can be found here: http://kafka.apache.org/083/javadoc/index.html?org/apache/kafka/clients/producer/KafkaProducer.html We can update the wiki if there are some examples that are still missing from this java doc. Guozhang On Thu, Mar 19, 2015 at 4:37 PM, Pete Wright pwri...@rubiconproject.com wrote: Hi, Is there a plan to update the producer documentation on the wiki located here: https://cwiki.apache.org/confluence/display/KAFKA/0.8.0+Producer+Example This would be helpful for people working on implementing the new producer class deployed in 0.8.2.x. If there are any patches available for this or other docs available for reference that would be helpful as well. Thanks! -pete -- Pete Wright Lead Systems Architect Rubicon Project pwri...@rubiconproject.com 310.309.9298 -- Pete Wright Lead Systems Architect Rubicon Project pwri...@rubiconproject.com 310.309.9298 -- Pete Wright Lead Systems Architect Rubicon Project pwri...@rubiconproject.com 310.309.9298
Re: Anyone interested in speaking at Bay Area Kafka meetup @ LinkedIn on March 24?
Hi Jon, It the link for the 1/27 meetup you posted works for me, but I haven't found how to find that same link on the meetup site (there are links that point to the live stream, which of course is no longer happening!). Thoughts? Thanks, Jason On Mon, Mar 2, 2015 at 11:31 AM, Jon Bringhurst jbringhu...@linkedin.com.invalid wrote: The meetups are recorded. For example, here's a link to the January meetup: http://www.ustream.tv/recorded/58109076 The links to the recordings are usually posted to the comments for each meetup on http://www.meetup.com/http-kafka-apache-org/ -Jon On Feb 23, 2015, at 3:24 PM, Ruslan Khafizov ruslan.khafi...@gmail.com wrote: +1 For recording sessions. On 24 Feb 2015 07:22, Jiangjie Qin j...@linkedin.com.invalid wrote: +1, I¹m very interested. On 2/23/15, 3:05 PM, Jay Kreps jay.kr...@gmail.com wrote: +1 I think something like Kafka on AWS at Netflix would be hugely interesting to a lot of people. -Jay On Mon, Feb 23, 2015 at 3:02 PM, Allen Wang aw...@netflix.com.invalid wrote: We (Steven Wu and Allen Wang) can talk about Kafka use cases and operations in Netflix. Specifically, we can talk about how we scale and operate Kafka clusters in AWS and how we migrate our data pipeline to Kafka. Thanks, Allen On Mon, Feb 23, 2015 at 12:15 PM, Ed Yakabosky eyakabo...@linkedin.com.invalid wrote: Hi Kafka Open Source - LinkedIn will host another Bay Area Kafka meetup in Mountain View on March 24. We are planning to present on Offset Management but are looking for additional speakers. If you¹re interested in presenting a use case, operational plan, or your experience with a particular feature (REST interface, WebConsole), please reply-all to let us know. [BCC: Open Source lists] Thanks, Ed