Re: Potential memory leak in rocksdb

2017-02-09 Thread Pierre Coquentin
Here is the gist with the two gif https://gist.github.com/PierreCoquentin/d2df46e5e1c0d3506f6311b343e6f775 On Fri, Feb 10, 2017 at 7:45 AM, Guozhang Wang wrote: > Pierre, > > Apache mailing list has some restricts to attach large attachments and I > think that is why your

Re: Need help in understanding bunch of rocksdb errors on kafka_2.10-0.10.1.1

2017-02-09 Thread Sachin Mittal
Hi, We are running rocksdb with default configuration. I would try to monitor the rocks db, I do see the beans when I connect via jmx client. We use rocks db for aggregation. Our pipe line is: input .groupByKey() .aggregate(new Initializer() { public SortedSet apply() {

Re: Need help in understanding bunch of rocksdb errors on kafka_2.10-0.10.1.1

2017-02-09 Thread Guozhang Wang
Sachin, Thanks for sharing your observations, that are very helpful. Regards to monitoring, there are indeed metrics inside the Streams library to meter state store operations; for rocksdb it records the average latency and callrate for put / get / delete / all / range and flush / restore. You

Re: Potential memory leak in rocksdb

2017-02-09 Thread Guozhang Wang
Pierre, Apache mailing list has some restricts to attach large attachments and I think that is why your gif files are not shown up. Could you try using a gist link? Guozhang On Wed, Feb 8, 2017 at 9:49 AM, Pierre Coquentin wrote: > Well, I am a little perplexed

Fw: Exception when using simple consumer fetching offsets.

2017-02-09 Thread James Teng
Hi, i am working on a monitoring system, when i try to monitor consumer offsets on remote kafka cluster, i got an error using SimpleConsumer api. i found a post on stackoverlflow saying i should set advertised.host.name=xxx for all the broker servers. i discovered that all the broker servers

Re: Need help in understanding bunch of rocksdb errors on kafka_2.10-0.10.1.1

2017-02-09 Thread Sachin Mittal
Hi, We recently upgraded to 0.10.2.0-rc0, the rocksdb issue seems to be resolved however we are just not able to get the streams going under our current scenario. The issue seems to be still related to rocksdb. Let me explain the situation: We have 40 partitions and 12 threads distributes across

Re: Getting CommitFailedException in 0.10.2.0 due to member id is not valid or unknown

2017-02-09 Thread Sachin Mittal
Hi, I could manage the streams client log, the server logs were deleted since time had elapsed and it got rolled over. See if you can figure out something from these. These are not best of logs generated. https://dl.dropboxusercontent.com/u/46450177/TestKafkaAdvice.StreamThread-1.log The above

Re: Partitioning behavior of Kafka Streams without explicit StreamPartitioner

2017-02-09 Thread Matthias J. Sax
It's by design. The reason it, that Streams uses a single producer to write to different output topic. As different output topics might have different key and/or value types, the producer is instantiated with byte[] as key and value type, and Streams serialized the data before handing it to the

Partitioning behavior of Kafka Streams without explicit StreamPartitioner

2017-02-09 Thread Steven Schlansker
Hi, I discovered what I consider to be really confusing behavior -- wondering if this is by design or a bug. The Kafka Partitioner interface: public int partition(String topic, Object key, byte[] keyBytes, Object value, byte[] valueBytes, Cluster cluster); has both "Object value" and "byte[]

mirror-maker and firewalls

2017-02-09 Thread Robert Friberg
I'm setting up inter-cluster mirroring between cloud and onprem. The zookeepers and brokers are on private subnets. The mirrormaker will run at the target cluster and I'm guessing it needs to talk to both the zookeepers and brokers in the source cluster. I suppose I could set up NAT rules to

Re: Shutting down a Streams job

2017-02-09 Thread Eno Thereska
Hi Dmitry, Elias, You raise a valid point, and thanks for opening https://issues.apache.org/jira/browse/KAFKA-4748 Elias. We'll hopefully have some ideas to share soon. Eno > On 9 Feb 2017, at 16:54, Dmitry Minkovsky

ConsumerInterceptor in Tomcat app

2017-02-09 Thread Isabelle Giguère
Hi; We have recently begun to integrate Kafka into our product, using Kafka 0.10.1.0. The an instance of org.apache.kafka.clients.consumer.KafkaConsumer is run from a Tomcat application. When I first read about ConsumerInterceptor, that could possibly "mutate" records, it seemed like it

Re: Shutting down a Streams job

2017-02-09 Thread Dmitry Minkovsky
That makes sense. That's what I was kind of worried about (launching soon). Hope someone else posts! ср, 8 февр. 2017 г. в 16:54, Elias Levy : > It is certainly possible, but when you got dozens of workers, that would > take a very long time, specially if you got a

Re: ProcessorContext commit question

2017-02-09 Thread Eno Thereska
It's not needed if you are just doing the equivalent of to(), and you don't have a state store. I'm realising the Javadoc for commit() needs updating, it doesn't explain much at the moment. Eno > On 9 Feb 2017, at 16:22, Adrian McCague wrote: > > Thanks Eno that

RE: ProcessorContext commit question

2017-02-09 Thread Adrian McCague
Thanks Eno that makes sense. If then this is an implementation of Transformer which is in a DSL topology with DSL sinks ie `to()`, is the commit surplus to requirement? I suspect it will do no harm at the very least. Thanks Adrian -Original Message- From: Eno Thereska

Re: ProcessorContext commit question

2017-02-09 Thread Eno Thereska
Hi Adrian, It's also done in the DSL, but at a different point, in Task.commit(), since the flow is slightly different. Yes, once data is stored in stores, the offsets should be committed, so in case of a crash the same offsets are not processed again. Thanks Eno > On 9 Feb 2017, at 16:06,

ProcessorContext commit question

2017-02-09 Thread Adrian McCague
Hi all, In processor and transformer implementations, what are the use cases for calling `context.commit()`? Examples imply it should be called when state store modifications are complete, Streams DSL implementations do not fall in line with the examples, ie KStreamAggregate. Thanks Adrian

Re: How to measure the load capacity of kafka cluster

2017-02-09 Thread David Garcia
From my experience, the bottle neck of a kafka cluster is the writing. (i.e. the producers) The most direct way to measure how “stressed” the writing threads are is to directly observe the producer purgatory buffer of your brokers. The larger it gets the more likely a leader will report an

Re: KIP-119: Drop Support for Scala 2.10 in Kafka 0.11

2017-02-09 Thread Ismael Juma
So far the feedback is positive, but there weren't many responses. I'll start a vote next week if there are no objections until then. Ismael On Fri, Feb 3, 2017 at 2:30 PM, Ismael Juma wrote: > Hi all, > > I have posted a KIP for dropping support for Scala 2.10 in Kafka

Re: Kafka consumer offset location

2017-02-09 Thread Igor Kuzmenko
I don't control consumer directly. I'm using apache Storm with kafka spout to read topic. On Thu, Feb 9, 2017 at 5:28 PM, Mahendra Kariya wrote: > You can use the seekToBeginning method of KafkaConsumer. > >

Re: Kafka consumer offset location

2017-02-09 Thread Mahendra Kariya
You can use the seekToBeginning method of KafkaConsumer. https://kafka.apache.org/0100/javadoc/org/apache/kafka/clients/consumer/KafkaConsumer.html#seekToBeginning(java.util.Collection) On Thu, Feb 9, 2017 at 7:56 PM, Igor Kuzmenko wrote: > Hello, I'm using new consumer to

Kafka consumer offset location

2017-02-09 Thread Igor Kuzmenko
Hello, I'm using new consumer to read kafka topic. For testing, I want to read the same topic from the beggining multiple times, with same consumer. Before restarting test, I want to delete consumer offsets, so consumer start read from begining. Where can I find offsets?

Re: Kafka Connect - Unknown magic byte

2017-02-09 Thread Nick DeCoursin
Hello, Here is a github repo with the failing case: https://github.com/decoursin/kafka-connect-test. I've tried other similar things and nothing seems to work. Thanks, Nick On 9 February 2017 at 04:40, Nick DeCoursin wrote: > Any help here? I can create a git repo

Re: Kafka Client 0.8.2.2 talks to Kafka Server 0.10.1.1

2017-02-09 Thread Ismael Juma
Jeffrey, The new Java consumer was incomplete in the 0.8.2 clients JAR, so it doesn't work in any scenario. If you are using 0.10.1 brokers, my recommendation is to use 0.10.1 client jars. It will be much easier than using the soon to be deprecated Scala consumer. Ismael On Wed, Feb 8, 2017 at

Re: Reducing replication factor of Kafka Topic

2017-02-09 Thread Karolis Pocius
If you don't want to do it manually, you can try Kafka Assigner https://github.com/linkedin/kafka-tools/wiki/Kafka-Assigner Specifically "Set Replication Factor" module https://github.com/linkedin/kafka-tools/wiki/module-set-replication-factor On 2017.02.09 13:39, Manikumar wrote: We have to

Re: Reducing replication factor of Kafka Topic

2017-02-09 Thread Manikumar
We have to manually create the replica assignment json file and invoke replica assignment using kafka-reassign-partitions.sh tool. There will be leader change action, which should not create any data loss. On Thu, Feb 9, 2017 at 4:05 PM, Shaik M wrote: > Hi, > > Can we

Reducing replication factor of Kafka Topic

2017-02-09 Thread Shaik M
Hi, Can we reduce the Kafka Topic Replication without any interruption of current data flow? Thanks, Shaik M

Re: Getting CommitFailedException in 0.10.2.0 due to member id is not valid or unknown

2017-02-09 Thread Sachin Mittal
I am getting the logs but could you please look at the line rebalanceException = t; https://github.com/apache/kafka/blob/0.10.2/streams/src/ main/java/org/apache/kafka/streams/processor/internals/ StreamThread.java#L261 Why are we setting rebalanceException in case of commit failed exception on

Re: Kafka monitoring

2017-02-09 Thread Sharninder
All consumers will "eventually" get the messages. What is it that you want to achieve by monitoring that? For the brokers you can monitor lag, for the producers you can have a counter that tracks messages sent and for consumers have one that tracks messages consumed. Although, just tracking lag

Kafka monitoring

2017-02-09 Thread Nabajyoti Dash
Hi all, I have requirement to monitor — wheather every message sent by the producer is received by each and every kafka consumers or not. That is if any message is not delivered then it should be taken care of properly. I googled out but didn’t find any satisfactory answers. Please suggest.

Re: Getting CommitFailedException in 0.10.2.0 due to member id is not valid or unknown

2017-02-09 Thread Damian Guy
Might be easiest to just send all the logs if possible. On Thu, 9 Feb 2017 at 08:10 Sachin Mittal wrote: > I would try to get the logs soon. > One quick question, I have three brokers which run in cluster with default > logging. > > Which log4j logs would be of interest at

Re: Kerberos/SASL Enabled Kafka - broker fails NoAuth due ACL

2017-02-09 Thread Ashish Bhushan
Any help ? On 09-Feb-2017 1:13 PM, "Ashish Bhushan" wrote: > Hi, > > I used same principal and keytab across all brokers jass file ( Client > section ) > > Still not working , now the second broker that starts is throwing > 'Authentication failure' exception > > Do I need

Re: Getting CommitFailedException in 0.10.2.0 due to member id is not valid or unknown

2017-02-09 Thread Sachin Mittal
I would try to get the logs soon. One quick question, I have three brokers which run in cluster with default logging. Which log4j logs would be of interest at broker side and which broker or do I need to send logs from all three. My topic is partitioned and replicated on all three so kafka-logs

Re: Getting CommitFailedException in 0.10.2.0 due to member id is not valid or unknown

2017-02-09 Thread Damian Guy
Sachin, Can you provide the full logs from the broker and the streams app? It is hard to understand what is going on with only snippets of information. It seems like the rebalance is taking too long, but i can't tell from this. Thanks, Damian On Thu, 9 Feb 2017 at 07:53 Sachin Mittal