Re: Kafka/zookeeper logs in every command

2018-01-29 Thread José Ribeiro
Hello I tried what you told me too, but I already have the tools-log4j.properties set to WARN, and I don't have a variable KAFKA_LOG4J_OPTS set. I even tried to disable all the logs, but the same happens. It's not that the Kafka doesn't work because it works, but it shows to much info. Could

Log segment deletion

2018-01-29 Thread Martin Kleppmann
Hi all, We are debugging an issue with a Kafka Streams application that is producing incorrect output. The application is a simple group-by on a key, and then count. As expected, the application creates a repartitioning topic for the group-by stage. The problem appears to be that messages are g

Re: Log segment deletion

2018-01-29 Thread Martin Kleppmann
Follow-up: I think we figured out what was happening. Setting the broker config log.message.timestamp.type=LogAppendTime (instead of the default value CreateTime) stopped the messages disappearing. The messages in the Streams app's input topic are older than the 24 hours default retention perio

Re: JSONSchema Kafka Connect Converter

2018-01-29 Thread Randall Hauch
Hi, Andrew. The Converter is part of Connect's public API, so it certainly is valid/encouraged to create new implementations when that makes sense for users. I know of several Converter implementations that are outside of the Apache Kafka project. The project's existing JSON converter is fairly li

Choose the number of partitions/topics

2018-01-29 Thread Maria Pilar
Hi everyone I have design an integration between 2 systems throug our API Stream Kafka, and the requirements are unclear to choose properly the number of partitions/topics. That is the use case: My producer will send 28 different type of events, so I have decided to create 28 topics. The max si

Re: skipped-records-rate vs skippedDueToDeserializationError-rate metric in streams app

2018-01-29 Thread Guozhang Wang
Hi Srikanth, How did you set the LogAndContinueExceptionHandler in the configs? Could you copy the code snippet here? Guozhang On Sun, Jan 28, 2018 at 11:26 PM, Srikanth wrote: > Kafka-streams version "1.0.0". > > Thanks, > Srikanth > > On Mon, Jan 29, 2018 at 12:23 AM, Guozhang Wang > wrote

Re: Log segment deletion

2018-01-29 Thread Guozhang Wang
Hello Martin, What you've observed is correct. More generally speaking, for various broker-side operations that based on record timestamps and treating them as wall-clock time, there is a mismatch between the stream records' timestamp which is basically "event time", against the broker's system wa

New post about CDC and Kafka

2018-01-29 Thread Ofir Sharony
Hi all, The following post I wrote describes the usage of change-data-capture and Kafka for user behavior analysis, connecting database record changes to user context in near real-time. https://medium.com/myheritage-engineering/achieving-real-time-analytics-via-change-data-capture-d69ed2ead889 En

Re: Choose the number of partitions/topics

2018-01-29 Thread Chicolo, Robert (rchic...@student.cccs.edu)
so it goes beyond the throughput that kafka can support. You have to decide as to what degree of parallelism your application can support. If one message processing depends on processing for another message, that limits the degree to which you can process in parallel. Depending on how much time

Kafka Consumers not rebalancing.

2018-01-29 Thread satyajit vegesna
Hi All, I was experimenting on the new consumer API and have a question regarding the rebalance process. I start a consumer group with single thread and make the Thread sleep while processing the records retrieved from the first consumer.poll call, i was making sure the Thread.sleep time goes bey

monitor consumer offset lag script/code

2018-01-29 Thread Sunil Parmar
We're using 0.9 ( CDH ) and consumer offsets are stored within Kafka. What is the preferred way to get consumer offset from code or script for monitoring ? Is there any sample code/ script to do so ? Thanks, Sunil Parmar

Re: Recommended max number of topics (and data separation)

2018-01-29 Thread Andrey Falko
On Sun, Jan 28, 2018 at 8:45 AM, David Espinosa wrote: > Hi Monty, > > I'm also planning to use a big amount of topics in Kafka, so recently I > made a test within a 3 nodes kafka cluster where I created 100k topics with > one partition. Sent 1M messages in total. Are your topic partitions replic

ReadOnlyKeyValueStore.range API

2018-01-29 Thread Debasish Ghosh
Hello - The above API gives me the range of values between fromKey and toKey for a local state store. Suppose I have an application running in distributed mode (multiple nodes same application id). How does this API translate to multiple nodes ? I know the basic implementation is for a local nod