Re: [VOTE] 2.2.0 RC1

2019-03-06 Thread Matthias J. Sax
serious. Otherwise the RC1 worked fine for me. > > Jakub > > > On Fri, Mar 1, 2019 at 8:48 PM Matthias J. Sax > wrote: > >> Hello Kafka users, developers and client-developers, >> >> This is the second candidate for release of Apache Kafka 2.2.0. >

[VOTE] 2.2.0 RC1

2019-03-01 Thread Matthias J. Sax
Hello Kafka users, developers and client-developers, This is the second candidate for release of Apache Kafka 2.2.0. - Added SSL support for custom principle name - Allow SASL connections to periodically re-authenticate - Improved consumer group management - default group.id is `null` inste

Re: [VOTE] 2.2.0 RC0 [CANCELED]

2019-02-28 Thread Matthias J. Sax
) up to step 6 without >> issue. >> >> (+1 non-binding). >> >> Adam >> >> >> >> On Mon, Feb 25, 2019 at 9:19 PM Matthias J. Sax >> wrote: >> >>> @Stephane >>> >>> Thanks! You are right (I copied the list from

Re: [VOTE] 2.2.0 RC0

2019-02-27 Thread Matthias J. Sax
> If english locale is expected to build I might have missed it. > > br, Patrik > > On Sun, 24 Feb 2019 at 00:57, Matthias J. Sax wrote: > >> Hello Kafka users, developers and client-developers, >> >> This is the first candidate for the release of Apache Kafka 2.

Re: PGP No Public Key

2019-02-26 Thread Matthias J. Sax
gt; gpg: key 934EEC7DBE7EFC73: "Ewen Cheslack-Postava " not > changed > gpg: key DDAA34525234D94F: "Ewen Cheslack-Postava " not > changed > gpg: key 4B606607518830CF: "Damian Guy (CODE SIGNING KEY) > " not changed > gpg: key 0CF65F72E4609424: "Ra

[VOTE] 2.2.0 RC0

2019-02-23 Thread Matthias J. Sax
Hello Kafka users, developers and client-developers, This is the first candidate for the release of Apache Kafka 2.2.0. This is a minor release with the follow highlight: - Added SSL support for custom principle name - Allow SASL connections to periodically re-authenticate - Improved consumer

Re: Streams reset offsets for no apparent reason

2019-02-21 Thread Matthias J. Sax
resent, in which the Java consumer Fetcher may print the log message > "Resetting offset for partition {} to offset {}."? > > Regards, > Raman > > > On Thu, Feb 21, 2019 at 2:00 AM Matthias J. Sax > wrote: > >> Thanks for reporting the issue! >> >

Re: Kafka stores range ordering

2019-02-21 Thread Matthias J. Sax
It's _not_ a public contract that data of a range() query are returned ordered by key. Thus, you should not rely on it. It depends on the store implementation, and it just happens that default RocksDB does return the data ordered. -Matthias On 2/21/19 9:23 AM, Yurii Demchenko wrote: > Dear Kafk

Re: Questions on Exactly Once Semantics

2019-02-21 Thread Matthias J. Sax
sactional.id>’ acts more as a supersede and hence nullifies > and handles all the (ill)effects of PID - scenarios like producer > restarts, crashes etc. > > Please reply if my understanding is incorrect. > > Thanks > > > On 20 February 2019 at 23:57:17, Matthias J. Sax

Re: Streams reset offsets for no apparent reason

2019-02-20 Thread Matthias J. Sax
Thanks for reporting the issue! Are you able to reproduce it? If yes, can you maybe provide broker and client logs in DEBUG level? -Matthias On 2/20/19 7:07 PM, Raman Gupta wrote: > I have an exactly-once stream that reads a topic, transforms it, and writes > new messages into the same topic as

Re: Questions on Exactly Once Semantics

2019-02-20 Thread Matthias J. Sax
KAFKA/FAQ > > So that we can refer future questions to the page than answering them > repeatedly. @Matthias J Sax <mailto:matth...@confluent.io> : would you > like to do it? > > > Guozhang > > On Tue, Feb 19, 2019 at 3:12 PM Matthias J. Sax <mailto:matth...@co

Re: what's in the rocksdb in the tmp dir?

2019-02-20 Thread Matthias J. Sax
> when a stream app get restarted, can the store data >> directly loaded from this folder? Yes. That's the purpose of local state. > because I see there is very heavy traffic >> on the network to read from broker, assuming it's trying to rebuild the >> store. This should only happen if the loca

Re: Kafka streams application unaware of connection to broker lost

2019-02-20 Thread Matthias J. Sax
It's a known issue: https://issues.apache.org/jira/browse/KAFKA-6520 On 2/20/19 3:25 AM, Javier Arias Losada wrote: > Hello Kafka users, > > working on a Kafka-Streams stateless application; we want to implement some > healthchecks so that whenever connection to Kafka is lost for more than a > t

Re: Questions on Exactly Once Semantics

2019-02-19 Thread Matthias J. Sax
Even if the question was sent 4 times to the mailing list, I am only answering is exactly-once (sorry for the bad joke -- could not resist...) You have to distinguish between "idempotent producer" and "transactional producer". If you enable idempotent writes (config `enable.idempotence`), your p

Re: [ANNOUNCE] New Committer: Randall Hauch

2019-02-15 Thread Matthias J. Sax
Congrats Randall! -Matthias On 2/14/19 6:16 PM, Guozhang Wang wrote: > Hello all, > > The PMC of Apache Kafka is happy to announce another new committer joining > the project today: we have invited Randall Hauch as a project committer and > he has accepted. > > Randall has been participating i

Re: Accessing Kafka stream's KTable underlying RocksDB memory usage

2019-02-15 Thread Matthias J. Sax
Cross posted on SO: https://stackoverflow.com/questions/54701449/accessing-kafka-streams-ktable-underlying-rocksdb-memory-usage On 2/14/19 9:24 PM, P. won wrote: > Hi, > > I have a kafka stream app that currently takes 3 topics and aggregates > them into a KTable. This app resides inside a micros

Re: [ANNOUNCE] New Committer: Bill Bejeck

2019-02-13 Thread Matthias J. Sax
Congrats! Well deserved! -Matthias On 2/13/19 4:56 PM, Guozhang Wang wrote: > Hello all, > > The PMC of Apache Kafka is happy to announce that we've added Bill Bejeck > as our newest project committer. > > Bill has been active in the Kafka community since 2015. He has made > significant contrib

Re: Can I query ktable/stream/store with SQL like statment.

2019-02-13 Thread Matthias J. Sax
t; call for this? > > Thanks, > Nan > > On Tue, Feb 12, 2019 at 6:17 PM Matthias J. Sax > wrote: > >> You could do a range query from "abc" to "abd" for example (in this >> case, you would need to make sure to check the result form the iterator >

Re: kstream transform forward to different topics

2019-02-13 Thread Matthias J. Sax
The goal of KIP-307 is a different one. It's about providing names to make debugging easier. Thus, I don't think "307 is doing it the wrong way" -- the question is, what problem is addressed, and KIP-307 addresses a different one as discussed on this question. -Matthias On 2/13/19 1:53 AM, Jan

Re: Can I query ktable/stream/store with SQL like statment.

2019-02-12 Thread Matthias J. Sax
You could do a range query from "abc" to "abd" for example (in this case, you would need to make sure to check the result form the iterator and drop "abd" though). Note, that range queries are executed on the raw bytes. Thus, you need to understand how the serializes you use work. In doubt, you ma

Re: Kafka exactly-once with multiple producers

2019-02-05 Thread Matthias J. Sax
Each producer will need to use it's own `transactional.id`. Otherwise, one producer would fence-off and "block" the other. Both producers can start transactions independently from each other, and also commit independently (or abort, or a mix of commit/abort between both). Messages of both producer

Re: Issues with KTable to KTable leftJoin

2019-02-01 Thread Matthias J. Sax
wrong to me. Calling a joiner with the old value > AFTER the new value, does not allow this function to know what's the last > known value for that message. > Still, even if you are not convinced this is an issue, I believe at least > the "leftJoin"'s Javadoc sho

Re: Kafka streams messages duplicates with non-overlapping gap-less windows

2019-02-01 Thread Matthias J. Sax
> > 0 - [1,2,3,4,5] > 1 - [6,7,8,9,10] > > But instead I found something like: > > key - value > > 0 - [1,2,3,4,5] > 0 - [2,3,4,5] > 1 - [6,7,8,9] > 1 - [6,7,8,9,10] > > So the question is, did I do something wrong tr

Re: Hopping Window

2019-02-01 Thread Matthias J. Sax
Compare: https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Stream+Usage+Patterns -Matthias On 1/29/19 8:50 AM, Rene Richard wrote: > I think I figured out my problem!!! > > > 19/01/29 12:40:01 INFO HBProcessor: > - > 19/01/

Re: Issues with KTable to KTable leftJoin

2019-02-01 Thread Matthias J. Sax
Sounds like expected behavior to me. Note, that by default, the result KTable for a KTable-KTable join is not materialized. To be able to compute the correct result for this case, we need to send the old and new join result downstream to allow the downstream join to compute the correct result. It'

Re: kafka stream depends on it's own derived table

2019-01-28 Thread Matthias J. Sax
It might be easier to use a `Transformer` with a state store. Each time you receive an input record, you first check if the parent entry is in the store. If yes, add the new record, otherwise not. -Matthias On 1/28/19 2:37 PM, Nan Xu wrote: > hi, > I was writing a simple stream app, all it does i

Re: Kafka streams messages duplicates with non-overlapping gap-less windows

2019-01-26 Thread Matthias J. Sax
I am not 100% sure, what you mean by >> I've a input topic where I'm 100% sure there are no duplicate keys or >> messaged If this is the case (ie, each key is unique), it would imply that each window contains exactly one record per key. Hence, why do you aggregate? Each aggregate would consist o

Re: Bug in TopologyTestDriver

2019-01-26 Thread Matthias J. Sax
This question is cross posted on SO. I answered it there: https://stackoverflow.com/questions/54372134/topologytestdriver-sending-incorrect-message-on-ktable-aggregations On 1/25/19 6:32 PM, Murilo Tavares wrote: > Hi > I am new to this mailing list, so not sure if this is the right place to >

Re: Kafka Consumer Not Assigned Partitions

2019-01-23 Thread Matthias J. Sax
Calling `consumer.subscribe()` is a local call. Only when you call `consumer.poll()` the consumer will connect to the broker to get its assignment. Thus, it's save to call `poll()` directly. `assignment()` will return the assignment only after the first `poll()` call. -Matthias On 1/23/19 9:00

Re: NullPointerException in KafkaStreams during startup

2019-01-21 Thread Matthias J. Sax
That is expected... It's not possible to change the subscription during a rolling restart. You need to stop all instances and afterwards start new instances with the new subscription. I did not look into the details of your change, but you might also need to reset your application before starting

Re: Why do the offsets of the consumer-group (app-id) of my Kafka Streams Application get reset after application restart?

2019-01-20 Thread Matthias J. Sax
Seems this question was cross posted on SO: https://stackoverflow.com/questions/54145281/why-do-the-offsets-of-the-consumer-group-app-id-of-my-kafka-streams-applicatio On 1/14/19 8:49 AM, Jonathan Santilli wrote: > Hello Bill, thanks a lot for the reply, > I will implement your recommendation abo

Re: Warning when adding GlobalKTable to toplogy

2019-01-19 Thread Matthias J. Sax
I don't thinks you need to worry about it. However, it sound like a bug. The main consumer should not subscribe to those topics. Could you open a JIRA for this, as "minor bug". -Matthias On 1/19/19 9:48 AM, Patrik Kleindl wrote: > Hi > That is because the global tables are handled separately by

Re: max.task.idle.ms behavior

2019-01-18 Thread Matthias J. Sax
date message in a particular polled > chunk, it reverts back to NORMAL processing mode... > > Is my understanding correct? > > Thanks, Peter > > On 1/16/19 8:15 AM, Matthias J. Sax wrote: >> The parameter applies too all three topics (input, intermediate, >> repartition

Re: max.task.idle.ms behavior

2019-01-15 Thread Matthias J. Sax
The parameter applies too all three topics (input, intermediate, repartitions topics) and it's a global config. About the blocking behavior: If one partitions becomes empty, all other partitions are paused() and Streams only poll() for the empty partition. If no data is returned within the timeou

Re: message ordering within a transaction

2019-01-08 Thread Matthias J. Sax
For this case no reordering will occur, but the original send order will be preserved. This is even true without transactions if you use idempotent producer. -Matthias On 1/6/19 6:00 PM, Mark Horton wrote: > I was curious if re-ordering is possible within a transaction? Let's > say a KafkaProduc

Re: Why do I get an IllegalStateException when accessing record metadata?

2019-01-03 Thread Matthias J. Sax
I see. When updating the FAQ, it should be clear what you mean. Your current proposal was unclear to me, and thus, it might be unclear to other users, too. -Matthias On 1/3/19 9:13 PM, Eric Lalonde wrote: > > >> On Jan 2, 2019, at 6:31 AM, Matthias J. Sax > <mailto:mat

Re: Why do I get an IllegalStateException when accessing record metadata?

2019-01-02 Thread Matthias J. Sax
Thanks for reporting this. Feel free to edit the Wiki with the FAQ directly. What is unclear to me: what do you mean by "the state store [...] was errantly scoped to the TransformerProvider, not the Transformer" ? I would like to understand the actual issue. -Matthias On 12/31/18 2:36 AM, Eric

Re: Whey does the window final result is not emitted after the window has elapsed?

2019-01-02 Thread Matthias J. Sax
> After some time, the window closes. This is not correct. Windows are based on event-time, and because no new input record is processed, the window is not closed. That is the reason why you don't get any output. Only a new input record can advance "stream time" and close the window. In practice,

Re: UnknownProducerIdException problems

2019-01-02 Thread Matthias J. Sax
This is a known issues, but you don't need to worry about it too much. You can just create a new producer for this case an continue. For more details see https://cwiki.apache.org/confluence/display/KAFKA/KIP-360%3A+Improve+handling+of+unknown+producer -Matthias On 1/2/19 1:05 AM, 이도현 wrote: > I

Re: steram cannot re-initialize the task

2018-12-28 Thread Matthias J. Sax
Hard to say what the root cause is. If you get `OutOfMemoryError`, it seems the you need to increase the memory to provide to the JVM? Does this happen before or after the log entry? Also, can you verify that your Kafka cluster is healthy? -Matthias On 12/28/18 4:30 PM, 徐华 wrote: > Hi, > > Wh

Re: How does the /tmp/kafka-streams folder work?

2018-12-28 Thread Matthias J. Sax
ast message per key is kept and others are discarded)? > > Regards, Peter > > On 12/28/18 4:13 PM, Matthias J. Sax wrote: >>>> Is it really necessary to keep the whole log topic of a particular >>>> local >>>> store? Such log will grow indefinitel

Re: How does the /tmp/kafka-streams folder work?

2018-12-28 Thread Matthias J. Sax
ted without any data loss. -Matthias On 12/28/18 1:42 PM, Peter Levart wrote: > Hi Matthias, > > Just a couple of questions about that... > > On 12/27/18 9:57 PM, Matthias J. Sax wrote: >> All data is backed in the Kafka cluster. Data that is stored locally, is >&g

Re: How does the /tmp/kafka-streams folder work?

2018-12-27 Thread Matthias J. Sax
All data is backed in the Kafka cluster. Data that is stored locally, is basically a cache, and Kafka Streams will recreate the local data if you loose it. Thus, I am not sure how the KTable data could be stale. One possibility might be a miss-configuration: I assume that you read the topic direct

Re: stream shutdown error

2018-12-27 Thread Matthias J. Sax
I am not 100% sure what you exactly do to upgrade to 2.1. But if you add a `suppress()` operator, you change your topology and therefore the old and new program is not compatible any longer. To avoid breaking changes like this, you can now (since 2.1) name all operators, stores, and changelog topi

Re: Local city Kafka user Meet-up

2018-12-22 Thread Matthias J. Sax
You can open a PR against https://github.com/apache/kafka-site repo for web page updates. For the wiki, if you create an account, we can give you write access if you share your wiki account ID here :) -Matthias On 12/22/18 8:56 AM, Ascot Moss wrote: > Hi, > > From Kafka site, I can see there

Re: High end-to-end latency with processing.guarantee=exactly_once

2018-12-20 Thread Matthias J. Sax
The problem is repartitions topics: Kafka Streams considers those topics as transient and purges consumed data aggressively (cf https://issues.apache.org/jira/browse/KAFKA-6150) resulting in lost producer state for those topics :( -Matthias On 12/20/18 3:18 AM, Dmitry Minkovsky wrote: > Also, I

Re: MAX_TASK_IDLE_MS_CONFIG not working?

2018-12-14 Thread Matthias J. Sax
If you have out-of-order data, there is no guarantee that the current record has larger timestamp than previous record. Data is still processed in offset order. Also, you use selectKey() and groupByKey() thus triggering a repartioning that may introduce out-of-order data downstream even if all you

Re: Session Window emitting null values when converted to stream?

2018-12-02 Thread Matthias J. Sax
The nulls are expected. It's not about expired session windows though: sessions window are stored as `<(key,start-timestamp,end-timestamp), value>`. If the window boundaries changed due to new incoming events (or maybe a merge of two windows due to late arriving records), the window is updated via

Re: Partial updates in Kafka Streams stores

2018-11-30 Thread Matthias J. Sax
ere is a network outage. Will the node that picks up the partition see the > first update? > > Thank you > > > Sent with ProtonMail Secure Email. > > ‐‐‐ Original Message ‐‐‐ > On Friday, 30 November 2018 21:23, Matthias J. Sax > wrote: > >> You

Re: Partial updates in Kafka Streams stores

2018-11-30 Thread Matthias J. Sax
You can enable exactly-once processing guarantees to guard against inconsistent stores in fail over scenario. -Matthias On 11/30/18 12:57 AM, Yoshimo wrote: > Hello Kafka users, > > I am currently building a Kafka Streams application and I am using a > transform step with two KeyValue stores,

Re: kafka-streams batch restore state

2018-11-29 Thread Matthias J. Sax
`kill -9` will force-stop the application the hard way, thus, it cannot write a checkpoint file that it needs for a clean restart. Just you should use `kill` (SIGTERM, not SIGKILL == -9) to tell Kafka Streams to shutdown gracefully. This will allow Kafka Streams to write a checkpoint file. This wa

Re: kafka-streams batch restore state

2018-11-29 Thread Matthias J. Sax
Not sure. Maybe you can provide a larger part of the log? Maybe in DEBUG level? -Matthias On 11/28/18 10:08 PM, meigong.wang wrote: > When I restart my kafka-streams application, sometimes I can see following > log: > > > [Consumer > clientId=market-kline-stream-wmg-ticker-d891eeaf-2932-45b

Re: The limit on the number of consumers in a group.

2018-11-27 Thread Matthias J. Sax
Rebalancing is expected to take longer for larger groups. But it should work nevertheless. I would recommend to dig into the logs: does a single rebalance "hang" or do you get multiple rebalances triggered after each other? -Matthias On 10/23/18 12:16 AM, Dominic Kim wrote: > Dear all. > > Is

Re: UNKNOWN_PRODUCER_ID error when running Streams WordCount demowith processing.guarantee set to EXACTLY_ONCE

2018-11-22 Thread Matthias J. Sax
The broker side problem is there since 0.11.0. However, it is not exposed using older Kafka Streams versions. Since version 1.1.0, Kafka Streams uses the new purge data capabilities to reduce storage requirements for repartition topics (https://issues.apache.org/jira/browse/KAFKA-6150). Because o

Re: Producer throughput with varying acks=0,1,-1

2018-11-19 Thread Matthias J. Sax
Nokia - IN/Bangalore) wrote: > But if sends are not done in blocking way (with .get()) how does acks matter ? > > -Original Message----- > From: Matthias J. Sax > Sent: Saturday, November 17, 2018 12:15 AM > To: users@kafka.apache.org > Subject: Re: Producer throughput

Re: Offsets/Lags for global state stores not shown

2018-11-18 Thread Matthias J. Sax
ch valid or would you expect some pitfalls? > > We have not used this approach more because it doesn't not work for global > stores inside a streams application, but it might be beneficial to split > that up again. > > best regards > > Patrik > > On Tue, 6 Nov 2018

Re: Producer throughput with varying acks=0,1,-1

2018-11-16 Thread Matthias J. Sax
I you enable acks, it's not fire and forget any longer. -Matthias On 11/16/18 1:00 AM, Abhishek Choudhary wrote: > Hi, > > I have been doing some performance tests with kafka cluster for my project. > I have a question regarding the send call and the 'acks' property of > producer. I observed bel

Re: Kafka Streams Session store performance degradation from 0.10.2.1 to 0.11.0.3

2018-11-08 Thread Matthias J. Sax
U Cache much slower for our use case. What would you recommend for our > next steps? > > Jonathan > > On 2018/11/06 19:22:16, "Matthias J. Sax" wrote: >> Not sure atm why you see a performance degradation. Would need to dig >> into the details. >> >

Re: Kafka Streams Session store performance degradation from 0.10.2.1 to 0.11.0.3

2018-11-06 Thread Matthias J. Sax
Not sure atm why you see a performance degradation. Would need to dig into the details. However, did you consider to upgrade to 2.0 instead or 0.11? Also note that we added a new operator `suppress()` in upcoming 2.1 release, that allows you to do rate control without caching: https://cwiki.apach

Re: Offsets/Lags for global state stores not shown

2018-11-06 Thread Matthias J. Sax
The topics of global stores are not included by design. The "problem" is, that each instance needs to consume *all* topic-partitions from and thus topis, we thus they cannot be include into the consumer group that would assign each partition to exactly one instance. Hence, an additional consumer i

Re: Bemchmarks for KTable Joins and Queries

2018-11-04 Thread Matthias J. Sax
ded in-line to your > questions below. Thank you for taking the time to understand my question. > > On Fri, Nov 2, 2018 at 1:28 PM Matthias J. Sax > wrote: > >>>> At a high level, KTables provide a capability to query for data. >> >> Can you elaborate? Do you

Re: Bemchmarks for KTable Joins and Queries

2018-11-02 Thread Matthias J. Sax
Tables... > > That's the kind of information in terms of benchmarks I'd be interested in > knowing exists or not. > > Thank you, > > Tom > > On Thu, Nov 1, 2018, 16:24 Matthias J. Sax >> I am not aware if benchmarks, but want to point out,

Re: consumer fetch multiple topic partitions committed offset

2018-11-02 Thread Matthias J. Sax
end multiple request to > coordinator, maybe block in the middle request, why not accumulate the > multiple TopicPartitions and send one fetch offset request to coordinator? > >> On Nov 2, 2018, at 04:20, Matthias J. Sax wrote: >> >> You need to call `committed()` multiple times.

Re: Bemchmarks for KTable Joins and Queries

2018-11-01 Thread Matthias J. Sax
I am not aware if benchmarks, but want to point out, that KTables work somewhat different to relational database system. Thus, you might want to evaluate not base on performance, but on the semantics KTable provide. Recall, that Kafka Streams is a stream processing library while a database system

Re: consumer fetch multiple topic partitions committed offset

2018-11-01 Thread Matthias J. Sax
You need to call `committed()` multiple times. -Matthias On 11/1/18 12:28 AM, hacker win7 wrote: > Hi, > > After reviewing the KafkaConsumer source about API of *committed():* > I found that old consumer support committed(mutipleTopicPartitions) to > return multiple committed offset, while in ne

Re: Despite of log.retention.hours=168 the log files of messages sente yesterday are not present

2018-11-01 Thread Matthias J. Sax
/tmp/kafka-logs is just a convenient default, but not a reliable folder to store data. /tmp/ might be cleared by the operation system. Note that the quickstart is not designed to give "production ready" configuration etc. It's just to play with the system. You should change the config accordingly.

Re: Error when using mockSchemaregistry

2018-10-29 Thread Matthias J. Sax
cer record .I am kind of able to > write this for simple data types but have no luck with avro. > > Produces record > topic > consumer > doWork() > produce . > > On Mon, Oct 29, 2018 at 4:50 PM Matthias J. Sax > wrote: > >> You set: >>

Re: Get count of messages

2018-10-29 Thread Matthias J. Sax
That is quite expensive to do... Might be best to write a short Java program that uses Consumer#endOffset() and Consumer#beginningOffsets() -Matthias On 10/29/18 3:02 PM, Burton Williams wrote: > you can user kafkacat starting at that offset to head and pipe the output > to "wc -l" (word count)

Re: Error when using mockSchemaregistry

2018-10-29 Thread Matthias J. Sax
You set: > senderProps.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG,avroSerializer.getClass()); This will tell the Producer to create a new AvroSerailizer object, and this object expects "schema.registry.url" to be set during initialization, ie, you need to add the config to `senderProps`. H

Re: Problem with kafka-streams aggregate windowedBy

2018-10-29 Thread Matthias J. Sax
Make sure to call `KafkaStreams#close()` to get the latest offsets committed. Beside this, you can check the consumer and Streams logs in DEBUG mode, to see what offset is picked up (or not). -Matthias On 10/29/18 11:43 AM, Patrik Kleindl wrote: > Hi > How long does your application run? More t

Re: Converting a Stream to a Table - groupBy/reduce vs. stream.to/builder.table

2018-10-26 Thread Matthias J. Sax
te: > >> Hello Matthias, >> thank you for the explanation. >> Streaming back to a topic and consuming this as a KTable does respect the >> null values as deletes, correct? But at the price of some overhead. >> Is there any (historical, technical or emotional;-)) re

Re: Consumer Pause & Scheduled Resume

2018-10-25 Thread Matthias J. Sax
not safe for multi threaded > access . I am trying to see how can call pause and resume on the same > thread. There will be only one thread running for consumption. > > > On Wed, Oct 24, 2018 at 3:43 PM Matthias J. Sax > wrote: > >> There is no issue if you call `poll

Re: running kafka streams inside kafka connect

2018-10-25 Thread Matthias J. Sax
Streams is not designed to be run inside Connect, and this won't work. What you can do is, to import the data via connect into a "staging topic" and then read this "staging topic" with a Kafka Streams application and apply the transformations etc to write the data into the actual target topics.

Re: Converting a Stream to a Table - groupBy/reduce vs. stream.to/builder.table

2018-10-25 Thread Matthias J. Sax
Patrik, `null` values in a KStream don't have delete semantics (it's not a changelog stream). That's why we drop them in the KStream#reduce implemenation. If you want to explicitly remove results for a key from the result KTable, your `Reducer#apply()` implementation must return `null` -- the res

Re: Consumer Pause & Scheduled Resume

2018-10-24 Thread Matthias J. Sax
There is no issue if you call `poll()` is all partitions are paused. If fact, if you want to make sure that the consumer does not fall out of the consumer group, you must call `poll()` in regular interval to not hit `max.poll.interval.ms` timeout. -Matthias On 10/24/18 10:25 AM, pradeep s wrote:

Re: kafka client 1.1.0 broker compatibility

2018-10-22 Thread Matthias J. Sax
If Logstash's internal client is 1.1.0, it is be compatible with Kafka brokers 2.0.0. Note: Brokers are always backward compatible to older clients (your case). Additionally, since, 0.10.0.0 release, broker are also forward compatible to newer clients. -Matthias On 10/22/18 4:06 AM, Dayananda S

Re: Kafka Streams "processed" definition

2018-10-22 Thread Matthias J. Sax
Same reply as to your other email asking the same question: > A message is considered processed, if all state updates are done and all > output messages are written. > > Note, that this notion of "processed" is based on sub-topologies, but > not the full topology. > > Hope this helps. > > > -M

Re: Kafka commit interval

2018-10-19 Thread Matthias J. Sax
There is not upper limit. And yes, you are right about rebalancing. This would be an issue and yes, you can use the rebalance listener to address it (it's the purpose of the rebalance listener to be used for cases like this). -Matthias On 10/16/18 2:19 PM, pradeep s wrote: > Hi, > I have a usec

Re: Kafka Streams, when is considered processed?

2018-10-19 Thread Matthias J. Sax
A message is considered processed, if all state updates are done and all output messages are written. Note, that this notion of "processed" is based on sub-topologies, but not the full topology. Hope this helps. -Matthias On 10/18/18 4:28 AM, Tobias Johansson wrote: > Hi, > > > I can't find

Re: Kafka on Eclipse Setup

2018-10-15 Thread Matthias J. Sax
You can try to run "mvn eclipse:eclipse" can restart Eclipse afterwards. -Matthias On 10/15/18 8:41 AM, James Kwan wrote: > Vahid > > I have followed the link, but it does not resolve the stream compilation > issues. I will try Manna’s suggestion. The issue is not to start a > zookeeper or K

Re: Kafka UnknownProducerIdException

2018-10-12 Thread Matthias J. Sax
This error is related to Kafka EOS feature. I am not familiar with Golden Gate though. The error can happen, if the state of a transactional producer is lost. The state is stored in the topic itself, thus, it could be lost due to long truncation (for example, if a producer does not send data for a

Re: [ANNOUNCE] New Committer: Manikumar Reddy

2018-10-11 Thread Matthias J. Sax
Congrats! On 10/11/18 2:31 PM, Yishun Guan wrote: > Congrats Manikumar! > On Thu, Oct 11, 2018 at 1:20 PM Sönke Liebau > wrote: >> >> Great news, congratulations Manikumar!! >> >> On Thu, Oct 11, 2018 at 9:08 PM Vahid Hashemian >> wrote: >> >>> Congrats Manikumar! >>> >>> On Thu, Oct 11, 2018 a

Re: Global/Restore consumer use auto.offset.reset = none vs. OffsetOutOfRangeException

2018-10-04 Thread Matthias J. Sax
e case is closed for me at the moment. > > Thanks again and best regards > Patrik > > On Thu, 4 Oct 2018 at 02:58, Matthias J. Sax wrote: > >> I double checked the code and discussed with a colleague. >> >> There are two places when we call `globalConsumer.poll

Re: Global/Restore consumer use auto.offset.reset = none vs. OffsetOutOfRangeException

2018-10-03 Thread Matthias J. Sax
ubscriptions.requestOffsetReset(tp); > } else { > throw new > OffsetOutOfRangeException(Collections.singletonMap(tp, fetchOffset)); > } > > So this means that for global/restore the exception will always be thrown > without some specia

Re: Global/Restore consumer use auto.offset.reset = none vs. OffsetOutOfRangeException

2018-10-02 Thread Matthias J. Sax
It is by design to set the reset policy to "none" (https://issues.apache.org/jira/browse/KAFKA-6121), and not allowed by design to overwrite this (there might be a workaround for you though). However, Streams should not die but catch the exception and recover from it automatically. What version do

Re: Asking about how to consume the message from follower partition

2018-10-01 Thread Matthias J. Sax
s. > > It's also not uncommon to have different replicas in different availability > zones or racks. Consumers may prefer reading from their local AZ, even if > that includes followers. > > Ryanne > > On Sun, Sep 30, 2018, 1:13 PM Matthias J. Sax wrote: > >> Kafka

Re: Asking about how to consume the message from follower partition

2018-09-30 Thread Matthias J. Sax
Kafka scales via partitions, not via replication. Replication guarantees fault-tolerance. Thus, it's not possible atm, to consume from follower partitions -- also note, even if different consumer would read from follower partitions, each of them would read the same data. I assume, you want differen

Re: Subtractor

2018-09-25 Thread Matthias J. Sax
`Substractor` in only needed for KTable aggregations. For `KStream` aggregations it's not needed: for this case, each window is materialized into one row of the result KTable and just updated with the `Aggregator` for each incoming record (ie, corresponding to key -- and timestamp, for windowed-ag

Re: [ANNOUNCE] New committer: Colin McCabe

2018-09-25 Thread Matthias J. Sax
Congrats Colin! The was over due for some time :) -Matthias On 9/25/18 1:51 AM, Edoardo Comar wrote: > Congratulations Colin ! > -- > > Edoardo Comar > > IBM Event Streams > IBM UK Ltd, Hursley Park, SO21 2JN > > > > > From: Ismael Juma > T

Re: State flush & recovery during failures

2018-09-23 Thread Matthias J. Sax
ons get restored after failure? If not what is the right way > to do so? Is it by storing the info currently active punctuations in some > state and by scheduling them again in the init method? > > Thanks, > Vishnu > > On Thu, Sep 20, 2018 at 3:42 PM Matthias J. Sax > wro

Re: Subcribe kafka

2018-09-21 Thread Matthias J. Sax
Follow instructions here: https://kafka.apache.org/contact On 9/21/18 5:01 AM, ?? wrote: > Hi: > I am new a user??I want to subscribe Kafka > signature.asc Description: OpenPGP digital signature

Re: State flush & recovery during failures

2018-09-20 Thread Matthias J. Sax
1. The record will be re-read, but the state would not be re-build (ie, no undo of step (2). Thus, on re-processing you would add the record again, and you would "over count" in step (3) -- trigger would still fire I assume. 2. I assume, by "forward" you mean writing to an output topic: If you wri

Re: Understanding default.deserialization.exception.handler

2018-09-19 Thread Matthias J. Sax
to spend a fair amount of time working out what was going on I thought I'd > flag it up in case it wasn't known about. > > Tim Ward > > -----Original Message- > From: Matthias J. Sax > Sent: 14 September 2018 19:57 > To: u

Re: Questions about manage offset in external storage and consumer failure detect

2018-09-19 Thread Matthias J. Sax
1. If you don't have a good reason to store offsets externally, I would not recommend it, but use the client's built-in mechanism. It will be more work (ie, code you need to write) if you store offsets externally. For some use-cases, it's beneficial to store the offsets in an external system to ge

Re: Question regarding consumer getting a new message on the topic subscribed poll/select

2018-09-18 Thread Matthias J. Sax
You will need to `poll()` -- that is how Kafka works. The brokers practically don't even know that a consumer exist and thus cannot sent any notification. It's be design. -Matthias On 9/18/18 8:26 AM, Siddhartha Khaitan wrote: > Hello, > > I have a basic question. > > Lets say a kafka consume

Re: Best way for reading all messages and close

2018-09-14 Thread Matthias J. Sax
Using Kafka Streams this is a little tricky. The API itself has no built-in mechanism to do this. You would need to monitor the lag of the application, and if the lag is zero (assuming you don't write new data into the topic in parallel), terminate the application. -Matthias On 9/14/18 4:19 AM,

Re: Understanding default.deserialization.exception.handler

2018-09-14 Thread Matthias J. Sax
Your observation is correct. It's a known bug: https://issues.apache.org/jira/browse/KAFKA-6502 In practice, it should not be a big issue though. - you would only hit this bug if you don't process a "good message" afterwards - even if you hit this bug, you would just skip the message again Thus

Re: SAM Scala aggregate

2018-09-09 Thread Matthias J. Sax
Why can't you use Kafka Streams 2.0? Note: Kafka Streams is backward compatible and it can connect to older brokers -- it's not required to upgrade your cluster to use Kafka Streams 2.0 -- updating you maven/gradle dependency is sufficient. Also, AFAIK SAM conversions are only available in Scala

Re: Kafka stream only consume n messages

2018-09-05 Thread Matthias J. Sax
1. There is not API for this 2. I guess it might be possible, but might not be the best way to do it. 3. That is also not possible. I would recommend something like this: > final AtomicBoolean shutdown = new AtomicBoolean(false); > > StreamsBuilder builder = ... > > KStream stream = builder

Re: GlobalKTable/KTable initialization differences

2018-09-05 Thread Matthias J. Sax
I create https://issues.apache.org/jira/browse/KAFKA-7380 to track this. -Matthias On 8/27/18 12:07 PM, Guozhang Wang wrote: > Hello Patrik, > > Thanks for the email and this is indeed a good question! :) > > There are some historic reasons that we did the global state restoration in > a differ

<    1   2   3   4   5   6   7   8   9   10   >