Re: Controlled topics creation in Kafka cluster, how does that affect a Kafka Streams App that uses join?

2018-09-05 Thread Matthias J. Sax
You can figure out the names via builder.build().describe() The naming pattern is --[repartition|changelog] You will also need to create those topic with the correct number of partitions (usually the number of input topic partitions). With this information, you can pre-create the topics and Kaf

Re: Kafka-streams: setting internal topics cleanup policy to delete doesn't work

2018-09-02 Thread Matthias J. Sax
> I wanted to set *retention bytes* and change *cleanup policy* to *delete* > to prevent the storage being full. I set following configs in kafka > streams code: Are you sure that you use Kafka Streams correctly? Seems like a miss configuration to switch from compaction to deletion policy. Also n

Re: Regarding issue - https://lists.apache.org/thread.html/1f2ffc93483cbe71167fa47875c5ecda8dbcd5275d3d41b5af3220d9@%3Cusers.kafka.apache.org%3E

2018-08-25 Thread Matthias J. Sax
No. The cleanup interval configures when old state, that is not longer used, will be deleted. This does not imply a TTL feature. It's about tasks that got assigned to a different KafkaStreams instance. State would only grow unbounded if your program increases the state unbounded. For example, if

Re: Is it possible to send a message more than once with transactional.id set?

2018-08-22 Thread Matthias J. Sax
I would assume, that you refer to "commit markers". Each time you call commitTransaction(), a special message called commit marker is written to the log to indicate a successful transaction (there are also "abort markers" if a transaction gets aborted). Those markers "eat up" one offset, but wont'

Re: Improve error message when trying to produce message without key for compacted topic

2018-08-21 Thread Matthias J. Sax
That makes sense to me. Please file a Jira. Thanks a lot for reporting this issue! -Matthias On 8/21/18 3:29 AM, Patrik Kleindl wrote: > Hello > > Yesterday we had the following exception: > > Exception thrown when sending a message with key='null' and payload='...' > to topic sometopic:: org

Re: Issue in Kafka 2.0.0 ?

2018-08-20 Thread Matthias J. Sax
Thanks for reporting and for creating the ticket! -Matthias On 8/20/18 5:17 PM, Ted Yu wrote: > I was able to reproduce what you saw with modification > to StreamToTableJoinScalaIntegrationTestImplicitSerdes.scala > I have logged KAFKA-7316 and am looking for a fix. > > FYI > > On Mon, Aug 20,

Re: kafka stream latency

2018-08-18 Thread Matthias J. Sax
I cannot spot any obvious reasons. As you consume from the result topic for verification, we should verify that the latency spikes original on write and not on read: you might want to have a look into Kafka Streams JMX metric to see if processing latency spikes or throughput drops. Also watch for

Re: Usage of cleanup.policy=compact,delete

2018-08-12 Thread Matthias J. Sax
I think, at command line tool level, you need to use --add-config log.cleanup.policy=[compact,delete] ie, you need to add square bracket to mark the config as a list. This is different to Java code for which you would use props.put("log.cleanup.policy", "compact,delete"); The config should be

Re: Error persisting into KeyValueStore

2018-08-07 Thread Matthias J. Sax
You might hit a RockDB issues as reported here: https://issues.apache.org/jira/browse/KAFKA-6327 -Matthias On 8/6/18 11:07 PM, Siva Ram wrote: > Hi, > > We switched to a virtual machine (from physical node) and we are observing > the following exception occurs and all our the stream application

Re: Viewing transactional markers in client

2018-08-02 Thread Matthias J. Sax
Messages don't contain any information to what transaction they belong. Note, that `transaction markers`, as special messages that only indicate a commit or abort. There is no `transaction marker` _in_ a message. -Matthias On 8/2/18 12:04 PM, ma...@kafkatool.com wrote: > > > This would

Re: Creating and deleting Kafka Topics in Scala App

2018-08-01 Thread Matthias J. Sax
t. But it doesn't work either. > > Pulkit > > On Tue, Jul 31, 2018 at 9:22 PM, Matthias J. Sax > wrote: > >> Is `delete.topic.enable` set to `true`? It's a broker configuration. >> >> >> -Matthias >> >> On 7/31/18 8:57 AM, Pulkit

Re: Creating and deleting Kafka Topics in Scala App

2018-07-31 Thread Matthias J. Sax
Is `delete.topic.enable` set to `true`? It's a broker configuration. -Matthias On 7/31/18 8:57 AM, Pulkit Manchanda wrote: > HI All, > > I am want to create and delete Kafka topics on runtime in my Application. > I followed few projects on GitHub like > https://github.com/simplesteph/kafka-0.11

Re: Viewing transactional markers in client

2018-07-31 Thread Matthias J. Sax
No. Transaction markers are not exposed. Why would you need them? They are considered implementation details. -Matthias On 7/27/18 5:58 AM, ma...@kafkatool.com wrote: > Is there any way for a KafkaConsumer to view/get the transactional > marker messages? > > -- > Best regards, > Mark

Re: Kafka Streams: Share state store across processors

2018-07-18 Thread Matthias J. Sax
31y8> > groups.google.com > Posted 11/13/17 2:50 AM, 8 messages > > > > From: Matthias J. Sax > Sent: Wednesday, July 18, 2018 10:20:02 AM > To: users@kafka.apache.org > Subject: Re: Kafka Streams: Share state store across processors &

Re: [kafka-clients] Re: [VOTE] 1.1.1 RC3

2018-07-18 Thread Matthias J. Sax
Thanks Dong. I am a little late, but +1, too. - verified signatures - build from sources - run unit test suite - run streams quickstart Thanks for running the release! -Matthias On 7/18/18 10:24 AM, Dong Lin wrote: > Thank you all for taking time to certify and vote for the release! > >

Re: Kafka Streams: Share state store across processors

2018-07-18 Thread Matthias J. Sax
You can connect both stores to both processor for this. -Matthias On 7/17/18 11:12 PM, Druhin Sagar Goel wrote: > Hi, > > I am new to the Kafka Streams framework. I have the following streams use > case: > > State store A > State store B > > Processor A > Processor B > > State store A is onl

Re: Kafka-streams calling subtractor with null aggregator value in KGroupedTable.reduce() and other weirdness

2018-07-17 Thread Matthias J. Sax
h cases (not a general case) > can be improved. > On Tue, Jul 17, 2018 at 1:48 AM Matthias J. Sax wrote: >> >> It is not possible to use a single message, because both messages may go >> to different partitions and may be processed by different applications >> instan

Re: Kafka-streams calling subtractor with null aggregator value in KGroupedTable.reduce() and other weirdness

2018-07-16 Thread Matthias J. Sax
> > Would this change make sense? > On Mon, Jul 16, 2018 at 10:34 PM Matthias J. Sax > wrote: >> >> Vasily, >> >> yes, it can happen. As you noticed, both messages might be processed on >> different machines. Thus, Kafka Streams provides 'eventual cons

Re: Kafka-streams calling subtractor with null aggregator value in KGroupedTable.reduce() and other weirdness

2018-07-16 Thread Matthias J. Sax
Vasily, yes, it can happen. As you noticed, both messages might be processed on different machines. Thus, Kafka Streams provides 'eventual consistency' guarantees. -Matthias On 7/16/18 6:51 AM, Vasily Sulatskov wrote: > Hi John, > > Thanks a lot for you explanation. It does make much more sens

[ANNOUCE] Apache Kafka 1.0.2 Released

2018-07-15 Thread Matthias J. Sax
York Times, Uber, Yelp, and Zalando, among others. A big thank you for the following 32 contributors to this release! Matthias J. Sax, Rajini Sivaram, Anna Povzner, Jason Gustafson, Ewen Cheslack-Postava, Guozhang Wang, Dong Lin, huxi, John Roesler, Ismael Juma, Jun Rao, Manikumar Reddy O, Max

Re: Apache Kafka QuickStart

2018-07-15 Thread Matthias J. Sax
> After executing the first command to start zookeeper, do i have to open a > Terminal to run the Kafka Server? Yes. What is the problem with this? "but run into problem in Step 2" If you follow the quickstart, you download the binaries and start multiple processes (Zookeeper, Kafka Broker, Ka

Re: Questions about state stores and KSQL

2018-07-15 Thread Matthias J. Sax
To understand joins better, you might want to check out: https://www.confluent.io/blog/crossing-streams-joins-apache-kafka/ KSQL uses the same join semantics as Kafka Streams. -Matthias On 7/11/18 8:01 AM, Guozhang Wang wrote: > Hello Jonathan, > > At the very high-level, KSQL statements is c

Re: Kafka Streams Application Failing to Start Due to State Store Recovery Time Exceeding Producer Transaction Timeout

2018-07-10 Thread Matthias J. Sax
d Kafka Streams on version 1.1.0 so > I wonder why this issue is still occurring? > > -David > >> On Jul 10, 2018, at 9:38 AM, Matthias J. Sax wrote: >> >> Can it be, that you hit: https://issues.apache.org/jira/browse/KAFKA-6634 >> >> -Matthias >

Re: cleanup.policy - doesn't accept compact,delete

2018-07-10 Thread Matthias J. Sax
Try to remove the space after the comma. -Matthias On 7/10/18 10:43 AM, Jayaraman, AshokKumar (CCI-Atlanta-CON) wrote: > Hi, > > When we try to use the same (square brackets), the internal topics are > failing to get created. Any suggestions? > > changelogConfig.put("cleanup.policy", "[compac

Re: flatMapValues does not calculate timestamp for each record generated

2018-07-10 Thread Matthias J. Sax
w (instead of two), which gives > incorrect result. > > But when I look at the API document, I had the impression that it would > behave like sending individual metrics. > > Hopefully I gave a better explanation of what I try to achieve. > > Thanks, > Sicheng > >

Re: Kafka Streams Application Failing to Start Due to State Store Recovery Time Exceeding Producer Transaction Timeout

2018-07-10 Thread Matthias J. Sax
Can it be, that you hit: https://issues.apache.org/jira/browse/KAFKA-6634 -Matthias On 7/9/18 7:58 PM, David Chu wrote: > I have a Kafka Streams application which is currently failing to start due to > the following ProducerFencedException: > > "Caused by: org.apache.kafka.common.errors.Produce

Re: flatMapValues does not calculate timestamp for each record generated

2018-07-10 Thread Matthias J. Sax
Not sure what you mean by "does not reset recordContext". Note, that the "contract" for `flatMapValues` is that the output records inherit the timestamp of the input record. Not sure what behavior you expect? Maybe you can elaborate? -Matthias On 7/9/18 7:27 PM, Sicheng Liu wrote: > Hi, > > I

Re: [kafka-clients] [VOTE] 1.0.2 RC1

2018-07-08 Thread Matthias J. Sax
gt; Harsha > > On Mon, Jul 2nd, 2018 at 11:57 AM, Jun Rao <mailto:j...@confluent.io>> wrote: > > > > > > > > > Hi, Matthias, > > > > Thanks for the running the release. Verified quickstart on scala 2.12

Re: [kafka-clients] [VOTE] 1.0.2 RC1

2018-07-08 Thread Matthias J. Sax
>> wrote: > > > > > > > > > Hi, Matthias, > > > > Thanks for the running the release. Verified quickstart on scala 2.12 > > binary. +1 > > > > Jun > > > > On Fri, Jun 29, 2018 at 10:

[ANNOUNCE] Apache Kafka 0.10.2.2 Released

2018-07-03 Thread Matthias J. Sax
, Rabobank, Target, The New York Times, Uber, Yelp, and Zalando, among others. A big thank you for the following 30 contributors to this release! Ewen Cheslack-Postava, Matthias J. Sax, Randall Hauch, Eno Thereska, Damian Guy, Rajini Sivaram, Colin P. Mccabe, Kelvin Rutt, Kyle Winkelman, Max Zheng

[ANNOUNCE] Apache Kafka 0.11.0.3 Released

2018-07-03 Thread Matthias J. Sax
, Rabobank, Target, The New York Times, Uber, Yelp, and Zalando, among others. A big thank you for the following 26 contributors to this release! Matthias J. Sax, Ewen Cheslack-Postava, Konstantine Karantasis, Guozhang Wang, Rajini Sivaram, Randall Hauch, tedyu, Jagadesh Adireddi, Jarek Rudzinski

Re: [RESULTS] [VOTE] Release Kafka version 0.11.0.3

2018-07-02 Thread Matthias J. Sax
votes -1 votes * No votes -Matthias On 7/2/18 9:54 AM, Matthias J. Sax wrote: > This vote passes with 8 +1 votes (3 bindings) and no 0 or -1 votes. > > +1 votes PMC Members: > * Jun > * Rajini > * Ismael > > Committers: > * Matthias > > Community: > * Va

[RESULTS] [VOTE] Release Kafka version 0.11.0.3

2018-07-02 Thread Matthias J. Sax
This vote passes with 8 +1 votes (3 bindings) and no 0 or -1 votes. +1 votes PMC Members: * Jun * Rajini * Ismael Committers: * Matthias Community: * Vahid * Manikumar * Ted * Harash 0 votes * No votes -1 votes * No votes Vote thread: http://search-hadoop.com/m/Kafka/uyzND1yVsGqxIVN5?subj=+VO

Re: [kafka-clients] Re: [VOTE] 0.11.0.3 RC0

2018-07-02 Thread Matthias J. Sax
se! > > Ismael > > On Fri, Jun 22, 2018 at 3:14 PM Matthias J. Sax > mailto:matth...@confluent.io>> wrote: > > Hello Kafka users, developers and client-developers, > > This is the first candidate for release of Apache Kafka

[RESULTS] [VOTE] Release Kafka version 0.10.2.2

2018-07-02 Thread Matthias J. Sax
This vote passes with 5 +1 votes (3 bindings) and no 0 or -1 votes. +1 votes PMC Members: * Jason * Guozhang * Jun Committers: * Matthias Community: * Ted 0 votes * No votes -1 votes * No votes Vote thread: http://search-hadoop.com/m/Kafka/uyzND1Wt6721GMzOE1?subj=+VOTE+0+10+2+2+RC1 I'll cont

Re: [kafka-clients] [VOTE] 0.10.2.2 RC1

2018-07-02 Thread Matthias J. Sax
+1 -Matthias On 6/29/18 2:46 PM, Jun Rao wrote: > Hi, Matthias, > > Thanks for running the release. Verified quickstart on scala 2.12 binary. +1 > > Jun > > On Fri, Jun 22, 2018 at 6:43 PM, Matthias J. Sax <mailto:matth...@confluent.io>> wrote: > >

Re: Change in email id for Subscription

2018-07-02 Thread Matthias J. Sax
It's self-service. Unsubscribe with your old email and subscribe with your new one: https://kafka.apache.org/contact On 7/1/18 9:19 PM, Malik, Shibha (GE Renewable Energy, consultant) wrote: > Hi, > > I want to change my email id for subscription. Is this the right group to > email to ? > > Tha

Re: [kafka-clients] [VOTE] 1.1.1 RC2

2018-06-29 Thread Matthias J. Sax
Hi Dong, it seems that the kafka-streams-quickstart artifacts are missing. Is it just me or is the RC incomplete? -Matthias On 6/29/18 4:07 PM, Rajini Sivaram wrote: > Hi Dong, > > +1 (binding) > > Verified binary using quick start, ran tests from source, checked > release notes. > > Thanks

[VOTE] 1.0.2 RC1

2018-06-29 Thread Matthias J. Sax
Hello Kafka users, developers and client-developers, This is the second candidate for release of Apache Kafka 1.0.2. This is a bug fix release addressing 27 tickets: https://cwiki.apache.org/confluence/display/KAFKA/Release+Plan+1.0.2 Release notes for the 1.0.2 release: http://home.apache.org/~

Re: Visible intermediate state for KGroupedTable aggregation

2018-06-29 Thread Matthias J. Sax
:16 AM Matthias J. Sax > wrote: >> You cannot suppress those records, because both are required for >> correctness. Note, that each event might go to a different instance in >> the downstream aggregation -- that's why both records are required. >> >> Not sure

Re: Visible intermediate state for KGroupedTable aggregation

2018-06-29 Thread Matthias J. Sax
Hi, You cannot suppress those records, because both are required for correctness. Note, that each event might go to a different instance in the downstream aggregation -- that's why both records are required. Not sure what the problem for your business logic is. Note, that Kafka Streams provides e

Re: [VOTE] 1.0.2 RC0

2018-06-29 Thread Matthias J. Sax
t; >>>> Checked signatures. >>>> >>>> On Fri, Jun 22, 2018 at 11:42 AM, Vahid S Hashemian < >>>> vahidhashem...@us.ibm.com> wrote: >>>> >>>>> +1 (non-binding) >>>>> >>>>> Built from s

Re: Can multiple producers write to the same partition ?

2018-06-27 Thread Matthias J. Sax
Yes. If you do this, the writes of both producers with interleave and there are no ordering guarantees between records written by different producers. -Matthias On 6/27/18 11:26 AM, Malik, Shibha (GE Renewable Energy, consultant) wrote: > Hi, > > Can multiple producers write to the same partit

Re: Retries

2018-06-25 Thread Matthias J. Sax
gt; >> On Jun 24, 2018, at 8:03 PM, Matthias J. Sax wrote: >> >> Michael, >> >> It depends on the semantics you want to get. About retries in general, >> as long as a producer retries internally, you would not even notice. >> Only after retries are exhaus

Re: Retries

2018-06-24 Thread Matthias J. Sax
Michael, It depends on the semantics you want to get. About retries in general, as long as a producer retries internally, you would not even notice. Only after retries are exhausted, an exception is thrown. Kafka Streams allows you to implement a handler for this (cf https://kafka.apache.org/11/d

Re: Process messages from StateStore

2018-06-24 Thread Matthias J. Sax
Jozsef, Your question is a little unclear to me. > To detect lost messages For what topology? >> KTable inputTable = builder.table("inputTopic", >> Consumed.with(...).filter(...)); The code you show contains a `filter()` that can remove record? Could this be the issue? It's also unclear to m

Re: Kafka Streams Session store fetch latency very high with caching turned on

2018-06-24 Thread Matthias J. Sax
Sam, Thanks for your email. This is a very interesting find. I did not double check the code but your reasoning makes sense to me. Note, that caching was _not_ introduced to reduce the writes to RocksDB, but to reduce the write the the changelog topic and to reduce the number of records send downs

[VOTE] 0.10.2.2 RC1

2018-06-22 Thread Matthias J. Sax
Hello Kafka users, developers and client-developers, This is the second candidate for release of Apache Kafka 0.10.2.2. Note, that RC0 was created before the upgrade to Gradle 4.8.1 and thus, we discarded it in favor of RC1 (without sending out a email for RC0). This is a bug fix release closing

[VOTE] 0.11.0.3 RC0

2018-06-22 Thread Matthias J. Sax
Hello Kafka users, developers and client-developers, This is the first candidate for release of Apache Kafka 0.11.0.3. This is a bug fix release closing 27 tickets: https://cwiki.apache.org/confluence/display/KAFKA/Release+Plan+0.11.0.3 Release notes for the 0.11.0.3 release: http://home.apache.

[VOTE] 1.0.2 RC0

2018-06-22 Thread Matthias J. Sax
Hello Kafka users, developers and client-developers, This is the first candidate for release of Apache Kafka 1.0.2. This is a bug fix release closing 26 tickets: https://cwiki.apache.org/confluence/display/KAFKA/Release+Plan+1.0.2 Release notes for the 1.0.2 release: http://home.apache.org/~mjsa

Re: Kafka Streams - Expiring Records By Process Time

2018-06-21 Thread Matthias J. Sax
ivered, for example, > backfill.). If you use event-time for retention, these old metrics could be > dropped and won't be aggregated. If we use process-time, at least it will > stay in state-store for some time for aggregation. > > On Thu, Jun 21, 2018 at 1:24 PM, Matthias J. Sa

Re: Kafka Streams - Expiring Records By Process Time

2018-06-21 Thread Matthias J. Sax
I don't understand why event-time retention time cannot be used? Cannot elaborate? -Matthias On 6/21/18 10:59 AM, Sicheng Liu wrote: > Hi All, > > We have a use case that we aggregate some metrics with its event-time > (timestamp on the metric itself) using the simplest tumbling window. The > wi

Re: Consulting ReadOnlyKeyValueStore from Processor can lead to deadlock

2018-06-05 Thread Matthias J. Sax
Created a Jira for each: - https://issues.apache.org/jira/browse/KAFKA-6998 - https://issues.apache.org/jira/browse/KAFKA-6999 -Matthias On 5/11/18 10:06 AM, Guozhang Wang wrote: > Hello Steven, thanks for pointing it out. I think both of the mentioned > issues worth be improving: > > 1. The

Re: UnknownProducerIdException in Kafka streams when enabling exactly once

2018-06-05 Thread Matthias J. Sax
Sorry for late reply. > The source stream contains millions of messages produced over several months. What is the retention time of the output topic? If it is smaller than the message timestamp (that I expect to be multiple month old), on write the data would be delete quitckly, because it's olde

Re: Is there expiration for committed Offset in the partition

2018-06-01 Thread Matthias J. Sax
It is an know issue. You can increase the retention time for stored offsets via configs thought. There is already an open PR to fix this issue: https://issues.apache.org/jira/browse/KAFKA-4682 -Matthias On 6/1/18 2:00 AM, Dinesh Subramanian wrote: > Hi M. Manna, > > Am planning to store outsid

Re: StreamsConfig.DEFAULT_PRODUCTION_EXCEPTION_HANDLER_CLASS_CONFIG

2018-05-31 Thread Matthias J. Sax
As a workaround, you can specify the config just as a string directly: props.put("default.deserialization.exception.handler", ...) -Matthias On 5/31/18 7:48 AM, Guozhang Wang wrote: > Hello Sumit, > > We are going to release 2.0 soon which should contain this fix: > https://issues.apache.org/

Re: Round-Robin assignment when non-nullable record key

2018-05-31 Thread Matthias J. Sax
You can also pass in a custom partitioner instead of using the default partitioner. -Matthias On 5/31/18 7:39 AM, Hans Jespersen wrote: > Why don’t to just put the metadata in the header and leave the key null so it > defaults to round robin? > > -hans > >> On May 31, 2018, at 6:54 AM, M. Man

Re: Correct usage of consumer groups

2018-05-29 Thread Matthias J. Sax
About the docs: Config `cleanup.policy` states: > A string that is either "delete" or "compact". > This string designates the retention policy to > use on old log segments. The default policy> ("delete") will discard old > segments when their > retention time or size limit has been reached.> The

Re: Effect of settings segment.ms and retention.ms not accurate

2018-05-29 Thread Matthias J. Sax
s some > issues. If it is mentioned clearly then everyone will be aware. Could you > please point in right direction about reading timestamp of log message? I > will see about implementing that solution in code. > > On Tue, May 29, 2018 at 11:37 AM Matthias J. Sax > wrote: > >>

Re: Effect of settings segment.ms and retention.ms not accurate

2018-05-28 Thread Matthias J. Sax
Retention time is a lower bound for how long it is guaranteed that data will be stored. This guarantee work "one way" only. There is no guarantee when data will be deleted after the bound passed. However, client side, you can always check the record timestamp and just drop older data that is still

Re: subscribe mail list

2018-05-25 Thread Matthias J. Sax
To subscribe, please follow instructions here: https://kafka.apache.org/contact On 5/24/18 8:16 PM, wrote: > subscribe mail list > signature.asc Description: OpenPGP digital signature

Re: Ktable from compacted topic is using a changelog topic

2018-05-24 Thread Matthias J. Sax
Your understanding is correct. Unfortunately, a regression slipped into 1.0 release such that the described optimization is not done... It's fixed in upcoming 2.0 release. -Matthias On 5/24/18 4:52 PM, Todd Hughes wrote: > From what I've read, a Ktable directly sourced from a compacted topic is

Re: Non duplicated WindowStore in Kstream - KStream Join?

2018-05-23 Thread Matthias J. Sax
Question cross-posted at SO: https://stackoverflow.com/questions/50492491/customize-window-store-implementation-in-kstream-kstream-join I did put an answer there. -Matthias On 5/23/18 8:56 AM, Edmondo Porcu wrote: > We need to perform a Kstream - Kstream join with a very large window, where > a

Re: How to achieve exactly one semantic in Kafka consumer

2018-05-23 Thread Matthias J. Sax
Assuming, that there are no duplicates in your input topic, as long as no failure occurs, the consumer will read every message exactly-once by default. Only in case of failure, when the consumer "falls back" to an older offset, you might see some duplicates. You will need to write custom code to h

Re: Best practices

2018-05-21 Thread Matthias J. Sax
he best > practice to write to same topic on different broker? Is there one? I should > be able to get a list of brokers programmatically from zk path /brokers/ids > ? > > On Sun, May 20, 2018, 3:21 PM Matthias J. Sax wrote: > >> You can register a callback for each sent

Re: Best practices

2018-05-20 Thread Matthias J. Sax
You can register a callback for each sent record to learn about successful write or fail: > producer.send(record, callback); For replication, you don't need to send twice. If the replication factor is configured broker side, the broker take care of replication automatically. You can also configu

Re: Issue with GlobalKTable on sink topic

2018-05-18 Thread Matthias J. Sax
f I try to run again my job (after the failure) without cleaning my > environment (so on the same topics), my GlobalKTable is going to read the > records as expected (so not-null records). > > > > > 2018-05-18 0:04 GMT+02:00 Matthias J. Sax : > >>>> So, are you impl

Re: Issue with GlobalKTable on sink topic

2018-05-17 Thread Matthias J. Sax
at runtime by the same job, > but it needs instead a pre-populated topic either some other application > populating it? > If so, I can not execute the next left join because globalKTable records > must be consumed at runtime. > > Is there any possibility of using another method to ob

Re: Issue with GlobalKTable on sink topic

2018-05-16 Thread Matthias J. Sax
Should be ok to read the topic. I cannot spot any error in your configs/program either. However, I am not entirely sure, if I understand the problem correctly. >> The problem is in the first run, where GlobalKTable reads null records (I >> have a json serializer and it reads a record with null v

Re: AW: Exception stopps data processing (Kafka Streams)

2018-05-16 Thread Matthias J. Sax
ht? > > Best, > Claudia > > -Ursprüngliche Nachricht- > Von: Matthias J. Sax > Gesendet: Dienstag, 15. Mai 2018 22:58 > An: users@kafka.apache.org > Betreff: Re: Exception stopps data processing (Kafka Streams) > > Claudia, > > I leader change is a

Re: Producer#commitTransaction() Not Being Called if New Records Aren't Processed by StreamTask

2018-05-15 Thread Matthias J. Sax
Thanks for reporting this @David! @Guozhang: I actually think this is two different issues. This is also exposed in a current PR: https://github.com/apache/kafka/pull/4912/files#r188179256 I created https://issues.apache.org/jira/browse/KAFKA-6906 for this issue. -Matthias On 5/11/18 10:39 A

Re: Exception stopps data processing (Kafka Streams)

2018-05-15 Thread Matthias J. Sax
Claudia, I leader change is a retryable error. What is your producer config for `retries`? You might want to increase it such that the producer does not throw the exception immediately but retries couple of times -- you might also want to adjust `retry.backoff.ms` that sets the time to wait until

Re: Kafka streaming partition assignment

2018-05-13 Thread Matthias J. Sax
It depends on your version. The behavior is known and we put one improvement into 1.1 release: https://github.com/apache/kafka/pull/4410 Thus, it's "by design" (for 1.0 and older) but we we want to improve it. Cf: https://issues.apache.org/jira/browse/KAFKA-4969 -Matthias On 5/13/18 7:52 PM, Lia

Re: How many are supported by kafka cluster

2018-05-11 Thread Matthias J. Sax
This might be interesting: https://www.confluent.io/blog/how-to-choose-the-number-of-topicspartitions-in-a-kafka-cluster/ Not sure what you mean by "streams" exactly but for brokers the number of partitions are the dominating factor, not the number of topics. -Matthias On 5/11/18 2:01 AM, Sathi

Re: What is the performance impact of setting max.poll.records=1

2018-05-11 Thread Matthias J. Sax
`max.poll.records` only configures how many records are returned from poll(). Internally, the consumer buffers a batch or records and only if this batch is empty, if will do a new fetch request within poll(). -Matthias On 5/10/18 10:46 PM, Mads Tandrup wrote: > Hi > > I forgot to metion that I

Re: Kafka as K/V store

2018-05-09 Thread Matthias J. Sax
You might want to look into Kafka Streams. In particular KTable and Interactive Queries (IQ). A `put` would be a write to the table source topic, while a `get` can be implemented via IQ. For subscribe to particular key, you would consume the whole source topic and filter for the key you are inter

Re: Kafka offset problem when using Spark Streaming...

2018-05-09 Thread Matthias J. Sax
Hard to say. Might be a Spark issue though... On 5/9/18 3:42 AM, Pena Quijada Alexander wrote: > Hi all, > > We're facing some problems with ours Spark Streaming jobs, from yesterday we > have got the following error into our logs when the jobs fail: > > java.lang.AssertionError: assertion fail

Re: Kafka streams state directory - help

2018-05-03 Thread Matthias J. Sax
artRunnerWithArgs(JUnit4IdeaTestRunner.java:68) > at > com.intellij.rt.execution.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:47) > at > com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:242) > at com.intellij.rt.execution.ju

Re: Kafka streams state directory - help

2018-04-21 Thread Matthias J. Sax
You are hitting: https://issues.apache.org/jira/browse/KAFKA-6499 Was fixed in 1.1 release. Thus, you can just ignore the checkpoint file. There should be no issue with running on Kubernetes. Also, if there is no store (independent of disk based or in-memory) there will be no changelog topic.

Re: Ordering guarantee after branching a stream

2018-04-20 Thread Matthias J. Sax
As long as no repartitioning happens yes. If one of both sub-streams is repartitioned, there is no guarantee. -Matthias On 4/20/18 4:31 PM, Botuck, Jacob (STL) - contr wrote: > If I call branch on a kStream, and I send record A into the trunkStream > followed by record B. If records A and B go

Re: Is KTable cleaned up automatically in a Kafka streams application?

2018-04-19 Thread Matthias J. Sax
never be deleted and will stay in the KTable. Is this correct? > > Thanks, > Mihaela Stoycheva > > On Thu, Apr 19, 2018 at 3:12 PM, Matthias J. Sax > wrote: > >> Not sure what you mean by "old state that is not longer needed" ? >> >> key-

Re: Is KTable cleaned up automatically in a Kafka streams application?

2018-04-19 Thread Matthias J. Sax
Not sure what you mean by "old state that is not longer needed" ? key-value entries are kept forever, and there is no TTL. If you want to delete something from the store, you can return `null` as aggregation result though. -Matthias On 4/19/18 2:28 PM, adrien ruffie wrote: > Hi Mihaela, > > >

Re: Possible to Disable Offset Checkpointing for an In-Memory Global Store?

2018-04-18 Thread Matthias J. Sax
Make totally sense and is a known issue: https://issues.apache.org/jira/browse/KAFKA-6711 Please follow up the ticket and/or PR -- not sure what the current status is. -Matthias On 4/17/18 11:16 PM, David Chu wrote: > I have a custom in-memory state store which I’d like to configure as a global

Re: Scaling up kafka consumer applications(kafka streams)

2018-04-11 Thread Matthias J. Sax
9, 2018 at 1:44 AM, Matthias J. Sax > wrote: > >> It depend on the concrete program you write... Most likely it's 4, but >> hard to say without more information. >> >> If you read all 4 topics as a single stream (i.e. pattern subscription) >> it should be 4

Re: Kafka-streams: mix Processor API with windowed grouping

2018-04-09 Thread Matthias J. Sax
em: we added 250 hits to remaining, but actually we had to add only > 150 hits. We have to subtract previous count and it means we need to keep > them all somewhere. That's where we hope access to KV store can help. > > > > > > > > > > > On

Re: join 2 topic streams --> to another topic

2018-04-09 Thread Matthias J. Sax
imestamps why not, is it possible with windowing ? > > > Thank Matthias > > Adrien > > > De : Matthias J. Sax > Envoyé : dimanche 8 avril 2018 23:04:24 > À : users@kafka.apache.org > Objet : Re: join 2 topic streams --> to another topic >

Re: join 2 topic streams --> to another topic

2018-04-08 Thread Matthias J. Sax
Check out this blog post that explain how the different joins work: https://www.confluent.io/blog/crossing-streams-joins-apache-kafka/ It's hard to give a general answer -- it depends on the context of your application. Are keys unique? Do you want to get exactly one result or should a single stoc

Re: Scaling up kafka consumer applications(kafka streams)

2018-04-08 Thread Matthias J. Sax
It depend on the concrete program you write... Most likely it's 4, but hard to say without more information. If you read all 4 topics as a single stream (i.e. pattern subscription) it should be 4. If you repartitions data in your applications later, it might be more thought. -Matthias On 4/8/1

Re: Kafka-streams: mix Processor API with windowed grouping

2018-04-07 Thread Matthias J. Sax
>> ok, question then - is it possible to use state store with .aggregate()? Not sure what you exactly mean by this. An aggregations always uses a store; it's a stateful operation and cannot be computed without a store. For TopN, if you get the hit-count as input, you can use a `.aggregate()` oper

Re: Kafka Streams Internal Topic Retention not applied

2018-04-07 Thread Matthias J. Sax
mbstones records will be deleted after "delete.retention.ms”, right? > Which defaults to 24 hours - meaning that the internal topic should only > contain data for 24 hours + window Size? Is this somehow right? > > Again, thank you very much for taking the time to answer these ques

Re: Kafka Streams Internal Topic Retention not applied

2018-04-06 Thread Matthias J. Sax
Björn, broker configs are default config but can be overwritten when a topic is created. And this happens when Kafka Streams creates internal topics. Thus, you need to change the setting Kafka Streams applies when creating topics. Also note: if cleanup.policy = compact, the setting of `log.retent

Re: Kafka-streams: mix Processor API with windowed grouping

2018-04-06 Thread Matthias J. Sax
KGroupedStream and TimeWindowedKStream are only logical representations at DSL level. They don't really "do" anything. Thus, you can mimic them as follows: builder.addStore(...) in.selectKey().through(...).transform(..., "storeName"). selectKey() set's the new key for the grouping and the throug

Re: message queueing questions?

2018-04-05 Thread Matthias J. Sax
Hi, multiple answers to this question: 1) it depends of you send messages sync or async to the brokers. Producers do buffer messages in-memory for more efficient writes to the brokers. If messages are successfully sent to the brokers, you can get an acknowledgment back the you can check on the pr

Re: Move to Kafka 1.1 from 0.10.2.x ?

2018-04-05 Thread Matthias J. Sax
Check out the upgrade notes: https://kafka.apache.org/documentation/#upgrade You should consider all notes for 0.11, 1.0 and 1.1 releases. And yes, 1.1 is absolutely ready for production. -Matthias On 4/5/18 11:57 AM, Raghav wrote: > Hi > > Are there anything that needs to be taken care for

Re: kafka-streams TopologyTestDriver problem with EXACTLY_ONCE

2018-04-05 Thread Matthias J. Sax
Thanks Fred! On 4/5/18 3:41 AM, Frederic Arno wrote: > That's what I'm doing now, I override the config from within my tests. > > Reported here: https://issues.apache.org/jira/browse/KAFKA-6749 > > Thanks, Fred > > On 04/04/2018 10:56 PM, Matthias J. Sax wro

Re: kafka-streams Invalid transition attempted from state READY to state ABORTING_TRANSACTION

2018-04-05 Thread Matthias J. Sax
ps://github.com/apache/kafka/pull/4826 >> >> I will fill in JIRA Id once Frederic creates the JIRA. >> >> Cheers >> >> On Wed, Apr 4, 2018 at 4:29 PM, Matthias J. Sax >> wrote: >> >>> Yes. That looks promising to me. Feel free to open an PR afte

Re: kafka-streams Invalid transition attempted from state READY to state ABORTING_TRANSACTION

2018-04-04 Thread Matthias J. Sax
mbie) { > +if (!isZombie && transactionInFlight) { > producer.abortTransaction(); > } > transactionInFlight = false; > > On Wed, Apr 4, 2018 at 2:02 PM, Matthias J. Sax > wro

Re: kafka-streams Invalid transition attempted from state READY to state ABORTING_TRANSACTION

2018-04-04 Thread Matthias J. Sax
Thanks for reporting this. It's indeed a bug in Kafka Streams. It's related to this fix: https://issues.apache.org/jira/browse/KAFKA-6634 -- the corresponding PR introduces the issue. Because, we initialize TX delayed, for your case, we never initialize TX and thus aborting the TX fails. Please

Re: kafka-streams TopologyTestDriver problem with EXACTLY_ONCE

2018-04-04 Thread Matthias J. Sax
Just a side remark. As a workaround it should be fine to remove the config. TopologyTestDriver will not produce duplicates anyway and is also not suitable to test EOS. -Matthias On 4/4/18 1:26 PM, Guozhang Wang wrote: > Thanks Frederic for reporting the issue, I think it is indeed a missing > pie

Re: Error with kafka stream and kafka

2018-04-03 Thread Matthias J. Sax
What broker and Streams version do you use? Can you share the log/error message/stacktrace? -Matthias On 4/3/18 10:58 AM, Ariel Debernardi wrote: > Hi, > > I have a services with kafka with three brokers, and other with kafka > streams. > > In kafka we have a topic "dep-stream", whit 12 partiti

<    1   2   3   4   5   6   7   8   9   10   >