Re: How are custom keys compared during joins

2019-12-07 Thread Sachin Mittal
I figured out the issue. Basically my table2Mapped which is created from stream2 has some messages that arrive later than they arrive at stream1 for same key. After checking stream to table semantics I found that the left side is joined to right side only for the record that exist for that key tha

Re: How to set concrete names for state stores and internal topics backed by these

2019-12-07 Thread John Roesler
Hi Sachin, I’m glad it helped! What you have in mind is a good thing to do. One thing to watch out for is _not_ to add names using Materialized for KTable operations that otherwise would not create a store. For example, if you filter or mapValues a KTable, those operations usually do not actua

Re: Reducing streams startup bandwidth usage

2019-12-07 Thread Alessandro Tagliapietra
Could be, but since we have a limite amount of input keys (~30), windowing generates new keys but old ones are never touched again since the data per key is in order, I assume it shouldn't be a big deal for it to handle 30 keys I'll have a look at cache metrics and see if something pops out Thanks

Re: Reducing streams startup bandwidth usage

2019-12-07 Thread John Roesler
Hmm, that’s a good question. Now that we’re talking about caching, I wonder if the cache was just too small. It’s not very big by default. On Sat, Dec 7, 2019, at 11:16, Alessandro Tagliapietra wrote: > Ok I'll check on that! > > Now I can see that with caching we went from 3-4MB/s to 400KB/s,

Re: Reducing streams startup bandwidth usage

2019-12-07 Thread Alessandro Tagliapietra
Ok I'll check on that! Now I can see that with caching we went from 3-4MB/s to 400KB/s, that will help with the bill. Last question, any reason why after a while the regular windowed stream starts sending every update instead of caching? Could it be because it doesn't have any more memory availab

Re: Reducing streams startup bandwidth usage

2019-12-07 Thread John Roesler
Ah, yes. Glad you figured it out! Caching does not reduce EOS guarantees at all. I highly recommend using it. You might even want to take a look at the caching metrics to make sure you have a good hit ratio. -John On Sat, Dec 7, 2019, at 10:51, Alessandro Tagliapietra wrote: > Never mind I've

Re: Reducing streams startup bandwidth usage

2019-12-07 Thread Alessandro Tagliapietra
Never mind I've found out I can use `.withCachingEnabled` on the store builder to achieve the same thing as the windowing example as `Materialized.as` turns that on by default. Does caching in any way reduces the EOS guarantees? -- Alessandro Tagliapietra On Sat, Dec 7, 2019 at 1:12 AM Alessand

KafkaProducer - Oversized batches

2019-12-07 Thread Tomoyuki Saito
Hi, ## Questions 1. Any possible way to make sure to avoid batch split, or oversized batches? 2. Any progress/discussion to fix the issue mentioned in the following PR: https://github.com/apache/kafka/pull/6469 (kafka#6469) ## Background `FlinkKafkaProducer` expects that callbacks for sent reco

Re: Kafka transactions not working in 2.3.1

2019-12-07 Thread Jonathan Santilli
Hello, have you ensured you have installed the same version in all brokers? Did you restart all brokers after the update as indicated in the rolling upgrade instructions? Cheers! On Sat, Dec 7, 2019, 2:38 PM Soman Ullah wrote: > Hello, > > I recently upgraded our kafka cluster from Kafka versi

Kafka transactions not working in 2.3.1

2019-12-07 Thread Soman Ullah
Hello, I recently upgraded our kafka cluster from Kafka version 0.10.1 to 2.3.1 I've confirmed the version has updated using the following command: /home/wrte/install/kafka/bin/kafka-topics.sh --version 2.3.1.1_212.11 (Commit:8c923fb4c62a38ae) However I'm unable to use the transactional features

Re: How to set concrete names for state stores and internal topics backed by these

2019-12-07 Thread Sachin Mittal
Hi John, This was very helpful. However I am still confused about when to set the names for Materialized and Grouped. I am basically setting the names because to have definite names of state stores and internal topics identifiable for debugging purpose. So when we set a name, do we also need to se

Re: Reducing streams startup bandwidth usage

2019-12-07 Thread Alessandro Tagliapietra
Seems my journey with this isn't done just yet, This seems very complicated to me but I'll try to explain it as best I can. To better understand the streams network usage I've used prometheus with the JMX exporter to export kafka metrics. To check the amount of data we use I'm looking at the incre