Re: Tracking topic consumers

2021-10-07 Thread Boyang Chen
eep track of consumer > offset or consumer groups. > Thanks > Murilo > > On Thu, 7 Oct 2021 at 15:33, Boyang Chen > wrote: > > > Hey Murilo, could you explain what you mean by `goka views`? If you are > > talking about https://github.com/lovoo/goka, they shoul

Re: Tracking topic consumers

2021-10-07 Thread Boyang Chen
Hey Murilo, could you explain what you mean by `goka views`? If you are talking about https://github.com/lovoo/goka, they should use consumer groups as well IIUC. Boyang On Thu, Oct 7, 2021 at 11:55 AM Murilo Tavares wrote: > Hi. Looking for some insights here. > We use Kafka at a large scale,

Re: DescribeTopics could return deleted topic

2021-08-17 Thread Boyang Chen
metadata update request). But I would > not expect it to "always" return the deleted topic partition, unless the > broker being queried is partitioned from the controller and hence would > never receive the metadata update request. > > > Guozhang > > On Sun,

Re: Kafka Streams leave group behaviour

2021-08-12 Thread Boyang Chen
You are right Uwe, Kafka Streams won't leave group no matter dynamic or static membership. If you want to have fast scale down, consider trying static membership and use the admin command `removeMemberFromGroup` when you need to rescale. Boyang On Thu, Aug 12, 2021 at 4:37 PM Lerh Chuan Low

DescribeTopics could return deleted topic

2021-08-08 Thread Boyang Chen
Hey there, Has anyone experienced the case where the admin delete topic command was issued but the DescribeTopics command always returns the topic partition? What's the expected time for the topic metadata to disappear? Boyang

Embedded Kafka connector in data ingestion service

2021-06-21 Thread Boyang Chen
Hey there, I'm wondering if anyone has the need to use an embedded Kafka connector module. The goal we want to achieve is to avoid letting customers maintain a separate component when they stream data from their Kafka cluster to our service, so that they just need to provide the cluster

Re: [ANNOUNCE] New Committer: Bruno Cadonna

2021-04-08 Thread Boyang Chen
Congratulations Bruno! On Thu, Apr 8, 2021 at 2:42 AM Mickael Maison wrote: > Congratulations Bruno! > > On Thu, Apr 8, 2021 at 10:07 AM David Jacot > wrote: > > > > Congrats Bruno! > > > > On Thu, Apr 8, 2021 at 9:53 AM Tom Bentley wrote: > > > > > Congratulations Bruno! > > > > > > On Thu,

Re: Abort transaction semantics

2021-02-18 Thread Boyang Chen
Thanks for the question. I think Gary provided an excellent answer. Additionally, you could check out the code example for EOS, which shows you how to reset the state while

Re: Kafka transactions commit message consumability issues

2021-01-26 Thread Boyang Chen
Have you set consumer isolation level? If it was set to uncommitted, it will be able to see messages you produced, without commitTransaction call On Tue, Jan 26, 2021 at 7:43 AM 积淀智慧 wrote: > Hello, everybody, > > > I'm running some tests while using Kafka transactions. > > > test 1 : > String

Re: [ANNOUNCE] Apache Kafka 2.7.0

2020-12-28 Thread Boyang Chen
> > > between systems or applications. > > > > > > ** Building real-time streaming applications that transform or react > > > to the streams of data. > > > > > > > > > Apache Kafka is in use at large and small companies worldwide, > includ

Re: [VOTE] 2.7.0 RC4

2020-12-06 Thread Boyang Chen
Hey Bill, Unfortunately we have found another regression in 2.7 streams, which I have filed a blocker here . The implementation is done, and I will try to get reviews and merge ASAP. Best, Boyang On Fri, Dec 4, 2020 at 3:14 PM Jack Yang wrote:

Re: [VOTE] 2.7.0 RC1

2020-11-05 Thread Boyang Chen
Hey Bill, we (Kudos to Magnus) found a blocker for 2.7 on the new Producer error code: https://issues.apache.org/jira/browse/KAFKA-10687 that needs to be fixed. I will start preparing the patch ASAP, just FYI. Best, Boyang On Wed, Nov 4, 2020 at 11:42 AM Bill Bejeck wrote: > * Successful

Re: [VOTE] KIP-657: Add Customized Kafka Streams Logo

2020-08-19 Thread Boyang Chen
e: > > > > > > > > I'm leaning towards design B primarily because it reminds me of the > > > Firefox > > > > logo which I like a lot. But I also share Adam's concern that it > should > > > > better not obscure the Kafka logo --- so i

Re: [ANNOUNCE] Apache Kafka 2.5.1

2020-08-12 Thread Boyang Chen
; > > application: > > > > > > ** Building real-time streaming data pipelines that reliably > > > get data between systems or applications. > > > > > > ** Building real-time streaming applications that transform > > > or react to the s

Re: [ANNOUNCE] New committer: Xi Hu

2020-06-24 Thread Boyang Chen
Congratulations Xi! Well deserved. On Wed, Jun 24, 2020 at 10:10 AM AJ Chen wrote: > Congratulations, Xi. > -aj > > > > On Wed, Jun 24, 2020 at 9:27 AM Guozhang Wang wrote: > > > The PMC for Apache Kafka has invited Xi Hu as a committer and we are > > pleased to announce that he has accepted!

Re: [ANNOUNCE] New committer: Boyang Chen

2020-06-22 Thread Boyang Chen
> Congratulations Boyang! Well deserved. > > > > -Bill > > > > On Mon, Jun 22, 2020 at 7:35 PM Colin McCabe wrote: > > > >> Congratulations, Boyang! > >> > >> cheers, > >> Colin > >> > >> On Mon, Jun 22, 2020, at

Re: dynamic produce and consume new topics

2020-06-13 Thread Boyang Chen
>> Thanks, Boyang. > >> > >> Spring''s KafkaTemplate can easily create new topic on the flight > >> already. > >> > >> On consumer side, is there java sample code to replace this annotation? > >> That will create a new listener for a new t

Re: dynamic produce and consume new topics

2020-06-12 Thread Boyang Chen
Hey AJ, you should be able to make the subscribed topics and output topics dynamic configs for Consumer and Producer. If you need to create new topics on runtime, there is some broker side setting you could use to allow topic creation based of produced records:

Re: idempotency issue and transactions on producer

2020-05-15 Thread Boyang Chen
Hey Raffaele, the producer id is getting assigned upon receiving the producer.initTransaction call at the broker side. It guarantees the uniqueness of a producer for current lifecycle, which you don't have to configure manually. Transactional API on the other hand, includes idempotent produce

Re: How to add partitions to an existing kafka topic

2020-04-15 Thread Boyang Chen
Hey Sachin, your observation is correct, unfortunately Kafka Streams doesn't support adding partitions online. The rebalance could not guarantee the same key routing to the same partition when the input topic partition changes, as this is the upstream producer's responsibility to consistently

Re: Global state store: Lazy loading

2020-04-02 Thread Boyang Chen
Hey Navneeth, could you clarify a bit on what you mean by `lazy load`, specifically how you make it happen with local KV store? On Thu, Apr 2, 2020 at 12:09 PM Navneeth Krishnan wrote: > Hi All, > > Is there a recommend way for lazy loading the global state store. I'm using > PAPI and I have

Re: Statestore restoration - Error while range compacting during restoring

2020-03-31 Thread Boyang Chen
Thanks Nicolas for the report, so are you suggesting that you couldn't turn on compactions for the state store? Is there a workaround? On Tue, Mar 31, 2020 at 9:54 AM Nicolas Carlot wrote: > After some more testing and debugging, it seems that it is caused by the > compaction option I've

Re: [VOTE] 2.5.0 RC2

2020-03-20 Thread Boyang Chen
Hey David, I would like to raise https://issues.apache.org/jira/browse/KAFKA-9701 as a 2.5 blocker. The impact of this bug is that it could throw fatal exception and kill a stream thread on Kafka Streams level. It could also create a crashing scenario for plain Kafka Consumer users as well as the

Re: Integrating Kafka with Stateful Spark Streaming

2020-03-04 Thread Boyang Chen
Hey there, have you already sought help from Spark community? Currently I don't think we could attribute the symptom to Kafka. Boyang On Wed, Mar 4, 2020 at 7:37 AM Something Something wrote: > Need help integrating Kafka with 'Stateful Spark Streaming' application. > > In a Stateful Spark

Re: Use a single consumer or create consumer per topic

2020-02-26 Thread Boyang Chen
Hey Mark, you could use a consumer group (check the consumer #subscribe API) to consume from 50 topics in a dynamic fashion, as long as the data processing function is the same for all the records. Consumer group could provide basic guarantees for balancing the number of partitions for each

Re: Adding a new sub-topolgy requires reset?

2020-02-06 Thread Boyang Chen
Hey Murilo, feel free to file a JIRA and paste your full topology. It seems like a bug to me. Boyang On Thu, Feb 6, 2020 at 8:17 AM Murilo Tavares wrote: > Answering my own question, obviously this is a stateless application, so > there’s no reset needed. Mu bad. > But the NPE does seem to be

Re: min.insync.replicas and producer acks

2020-01-24 Thread Boyang Chen
Hey Pushkar, producer ack only has 3 options: none, one, or all. You could not nominate an arbitrary number. On Fri, Jan 24, 2020 at 7:53 PM Pushkar Deole wrote: > Hi All, > > I am a bit confused about min.insync.replicas and producer acks. Are these > two configurations achieve the same

Re: Python and subscribing to topic based on key

2020-01-02 Thread Boyang Chen
Hey George, there is no support for server side filtering based on key at the moment, as it may significantly impact broker side performance. On Thu, Jan 2, 2020 at 12:26 PM George wrote: > Hi all > > is it possible to subscribe to a topic based on a specified key... without > specifying the

Re: complicated logic for tombstone records

2020-01-02 Thread Boyang Chen
Hey Jan, although I believe your case is much more complicated, but would time based retention work for you at all? If yes, time window store is like the best option. If no, streams has no out-of-box solution for invalidating the aggregation record. It seems at least we could provide an API to

Re: Kafka consumer group keeps moving to PreparingRebalance and stops consuming

2019-12-06 Thread Boyang Chen
Hey Avshalom, the consumer instance is initiated per stream thread. You will not be creating new consumers so the root cause is definitely member timeout. Have you changed the max.poll.interval by any chance? That config controls how long you tolerate the interval between poll calls to make sure

Re: [ANNOUNCE] New committer: John Roesler

2019-11-12 Thread Boyang Chen
Great work John! Well deserved On Tue, Nov 12, 2019 at 1:56 PM Guozhang Wang wrote: > Hi Everyone, > > The PMC of Apache Kafka is pleased to announce a new Kafka committer, John > Roesler. > > John has been contributing to Apache Kafka since early 2018. His main > contributions are primarily

Re: Attempt to prove Kafka transactions work

2019-10-27 Thread Boyang Chen
Hey Edward, just to summarize and make sure I understood your question, you want to implement some Chaos testing to validate Kafka EOS model, but not sure how to start or curious about whether there are already works in the community doing that? For the correctness of Kafka EOS, we have tons of

Re: Kafka Streams State stuck in rebalancing after one of the StreamThread encounters java.lang.IllegalStateException: No current assignment for partition

2019-10-23 Thread Boyang Chen
; Let's move the discussion to the ticket. > > > -Matthias > > On 10/21/19 1:19 PM, Boyang Chen wrote: > > Hey Amuthan, > > > > I replied you on the Slack community. Feel free to choose either continue > > the discussion here or in Slack. > > > > Boyang > >

Re: Kafka Streams Daily Aggregation

2019-10-23 Thread Boyang Chen
Hey Zongzhen, I have implemented some similar functionality with KStream before. You could just set tumbling window to 24 hours to get daily aggregation result. As you just need calendar dates, the tumbling window computation starts from system time 0 which is exactly cut-off daily. Boyang On

Re: Allow keys to specify partitionKey

2019-10-21 Thread Boyang Chen
Hey Jan, correct me if I'm wrong, my understanding of your problem is as follows: 1. This is a Kafka Streams question 2. You want to preserve the same primary key, but to repartition based on some certain field with key/value If this is the case, have you considered using `through(topic,

Re: Kafka Streams State stuck in rebalancing after one of the StreamThread encounters java.lang.IllegalStateException: No current assignment for partition

2019-10-21 Thread Boyang Chen
Hey Amuthan, I replied you on the Slack community. Feel free to choose either continue the discussion here or in Slack. Boyang On Mon, Oct 21, 2019 at 1:06 PM Amuthan wrote: > I have a Kafka stream application that stores the incoming messages into a > state store, and later during the

Re: Kstream to Kafka table

2019-10-04 Thread Boyang Chen
hen. Can we do through commands instead of using programming > library? > > Sent from my iPhone > > > On Oct 4, 2019, at 3:51 PM, Boyang Chen > wrote: > > > > Hey Asmath, > > > > just for the KStream question, feel free to checkout our official doc:

Re: Kafka Streams changelog topic has 5 times higher out-traffic than in-traffic

2019-10-04 Thread Boyang Chen
Hey Xiyuan, to better understand the situation, we need to clarify the actual consumer of the changelog topic which contributes to the volume increase. You could attempt to expose some broker metrics to see the actual client, but normally there are two types of changelog consumers: 1. restore

Re: Stale data in KStream->KTable join

2019-10-04 Thread Boyang Chen
Hey Trey, as I was reading, several suggestions I have are: 1. Could you revert 0ms commit interval to default? It will not help with the situation as you will try to commit on every poll() 2. I couldn't know how you actually write your code, but you could try something really simple as print

Re: Kstream to Kafka table

2019-10-04 Thread Boyang Chen
Hey Asmath, just for the KStream question, feel free to checkout our official doc: https://kafka.apache.org/23/documentation/streams/developer-guide/dsl-api.html if you need to push to a output topic, in DSL there is a #to(String topic) function which will write the stream output to your intended

Re: KSTREAM basics

2019-10-03 Thread Boyang Chen
Hey David, if you are talking about calling toStream() on KTABLE, from the java doc: *Note that this is a logical operation and only changes the "interpretation" of the stream, i.e., each record of * this changelog stream is no longer treated as an updated record* So the answer is yes to

Re: Kafka Streams can't run normally after restart/redeployment

2019-09-25 Thread Boyang Chen
Hey Xiyuan, I would assume it's easier for us to help you by reading your application with a full paste of code (a prototype). Changing application id would work suggests that re-process all the data again shall work, do I understand that correctly? Boyang On Wed, Sep 25, 2019 at 8:16 AM Xiyuan

Re: Byzantine Fault Tolerance Implementation

2019-08-27 Thread Boyang Chen
Hey Nayak, there is an on-going KIP in the community about deprecating zookeeper: https://cwiki.apache.org/confluence/display/KAFKA/KIP-500%3A+Replace+ZooKeeper+with+a+Self-Managed+Metadata+Quorum It should be a good place to raise your question about making consensus algorithm pluggable in the

Re: How do I tell Kafka Streams not to repartition?

2019-08-09 Thread Boyang Chen
In case I'm not making myself clear, any operation that changes the record key will result in repartition. Since you don't want that, you shall choose to call groupByKey afterwards and aggregation will happen on `parent id` level. On Fri, Aug 9, 2019 at 3:27 PM Boyang Chen wrote: > Hey

Re: How do I tell Kafka Streams not to repartition?

2019-08-09 Thread Boyang Chen
Hey Tim, I think the functionality you need is groupByKey() which avoids repartitioning, feel free to check it out here: https://docs.confluent.io/current/streams/developer-guide/dsl-api.html#aggregating. Recommend you to read the whole thing but feel free just to search `groupByKey`. On Fri,

Re: How to start a stream from only new records?

2019-08-09 Thread Boyang Chen
Hey Tim, if you are talking about avoid re-processing data and start consumption from latest, you could set your `offset.reset.policy` to latest. Let me know if this answers your question. On Fri, Aug 9, 2019 at 7:09 AM Tim Ward wrote: > With a real time application, nobody is interested in

Re: Kafka Streams & Processor API

2019-08-05 Thread Boyang Chen
Hey Navneeth, thank you for your interest in trying out Kafka Streams! Normally we will redirect new folks to the stream FAQ first in case you haven't checked it out. For details to your question: 1. Joining 2 topics using processor API (we

Re: (Re-)joining group for a longer time

2019-08-03 Thread Boyang Chen
Hey Kamesh, thank you for the question. Could you also check the broker side log to see if the group is forming generations properly? Information we have for now is a bit hard to tell what's going on. Also since you have upgraded to 2.3, during incremental rebalancing you will experience 2

Re: Group Coordinator stuck on PrepareRebalance state.

2019-07-10 Thread Boyang Chen
Hey Aravind, If your client/broker just upgraded to 2.3, Jason has filed a blocker for 2.3: https://issues.apache.org/jira/browse/KAFKA-8653 and a fix is on its way: https://github.com/apache/kafka/pull/7072/files Let me know if you are actually on a different version. Boyang On Wed, Jul 10,

Kafka Trunk Build Failure with Gradle 5.0

2018-12-12 Thread Boyang Chen
Hey friends, I'm wondering anyone has seen this error before: Could not find method annotationProcessor() for arguments [org.openjdk.jmh:jmh-generator-annprocess:1.21] on object of type org.gradle.api.internal.artifacts.dsl.dependencies.DefaultDependencyHandler. Basically I'm building the