Re: Apache zookeeper going down every 168 hours

2024-05-04 Thread Haruki Okada
llo, > > Thanks for your prompt response. > > How to apply patch for this? Could you please provide further more details? > > Regards > Yogeshkumar A > > On Sat, 4 May 2024 at 9:33 AM, Haruki Okada wrote: > > > Thanks for sharing logs. > > > > Kaf

Re: Apache zookeeper going down every 168 hours

2024-05-03 Thread Haruki Okada
on Production environment) so I recommend you to try on Linux (or on WSL at least) Thanks, 2024年5月4日(土) 10:20 Yogeshkumar Annadurai : > Hello, > > We see timeout error in server.log > log files and properties files are attached for your reference > > regards > Yogeshkumar A > &g

Re: Apache zookeeper going down every 168 hours

2024-05-03 Thread Haruki Okada
Hi. log.retention shouldn't be related to the phenomenon. Sounds like we should understand the situation more precisely to answer. > apache zookeeper connection is going down automatically How did you confirm this? On ZooKeeper log? Also, did you see any logs on Kafka side? (on stdout or

Re: Kafka Producer avoid sending more records when met exception while also keeping performance

2024-03-14 Thread Haruki Okada
Hi. > By setting max.in.flight.requests.per.connection to 1, I'm concerned that this could become a performance bottleneck As Greg pointed out, this is a trade-off between the ordering-guarantee and the throughput. So you should first measure the throughput of

Re: Does Kafka wait for an fsync to send back and ACK for a published message ?

2024-03-14 Thread Haruki Okada
Hi. By default, Kafka returns ack without waiting fsync to the disk. But you can change this behavior by log.flush.interval.messages config. For data durability, Kafka mainly relies on replication instead. > then there is potential for message loss if the node crashes before On the crashed

Re: Kafka Producer avoid sending more records when met exception while also keeping performance

2024-03-11 Thread Haruki Okada
Hi. > I immediately stop sending more new records and stop the kafka producer, but some extra records were still sent I still don't get why you need this behavior though, as long as you set max.in.flight.requests.per.connection to greater than 1, it's impossible to avoid this because

Re: During cluster peak, KAFKA NetworkProcessorAvgIdlePercent is lower than 0.2

2024-01-21 Thread Haruki Okada
, and the > idle rate is low? > > Haruki Okada 于2024年1月15日周一 21:56写道: > > > You should investigate the cause of request-queue full situation first. > > Since I guess low network idle ratio is the consequence of that. > > (Network-threads would block on queueing when

Re: kafka cluster question

2024-01-19 Thread Haruki Okada
Hi. Which server did you shutdown in testing? If it was 192.168.20.223, that is natural kafka-consumer-groups script fails because you passed only 192.168.20.223 to the bootstrap-server arg. In HA setup, you have to pass multiple brokers (as the comma separated string) to bootstrap-server so

Re: During cluster peak, KAFKA NetworkProcessorAvgIdlePercent is lower than 0.2

2024-01-15 Thread Haruki Okada
You should investigate the cause of request-queue full situation first. Since I guess low network idle ratio is the consequence of that. (Network-threads would block on queueing when request-queue is full) I recommend running async-profiler to take the profile of the broker process if possible

Re: Relation between fetch.max.bytes, max.partition.fetch.bytes & max.poll.records

2023-12-09 Thread Haruki Okada
ollow-up question, since max.poll.records has nothing to do with > > fetch requests, then is there any gain on number of network calls being > > made between consumer & broker if max.poll.records is set to 1 as against > > let's say the default 500. > > > > On We

Re: Relation between fetch.max.bytes, max.partition.fetch.bytes & max.poll.records

2023-12-06 Thread Haruki Okada
? If not what could be the reason that > may cause poll-idle-ratio-avg to approach 1.0? > > > Can you let me know what > > On Sat, 2 Dec, 2023, 07:05 Haruki Okada, wrote: > > > Hi. > > > > `max.poll.records` does nothing with fetch

Re: Relation between fetch.max.bytes, max.partition.fetch.bytes & max.poll.records

2023-12-01 Thread Haruki Okada
Hi. `max.poll.records` does nothing with fetch requests (refs: https://kafka.apache.org/35/documentation.html#consumerconfigs_max.poll.records ) Then, how many records will be returned for single fetch request depends on the partition-leader assignment. (note: we assume follower-fetch is not

Re: Kafka 2.7.2 to 3.5.1 upgrade

2023-12-01 Thread Haruki Okada
Hi. I'm not sure if KafkaManager has such bug though, you should check if there's any under replicated partitions actually by `kafka-topics.sh` command with `--under-replicated-partitions` option first. 2023年11月30日(木) 23:41 Lud Antonie : > Hello, > > After upgrading from 2.7.2 to 3.5.1 some

Re: How does Kafka Consumer send JoinRequest?

2023-11-26 Thread Haruki Okada
Hi. JoinGroup request is sent from the polling/user thread. In your example, the consumer instance will be removed from the group because it didn't join the group within the timeout. So the partition will be assigned to another consumer and be processed. 2023年11月26日(日) 18:09 Debraj Manna : >

Re: [Question] About Kafka producer design decision making

2023-11-14 Thread Haruki Okada
Hi. I also guess the main reason for using Future was for JDK1.7 support which is no longer necessary in the current Kafka version. Actually, there's a KIP about this: https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=100829459 but it seems it's not active now. > I wonder if it is

Re: About Kafka Java Client Producer Retry And Callback

2023-11-13 Thread Haruki Okada
> will the callback be executed for each retry The callback will be triggered only once when the produce is finally ended up with succeeded or failed after retries. > is there any way to make Kafka producers retry locally Easiest way would be to make produce failing artificially. it can be done

Re: Multiple consumers for a Single Partition in Kafka

2023-10-03 Thread Haruki Okada
Hi Sree. Yeah, parallel-processing-per-partition requirement often arises particularly when the throughput is limited due to external I/O latency and there are some solutions: - https://github.com/line/decaton * provides per-key ordering guarantee (or can be unordered if no ordering is

Re: ISR expansion vs. shrink eligibility

2023-03-28 Thread Haruki Okada
Hi. So the question is about the difference between the leader LEO (shrink criteria) and the leader HW (expand criteria), right? 1. Why shrink-criteria uses leader LEO Since HW is defined as "the latest offset that is replicated to all ISRs", it can't be used to kick out a replica from the ISR

Re: kafka producer exception due to TimeoutException: Expiring records for topic 120000ms has passed since batch creation

2022-04-05 Thread Haruki Okada
Hi, Pushkar. As the error message shows, the error means that some messages couldn't be produced successfully for 120 seconds. There are many causes which could lead to this phenomenon, so it's hard to tell the solution unless more information is provided. For example: - Kafka broker-side's

Re: Kafka performance when it comes to throughput

2022-01-06 Thread Haruki Okada
Hi, Marisa. Kafka is well-designed to make full use of system resources, so I think calculating based on machine's spec is a good start. Let's say we have servers with 10Gbps full-duplex NIC. Also, let's say we set the topic's replication factor to 3 (so the cluster will have minimum 3 servers),

Re: Help needed to migrate from one infra to another without downtime

2021-10-22 Thread Haruki Okada
Hi, Rijo. This slide might help you to create a procedure to migrate the zk ensemble without downtime. https://speakerdeck.com/line_developers/split-brain-free-online-zookeeper-migration The slide is based on zookeeper 3.4 so in your environment (3.5), the procedure might be simplified thanks to

Re: Kafka Scaling Ideas

2020-12-22 Thread Haruki Okada
rformance? > > On Mon, Dec 21, 2020 at 4:16 PM Haruki Okada wrote: > > > About "first layer" right? > > Then it's better to make sure that not get() the result of > Producer#send() > > for each message, because in that way, it spoils the ability of >

Re: Kafka Scaling Ideas

2020-12-21 Thread Haruki Okada
h inserts, if you aren't yet. > Say, > > > each > > > > consumer waits for 1000 messages or 5 seconds to have passed > (whichever > > > > comes first) and then does a single bulk insert of the msgs it has > > > > received, followed by a manual

Re: Kafka Scaling Ideas

2020-12-21 Thread Haruki Okada
s for load testing kafka? > > On Sun, Dec 20, 2020 at 7:23 PM Haruki Okada wrote: > > > It depends on how you manually commit offsets. > > Auto-commit does commits offsets in async manner basically, so as long as > > you do manual-commit in the same way, there should be

Re: Kafka Scaling Ideas

2020-12-20 Thread Haruki Okada
nse: > - "One possible solution is creating an intermediate topic" - I already did > it > - I'll look at Decaton - thx > > Is there any thoughts on the auto commit vs manual commit - if it can > better the performance while consuming? > > Yana > > > >

Re: Kafka Scaling Ideas

2020-12-19 Thread Haruki Okada
Hi. Yeah, Spring-Kafka does processing messages sequentially, so the consumer throughput would be capped by database latency per single process. One possible solution is creating an intermediate topic (or altering source topic) with much more partitions as Marina suggested. I'd like to suggest

Re: multi-threaded consumer configuration like stream threads?

2020-11-23 Thread Haruki Okada
processed again until successful. > > > On Mon, Nov 23, 2020 at 10:16 PM Haruki Okada wrote: > > > Hi Pushkar. > > > > Just for your information, https://github.com/line/decaton is a Kafka > > consumer framework that supports parallel processing per single > partiti

Re: multi-threaded consumer configuration like stream threads?

2020-11-23 Thread Haruki Okada
Hi Pushkar. Just for your information, https://github.com/line/decaton is a Kafka consumer framework that supports parallel processing per single partition. It manages committable (i.e. the offset that all preceding offsets have been processed) offset internally so that preserves at-least-once

Protocol evolution/versioning docs are missing

2020-06-09 Thread Haruki Okada
Hi, Kafka. While reading through protocol docs, I found that doc about protocol evolution and versioning are missing in protocol.html, while toc contains a section for it. https://github.com/apache/kafka/blob/trunk/docs/protocol.html#L47-L50 Is there any plan to add a doc about protocol