[jira] [Commented] (KAFKA-10526) Explore performance impact of leader fsync deferral

2021-03-05 Thread Sagar Rao (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-10526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17295880#comment-17295880
 ] 

Sagar Rao commented on KAFKA-10526:
---

hey [~hachikuji] sorry for bugging again on this but could you plz help me out 
with the queries above? 

> Explore performance impact of leader fsync deferral
> ---
>
> Key: KAFKA-10526
> URL: https://issues.apache.org/jira/browse/KAFKA-10526
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Jason Gustafson
>Assignee: Sagar Rao
>Priority: Major
>
> In order to commit a write, a majority of nodes must call fsync in order to 
> ensure the data has been written to disk. An interesting optimization option 
> to consider is letting the leader defer fsync until the high watermark is 
> ready to be advanced. This potentially allows us to reduce the number of 
> flushes on the leader.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-10526) Explore performance impact of leader fsync deferral

2021-02-26 Thread Sagar Rao (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-10526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17291485#comment-17291485
 ] 

Sagar Rao commented on KAFKA-10526:
---

[~hachikuji], I looked at the codebase and the KIP further and here's what I 
understood:

1) Any new records that the leader receives, it immediately updates its local 
state. This happens via the maybeAppendBatches method which invokes 
flushLeaderLog. In flushLeaderLog, for the bunch of records, it would update 
it's local state and check if the HWM can be advanced. Note that after this 
step, the log is always flushed to disk in flushLeaderLog.

2) The followers invoke fetch requests to fetch records. Once the leader 
receives such a message, it invokes tryCompleteFetchRequest which validates the 
request. At this point, it reads a bunch of records which can be returned to 
the follower and it tries to update the replicaState. It also tries to update 
the HWM and if it does, then the HWM on the log is also advanced. 

3) The follower, when it receives a FetchResponse, appends the response to its 
log and also flushes the record to its log. I believe it also updates the 
follower watermark here.

 

So, in this flow, flush happens in 2 flows: 1) when the leader completes a 
batch and secondly, when a fetchresponse is received by the follower. As per 
the Op in the ticket, fsync is called a number of times on the followers, so 
that is the ls the latter. Few questions that I have:

 

1) Basic question, but I see all this logic in KafkaRaftClient. where does the 
instance of the class get instantiated? Is it on the leader?

2) looking at this flow, i am slightly confused on how does the leader know 
which records have been committed successfully on the followers? It seems to 
maintian a local copy of replicas and their offsets and epochs, but how does it 
know which have been committed? Is it via the fetch requests received from the 
followers?

3) The optimisation that you have talked about, where does that need to happen 
in this flow? Is it while handling fetch responses  or when appending new 
records in the batch? Or is it some other place?

 

> Explore performance impact of leader fsync deferral
> ---
>
> Key: KAFKA-10526
> URL: https://issues.apache.org/jira/browse/KAFKA-10526
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Jason Gustafson
>Assignee: Sagar Rao
>Priority: Major
>
> In order to commit a write, a majority of nodes must call fsync in order to 
> ensure the data has been written to disk. An interesting optimization option 
> to consider is letting the leader defer fsync until the high watermark is 
> ready to be advanced. This potentially allows us to reduce the number of 
> flushes on the leader.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-10526) Explore performance impact of leader fsync deferral

2021-02-25 Thread Sagar Rao (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-10526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17291436#comment-17291436
 ] 

Sagar Rao commented on KAFKA-10526:
---

hi [~hachikuji], can you plz validate my understanding from the above comment 
whenever you get the chance?

> Explore performance impact of leader fsync deferral
> ---
>
> Key: KAFKA-10526
> URL: https://issues.apache.org/jira/browse/KAFKA-10526
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Jason Gustafson
>Assignee: Sagar Rao
>Priority: Major
>
> In order to commit a write, a majority of nodes must call fsync in order to 
> ensure the data has been written to disk. An interesting optimization option 
> to consider is letting the leader defer fsync until the high watermark is 
> ready to be advanced. This potentially allows us to reduce the number of 
> flushes on the leader.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-10526) Explore performance impact of leader fsync deferral

2021-02-11 Thread Sagar Rao (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-10526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17282971#comment-17282971
 ] 

Sagar Rao commented on KAFKA-10526:
---

[~hachikuji], I have looked at the codebase and also the KIP-595 and tried to 
understand this. 

One thing that I want to know is that the log replication happens via the Fetch 
request/response dance. 

So, the leader gets a Fetch request and if all pre conditions are met, finds a 
bunch of records and returns a FetchResponse. During that process, it keeps 
updating its LocalState and the replicated state for each Fetch Request that 
comes through. In that process, it also tries to check if the highwatermark can 
be moved ahead as it tries to find if a majority of followers are at a point > 
current HMW offset.

 

The follower, when it receives the FetchResponse, looks at the messages and see 
if it needs to truncate its log or if the LEader has been fenced etc and then 
finally, writes the records passed in the FetchResponse to its log. 

What I am not able to figure out is that how does the leader know that a write 
has been committed on the follower side. I could find the code to check if the 
HWM should be incremented or not based upon the ReplicaState.

The FetchResponse handler finally returns if the fetch was successful or not, 
but how is the value propagated back to the leader? There are some listener 
contexts, is it through that or via the NetworkChannels? I see a correlation id 
which is being used in the Raftinbound messages as well.

 

In terms of the optimisation that you have suggested, instead of updating 
LocalState/ReplicaState every time as the leader receives each FetchRequest it 
can wait if the majority has committed the writes and flush only then. Is that 
the correct understanding? 

 

> Explore performance impact of leader fsync deferral
> ---
>
> Key: KAFKA-10526
> URL: https://issues.apache.org/jira/browse/KAFKA-10526
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Jason Gustafson
>Assignee: Sagar Rao
>Priority: Major
>
> In order to commit a write, a majority of nodes must call fsync in order to 
> ensure the data has been written to disk. An interesting optimization option 
> to consider is letting the leader defer fsync until the high watermark is 
> ready to be advanced. This potentially allows us to reduce the number of 
> flushes on the leader.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-10526) Explore performance impact of leader fsync deferral

2021-01-11 Thread Sagar Rao (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-10526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17262724#comment-17262724
 ] 

Sagar Rao commented on KAFKA-10526:
---

[~hachikuji], while KAFKA-10652  is getting reviewed, I was wondering if I can 
get started on this one. Any pointers/docs on how to perform the benchmarking? 
I will also start looking at the points where fsync deferral can be made post 
the crossing of high watermark.

> Explore performance impact of leader fsync deferral
> ---
>
> Key: KAFKA-10526
> URL: https://issues.apache.org/jira/browse/KAFKA-10526
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Jason Gustafson
>Assignee: Sagar Rao
>Priority: Major
>
> In order to commit a write, a majority of nodes must call fsync in order to 
> ensure the data has been written to disk. An interesting optimization option 
> to consider is letting the leader defer fsync until the high watermark is 
> ready to be advanced. This potentially allows us to reduce the number of 
> flushes on the leader.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-10526) Explore performance impact of leader fsync deferral

2020-10-28 Thread Sagar Rao (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-10526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1746#comment-1746
 ] 

Sagar Rao commented on KAFKA-10526:
---

Sure thank you [~hachikuji]!, i have assigned this one and the other 2 to 
myself. I will go through the code 

> Explore performance impact of leader fsync deferral
> ---
>
> Key: KAFKA-10526
> URL: https://issues.apache.org/jira/browse/KAFKA-10526
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Jason Gustafson
>Priority: Major
>
> In order to commit a write, a majority of nodes must call fsync in order to 
> ensure the data has been written to disk. An interesting optimization option 
> to consider is letting the leader defer fsync until the high watermark is 
> ready to be advanced. This potentially allows us to reduce the number of 
> flushes on the leader.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-10526) Explore performance impact of leader fsync deferral

2020-10-27 Thread Jason Gustafson (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-10526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17221881#comment-17221881
 ] 

Jason Gustafson commented on KAFKA-10526:
-

[~sagarrao] Yes, of course. I might suggest KAFKA-10652 as a lower hanging 
fruit to get into the code a little bit.

> Explore performance impact of leader fsync deferral
> ---
>
> Key: KAFKA-10526
> URL: https://issues.apache.org/jira/browse/KAFKA-10526
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Jason Gustafson
>Priority: Major
>
> In order to commit a write, a majority of nodes must call fsync in order to 
> ensure the data has been written to disk. An interesting optimization option 
> to consider is letting the leader defer fsync until the high watermark is 
> ready to be advanced. This potentially allows us to reduce the number of 
> flushes on the leader.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-10526) Explore performance impact of leader fsync deferral

2020-10-16 Thread Sagar Rao (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-10526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17215353#comment-17215353
 ] 

Sagar Rao commented on KAFKA-10526:
---

hey [~hachikuji], is it something that I can pick up? I am not sure if you or 
someone else is planning to pick it as it seems to be related to the raft 
protocol KIP.. Let me know plz.

> Explore performance impact of leader fsync deferral
> ---
>
> Key: KAFKA-10526
> URL: https://issues.apache.org/jira/browse/KAFKA-10526
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Jason Gustafson
>Priority: Major
>
> In order to commit a write, a majority of nodes must call fsync in order to 
> ensure the data has been written to disk. An interesting optimization option 
> to consider is letting the leader defer fsync until the high watermark is 
> ready to be advanced. This potentially allows us to reduce the number of 
> flushes on the leader.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)