[ 
https://issues.apache.org/jira/browse/KUDU-3383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

shenxingwuying updated KUDU-3383:
---------------------------------
    Description: 
As describe as https://issues.apache.org/jira/browse/KUDU-3382.

I am talking about linearizability read.
h1. Background && Motivation

Linearizability read is a very friendly feature for developers, kudu can 
support it. Now I find kudu may be implements yet.
h1. Issue of linearizability read from leader

We need talk about the issue.

The feture is especially important for kv system, and kudu is mainly olap 
oriented. But in some scenaios, the feature also privide advantages.

Kudu's read implements by Scan, event though read one row. It send a 
ScanRequest with NewScanRequest and then send ContinueScanRequest. The feature 
will be aimed at NewScanRequest.

Kudu's raft implements is a strong leader, leader's state machine is not older 
than followers, and followers heartbeat timeout or receives leader election 
request(leader transfer) can elect leader and switch leader.

If kudu need linearizability read, read leader is not enough, because double 
leader may be exist at a very small period time.

I provide two scenarios. The first one:

 

!image-2022-07-20-23-17-40-718.png!

 
 # A raft group has 3 replicas, L1, F2, F3. Their states is steady during term 
1.
 # If network parition, F2 and F3 loss leader's heartbeat, F3 start election, 
F2 vote it.
 # F3 become Leader, we can call it L3. At this moment, there are 2 leaders: 
L1(1) and L3(2).
 # The state will be continued until the network partition recover. The time 
may be short or long.

During double leader, it's not liearizability read. So kudu should avoid double 
leader at any time, pay the corresponding cost is no leader at a small period 
time. Kudu should make a choice. For user usally need linearizability, so I 
think kudu should support it. During a very small time no leader's 
unavailability can avoid by client's fault tolerance.

Whether read leader is linearizability read, someone can make sure it or I can 
do a experiment.

kudu should avoid double leaders at a very small period time and network fault 
happens . I review the codes, and think now the problem is exist.
h1. Solution

To avoid the double leader's trouble,leader should be keep alive. If a leader 
receives no enough heartbeats in a period of time, it shoud be leader down and 
and then start another election just like follower does. Leader's timeout 
should be less than follower's election.

Another scheme: Read should send heartbeat to two follow to make sure it is 
valid leader.

  was:
The next 3 days, I'll fix the idea of the issue.  After make sure the problem, 
Maybe I should provide a document for the issue.

 

As describe as https://issues.apache.org/jira/browse/KUDU-3382.

 

 
h1. Background && Motivation

Linearizability read is a very friendly feature for developers, kudu can 
support it. Now I find kudu may be implements yet.
h1. Issue of linearizability read from leader

We need talk about the issue.

The feture is especially important for kv system, and kudu is mainly olap 
oriented. But in some scenaios, the feature also privide advantages.

Kudu's read implements by Scan, event though read one row. It send a 
ScanRequest with NewScanRequest and then send ContinueScanRequest. The feature 
will be aimed at NewScanRequest.

Kudu's raft implements is a strong leader, leader's state machine is not older 
than followers, and followers heartbeat timeout or receives leader election 
request(leader transfer) can elect leader and switch leader.

If kudu need linearizability read, read leader is not enough, because double 
leader may be exist at a very small period time.

I provide two scenarios. The first one:

 

!image-2022-07-20-23-17-40-718.png!

 
 # A raft group has 3 replicas, L1, F2, F3. Their states is steady during term 
1.
 # If network parition, F2 and F3 loss leader's heartbeat, F3 start election, 
F2 vote it.
 # F3 become Leader, we can call it L3. At this moment, there are 2 leaders: 
L1(1) and L3(2).
 # The state will be continued until the network partition recover. The time 
may be short or long.

During double leader, it's not liearizability read. So kudu should avoid double 
leader at any time, pay the corresponding cost is no leader at a small period 
time. Kudu should make a choice. For user usally need linearizability, so I 
think kudu should support it. During a very small time no leader's 
unavailability can avoid by client's fault tolerance.

Whether read leader is linearizability read, someone can make sure it or I can 
do a experiment.

kudu should avoid double leaders at a very small period time and network fault 
happens . I review the codes, and think now the problem is exist.
h1. Solution

To avoid the double leader's trouble,leader should be keep alive. If a leader 
receives no enough heartbeats in a period of time, it shoud be leader down and 
and then start another election just like follower does. Leader's timeout 
should be less than follower's election.

Another scheme: Read should send heartbeat to two follow to make sure it is 
valid leader.


> About strong consistency read from leader
> -----------------------------------------
>
>                 Key: KUDU-3383
>                 URL: https://issues.apache.org/jira/browse/KUDU-3383
>             Project: Kudu
>          Issue Type: Improvement
>            Reporter: shenxingwuying
>            Assignee: shenxingwuying
>            Priority: Major
>         Attachments: image-2022-07-20-23-14-34-519.png, 
> image-2022-07-20-23-17-40-718.png
>
>
> As describe as https://issues.apache.org/jira/browse/KUDU-3382.
> I am talking about linearizability read.
> h1. Background && Motivation
> Linearizability read is a very friendly feature for developers, kudu can 
> support it. Now I find kudu may be implements yet.
> h1. Issue of linearizability read from leader
> We need talk about the issue.
> The feture is especially important for kv system, and kudu is mainly olap 
> oriented. But in some scenaios, the feature also privide advantages.
> Kudu's read implements by Scan, event though read one row. It send a 
> ScanRequest with NewScanRequest and then send ContinueScanRequest. The 
> feature will be aimed at NewScanRequest.
> Kudu's raft implements is a strong leader, leader's state machine is not 
> older than followers, and followers heartbeat timeout or receives leader 
> election request(leader transfer) can elect leader and switch leader.
> If kudu need linearizability read, read leader is not enough, because double 
> leader may be exist at a very small period time.
> I provide two scenarios. The first one:
>  
> !image-2022-07-20-23-17-40-718.png!
>  
>  # A raft group has 3 replicas, L1, F2, F3. Their states is steady during 
> term 1.
>  # If network parition, F2 and F3 loss leader's heartbeat, F3 start election, 
> F2 vote it.
>  # F3 become Leader, we can call it L3. At this moment, there are 2 leaders: 
> L1(1) and L3(2).
>  # The state will be continued until the network partition recover. The time 
> may be short or long.
> During double leader, it's not liearizability read. So kudu should avoid 
> double leader at any time, pay the corresponding cost is no leader at a small 
> period time. Kudu should make a choice. For user usally need linearizability, 
> so I think kudu should support it. During a very small time no leader's 
> unavailability can avoid by client's fault tolerance.
> Whether read leader is linearizability read, someone can make sure it or I 
> can do a experiment.
> kudu should avoid double leaders at a very small period time and network 
> fault happens . I review the codes, and think now the problem is exist.
> h1. Solution
> To avoid the double leader's trouble,leader should be keep alive. If a leader 
> receives no enough heartbeats in a period of time, it shoud be leader down 
> and and then start another election just like follower does. Leader's timeout 
> should be less than follower's election.
> Another scheme: Read should send heartbeat to two follow to make sure it is 
> valid leader.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to