[ https://issues.apache.org/jira/browse/IGNITE-18742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Denis Chudov updated IGNITE-18742: ---------------------------------- Description: Maintenance phase of placement driver is the management of already existing leases by the placement driver's active actor. !screenshot-1.png! Describing in words, there should be a worker triggered once in a period of time equal to leaseInterval/2, that makes following: - it iterates over all partition groups managed by placement driver, and either extends their leases or chooses new, if there is no lease for group for some reason (it can be because leaseholder was never chosen, or actual leaseholder left the cluster, etc.); - if there is no lease, it should be assigned using lease candidates balancer, see IGNITE-18879 , otherwise the placement driver should define if the existing lease is goind to expire; - new timestamp (let's name it leaseValidUntil) until which the lease will be valid, should be calculated as currentTime + leaseInterval; - after that, placement driver should invoke meta storage to refresh data about leases, lease candidates and their new leaseValidUntil timestamps; - if invoke is successful (it can be not if there are two nodes considering themselves as the placement driver's active actors), LeaseGrantMessage containing leaseValidUntil should be sent to leases and lease candidates; - lease candidate is able to decline LeaseGrantMessage and sent LeaseGrantResponse with redirect proposal, containing alternative candidate. This proposal should also be handled with lease candidates balancer and as a result, following is possible: previously chosen candidate is enforced: placement driver sends LeaseGrantMessage(force=true) to the lease candidate, or the proposed candidate is accepted by placement driver; - if lease candidate accepts it's role, the placement driver makes invoke to meta storage, confirming new leaseholder and it's lease validness interval. Pseudocode: {code:java} scheduleAtFixedRate(leaseInterval / 2) { for (group in replicationGroups) { leaseValidUntil = now() + leaseInterval // getting timestamp until leases are prolonged lease = lease(group) if (lease == null) { // it can be null if these is no lease at all, or active lease left the cluster lease = leaseBalancer.get(group) // in this case lease is just a candidate } if (invokeMetaStorage(grantLease(lease, leaseValidUntil))) { sendLeaseGrantMessage(lease, leaseValidUntil) // send message to replica, response is handled by onLeaseGrantResponse } } } onLeaseGrantResponse(leaseGrantResponse) { leaseValidUntil = leaseGrantResponse.leaseValidUntil if (leaseGrantResponse.redirectProposal) { leaseCandidateNew = leaseBalancer.considerRedirectProposal(leaseGrantResponse.sender, leaseGrantResponse.redirectProposal); if (invokeMetaStorage(grantLease(leaseCandidateNew, leaseValidUntil))) { sendLeaseGrantMessage(leaseCandidateNew, leaseValidUntil, force) // force lease grant message } } else { assert(leaseGrantResponse.accepted) leaseholder = leaseGrantResponse.sender invokeMetaStorage(leaseConfirmed(leaseholder, leaseValidUntil)) } } {code} was: Maintenance phase of placement driver is the management of already existing leases by the placement driver's active actor. !screenshot-1.png! Describing in words, there should be a worker triggered once in a period of time equal to leaseInterval/2, that makes following: - it iterates over all partition groups managed by placement driver, and either extends their leases or chooses new, if there is no lease for group for some reason (it can be because leaseholder was never chosen, or actual leaseholder left the cluster, etc.); - if there is no lease, it should be assigned using lease candidates balancer, see IGNITE-18879 ; - new timestamp (let's name it leaseValidUntil) until which the lease will be valid, should be calculated as currentTime + leaseInterval; - after that, placement driver should invoke meta storage to refresh data about leases and their new leaseValidUntil timestamps; - if invoke is successful (it can be not if there are two nodes considering themselves as the placement driver's active actors), LeaseGrantMessage containing leaseValidUntil should be sent to leases and lease candidates; - lease candidate is able to decline LeaseGrantMessage and sent LeaseGrantResponse with redirect proposal, containing alternative candidate. This proposal should also be handled with lease candidates balancer and as a result, following is possible: previously chosen candidate is enforced: placement driver sends LeaseGrantMessage(force=true) to the lease candidate, or the proposed candidate is accepted by placement driver; - if lease candidate accepts it's role, the placement driver makes invoke to meta storage, confirming new leaseholder and it's lease validness interval. Pseudocode: {code:java} scheduleAtFixedRate(leaseInterval / 2) { for (group in replicationGroups) { leaseValidUntil = now() + leaseInterval // getting timestamp until leases are prolonged lease = lease(group) if (lease == null) { // it can be null if these is no lease at all, or active lease left the cluster lease = leaseBalancer.get(group) // in this case lease is just a candidate } if (invokeMetaStorage(grantLease(lease, leaseValidUntil))) { sendLeaseGrantMessage(lease, leaseValidUntil) // send message to replica, response is handled by onLeaseGrantResponse } } } onLeaseGrantResponse(leaseGrantResponse) { leaseValidUntil = leaseGrantResponse.leaseValidUntil if (leaseGrantResponse.redirectProposal) { leaseCandidateNew = leaseBalancer.considerRedirectProposal(leaseGrantResponse.sender, leaseGrantResponse.redirectProposal); if (invokeMetaStorage(grantLease(leaseCandidateNew, leaseValidUntil))) { sendLeaseGrantMessage(leaseCandidateNew, leaseValidUntil, force) // force lease grant message } } else { assert(leaseGrantResponse.accepted) leaseholder = leaseGrantResponse.sender invokeMetaStorage(leaseConfirmed(leaseholder, leaseValidUntil)) } } {code} > Implement logic for a maintenance phase of group lease management > ----------------------------------------------------------------- > > Key: IGNITE-18742 > URL: https://issues.apache.org/jira/browse/IGNITE-18742 > Project: Ignite > Issue Type: Improvement > Reporter: Alexander Lapin > Priority: Major > Labels: ignite-3 > Attachments: screenshot-1.png > > > Maintenance phase of placement driver is the management of already existing > leases by the placement driver's active actor. > !screenshot-1.png! > Describing in words, there should be a worker triggered once in a period of > time equal to leaseInterval/2, that makes following: > - it iterates over all partition groups managed by placement driver, and > either extends their leases or chooses new, if there is no lease for group > for some reason (it can be because leaseholder was never chosen, or actual > leaseholder left the cluster, etc.); > - if there is no lease, it should be assigned using lease candidates > balancer, see IGNITE-18879 , otherwise the placement driver should define if > the existing lease is goind to expire; > - new timestamp (let's name it leaseValidUntil) until which the lease will > be valid, should be calculated as currentTime + leaseInterval; > - after that, placement driver should invoke meta storage to refresh data > about leases, lease candidates and their new leaseValidUntil timestamps; > - if invoke is successful (it can be not if there are two nodes considering > themselves as the placement driver's active actors), LeaseGrantMessage > containing leaseValidUntil should be sent to leases and lease candidates; > - lease candidate is able to decline LeaseGrantMessage and sent > LeaseGrantResponse with redirect proposal, containing alternative candidate. > This proposal should also be handled with lease candidates balancer and as a > result, following is possible: previously chosen candidate is enforced: > placement driver sends LeaseGrantMessage(force=true) to the lease candidate, > or the proposed candidate is accepted by placement driver; > - if lease candidate accepts it's role, the placement driver makes invoke to > meta storage, confirming new leaseholder and it's lease validness interval. > Pseudocode: > {code:java} > scheduleAtFixedRate(leaseInterval / 2) { > for (group in replicationGroups) { > leaseValidUntil = now() + leaseInterval // getting timestamp until > leases are prolonged > lease = lease(group) > if (lease == null) { // it can be null if these is no lease at all, > or active lease left the cluster > lease = leaseBalancer.get(group) // in this case lease is just a > candidate > } > if (invokeMetaStorage(grantLease(lease, leaseValidUntil))) { > sendLeaseGrantMessage(lease, leaseValidUntil) // send message to > replica, response is handled by onLeaseGrantResponse > } > } > } > onLeaseGrantResponse(leaseGrantResponse) { > leaseValidUntil = leaseGrantResponse.leaseValidUntil > if (leaseGrantResponse.redirectProposal) { > leaseCandidateNew = > leaseBalancer.considerRedirectProposal(leaseGrantResponse.sender, > leaseGrantResponse.redirectProposal); > if (invokeMetaStorage(grantLease(leaseCandidateNew, > leaseValidUntil))) { > sendLeaseGrantMessage(leaseCandidateNew, leaseValidUntil, force) > // force lease grant message > } > } else { > assert(leaseGrantResponse.accepted) > leaseholder = leaseGrantResponse.sender > invokeMetaStorage(leaseConfirmed(leaseholder, leaseValidUntil)) > } > } > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)