Definitely. With respect to BDB usage & client retries, there are basically 4 types of Hyperspace operations:
1.read only (readdir, attrget etc) 2.single write txn (open, lock etc.) 3.multi write txn (close, release) 4.remove expired sessions (multi write txn, not client initiated) Of these, only 2 and 3 need request logging since they maybe retried by the client (or be genuinely new requests) while the master is trying to complete execution of such a previously interrupted operation. -Sanjit On Thu, Sep 3, 2009 at 4:56 PM, Luke <[email protected]> wrote: > > Is it possible to only persist modifying requests? and not the > majority of read/keepalive requests? > > On Thu, Sep 3, 2009 at 4:10 PM, Sanjit Jhala<[email protected]> wrote: > > This is required to ensure that the state remains consistent whether or > not > > the client retries. On a retry the client provides the master with the > > initial request id, allowing the master to synchronize the response with > the > > completion of the initial request. > > > > There could be some requests (eg close() ) which can cause multiple BDB > > transactions (release lock, grant pending lock requests, delete if > > ephemeral, delete handle) along with having to persist notifications (to > > ensure they are not lost in a crash). A crash in the middle of these > > operations leaves the system in an inconsistent state. The idea is to > resume > > incomplete operations while simultaneously handling client retries. > > > > -Sanjit > > > > On Thu, Sep 3, 2009 at 3:35 PM, Luke <[email protected]> wrote: > >> > >> Why do master needs to persist a list of requests if the client can > >> already retry? I think we should store as little as possible in BDB, > >> as it's the bottle neck. > >> > >> On Thu, Sep 3, 2009 at 1:39 PM, Sanjit Jhala<[email protected]> wrote: > >> > Recovery algorithm: > >> > > >> > 1. Set Master state to recovering, respond to any client > >> > requests/keepalives > >> > with "master_recovering" status (clients move into a recovery state > >> > where > >> > they don't send new requests but continue sending keepalives) > >> > 2. Read in session data persisted in BDB to memory and recreate > session > >> > map > >> > 3. Identify in progress operations, add them to the "completion map" > and > >> > en > >> > queue in the worker queue. > >> > 4. Create another thread to complete the session expiration for any > >> > sessions > >> > marked for expiry > >> > 5. Set Master state to ready and resume normal operations > >> > > >> > In order to ensure correct completion of requests interrupted by the > >> > master > >> > crash and ensure the client can safely retry these operations we need > >> > request ids (to uniquely identify requests) and a completion map. > >> > > >> > Request ids: > >> > Clients maintain an increasing 64-bit request id and a min heap of > >> > outstanding request ids. Before sending a request to the server, the > >> > client > >> > inserts the new id > >> > and deletes it after it receives a server response. On each keepalive > >> > request the client sends the top of the heap (or 0 if the heap is > empty) > >> > which its lowest > >> > in progress request. (Requires a mutex lock on insert and delete from > >> > heap) > >> > > >> > The server maintains a list of in-progress request ids per session (in > >> > BDB) > >> > as well as the most recently purged request id. New ids are added as > >> > part > >> > of processing the request. The result of the processing the request is > >> > also > >> > persisted (the state of the processing might also be stored if > needed). > >> > When the server receives a keepalive request it checks the client > >> > reported > >> > in-progress request id. If this value is non-zero and different from > the > >> > most > >> > recently purged request id (stored at the server) then the server > >> > updates to > >> > the new id and deletes info on all previously store requests with > >> > smaller > >> > ids. > >> > (Shouldn't require BDB ops/locks in the common case, ie no in-progress > >> > requests) > >> > > >> > Completion map: > >> > During master recovery, all incomplete requests will be enqueued and > >> > resumed > >> > from where they were left off. In addition an entry in a completion > map > >> > containing "session id+request id" --> "completion object" will be > >> > created. > >> > Clients will attempt to retry these requests (with a retry flag). When > >> > the > >> > master sees the retry flag it will check the completion map and wait > for > >> > the > >> > completion object to signal that the request processing is complete. > The > >> > master then looks up the request result (stored in BDB and possibly in > >> > the > >> > completion object) and sends it back to the client. > >> > > >> > > >> > -Sanjit > >> > > >> > > > >> > > >> > >> > > > > > > > > > > > > > --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Hypertable Development" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/hypertable-dev?hl=en -~----------~----~----~----~------~----~------~--~---
