This is required to ensure that the state remains consistent whether or not
the client retries. On a retry the client provides the master with the
initial request id, allowing the master to synchronize the response with the
completion of the initial request.

There could be some requests (eg close() ) which can cause multiple BDB
transactions (release lock, grant pending lock requests, delete if
ephemeral, delete handle) along with having to persist notifications (to
ensure they are not lost in a crash). A crash in the middle of these
operations leaves the system in an inconsistent state. The idea is to resume
incomplete operations while simultaneously handling client retries.

-Sanjit

On Thu, Sep 3, 2009 at 3:35 PM, Luke <[email protected]> wrote:

>
> Why do master needs to persist a list of requests if the client can
> already retry? I think we should store as little as possible in BDB,
> as it's the bottle neck.
>
> On Thu, Sep 3, 2009 at 1:39 PM, Sanjit Jhala<[email protected]> wrote:
> > Recovery algorithm:
> >
> > 1. Set Master state to recovering, respond to any client
> requests/keepalives
> > with "master_recovering" status (clients move into a recovery state where
> > they don't send new requests but continue sending keepalives)
> > 2. Read in session data  persisted in BDB to memory and recreate session
> map
> > 3. Identify in progress operations, add them to the "completion map" and
> en
> > queue in the worker queue.
> > 4. Create another thread to complete the session expiration for any
> sessions
> > marked for expiry
> > 5. Set Master state to ready and  resume normal operations
> >
> > In order to ensure correct completion of requests interrupted by the
> master
> > crash and ensure the client can safely retry these operations we need
> > request ids (to uniquely identify requests) and a completion map.
> >
> > Request ids:
> > Clients maintain an increasing 64-bit request id and a min heap of
> > outstanding request ids. Before sending a request to the server, the
> client
> > inserts the new id
> > and deletes it after it receives a server response. On each keepalive
> > request the client sends the top of the heap (or 0 if the heap is empty)
> > which its lowest
> > in progress request. (Requires a mutex lock on insert and delete from
> heap)
> >
> > The server maintains a list of in-progress request ids per session (in
> BDB)
> > as well as the most recently purged request id. New ids are added as part
> > of processing the request. The result of the processing the request is
> also
> > persisted (the state of the processing might also be stored if needed).
> > When the server receives a keepalive request it checks the client
> reported
> > in-progress request id. If this value is non-zero and different from the
> > most
> > recently purged request id (stored at the server) then the server updates
> to
> > the new id and deletes info on all previously store requests with smaller
> > ids.
> > (Shouldn't require BDB ops/locks in the common case, ie no in-progress
> > requests)
> >
> > Completion map:
> > During master recovery, all incomplete requests will be enqueued and
> resumed
> > from where they were left off. In addition an entry in a completion map
> > containing  "session id+request id" --> "completion object" will be
> created.
> > Clients will attempt to retry these requests (with a retry flag). When
> the
> > master sees the retry flag it will check the completion map and wait for
> the
> > completion object to signal that the request processing is complete. The
> > master then looks up the request result (stored in BDB and possibly in
> the
> > completion object) and sends it back to the client.
> >
> >
> > -Sanjit
> >
> > >
> >
>
> >
>

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Hypertable Development" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/hypertable-dev?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to