[ 
https://issues.apache.org/jira/browse/LIVY-782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17162031#comment-17162031
 ] 

Andras Beni commented on LIVY-782:
----------------------------------

[~anfog], thanks for bringing up this problem. 
Let me share my thoughts

h4. Current functionality

I believe there is a workaround for the problem. Sessions have an optional 
{{name}} field that is 
[guaranteed|https://github.com/apache/incubator-livy/blob/97cf2f75929ef6c152afc468adbead269bd0758f/server/src/main/scala/org/apache/livy/sessions/SessionManager.scala#L101]
 to be unique for both session types. So whenever a POST request fails, you can 
check if the session was created or not by listing sessions.
It might happen that another user creates a session with the same name, so you 
must still compare the properties of the session you wanted to create and the 
one that actually exists.

h4. Your suggested solution

Let's suppose the PUT request did not reach the server and another client 
creates a session with the same ID. In this case you will falsely assume the 
session was created and the response was lost. And you will use the session 
created by the other client. Even if the check involves other properties, I'm 
not sure we can prevent this.
I also don't really like the idea of generating IDs outside the application, 
because it complicates ID generation (especially in the HA scenario).

h4. Your alternative solution

The POST request proposed in this solution is not idempotent either. But 
wasting a session identifier is not as serious a problem as creating a session 
that the client will not know about, so this is improvement.
Should session identifiers time out if they are not used? Should there be a 
quota or can users allocate as many (unused) session identifiers as they like?
In your example, why does the client query Livy before retrying?

h4. My proposed solution

Add an optional {{requestId}} field (string, practically a UUID) to the current 
POST request. If it's null or the server does not know about a session with the 
same requestId, session gets created. If there's one with the same requestId 
and the exact same properties, it is returned as if it was created by this 
request. Otherwise (when there's one with the same requestId but different 
properties) an error is raised.
I believe with this semantics the POST request is idempotent. I haven't given 
much thought to how this plays together with the HA feature under development 
though.

Let me know what you think about my comments.


> Idempotent Livy session creation
> --------------------------------
>
>                 Key: LIVY-782
>                 URL: https://issues.apache.org/jira/browse/LIVY-782
>             Project: Livy
>          Issue Type: New Feature
>          Components: API, Server
>            Reporter: Andrew Fogarty
>            Priority: Major
>
> h2. Problem description
> Livy currently has POST APIs for creating sessions:
>  * To create a batch session, a client must submit a post request to 
> “/batches”.
>  * To create an interactive session, a client must submit a POST request to 
> “/sessions”.
> Both APIs generate a unique session ID which is returned to the client as 
> part of the response payload.
> These APIs are not idempotent.  That is, if either the request or the 
> response is lost in transit, the client has no way to validate whether that 
> job has started.  The only way to retry is to submit another POST, which 
> could potentially start a second job.
> For example, suppose a client submits a POST to create a new batch session. 
> Livy receives the request and starts the batch session with ID=12. When Livy 
> sends the response, assume it is lost in transit due to some networking 
> issue. The client never receives the response, so it does not know if the 
> batch started correctly and does not have an ID to query the status of the 
> batch session.
> This document contains two proposed solutions for this idempotence problem. 
> These solutions introduce APIs for creating sessions in an idempotent manner. 
> Neither solution makes changes to existing APIs.
> h2. Suggested solution
> This proposed solution introduces 1 new API:
>  * PUT(“/\{session type}/”) -> Session
> This API is described below.  *Note:* ‘->’ indicates the call “returns”.
> h3. API: PUT(“/\{session type}/”) -> Session – Create session with given 
> request ID header
> This new API is a PUT to create a new session (batch or interactive) for the 
> given session ID.  This new API is very similar to the existing POST API to 
> create a session and expects the request payload to be a CreateBatchRequest 
> or CreateInteractiveRequest as appropriate.
> The difference between this PUT API and the existing session POST API is that 
> requests to this API must contain a “requestId” header with a GUID value.  If 
> the requestId is not provided, then PUT will fail with an error. This 
> requestId is saved as an optional field on the metadata object 
> (BatchRecoveryMetadata or InteractiveRecoveryMetadata) stored in the 
> SessionStore.
> When creating the session, before storing the metadata object in the 
> SessionStore, we query the SessionStore to see if some session already exists 
> with that requestId. If a session with the requestId already exists, then we 
> return that session instead of creating a new one.  If there is no existing 
> session with that requestId, then we create the session normally.
> h3. Example
> This solution solves the idempotence problem by ensuring that repeat calls to 
> PUT with the same requestId will return the first created session. If a 
> client makes a request to PUT but for some reason does not receive a 
> response, then they can retry that request with the same requestId. If the 
> session had not started, then it will start. Otherwise, if the session has 
> already started, then its session object will be returned to the client.
> h2. Alternative solution
> Introduce 2 new APIs:
>  # POST(“/\{session type}/id”) -> \{sessionId: Int, : GUID}
>  # PUT(“/\{session type}/\{session id}”) -> Session
> Both are described below.
> h3. API 1: POST(“/\{session type}/id”) -> \{sessionId: Int, putKey: GUID} – 
> Generates a new unique sessionId
> The first API is a POST to generate a new unique session ID for the given 
> session type (batch or interactive).
> This API would:
>  # Increment the existing sessionId incrementor.
>  # Store a new value \{“/putkey/\{session type}/\{session id}” -> putKey} in 
> the SessionStore.
>  # Return \{sessionId: Int, putKey: GUID} payload.
> The returned payload contains the session ID as well as the “putKey”, which 
> is a GUID used in the second API to validate the sessionID. We call this the 
> “putKey” because it represents a unique key used to identify the PUT request. 
> We store the mapping from session ID to putKey in the SessionStore so that 
> the second API can validate that a provided session ID matches its putKey.
> h3. API 2: PUT(“/\{session type}/\{session id}”) -> Session – Create a 
> session with the given session ID
> The second API is a PUT to create a new session (batch or interactive) for 
> the given session ID. This new API is very similar to the existing POST API 
> to create a session and expects the request payload to be a 
> CreateBatchRequest or CreateInteractiveRequest as appropriate. 
> CreateBatchRequest and CreateInteractiveRequest will contain the optional 
> putKey field.
> This API would:
>  # Validate that the provided session ID matches the putKey by reading the 
> \{“/putkey/\{session type}/\{session id}” value from the SessionStore.
>  ## If no putKey is provided, or the session ID does not match the putKey, 
> then we fail the request. This is to ensure that the provided sessionID was 
> generated by the first API, and that some client isn’t using a sessionID that 
> it should not have permission to use.
>  # Follow the usual code path to create a session, except pass down the 
> session ID and the putkey.
>  ** For this feature, we would change that code path in BatchSession and 
> InteractiveSession. Before saving the session metadata record to 
> SessionStore, we check that some record with this ID does not already exist 
> in the SessionStore. If it does, then we just return that session and do not 
> create a new session.
> h3. Example
> With these new APIs, a client can get a valid session ID before submitting 
> their batch or interactive session to Livy. 
> The sequence would be:
>  # Call POST(“/\{session type}/id”) to get a new valid session ID and putKey.
>  # Call PUT(“/\{session type/{session id}”) to start a new session with that 
> valid session ID.
>  # If for some reason the client does not receive a response, use the ID to 
> query Livy for the status. Otherwise, they can re-submit the PUT request. 
> When a request is re-submitted:
>  ## If the session had not started, it will start.
>  ## If the session had started already, its session object will be returned 
> to the client.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to