[ 
https://issues.apache.org/jira/browse/HIVE-8568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14215847#comment-14215847
 ] 

Vaibhav Gumashta commented on HIVE-8568:
----------------------------------------

[~mohitsabharwal] Thanks so much for being patient. I have more review time now 
:)

I believe there is a natural "has many" association between an operation and 
the jobs it launches. In fact, the job ids come into existence only when the 
OperationState transitions to RUNNING. I think operation.getJobIDs will be a 
useful call to add. So in that sense, IMO there is nothing wrong by returning 
more related information with GetOperationStatus. Another benefit of doing that 
is that now, you don't have another potentially polling call that some clients 
of the thrift API can use in future (I feel polling calls are risky if the 
server does not have control over them!). In case of GetOperationStatus, 
CLIService.getOperationStatus prevents the server from getting bombarded by 
frequent polling RPC calls by using a long polling approach which you don't 
have to rewrite.

However, one issue that I overlooked was the network traffic that it might 
increase if we return job ids with each call to GetOperationStatus which is 
used pretty frequently in JDBC driver. We could expand TGetOperationStatusResp 
to handle that in the following backward compatible manner:
{code}
struct TGetOperationStatusReq {
  // Session to run this request against
  1: required TOperationHandle operationHandle
  2: optional bool getJobIds = false
}

struct TGetOperationStatusResp {
  1: required TStatus status
  2: optional TOperationState operationState
  // If operationState is ERROR_STATE, then the following fields may be set
  // sqlState as defined in the ISO/IEF CLI specification
  3: optional string sqlState
  // Internal error code
  4: optional i32 errorCode
  // Error message
  5: optional string errorMessage
  6: optional bool hasJobIds = false
  7: optional list<string> jobIds
}
{code}
In this way, clients which are interested in getting the job ids, can construct 
the appropriate request object. 

[~thejas] What are your thoughts?

> Add HS2 API to fetch Job IDs for a given query
> ----------------------------------------------
>
>                 Key: HIVE-8568
>                 URL: https://issues.apache.org/jira/browse/HIVE-8568
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Mohit Sabharwal
>            Assignee: Mohit Sabharwal
>         Attachments: HIVE-8568.1.patch, HIVE-8568.patch
>
>
> Fetching Job IDs corresponding to all running MR/Tez tasks is useful for 
> clients like Hue. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to