[ 
https://issues.apache.org/jira/browse/HIVE-4321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13653318#comment-13653318
 ] 

Carl Steinbach commented on HIVE-4321:
--------------------------------------

bq. It would make sense to use 'prepare' if we are trying to address the 
'prepare' + 'executePrepared' use case. Unlike OLTP oriented databases, where 
prepare+executePrepared are going to be useful for doing things like large 
number of single row inserts, it is not going to be as useful in hive. 

In the ODBC/JDBC world "prepare" is synonymous with "compile", e.g. here's a 
direct quote from Microsoft's ODBC docs:

{quote}
Prepared execution is an efficient way to execute a statement more than once. 
The statement is first compiled, or prepared, into an access plan. The access 
plan is then executed one or more times at a later time. For more information 
about access plans, see Processing an SQL Statement.
{quote}
Ref: 
http://msdn.microsoft.com/en-us/library/windows/desktop/ms716365(v=vs.85).aspx

In case I didn't make this clear I'm not suggesting that we support a 
BindParameter() call, but I also don't think we should use Compile() instead of 
Prepare() just because we don't plan on supporting parameter binding.

bq. My main worry is that this use case would add more state to be stored in 
hive server. Once we add support for high-availability, maintaining additional 
state on hive server 2 would come at additional costs (I am worried about costs 
of storing the whole plan in something like an rdbms or zookeeper).

How much additional state we plan to maintain is entirely up to us. Supporting 
the ability to execute a prepared statement multiple times does not mean that 
we have to save the query plan in between calls to execute(). I think the Hive 
JDBC driver already supports PreparedStatements, and we're basically just 
faking it. The fact that we're faking it doesn't matter because, as you 
mentioned, Hive is not an OLTP database.

Anyway, I'm getting a little off topic. The main reason I think we should call 
this "Prepare" instead of "Compile" is so that we maintain the close 
relationship between the CLIService API and the ODBC API. I designed it this 
way for the following reasons:

# People who are familiar with ODBC can look at the CLIService API and quickly 
understand how it works.
# We can reference the ODBC documentation instead of having to write our own. 
We already take advantage of this in the TCliService.thrift IDL file (though in 
some cases we also reference JDBC).
# Getting APIs right is tough. ODBC has gone through a bunch of revisions for 
precisely this reason. It still has bugs, but at least those bugs are well 
understood. As soon as we diverge significantly from ODBC we risk inventing new 
bugs, and then we will have the added challenge of figuring out how to 
reconcile our API bugs with the ODBC API bugs.

bq. HS2 is now a single point of failure in the system, I think we should start 
considering high-availability issues while adding features. Keeping state in 
client instead of server will help with that.

I agree that we should start thinking about HA, but I don't understand the part 
about maintaining state in the client. Wouldn't that imply that the client has 
to be HA as well? There are some security problems with this as well.


                
> Add Compile/Execute support to Hive Server
> ------------------------------------------
>
>                 Key: HIVE-4321
>                 URL: https://issues.apache.org/jira/browse/HIVE-4321
>             Project: Hive
>          Issue Type: Bug
>          Components: HiveServer2, Thrift API
>            Reporter: Sarah Parra
>            Assignee: Sarah Parra
>         Attachments: CompileExecute.patch
>
>
> Adds support for query compilation in Hive Server 2 and adds Thrift support 
> for compile/execute APIs.
> This enables scenarios that need to compile a query before it is executed, 
> e.g. and ODBC driver that implements SQLPrepare/SQLExecute. This is commonly 
> used for a client that needs metadata for the query before it is executed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to