[ 
https://issues.apache.org/jira/browse/HIVE-20977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16734700#comment-16734700
 ] 

Karthik Manamcheri commented on HIVE-20977:
-------------------------------------------

[~alangates] [~thejas] I want to hear your thoughts on a particular issue 
regarding this change. In this ticket, I am changing the behavior of bulk 
listing partitions API. Previously, if you issue a getPartitionsByNames (or one 
of the similar APIs) with an empty database, we threw an exception. After this 
change, we will return an empty list of partitions instead. This behavior is 
similar to what happens if you issue a getTablesByNames call (an empty list of 
tables are returned).

Here is my thoughts on why this is an ok change. In the master branch (without 
my change), if you remove the TransactionalValidationListener from the list of 
pre-listeners, the GetPartitions API tests fail! The bulk get partitions APIs 
behavior actually depends on the absence or presence of a pre-event listener. 
If that is the case, then clients already expect different return behavior 
(throwing an exception or returning an empty list). So the API contract itself 
is that it can throw an exception (OR return an empty list). However, we 
committed HIVE-12064 which made it so that there will *always* be a pre-event 
listener (the TransactionalValidationListener is added in code 
unconditionally). The unit tests were written around the fact that there is 
always a pre-event listener. I don't think that is a good assumption to make. 
Listeners are supposed to be plugins and hence we shouldn't design API 
contracts around there being a listener. 

My change will revert back to the original behavior (before HIVE-12064), which 
is that the list partitions APIs can throw an exception (or return an empty 
list) if an invalid table/db name is specified. We should also fix the tests so 
that they don't depend of the existence (or non-existence) of listeners or 
plugins.

Thoughts? 

> Lazy evaluate the table object in PreReadTableEvent to improve get_partition 
> performance
> ----------------------------------------------------------------------------------------
>
>                 Key: HIVE-20977
>                 URL: https://issues.apache.org/jira/browse/HIVE-20977
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Karthik Manamcheri
>            Assignee: Karthik Manamcheri
>            Priority: Minor
>         Attachments: HIVE-20977.1.patch, HIVE-20977.2.patch, 
> HIVE-20977.3.patch, HIVE-20977.4.patch
>
>
> The PreReadTableEvent is generated for non-table operations (such as 
> get_partitions), but only if there is an event listener attached. However, 
> this is also not necessary if the event listener is not interested in the 
> read table event.
> For example, the TransactionalValidationListener's onEvent looks like this
> {code:java}
> @Override
> public void onEvent(PreEventContext context) throws MetaException, 
> NoSuchObjectException,
>     InvalidOperationException {
>   switch (context.getEventType()) {
>     case CREATE_TABLE:
>       handle((PreCreateTableEvent) context);
>       break;
>     case ALTER_TABLE:
>       handle((PreAlterTableEvent) context);
>       break;
>     default:
>       //no validation required..
>   }
> }{code}
>  
> Note that for read table events it is a no-op. The problem is that the 
> get_table is evaluated when creating the PreReadTableEvent finally to be just 
> ignored!
> Look at the code below.. {{getMS().getTable(..)}} is evaluated irrespective 
> of if the listener uses it or not.
> {code:java}
> private void fireReadTablePreEvent(String catName, String dbName, String 
> tblName)
>     throws MetaException, NoSuchObjectException {
>   if(preListeners.size() > 0) {
>     // do this only if there is a pre event listener registered (avoid 
> unnecessary
>     // metastore api call)
>     Table t = getMS().getTable(catName, dbName, tblName);
>     if (t == null) {
>       throw new NoSuchObjectException(TableName.getQualified(catName, dbName, 
> tblName)
>           + " table not found");
>     }
>     firePreEvent(new PreReadTableEvent(t, this));
>   }
> }
> {code}
> This can be improved by using a {{Supplier}} and lazily evaluating the table 
> when needed (once when the first time it is called, memorized after that).
> *Motivation*
> Whenever a partition call occurs (get_partition, etc.), we fire the 
> PreReadTableEvent. This affects performance since it fetches the table even 
> if it is not being used. This change will improve performance on the 
> get_partition calls.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to