[ https://issues.apache.org/jira/browse/HIVE-20977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16734700#comment-16734700 ]
Karthik Manamcheri commented on HIVE-20977: ------------------------------------------- [~alangates] [~thejas] I want to hear your thoughts on a particular issue regarding this change. In this ticket, I am changing the behavior of bulk listing partitions API. Previously, if you issue a getPartitionsByNames (or one of the similar APIs) with an empty database, we threw an exception. After this change, we will return an empty list of partitions instead. This behavior is similar to what happens if you issue a getTablesByNames call (an empty list of tables are returned). Here is my thoughts on why this is an ok change. In the master branch (without my change), if you remove the TransactionalValidationListener from the list of pre-listeners, the GetPartitions API tests fail! The bulk get partitions APIs behavior actually depends on the absence or presence of a pre-event listener. If that is the case, then clients already expect different return behavior (throwing an exception or returning an empty list). So the API contract itself is that it can throw an exception (OR return an empty list). However, we committed HIVE-12064 which made it so that there will *always* be a pre-event listener (the TransactionalValidationListener is added in code unconditionally). The unit tests were written around the fact that there is always a pre-event listener. I don't think that is a good assumption to make. Listeners are supposed to be plugins and hence we shouldn't design API contracts around there being a listener. My change will revert back to the original behavior (before HIVE-12064), which is that the list partitions APIs can throw an exception (or return an empty list) if an invalid table/db name is specified. We should also fix the tests so that they don't depend of the existence (or non-existence) of listeners or plugins. Thoughts? > Lazy evaluate the table object in PreReadTableEvent to improve get_partition > performance > ---------------------------------------------------------------------------------------- > > Key: HIVE-20977 > URL: https://issues.apache.org/jira/browse/HIVE-20977 > Project: Hive > Issue Type: Improvement > Reporter: Karthik Manamcheri > Assignee: Karthik Manamcheri > Priority: Minor > Attachments: HIVE-20977.1.patch, HIVE-20977.2.patch, > HIVE-20977.3.patch, HIVE-20977.4.patch > > > The PreReadTableEvent is generated for non-table operations (such as > get_partitions), but only if there is an event listener attached. However, > this is also not necessary if the event listener is not interested in the > read table event. > For example, the TransactionalValidationListener's onEvent looks like this > {code:java} > @Override > public void onEvent(PreEventContext context) throws MetaException, > NoSuchObjectException, > InvalidOperationException { > switch (context.getEventType()) { > case CREATE_TABLE: > handle((PreCreateTableEvent) context); > break; > case ALTER_TABLE: > handle((PreAlterTableEvent) context); > break; > default: > //no validation required.. > } > }{code} > > Note that for read table events it is a no-op. The problem is that the > get_table is evaluated when creating the PreReadTableEvent finally to be just > ignored! > Look at the code below.. {{getMS().getTable(..)}} is evaluated irrespective > of if the listener uses it or not. > {code:java} > private void fireReadTablePreEvent(String catName, String dbName, String > tblName) > throws MetaException, NoSuchObjectException { > if(preListeners.size() > 0) { > // do this only if there is a pre event listener registered (avoid > unnecessary > // metastore api call) > Table t = getMS().getTable(catName, dbName, tblName); > if (t == null) { > throw new NoSuchObjectException(TableName.getQualified(catName, dbName, > tblName) > + " table not found"); > } > firePreEvent(new PreReadTableEvent(t, this)); > } > } > {code} > This can be improved by using a {{Supplier}} and lazily evaluating the table > when needed (once when the first time it is called, memorized after that). > *Motivation* > Whenever a partition call occurs (get_partition, etc.), we fire the > PreReadTableEvent. This affects performance since it fetches the table even > if it is not being used. This change will improve performance on the > get_partition calls. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)