[ https://issues.apache.org/jira/browse/HIVE-21028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16720808#comment-16720808 ]
Hive QA commented on HIVE-21028: -------------------------------- Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12951735/HIVE-21028.5.patch {color:red}ERROR:{color} -1 due to build exiting with an error Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/15307/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/15307/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-15307/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Tests exited with: Exception: Patch URL https://issues.apache.org/jira/secure/attachment/12951735/HIVE-21028.5.patch was found in seen patch url's cache and a test was probably run already on it. Aborting... {noformat} This message is automatically generated. ATTACHMENT ID: 12951735 - PreCommit-HIVE-Build > get_table_meta should use a fetch plan to avoid race conditions ending up in > NucleusObjectNotFoundException > ----------------------------------------------------------------------------------------------------------- > > Key: HIVE-21028 > URL: https://issues.apache.org/jira/browse/HIVE-21028 > Project: Hive > Issue Type: Bug > Reporter: Karthik Manamcheri > Assignee: Karthik Manamcheri > Priority: Major > Attachments: HIVE-21028.1.patch, HIVE-21028.2.patch, > HIVE-21028.3.patch, HIVE-21028.4.patch, HIVE-21028.5.patch > > > The {{getTableMeta}} call retrieves the tables, loops through the tables and > during this loop it retrieves the database object to get the containing > database name. DataNuclues does a lazy retrieval and so, when the first call > to get all the tables is done, it does not retrieve the database objects. > When this query is executed > {code}query = pm.newQuery(MTable.class, filterBuilder.toString()); > {code} > it loads all the tables, and when you do > {code} > table.getDatabase().getName() > {code} > it then goes and retrieves the database object. > *However*, there could be another thread which actually has deleted the > database!! If this happens, we end up with exceptions such as > {code} > 2018-12-04 22:25:06,525 INFO DataNucleus.Datastore.Retrieve: > [pool-7-thread-191]: Object with id > "6930391[OID]org.apache.hadoop.hive.metastore.model.MTable" not found ! > 2018-12-04 22:25:06,527 WARN DataNucleus.Persistence: [pool-7-thread-191]: > Exception thrown by StateManager.isLoaded > No such database row > org.datanucleus.exceptions.NucleusObjectNotFoundException: No such database > row > {code} > We see this happen especially with calls which retrieve all the tables in all > the databases (basically a call to get_table_meta with dbNames="\*" and > tableNames="\*"). > To avoid this, we can define a custom fetch plan and activate it only for the > get_table_meta query. This fetch plan would fetch the database object along > with the MTable object. > We would first create a fetch plan on the pmf > {code} > pmf.getFetchGroup(MTable.class, > "mtable_db_fetch_group").addMember("database"); > {code} > Then we use it just before calling the query > {code} > pm.getFetchPlan().addGroup("mtable_db_fetch_group"); > query = pm.newQuery(MTable.class, filterBuilder.toString()); > Collection<MTable> tables = (Collection<MTable>) query.executeWithArray(...); > ... > {code} > Before the API call ends, we can remove the fetch plan by > {code} > pm.getFetchPlan().removeGroup("mtable_db_fetch_group"); > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)