[
https://issues.apache.org/jira/browse/HIVE-3826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13547920#comment-13547920
]
Hudson commented on HIVE-3826:
------------------------------
Integrated in Hive-trunk-hadoop2 #54 (See
[https://builds.apache.org/job/Hive-trunk-hadoop2/54/])
HIVE-3826 Rollbacks and retries of drops cause
org.datanucleus.exceptions.NucleusObjectNotFoundException: No such database row)
(Kevin Wilfong via namit) (Revision 1425247)
Result = ABORTED
namit : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1425247
Files :
*
/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java
> Rollbacks and retries of drops cause
> org.datanucleus.exceptions.NucleusObjectNotFoundException: No such database
> row)
> ---------------------------------------------------------------------------------------------------------------------
>
> Key: HIVE-3826
> URL: https://issues.apache.org/jira/browse/HIVE-3826
> Project: Hive
> Issue Type: Bug
> Components: Metastore
> Affects Versions: 0.11.0
> Reporter: Kevin Wilfong
> Assignee: Kevin Wilfong
> Fix For: 0.11.0
>
> Attachments: HIVE-3826.1.patch.txt
>
>
> I'm not sure if this is the only cause of the exception
> "org.datanucleus.exceptions.NucleusObjectNotFoundException: No such database
> row)" from the metastore, but one cause seems to be related to a drop command
> failing, and being retried by the client.
> Based on focusing on a single thread in the metastore with DEBUG level
> logging, I was seeing the objects that were intended to be dropped remaining
> in the PersistenceManager cache even after a rollback. The steps seemed to
> be as follows:
> 1) First attempt to drop the table, the table is pulled into the
> PersistenceManager cache for the purposes of dropping
> 2) The drop fails, e.g. due to a lock wait timeout on the SQL backend, this
> causes a rollback of the transaction
> 3) The drop is retried using a different thread on the metastore Thrift
> server or a different server and succeeds
> 4) Back on the original thread of the original Thrift server someone tries to
> perform some write operation which produces a commit. This causes those
> detached objects related to the dropped table to attempt to reattach, causing
> JDO to query the SQL backend for those objects which it can't find. This
> causes the exception.
> I was able to reproduce this regularly using the following sequence of
> commands:
> Hive client 1 (Hive1): connected to a metastore Thrift server running a
> single thread, I hard coded a RuntimeException into the code to drop a table
> in the ObjectStore, specifically right before the commit in
> preDropStorageDescriptor, to induce a rollback. I also turned off all
> retries at all layers of the metastore.
> Hive client 2 (Hive2): connected to a separate metastore Thrift server
> running with standard configs and code
> 1: On Hive1, CREATE TABLE t1 (c STRING);
> 2: On Hive1, DROP TABLE t1; // This failed due to the hard coded exception
> 3: On Hive2, DROP TABLE t1; // Succeeds
> 4: On Hive1, CREATE DATABASE d1; // This database already existed, I'm not
> sure why this was necessary, but it didn't work without it, it seemed to have
> an affect on the order objects were committed in the next step
> 5: On Hive1, CREATE DATABASE d2; // This database didn't exist, it would fail
> with the NucleusObjectNotFoundException
> The object that would cause the exception varied, I saw the MTable, the
> MSerDeInfo, and MTablePrivilege from the table that attempted to be dropped.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira