[jira] [Commented] (HIVE-3826) Rollbacks and retries of drops cause org.datanucleus.exceptions.NucleusObjectNotFoundException: No such database row)

Kevin Wilfong (JIRA) Thu, 20 Dec 2012 16:11:15 -0800

    [ 
https://issues.apache.org/jira/browse/HIVE-3826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13537534#comment-13537534
 ]


Kevin Wilfong commented on HIVE-3826:
-------------------------------------

The tests pass.
                
> Rollbacks and retries of drops cause 
> org.datanucleus.exceptions.NucleusObjectNotFoundException: No such database 
> row)
> ---------------------------------------------------------------------------------------------------------------------
>
>                 Key: HIVE-3826
>                 URL: https://issues.apache.org/jira/browse/HIVE-3826
>             Project: Hive
>          Issue Type: Bug
>          Components: Metastore
>    Affects Versions: 0.11
>            Reporter: Kevin Wilfong
>            Assignee: Kevin Wilfong
>         Attachments: HIVE-3826.1.patch.txt
>
>
> I'm not sure if this is the only cause of the exception 
> "org.datanucleus.exceptions.NucleusObjectNotFoundException: No such database 
> row)" from the metastore, but one cause seems to be related to a drop command 
> failing, and being retried by the client.
> Based on focusing on a single thread in the metastore with DEBUG level 
> logging, I was seeing the objects that were intended to be dropped remaining 
> in the PersistenceManager cache even after a rollback.  The steps seemed to 
> be as follows:
> 1) First attempt to drop the table, the table is pulled into the 
> PersistenceManager cache for the purposes of dropping
> 2) The drop fails, e.g. due to a lock wait timeout on the SQL backend, this 
> causes a rollback of the transaction
> 3) The drop is retried using a different thread on the metastore Thrift 
> server or a different server and succeeds
> 4) Back on the original thread of the original Thrift server someone tries to 
> perform some write operation which produces a commit.  This causes those 
> detached objects related to the dropped table to attempt to reattach, causing 
> JDO to query the SQL backend for those objects which it can't find.  This 
> causes the exception.
> I was able to reproduce this regularly using the following sequence of 
> commands:
> Hive client 1 (Hive1): connected to a metastore Thrift server running a 
> single thread, I hard coded a RuntimeException into the code to drop a table 
> in the ObjectStore, specifically right before the commit in 
> preDropStorageDescriptor, to induce a rollback.  I also turned off all 
> retries at all layers of the metastore.
> Hive client 2 (Hive2): connected to a separate metastore Thrift server 
> running with standard configs and code
> 1: On Hive1, CREATE TABLE t1 (c STRING);
> 2: On Hive1, DROP TABLE t1; // This failed due to the hard coded exception
> 3: On Hive2, DROP TABLE t1; // Succeeds
> 4: On Hive1, CREATE DATABASE d1; // This database already existed, I'm not 
> sure why this was necessary, but it didn't work without it, it seemed to have 
> an affect on the order objects were committed in the next step
> 5: On Hive1, CREATE DATABASE d2; // This database didn't exist, it would fail 
> with the NucleusObjectNotFoundException
> The object that would cause the exception varied, I saw the MTable, the 
> MSerDeInfo, and MTablePrivilege from the table that attempted to be dropped.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3826) Rollbacks and retries of drops cause org.datanucleus.exceptions.NucleusObjectNotFoundException: No such database row)

Reply via email to