[ https://issues.apache.org/jira/browse/HIVE-3463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13459424#comment-13459424 ]
Alexander Alten-Lorenz commented on HIVE-3463: ---------------------------------------------- More insight: 1. The drop table command deletes entries in different tables as UNIT transactions, so if there is a communication break or any other uncertain condition, database could rollback that individual transaction leaving database in inconsistent state. 2. The drop table is not serialized action, which is causing Replication break 3. As cascade is not defined in the FK definition during table creation, the stale could present in certain conditions. 4. The database with id 511 is showing the tables of default database who’s db_id is 1 5. The select on any portioned table with a limit operator scans through all the partitions using UNIT transactions, which just query single entry per transaction, means to say, if a table has got 100000 partitions, it does 100000 queries to database before showing the result. Instead, it should have done one bulk query fetching all the partition details at once. Example: In our database we are having 3500 partitioned tables, with total of 950000 partitions. Every day we get hit on 90% of tables. And 60% of the users say, the queries takes its own sweet time to return, even though we kept the metastore level performance(MySQL) at 2miliSec to 20 miliSec, the query time at hive is always varies between few minutes to few hours. Even simple USE statement takes up to 5 min in some cases. (new thing I am bringing it up, as it took some time for me to confirm this) 6. Dead locks happening on hive tables, during normal operations, and quite a number of rollbacks happening at database side. What I noticed is the transaction serialization issue, which is issue a delete before an update, see below for the detail on the dead lock. With upgrade to mysql 5.5, we are not seeing it not so frequently now. One nit, Cascading for InnoDB is a great feature, since Hive doesn't support MySQL's binlog replication (Transaction level 'READ-COMMITTED' in InnoDB is not safe for binlog mode 'STATEMENT'). I was digging a bit deeper, that could be avoided by setting "SET GLOBAL binlog_format = 'ROW';: inside of MySQL. But this needs to be confirmed first. I raise the FR up. > Add CASCADING to MySQL's InnoDB schema > -------------------------------------- > > Key: HIVE-3463 > URL: https://issues.apache.org/jira/browse/HIVE-3463 > Project: Hive > Issue Type: Improvement > Components: Metastore > Affects Versions: 0.9.0 > Reporter: Alexander Alten-Lorenz > Priority: Minor > > Cascading could help to cleanup the tables when a FK is deleted. > http://dev.mysql.com/doc/refman/5.5/en/innodb-foreign-key-constraints.html -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira