[ https://issues.apache.org/jira/browse/HIVE-17691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Eugene Koifman updated HIVE-17691: ---------------------------------- Description: # DDLSemanticAnalyzer.alterTableOutput is unused # DDLTask.generateAddMmTasks(Table) - stmtId should probably come from TransactionManager # DDLTask.createTable(Hive db, CreateTableDesc crtTbl) has _Long mmWriteId = crtTbl.getInitialMmWriteId();_ logic is unclear.. this ID is only set in one place.. # FileSinkOperator has multiple places that look like _conf.getWriteType() == AcidUtils.Operation.NOT_ACID || conf.isMmTable()_ - what is the writeType for MM tables? Seems that Wei opted for "work.getLoadTableWork().getWriteType() != AcidUtils.Operation.NOT_ACID && !tbd.isMmTable()" to mean MM, e.g. MoveTask.handleStaticParts() call to Hive.loadPartition() # HiveConf.HIVE_TXN_OPERATIONAL_PROPERTIES - the doc/explanation there is obsolete # Compactor Initiator likely doesn't work for MM tables. It's triggered by into in TXN_COMPONENTS/COMPLETED_TXN_COMPONENTS. MM tables don't write to either because DbTxnManager.acquireLocks() does _compBuilder.setIsAcid(AcidUtils.isFullAcidTable(t));_ i.e. it treats MM as non-acid tables # In general integration with full Acid seems confused wrt to MM and seems to treat MM as special table type rather than subtype of Acid table. (mostly, but not always). ## e.g. _SemanticAnalyzer.genBucketingSortingDest(String dest, Operator input, QB qb, TableDesc table_desc, Table dest_tab, SortBucketRSCtx ctx)_ # LoadSemanticAnalyzer.analyzeInternal(ASTNode) sets statementId to 0 rather than from TM # ImportCommitTask - doesn't currently do anything. It used to commit mmID. Need to verify we properly commit the txn in the Driver # As far as I can tell all the mm_*.q tests run on TestCliDriver which means MR. This doesn't exercise some code specifically for dealing with writes from Union All queries (CTAS, Insert into). On MR this requires "hive.optimize.union.remove=true" (false by default) # Remove MoveWork().setNoop(boolean) and usages per todo in _GenMapRedUtils.createMRWorkForMergingFiles (FileSinkOperator fsInput, Path finalName, DependencyCollectionTask dependencyTask, List<Task<MoveWork>> mvTasks, HiveConf conf, Task<? extends Serializable> currTask)_ # PartialScanWork.tblDesc - unused # _Partition.getBucketPath(int bucketNum)_ has "// Note: this makes assumptions that won't work with MM tables, unions, etc.". File Jira? # _PartitionDesc.LOG_ is unused # Insert Overwrite for MM is incomplete - see comments in HIVE-15212 regarding IOW and multi IOW was: # DDLSemanticAnalyzer.alterTableOutput is unused # DDLTask.generateAddMmTasks(Table) - stmtId should probably come from TransactionManager # DDLTask.createTable(Hive db, CreateTableDesc crtTbl) has _Long mmWriteId = crtTbl.getInitialMmWriteId();_ logic is unclear.. this ID is only set in one place.. # FileSinkOperator has multiple places that look like _conf.getWriteType() == AcidUtils.Operation.NOT_ACID || conf.isMmTable()_ - what is the writeType for MM tables? Seems that Wei opted for "work.getLoadTableWork().getWriteType() != AcidUtils.Operation.NOT_ACID && !tbd.isMmTable()" to mean MM, e.g. MoveTask.handleStaticParts() call to Hive.loadPartition() # HiveConf.HIVE_TXN_OPERATIONAL_PROPERTIES - the doc/explanation there is obsolete # Compactor Initiator likely doesn't work for MM tables. It's triggered by into in TXN_COMPONENTS/COMPLETED_TXN_COMPONENTS. MM tables don't write to either because DbTxnManager.acquireLocks() does _compBuilder.setIsAcid(AcidUtils.isFullAcidTable(t));_ i.e. it treats MM as non-acid tables # In general integration with full Acid seems confused wrt to MM and seems to treat MM as special table type rather than subtype of Acid table. (mostly, but not always). ## e.g. _SemanticAnalyzer.genBucketingSortingDest(String dest, Operator input, QB qb, TableDesc table_desc, Table dest_tab, SortBucketRSCtx ctx)_ # LoadSemanticAnalyzer.analyzeInternal(ASTNode) sets statementId to 0 rather than from TM # ImportCommitTask - doesn't currently do anything. It used to commit mmID. Need to verify we properly commit the txn in the Driver # As far as I can tell all the mm_*.q tests run on TestCliDriver which means MR. This doesn't exercise some code specifically for dealing with writes from Union All queries (CTAS, Insert into). On MR this requires "hive.optimize.union.remove=true" (false by default) # Remove MoveWork().setNoop(boolean) and usages per todo in _GenMapRedUtils.createMRWorkForMergingFiles (FileSinkOperator fsInput, Path finalName, DependencyCollectionTask dependencyTask, List<Task<MoveWork>> mvTasks, HiveConf conf, Task<? extends Serializable> currTask)_ # PartialScanWork.tblDesc - unused # _Partition.getBucketPath(int bucketNum)_ has "// Note: this makes assumptions that won't work with MM tables, unions, etc.". File Jira? # _PartitionDesc.LOG_ is unused # > Miscellaneous List > ------------------ > > Key: HIVE-17691 > URL: https://issues.apache.org/jira/browse/HIVE-17691 > Project: Hive > Issue Type: Sub-task > Components: Transactions > Reporter: Eugene Koifman > > # DDLSemanticAnalyzer.alterTableOutput is unused > # DDLTask.generateAddMmTasks(Table) - stmtId should probably come from > TransactionManager > # DDLTask.createTable(Hive db, CreateTableDesc crtTbl) has _Long mmWriteId = > crtTbl.getInitialMmWriteId();_ logic is unclear.. this ID is only set in one > place.. > # FileSinkOperator has multiple places that look like _conf.getWriteType() == > AcidUtils.Operation.NOT_ACID || conf.isMmTable()_ - what is the writeType for > MM tables? Seems that Wei opted for "work.getLoadTableWork().getWriteType() > != AcidUtils.Operation.NOT_ACID && !tbd.isMmTable()" to mean MM, e.g. > MoveTask.handleStaticParts() call to Hive.loadPartition() > # HiveConf.HIVE_TXN_OPERATIONAL_PROPERTIES - the doc/explanation there is > obsolete > # Compactor Initiator likely doesn't work for MM tables. It's triggered by > into in TXN_COMPONENTS/COMPLETED_TXN_COMPONENTS. MM tables don't write to > either because DbTxnManager.acquireLocks() does > _compBuilder.setIsAcid(AcidUtils.isFullAcidTable(t));_ i.e. it treats MM as > non-acid tables > # In general integration with full Acid seems confused wrt to MM and seems to > treat MM as special table type rather than subtype of Acid table. (mostly, > but not always). > ## e.g. _SemanticAnalyzer.genBucketingSortingDest(String dest, Operator > input, QB qb, TableDesc table_desc, Table dest_tab, SortBucketRSCtx ctx)_ > # LoadSemanticAnalyzer.analyzeInternal(ASTNode) sets statementId to 0 rather > than from TM > # ImportCommitTask - doesn't currently do anything. It used to commit mmID. > Need to verify we properly commit the txn in the Driver > # As far as I can tell all the mm_*.q tests run on TestCliDriver which means > MR. This doesn't exercise some code specifically for dealing with writes > from Union All queries (CTAS, Insert into). On MR this requires > "hive.optimize.union.remove=true" (false by default) > # Remove MoveWork().setNoop(boolean) and usages per todo in > _GenMapRedUtils.createMRWorkForMergingFiles (FileSinkOperator fsInput, Path > finalName, DependencyCollectionTask dependencyTask, List<Task<MoveWork>> > mvTasks, HiveConf conf, Task<? extends Serializable> currTask)_ > # PartialScanWork.tblDesc - unused > # _Partition.getBucketPath(int bucketNum)_ has "// Note: this makes > assumptions that won't work with MM tables, unions, etc.". File Jira? > # _PartitionDesc.LOG_ is unused > # Insert Overwrite for MM is incomplete - see comments in HIVE-15212 > regarding IOW and multi IOW -- This message was sent by Atlassian JIRA (v6.4.14#64029)