[jira] [Commented] (HIVE-17947) Concurrent inserts might fail for ACID table since HIVE-17526 on branch-1
[ https://issues.apache.org/jira/browse/HIVE-17947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16247410#comment-16247410 ] Daniel Voros commented on HIVE-17947: - Thank you all! > Concurrent inserts might fail for ACID table since HIVE-17526 on branch-1 > - > > Key: HIVE-17947 > URL: https://issues.apache.org/jira/browse/HIVE-17947 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 1.3.0 >Reporter: Daniel Voros >Assignee: Daniel Voros >Priority: Blocker > Fix For: 1.3.0 > > Attachments: HIVE-17947.1-branch-1.patch, > HIVE-17947.2-branch-1.patch, HIVE-17947.3-branch-1.patch > > > HIVE-17526 (only on branch-1) disabled conversion to ACID if there are > *_copy_N files under the table, but the filesystem checks introduced there > are running for every insert since the MoveTask in the end of the insert will > call alterTable eventually. > The filename checking also recurses into staging directories created by other > inserts. If those are removed while listing the files, it leads to the > following exception and failing insert: > {code} > java.io.FileNotFoundException: File > hdfs://mycluster/apps/hive/warehouse/dvoros.db/concurrent_insert/.hive-staging_hive_2017-10-30_13-23-35_056_2844419018556002410-2/-ext-10001 > does not exist. > at > org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1081) > ~[hadoop-hdfs-2.7.3.2.6.3.0-235.jar:?] > at > org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1059) > ~[hadoop-hdfs-2.7.3.2.6.3.0-235.jar:?] > at > org.apache.hadoop.hdfs.DistributedFileSystem$23.doCall(DistributedFileSystem.java:1004) > ~[hadoop-hdfs-2.7.3.2.6.3.0-235.jar:?] > at > org.apache.hadoop.hdfs.DistributedFileSystem$23.doCall(DistributedFileSystem.java:1000) > ~[hadoop-hdfs-2.7.3.2.6.3.0-235.jar:?] > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > ~[hadoop-common-2.7.3.2.6.3.0-235.jar:?] > at > org.apache.hadoop.hdfs.DistributedFileSystem.listLocatedStatus(DistributedFileSystem.java:1018) > ~[hadoop-hdfs-2.7.3.2.6.3.0-235.jar:?] > at > org.apache.hadoop.fs.FileSystem.listLocatedStatus(FileSystem.java:1735) > ~[hadoop-common-2.7.3.2.6.3.0-235.jar:?] > at > org.apache.hadoop.fs.FileSystem$6.handleFileStat(FileSystem.java:1864) > ~[hadoop-common-2.7.3.2.6.3.0-235.jar:?] > at org.apache.hadoop.fs.FileSystem$6.hasNext(FileSystem.java:1841) > ~[hadoop-common-2.7.3.2.6.3.0-235.jar:?] > at > org.apache.hadoop.hive.metastore.TransactionalValidationListener.containsCopyNFiles(TransactionalValidationListener.java:226) > [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235] > at > org.apache.hadoop.hive.metastore.TransactionalValidationListener.handleAlterTableTransactionalProp(TransactionalValidationListener.java:104) > [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235] > at > org.apache.hadoop.hive.metastore.TransactionalValidationListener.handle(TransactionalValidationListener.java:63) > [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235] > at > org.apache.hadoop.hive.metastore.TransactionalValidationListener.onEvent(TransactionalValidationListener.java:55) > [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235] > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.firePreEvent(HiveMetaStore.java:2478) > [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235] > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_core(HiveMetaStore.java:4145) > [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235] > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_with_environment_context(HiveMetaStore.java:4117) > [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235] > at sun.reflect.GeneratedMethodAccessor107.invoke(Unknown Source) > ~[?:?] > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > ~[?:1.8.0_144] > at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_144] > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:148) > [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235] > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107) > [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235] > at > com.sun.proxy.$Proxy32.alter_table_with_environment_context(Unknown Source) > [?:?] > at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.alter_table_with_environmentContext(HiveMetaStoreClient.java
[jira] [Commented] (HIVE-17947) Concurrent inserts might fail for ACID table since HIVE-17526 on branch-1
[ https://issues.apache.org/jira/browse/HIVE-17947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16241136#comment-16241136 ] Hive QA commented on HIVE-17947: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12895378/HIVE-17947.3-branch-1.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 165 failed/errored test(s), 8123 tests executed *Failed tests:* {noformat} TestAcidOnTez - did not produce a TEST-*.xml file (likely timed out) (batchId=376) TestAdminUser - did not produce a TEST-*.xml file (likely timed out) (batchId=358) TestAuthorizationPreEventListener - did not produce a TEST-*.xml file (likely timed out) (batchId=391) TestAuthzApiEmbedAuthorizerInEmbed - did not produce a TEST-*.xml file (likely timed out) (batchId=368) TestAuthzApiEmbedAuthorizerInRemote - did not produce a TEST-*.xml file (likely timed out) (batchId=374) TestBeeLineWithArgs - did not produce a TEST-*.xml file (likely timed out) (batchId=398) TestCLIAuthzSessionContext - did not produce a TEST-*.xml file (likely timed out) (batchId=416) TestClearDanglingScratchDir - did not produce a TEST-*.xml file (likely timed out) (batchId=383) TestClientSideAuthorizationProvider - did not produce a TEST-*.xml file (likely timed out) (batchId=390) TestCompactor - did not produce a TEST-*.xml file (likely timed out) (batchId=379) TestCreateUdfEntities - did not produce a TEST-*.xml file (likely timed out) (batchId=378) TestCustomAuthentication - did not produce a TEST-*.xml file (likely timed out) (batchId=399) TestDBTokenStore - did not produce a TEST-*.xml file (likely timed out) (batchId=342) TestDDLWithRemoteMetastoreSecondNamenode - did not produce a TEST-*.xml file (likely timed out) (batchId=377) TestDynamicSerDe - did not produce a TEST-*.xml file (likely timed out) (batchId=345) TestEmbeddedHiveMetaStore - did not produce a TEST-*.xml file (likely timed out) (batchId=355) TestEmbeddedThriftBinaryCLIService - did not produce a TEST-*.xml file (likely timed out) (batchId=402) TestFilterHooks - did not produce a TEST-*.xml file (likely timed out) (batchId=350) TestFolderPermissions - did not produce a TEST-*.xml file (likely timed out) (batchId=385) TestHS2AuthzContext - did not produce a TEST-*.xml file (likely timed out) (batchId=419) TestHS2AuthzSessionContext - did not produce a TEST-*.xml file (likely timed out) (batchId=420) TestHS2ClearDanglingScratchDir - did not produce a TEST-*.xml file (likely timed out) (batchId=406) TestHS2ImpersonationWithRemoteMS - did not produce a TEST-*.xml file (likely timed out) (batchId=407) TestHiveAuthorizerCheckInvocation - did not produce a TEST-*.xml file (likely timed out) (batchId=394) TestHiveAuthorizerShowFilters - did not produce a TEST-*.xml file (likely timed out) (batchId=393) TestHiveHistory - did not produce a TEST-*.xml file (likely timed out) (batchId=396) TestHiveMetaStoreTxns - did not produce a TEST-*.xml file (likely timed out) (batchId=370) TestHiveMetaStoreWithEnvironmentContext - did not produce a TEST-*.xml file (likely timed out) (batchId=360) TestHiveMetaTool - did not produce a TEST-*.xml file (likely timed out) (batchId=373) TestHiveServer2 - did not produce a TEST-*.xml file (likely timed out) (batchId=422) TestHiveServer2SessionTimeout - did not produce a TEST-*.xml file (likely timed out) (batchId=423) TestHiveSessionImpl - did not produce a TEST-*.xml file (likely timed out) (batchId=403) TestHs2Hooks - did not produce a TEST-*.xml file (likely timed out) (batchId=375) TestHs2HooksWithMiniKdc - did not produce a TEST-*.xml file (likely timed out) (batchId=451) TestJdbcDriver2 - did not produce a TEST-*.xml file (likely timed out) (batchId=410) TestJdbcMetadataApiAuth - did not produce a TEST-*.xml file (likely timed out) (batchId=421) TestJdbcWithLocalClusterSpark - did not produce a TEST-*.xml file (likely timed out) (batchId=415) TestJdbcWithMiniHS2 - did not produce a TEST-*.xml file (likely timed out) (batchId=412) TestJdbcWithMiniKdc - did not produce a TEST-*.xml file (likely timed out) (batchId=448) TestJdbcWithMiniKdcCookie - did not produce a TEST-*.xml file (likely timed out) (batchId=447) TestJdbcWithMiniKdcSQLAuthBinary - did not produce a TEST-*.xml file (likely timed out) (batchId=445) TestJdbcWithMiniKdcSQLAuthHttp - did not produce a TEST-*.xml file (likely timed out) (batchId=450) TestJdbcWithMiniMr - did not produce a TEST-*.xml file (likely timed out) (batchId=411) TestJdbcWithSQLAuthUDFBlacklist - did not produce a TEST-*.xml file (likely timed out) (batchId=417) TestJdbcWithSQLAuthorization - did not produce a TEST-*.xml file (likely timed out) (batchId=418) TestLocationQueries - did not produce a TEST-*.xml file (likely timed out) (batchId=382) TestMTQueries - did not produce a TEST-*.xml file (likely
[jira] [Commented] (HIVE-17947) Concurrent inserts might fail for ACID table since HIVE-17526 on branch-1
[ https://issues.apache.org/jira/browse/HIVE-17947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16239636#comment-16239636 ] Daniel Voros commented on HIVE-17947: - All failed tests either pass locally or fail in the same way without the patch. > Concurrent inserts might fail for ACID table since HIVE-17526 on branch-1 > - > > Key: HIVE-17947 > URL: https://issues.apache.org/jira/browse/HIVE-17947 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 1.3.0 >Reporter: Daniel Voros >Assignee: Daniel Voros >Priority: Blocker > Attachments: HIVE-17947.1-branch-1.patch, > HIVE-17947.2-branch-1.patch, HIVE-17947.3-branch-1.patch > > > HIVE-17526 (only on branch-1) disabled conversion to ACID if there are > *_copy_N files under the table, but the filesystem checks introduced there > are running for every insert since the MoveTask in the end of the insert will > call alterTable eventually. > The filename checking also recurses into staging directories created by other > inserts. If those are removed while listing the files, it leads to the > following exception and failing insert: > {code} > java.io.FileNotFoundException: File > hdfs://mycluster/apps/hive/warehouse/dvoros.db/concurrent_insert/.hive-staging_hive_2017-10-30_13-23-35_056_2844419018556002410-2/-ext-10001 > does not exist. > at > org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1081) > ~[hadoop-hdfs-2.7.3.2.6.3.0-235.jar:?] > at > org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1059) > ~[hadoop-hdfs-2.7.3.2.6.3.0-235.jar:?] > at > org.apache.hadoop.hdfs.DistributedFileSystem$23.doCall(DistributedFileSystem.java:1004) > ~[hadoop-hdfs-2.7.3.2.6.3.0-235.jar:?] > at > org.apache.hadoop.hdfs.DistributedFileSystem$23.doCall(DistributedFileSystem.java:1000) > ~[hadoop-hdfs-2.7.3.2.6.3.0-235.jar:?] > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > ~[hadoop-common-2.7.3.2.6.3.0-235.jar:?] > at > org.apache.hadoop.hdfs.DistributedFileSystem.listLocatedStatus(DistributedFileSystem.java:1018) > ~[hadoop-hdfs-2.7.3.2.6.3.0-235.jar:?] > at > org.apache.hadoop.fs.FileSystem.listLocatedStatus(FileSystem.java:1735) > ~[hadoop-common-2.7.3.2.6.3.0-235.jar:?] > at > org.apache.hadoop.fs.FileSystem$6.handleFileStat(FileSystem.java:1864) > ~[hadoop-common-2.7.3.2.6.3.0-235.jar:?] > at org.apache.hadoop.fs.FileSystem$6.hasNext(FileSystem.java:1841) > ~[hadoop-common-2.7.3.2.6.3.0-235.jar:?] > at > org.apache.hadoop.hive.metastore.TransactionalValidationListener.containsCopyNFiles(TransactionalValidationListener.java:226) > [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235] > at > org.apache.hadoop.hive.metastore.TransactionalValidationListener.handleAlterTableTransactionalProp(TransactionalValidationListener.java:104) > [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235] > at > org.apache.hadoop.hive.metastore.TransactionalValidationListener.handle(TransactionalValidationListener.java:63) > [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235] > at > org.apache.hadoop.hive.metastore.TransactionalValidationListener.onEvent(TransactionalValidationListener.java:55) > [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235] > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.firePreEvent(HiveMetaStore.java:2478) > [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235] > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_core(HiveMetaStore.java:4145) > [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235] > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_with_environment_context(HiveMetaStore.java:4117) > [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235] > at sun.reflect.GeneratedMethodAccessor107.invoke(Unknown Source) > ~[?:?] > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > ~[?:1.8.0_144] > at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_144] > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:148) > [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235] > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107) > [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235] > at > com.sun.proxy.$Proxy32.alter_table_with_environment_context(Unknown Source) > [?:?] > at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.alter_table_with_environme
[jira] [Commented] (HIVE-17947) Concurrent inserts might fail for ACID table since HIVE-17526 on branch-1
[ https://issues.apache.org/jira/browse/HIVE-17947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16236625#comment-16236625 ] Hive QA commented on HIVE-17947: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12895378/HIVE-17947.3-branch-1.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 163 failed/errored test(s), 8125 tests executed *Failed tests:* {noformat} TestAcidOnTez - did not produce a TEST-*.xml file (likely timed out) (batchId=376) TestAdminUser - did not produce a TEST-*.xml file (likely timed out) (batchId=358) TestAuthorizationPreEventListener - did not produce a TEST-*.xml file (likely timed out) (batchId=391) TestAuthzApiEmbedAuthorizerInEmbed - did not produce a TEST-*.xml file (likely timed out) (batchId=368) TestAuthzApiEmbedAuthorizerInRemote - did not produce a TEST-*.xml file (likely timed out) (batchId=374) TestBeeLineWithArgs - did not produce a TEST-*.xml file (likely timed out) (batchId=398) TestCLIAuthzSessionContext - did not produce a TEST-*.xml file (likely timed out) (batchId=416) TestClearDanglingScratchDir - did not produce a TEST-*.xml file (likely timed out) (batchId=383) TestClientSideAuthorizationProvider - did not produce a TEST-*.xml file (likely timed out) (batchId=390) TestCompactor - did not produce a TEST-*.xml file (likely timed out) (batchId=379) TestCreateUdfEntities - did not produce a TEST-*.xml file (likely timed out) (batchId=378) TestCustomAuthentication - did not produce a TEST-*.xml file (likely timed out) (batchId=399) TestDBTokenStore - did not produce a TEST-*.xml file (likely timed out) (batchId=342) TestDDLWithRemoteMetastoreSecondNamenode - did not produce a TEST-*.xml file (likely timed out) (batchId=377) TestDynamicSerDe - did not produce a TEST-*.xml file (likely timed out) (batchId=345) TestEmbeddedHiveMetaStore - did not produce a TEST-*.xml file (likely timed out) (batchId=355) TestEmbeddedThriftBinaryCLIService - did not produce a TEST-*.xml file (likely timed out) (batchId=402) TestFilterHooks - did not produce a TEST-*.xml file (likely timed out) (batchId=350) TestFolderPermissions - did not produce a TEST-*.xml file (likely timed out) (batchId=385) TestHS2AuthzContext - did not produce a TEST-*.xml file (likely timed out) (batchId=419) TestHS2AuthzSessionContext - did not produce a TEST-*.xml file (likely timed out) (batchId=420) TestHS2ClearDanglingScratchDir - did not produce a TEST-*.xml file (likely timed out) (batchId=406) TestHS2ImpersonationWithRemoteMS - did not produce a TEST-*.xml file (likely timed out) (batchId=407) TestHiveAuthorizerCheckInvocation - did not produce a TEST-*.xml file (likely timed out) (batchId=394) TestHiveAuthorizerShowFilters - did not produce a TEST-*.xml file (likely timed out) (batchId=393) TestHiveHistory - did not produce a TEST-*.xml file (likely timed out) (batchId=396) TestHiveMetaStoreTxns - did not produce a TEST-*.xml file (likely timed out) (batchId=370) TestHiveMetaStoreWithEnvironmentContext - did not produce a TEST-*.xml file (likely timed out) (batchId=360) TestHiveMetaTool - did not produce a TEST-*.xml file (likely timed out) (batchId=373) TestHiveServer2 - did not produce a TEST-*.xml file (likely timed out) (batchId=422) TestHiveServer2SessionTimeout - did not produce a TEST-*.xml file (likely timed out) (batchId=423) TestHiveSessionImpl - did not produce a TEST-*.xml file (likely timed out) (batchId=403) TestHs2Hooks - did not produce a TEST-*.xml file (likely timed out) (batchId=375) TestHs2HooksWithMiniKdc - did not produce a TEST-*.xml file (likely timed out) (batchId=451) TestJdbcDriver2 - did not produce a TEST-*.xml file (likely timed out) (batchId=410) TestJdbcMetadataApiAuth - did not produce a TEST-*.xml file (likely timed out) (batchId=421) TestJdbcWithLocalClusterSpark - did not produce a TEST-*.xml file (likely timed out) (batchId=415) TestJdbcWithMiniHS2 - did not produce a TEST-*.xml file (likely timed out) (batchId=412) TestJdbcWithMiniKdc - did not produce a TEST-*.xml file (likely timed out) (batchId=448) TestJdbcWithMiniKdcCookie - did not produce a TEST-*.xml file (likely timed out) (batchId=447) TestJdbcWithMiniKdcSQLAuthBinary - did not produce a TEST-*.xml file (likely timed out) (batchId=445) TestJdbcWithMiniKdcSQLAuthHttp - did not produce a TEST-*.xml file (likely timed out) (batchId=450) TestJdbcWithMiniMr - did not produce a TEST-*.xml file (likely timed out) (batchId=411) TestJdbcWithSQLAuthUDFBlacklist - did not produce a TEST-*.xml file (likely timed out) (batchId=417) TestJdbcWithSQLAuthorization - did not produce a TEST-*.xml file (likely timed out) (batchId=418) TestLocationQueries - did not produce a TEST-*.xml file (likely timed out) (batchId=382) TestMTQueries - did not produce a TEST-*.xml file (likely
[jira] [Commented] (HIVE-17947) Concurrent inserts might fail for ACID table since HIVE-17526 on branch-1
[ https://issues.apache.org/jira/browse/HIVE-17947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16236051#comment-16236051 ] Eugene Koifman commented on HIVE-17947: --- +1 patch 3 > Concurrent inserts might fail for ACID table since HIVE-17526 on branch-1 > - > > Key: HIVE-17947 > URL: https://issues.apache.org/jira/browse/HIVE-17947 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 1.3.0 >Reporter: Daniel Voros >Assignee: Daniel Voros >Priority: Blocker > Attachments: HIVE-17947.1-branch-1.patch, > HIVE-17947.2-branch-1.patch, HIVE-17947.3-branch-1.patch > > > HIVE-17526 (only on branch-1) disabled conversion to ACID if there are > *_copy_N files under the table, but the filesystem checks introduced there > are running for every insert since the MoveTask in the end of the insert will > call alterTable eventually. > The filename checking also recurses into staging directories created by other > inserts. If those are removed while listing the files, it leads to the > following exception and failing insert: > {code} > java.io.FileNotFoundException: File > hdfs://mycluster/apps/hive/warehouse/dvoros.db/concurrent_insert/.hive-staging_hive_2017-10-30_13-23-35_056_2844419018556002410-2/-ext-10001 > does not exist. > at > org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1081) > ~[hadoop-hdfs-2.7.3.2.6.3.0-235.jar:?] > at > org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1059) > ~[hadoop-hdfs-2.7.3.2.6.3.0-235.jar:?] > at > org.apache.hadoop.hdfs.DistributedFileSystem$23.doCall(DistributedFileSystem.java:1004) > ~[hadoop-hdfs-2.7.3.2.6.3.0-235.jar:?] > at > org.apache.hadoop.hdfs.DistributedFileSystem$23.doCall(DistributedFileSystem.java:1000) > ~[hadoop-hdfs-2.7.3.2.6.3.0-235.jar:?] > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > ~[hadoop-common-2.7.3.2.6.3.0-235.jar:?] > at > org.apache.hadoop.hdfs.DistributedFileSystem.listLocatedStatus(DistributedFileSystem.java:1018) > ~[hadoop-hdfs-2.7.3.2.6.3.0-235.jar:?] > at > org.apache.hadoop.fs.FileSystem.listLocatedStatus(FileSystem.java:1735) > ~[hadoop-common-2.7.3.2.6.3.0-235.jar:?] > at > org.apache.hadoop.fs.FileSystem$6.handleFileStat(FileSystem.java:1864) > ~[hadoop-common-2.7.3.2.6.3.0-235.jar:?] > at org.apache.hadoop.fs.FileSystem$6.hasNext(FileSystem.java:1841) > ~[hadoop-common-2.7.3.2.6.3.0-235.jar:?] > at > org.apache.hadoop.hive.metastore.TransactionalValidationListener.containsCopyNFiles(TransactionalValidationListener.java:226) > [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235] > at > org.apache.hadoop.hive.metastore.TransactionalValidationListener.handleAlterTableTransactionalProp(TransactionalValidationListener.java:104) > [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235] > at > org.apache.hadoop.hive.metastore.TransactionalValidationListener.handle(TransactionalValidationListener.java:63) > [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235] > at > org.apache.hadoop.hive.metastore.TransactionalValidationListener.onEvent(TransactionalValidationListener.java:55) > [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235] > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.firePreEvent(HiveMetaStore.java:2478) > [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235] > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_core(HiveMetaStore.java:4145) > [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235] > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_with_environment_context(HiveMetaStore.java:4117) > [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235] > at sun.reflect.GeneratedMethodAccessor107.invoke(Unknown Source) > ~[?:?] > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > ~[?:1.8.0_144] > at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_144] > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:148) > [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235] > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107) > [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235] > at > com.sun.proxy.$Proxy32.alter_table_with_environment_context(Unknown Source) > [?:?] > at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.alter_table_with_environmentContext(HiveMetaStoreClient.java:299) > [hive-exec-2.1.0.2.6.3
[jira] [Commented] (HIVE-17947) Concurrent inserts might fail for ACID table since HIVE-17526 on branch-1
[ https://issues.apache.org/jira/browse/HIVE-17947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16235491#comment-16235491 ] Daniel Voros commented on HIVE-17947: - Thanks [~ekoifman], I agree, that would be better. Also if I'm not mistaken it's enough to check the names of files (and not directories). I'll upload a new patch. > Concurrent inserts might fail for ACID table since HIVE-17526 on branch-1 > - > > Key: HIVE-17947 > URL: https://issues.apache.org/jira/browse/HIVE-17947 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 1.3.0 >Reporter: Daniel Voros >Assignee: Daniel Voros >Priority: Blocker > Attachments: HIVE-17947.1-branch-1.patch, HIVE-17947.2-branch-1.patch > > > HIVE-17526 (only on branch-1) disabled conversion to ACID if there are > *_copy_N files under the table, but the filesystem checks introduced there > are running for every insert since the MoveTask in the end of the insert will > call alterTable eventually. > The filename checking also recurses into staging directories created by other > inserts. If those are removed while listing the files, it leads to the > following exception and failing insert: > {code} > java.io.FileNotFoundException: File > hdfs://mycluster/apps/hive/warehouse/dvoros.db/concurrent_insert/.hive-staging_hive_2017-10-30_13-23-35_056_2844419018556002410-2/-ext-10001 > does not exist. > at > org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1081) > ~[hadoop-hdfs-2.7.3.2.6.3.0-235.jar:?] > at > org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1059) > ~[hadoop-hdfs-2.7.3.2.6.3.0-235.jar:?] > at > org.apache.hadoop.hdfs.DistributedFileSystem$23.doCall(DistributedFileSystem.java:1004) > ~[hadoop-hdfs-2.7.3.2.6.3.0-235.jar:?] > at > org.apache.hadoop.hdfs.DistributedFileSystem$23.doCall(DistributedFileSystem.java:1000) > ~[hadoop-hdfs-2.7.3.2.6.3.0-235.jar:?] > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > ~[hadoop-common-2.7.3.2.6.3.0-235.jar:?] > at > org.apache.hadoop.hdfs.DistributedFileSystem.listLocatedStatus(DistributedFileSystem.java:1018) > ~[hadoop-hdfs-2.7.3.2.6.3.0-235.jar:?] > at > org.apache.hadoop.fs.FileSystem.listLocatedStatus(FileSystem.java:1735) > ~[hadoop-common-2.7.3.2.6.3.0-235.jar:?] > at > org.apache.hadoop.fs.FileSystem$6.handleFileStat(FileSystem.java:1864) > ~[hadoop-common-2.7.3.2.6.3.0-235.jar:?] > at org.apache.hadoop.fs.FileSystem$6.hasNext(FileSystem.java:1841) > ~[hadoop-common-2.7.3.2.6.3.0-235.jar:?] > at > org.apache.hadoop.hive.metastore.TransactionalValidationListener.containsCopyNFiles(TransactionalValidationListener.java:226) > [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235] > at > org.apache.hadoop.hive.metastore.TransactionalValidationListener.handleAlterTableTransactionalProp(TransactionalValidationListener.java:104) > [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235] > at > org.apache.hadoop.hive.metastore.TransactionalValidationListener.handle(TransactionalValidationListener.java:63) > [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235] > at > org.apache.hadoop.hive.metastore.TransactionalValidationListener.onEvent(TransactionalValidationListener.java:55) > [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235] > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.firePreEvent(HiveMetaStore.java:2478) > [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235] > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_core(HiveMetaStore.java:4145) > [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235] > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_with_environment_context(HiveMetaStore.java:4117) > [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235] > at sun.reflect.GeneratedMethodAccessor107.invoke(Unknown Source) > ~[?:?] > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > ~[?:1.8.0_144] > at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_144] > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:148) > [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235] > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107) > [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235] > at > com.sun.proxy.$Proxy32.alter_table_with_environment_context(Unknown Source) > [?:?] > at > org.apache.hadoop.hive.me
[jira] [Commented] (HIVE-17947) Concurrent inserts might fail for ACID table since HIVE-17526 on branch-1
[ https://issues.apache.org/jira/browse/HIVE-17947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16234451#comment-16234451 ] Eugene Koifman commented on HIVE-17947: --- Nit: It may be more efficient to check the file before putting it on the stack so that only directories are ever on the stack though I don't know if it will make a difference in practical situations. +1 pending tests > Concurrent inserts might fail for ACID table since HIVE-17526 on branch-1 > - > > Key: HIVE-17947 > URL: https://issues.apache.org/jira/browse/HIVE-17947 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 1.3.0 >Reporter: Daniel Voros >Assignee: Daniel Voros >Priority: Blocker > Attachments: HIVE-17947.1-branch-1.patch, HIVE-17947.2-branch-1.patch > > > HIVE-17526 (only on branch-1) disabled conversion to ACID if there are > *_copy_N files under the table, but the filesystem checks introduced there > are running for every insert since the MoveTask in the end of the insert will > call alterTable eventually. > The filename checking also recurses into staging directories created by other > inserts. If those are removed while listing the files, it leads to the > following exception and failing insert: > {code} > java.io.FileNotFoundException: File > hdfs://mycluster/apps/hive/warehouse/dvoros.db/concurrent_insert/.hive-staging_hive_2017-10-30_13-23-35_056_2844419018556002410-2/-ext-10001 > does not exist. > at > org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1081) > ~[hadoop-hdfs-2.7.3.2.6.3.0-235.jar:?] > at > org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1059) > ~[hadoop-hdfs-2.7.3.2.6.3.0-235.jar:?] > at > org.apache.hadoop.hdfs.DistributedFileSystem$23.doCall(DistributedFileSystem.java:1004) > ~[hadoop-hdfs-2.7.3.2.6.3.0-235.jar:?] > at > org.apache.hadoop.hdfs.DistributedFileSystem$23.doCall(DistributedFileSystem.java:1000) > ~[hadoop-hdfs-2.7.3.2.6.3.0-235.jar:?] > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > ~[hadoop-common-2.7.3.2.6.3.0-235.jar:?] > at > org.apache.hadoop.hdfs.DistributedFileSystem.listLocatedStatus(DistributedFileSystem.java:1018) > ~[hadoop-hdfs-2.7.3.2.6.3.0-235.jar:?] > at > org.apache.hadoop.fs.FileSystem.listLocatedStatus(FileSystem.java:1735) > ~[hadoop-common-2.7.3.2.6.3.0-235.jar:?] > at > org.apache.hadoop.fs.FileSystem$6.handleFileStat(FileSystem.java:1864) > ~[hadoop-common-2.7.3.2.6.3.0-235.jar:?] > at org.apache.hadoop.fs.FileSystem$6.hasNext(FileSystem.java:1841) > ~[hadoop-common-2.7.3.2.6.3.0-235.jar:?] > at > org.apache.hadoop.hive.metastore.TransactionalValidationListener.containsCopyNFiles(TransactionalValidationListener.java:226) > [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235] > at > org.apache.hadoop.hive.metastore.TransactionalValidationListener.handleAlterTableTransactionalProp(TransactionalValidationListener.java:104) > [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235] > at > org.apache.hadoop.hive.metastore.TransactionalValidationListener.handle(TransactionalValidationListener.java:63) > [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235] > at > org.apache.hadoop.hive.metastore.TransactionalValidationListener.onEvent(TransactionalValidationListener.java:55) > [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235] > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.firePreEvent(HiveMetaStore.java:2478) > [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235] > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_core(HiveMetaStore.java:4145) > [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235] > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_with_environment_context(HiveMetaStore.java:4117) > [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235] > at sun.reflect.GeneratedMethodAccessor107.invoke(Unknown Source) > ~[?:?] > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > ~[?:1.8.0_144] > at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_144] > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:148) > [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235] > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107) > [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235] > at > com.sun.proxy.$Proxy32.alter_table_with_environment_context(Unknown
[jira] [Commented] (HIVE-17947) Concurrent inserts might fail for ACID table since HIVE-17526 on branch-1
[ https://issues.apache.org/jira/browse/HIVE-17947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16227525#comment-16227525 ] Hive QA commented on HIVE-17947: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12894991/HIVE-17947.1-branch-1.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 176 failed/errored test(s), 7967 tests executed *Failed tests:* {noformat} TestAcidOnTez - did not produce a TEST-*.xml file (likely timed out) (batchId=376) TestAdminUser - did not produce a TEST-*.xml file (likely timed out) (batchId=358) TestAuthorizationPreEventListener - did not produce a TEST-*.xml file (likely timed out) (batchId=391) TestAuthzApiEmbedAuthorizerInEmbed - did not produce a TEST-*.xml file (likely timed out) (batchId=368) TestAuthzApiEmbedAuthorizerInRemote - did not produce a TEST-*.xml file (likely timed out) (batchId=374) TestBeeLineWithArgs - did not produce a TEST-*.xml file (likely timed out) (batchId=398) TestCLIAuthzSessionContext - did not produce a TEST-*.xml file (likely timed out) (batchId=416) TestClearDanglingScratchDir - did not produce a TEST-*.xml file (likely timed out) (batchId=383) TestClientSideAuthorizationProvider - did not produce a TEST-*.xml file (likely timed out) (batchId=390) TestCompactor - did not produce a TEST-*.xml file (likely timed out) (batchId=379) TestCreateUdfEntities - did not produce a TEST-*.xml file (likely timed out) (batchId=378) TestCustomAuthentication - did not produce a TEST-*.xml file (likely timed out) (batchId=399) TestDBTokenStore - did not produce a TEST-*.xml file (likely timed out) (batchId=342) TestDDLWithRemoteMetastoreSecondNamenode - did not produce a TEST-*.xml file (likely timed out) (batchId=377) TestDynamicSerDe - did not produce a TEST-*.xml file (likely timed out) (batchId=345) TestEmbeddedHiveMetaStore - did not produce a TEST-*.xml file (likely timed out) (batchId=355) TestEmbeddedThriftBinaryCLIService - did not produce a TEST-*.xml file (likely timed out) (batchId=402) TestEncryptedHDFSCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=437) TestFilterHooks - did not produce a TEST-*.xml file (likely timed out) (batchId=350) TestFolderPermissions - did not produce a TEST-*.xml file (likely timed out) (batchId=385) TestHCatLoaderEncryption - did not produce a TEST-*.xml file (likely timed out) (batchId=283) TestHS2AuthzContext - did not produce a TEST-*.xml file (likely timed out) (batchId=419) TestHS2AuthzSessionContext - did not produce a TEST-*.xml file (likely timed out) (batchId=420) TestHS2ClearDanglingScratchDir - did not produce a TEST-*.xml file (likely timed out) (batchId=406) TestHS2ImpersonationWithRemoteMS - did not produce a TEST-*.xml file (likely timed out) (batchId=407) TestHiveAuthorizerCheckInvocation - did not produce a TEST-*.xml file (likely timed out) (batchId=394) TestHiveAuthorizerShowFilters - did not produce a TEST-*.xml file (likely timed out) (batchId=393) TestHiveHistory - did not produce a TEST-*.xml file (likely timed out) (batchId=396) TestHiveMetaStoreTxns - did not produce a TEST-*.xml file (likely timed out) (batchId=370) TestHiveMetaStoreWithEnvironmentContext - did not produce a TEST-*.xml file (likely timed out) (batchId=360) TestHiveMetaTool - did not produce a TEST-*.xml file (likely timed out) (batchId=373) TestHiveServer2 - did not produce a TEST-*.xml file (likely timed out) (batchId=422) TestHiveServer2SessionTimeout - did not produce a TEST-*.xml file (likely timed out) (batchId=423) TestHiveSessionImpl - did not produce a TEST-*.xml file (likely timed out) (batchId=403) TestHs2Hooks - did not produce a TEST-*.xml file (likely timed out) (batchId=375) TestHs2HooksWithMiniKdc - did not produce a TEST-*.xml file (likely timed out) (batchId=451) TestJdbcDriver2 - did not produce a TEST-*.xml file (likely timed out) (batchId=410) TestJdbcMetadataApiAuth - did not produce a TEST-*.xml file (likely timed out) (batchId=421) TestJdbcWithLocalClusterSpark - did not produce a TEST-*.xml file (likely timed out) (batchId=415) TestJdbcWithMiniHS2 - did not produce a TEST-*.xml file (likely timed out) (batchId=412) TestJdbcWithMiniKdc - did not produce a TEST-*.xml file (likely timed out) (batchId=448) TestJdbcWithMiniKdcCookie - did not produce a TEST-*.xml file (likely timed out) (batchId=447) TestJdbcWithMiniKdcSQLAuthBinary - did not produce a TEST-*.xml file (likely timed out) (batchId=445) TestJdbcWithMiniKdcSQLAuthHttp - did not produce a TEST-*.xml file (likely timed out) (batchId=450) TestJdbcWithMiniMr - did not produce a TEST-*.xml file (likely timed out) (batchId=411) TestJdbcWithSQLAuthUDFBlacklist - did not produce a TEST-*.xml file (likely timed out) (batchId=417) TestJdbcWithSQLAuthorization - did not produce a TEST-*
[jira] [Commented] (HIVE-17947) Concurrent inserts might fail for ACID table since HIVE-17526 on branch-1
[ https://issues.apache.org/jira/browse/HIVE-17947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16227233#comment-16227233 ] Daniel Voros commented on HIVE-17947: - I agree, those uses only run on a single partition if the table is partitioned. And it should be, right? (: Yeah, something along those lines should work. I'll submit a new patch shortly. > Concurrent inserts might fail for ACID table since HIVE-17526 on branch-1 > - > > Key: HIVE-17947 > URL: https://issues.apache.org/jira/browse/HIVE-17947 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 1.3.0 >Reporter: Daniel Voros >Assignee: Daniel Voros >Priority: Blocker > Attachments: HIVE-17947.1-branch-1.patch > > > HIVE-17526 (only on branch-1) disabled conversion to ACID if there are > *_copy_N files under the table, but the filesystem checks introduced there > are running for every insert since the MoveTask in the end of the insert will > call alterTable eventually. > The filename checking also recurses into staging directories created by other > inserts. If those are removed while listing the files, it leads to the > following exception and failing insert: > {code} > java.io.FileNotFoundException: File > hdfs://mycluster/apps/hive/warehouse/dvoros.db/concurrent_insert/.hive-staging_hive_2017-10-30_13-23-35_056_2844419018556002410-2/-ext-10001 > does not exist. > at > org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1081) > ~[hadoop-hdfs-2.7.3.2.6.3.0-235.jar:?] > at > org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1059) > ~[hadoop-hdfs-2.7.3.2.6.3.0-235.jar:?] > at > org.apache.hadoop.hdfs.DistributedFileSystem$23.doCall(DistributedFileSystem.java:1004) > ~[hadoop-hdfs-2.7.3.2.6.3.0-235.jar:?] > at > org.apache.hadoop.hdfs.DistributedFileSystem$23.doCall(DistributedFileSystem.java:1000) > ~[hadoop-hdfs-2.7.3.2.6.3.0-235.jar:?] > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > ~[hadoop-common-2.7.3.2.6.3.0-235.jar:?] > at > org.apache.hadoop.hdfs.DistributedFileSystem.listLocatedStatus(DistributedFileSystem.java:1018) > ~[hadoop-hdfs-2.7.3.2.6.3.0-235.jar:?] > at > org.apache.hadoop.fs.FileSystem.listLocatedStatus(FileSystem.java:1735) > ~[hadoop-common-2.7.3.2.6.3.0-235.jar:?] > at > org.apache.hadoop.fs.FileSystem$6.handleFileStat(FileSystem.java:1864) > ~[hadoop-common-2.7.3.2.6.3.0-235.jar:?] > at org.apache.hadoop.fs.FileSystem$6.hasNext(FileSystem.java:1841) > ~[hadoop-common-2.7.3.2.6.3.0-235.jar:?] > at > org.apache.hadoop.hive.metastore.TransactionalValidationListener.containsCopyNFiles(TransactionalValidationListener.java:226) > [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235] > at > org.apache.hadoop.hive.metastore.TransactionalValidationListener.handleAlterTableTransactionalProp(TransactionalValidationListener.java:104) > [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235] > at > org.apache.hadoop.hive.metastore.TransactionalValidationListener.handle(TransactionalValidationListener.java:63) > [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235] > at > org.apache.hadoop.hive.metastore.TransactionalValidationListener.onEvent(TransactionalValidationListener.java:55) > [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235] > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.firePreEvent(HiveMetaStore.java:2478) > [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235] > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_core(HiveMetaStore.java:4145) > [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235] > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_with_environment_context(HiveMetaStore.java:4117) > [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235] > at sun.reflect.GeneratedMethodAccessor107.invoke(Unknown Source) > ~[?:?] > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > ~[?:1.8.0_144] > at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_144] > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:148) > [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235] > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107) > [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235] > at > com.sun.proxy.$Proxy32.alter_table_with_environment_context(Unknown Source) > [?:?] > at > org.apache.hadoop.hive.metasto
[jira] [Commented] (HIVE-17947) Concurrent inserts might fail for ACID table since HIVE-17526 on branch-1
[ https://issues.apache.org/jira/browse/HIVE-17947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16227198#comment-16227198 ] Eugene Koifman commented on HIVE-17947: --- My $0.02 The other uses of listStatusRecursively() look like they wouldn't be run on whole table, rather 1 partition at a time so it's much less likely see a lot of files. Couldn't something like this work? {noformat} public static boolean listStatusRecursively(FileSystem fs, FileStatus fileStatus, PathFilter filter, List results) throws IOException { if (fileStatus.isDir()) { for (FileStatus stat : fs.listStatus(fileStatus.getPath(), filter)) { listStatusRecursively(fs, stat, results); } } else { if(isCopyFile(stat)) return true; } } {noformat} > Concurrent inserts might fail for ACID table since HIVE-17526 on branch-1 > - > > Key: HIVE-17947 > URL: https://issues.apache.org/jira/browse/HIVE-17947 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 1.3.0 >Reporter: Daniel Voros >Assignee: Daniel Voros >Priority: Blocker > Attachments: HIVE-17947.1-branch-1.patch > > > HIVE-17526 (only on branch-1) disabled conversion to ACID if there are > *_copy_N files under the table, but the filesystem checks introduced there > are running for every insert since the MoveTask in the end of the insert will > call alterTable eventually. > The filename checking also recurses into staging directories created by other > inserts. If those are removed while listing the files, it leads to the > following exception and failing insert: > {code} > java.io.FileNotFoundException: File > hdfs://mycluster/apps/hive/warehouse/dvoros.db/concurrent_insert/.hive-staging_hive_2017-10-30_13-23-35_056_2844419018556002410-2/-ext-10001 > does not exist. > at > org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1081) > ~[hadoop-hdfs-2.7.3.2.6.3.0-235.jar:?] > at > org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1059) > ~[hadoop-hdfs-2.7.3.2.6.3.0-235.jar:?] > at > org.apache.hadoop.hdfs.DistributedFileSystem$23.doCall(DistributedFileSystem.java:1004) > ~[hadoop-hdfs-2.7.3.2.6.3.0-235.jar:?] > at > org.apache.hadoop.hdfs.DistributedFileSystem$23.doCall(DistributedFileSystem.java:1000) > ~[hadoop-hdfs-2.7.3.2.6.3.0-235.jar:?] > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > ~[hadoop-common-2.7.3.2.6.3.0-235.jar:?] > at > org.apache.hadoop.hdfs.DistributedFileSystem.listLocatedStatus(DistributedFileSystem.java:1018) > ~[hadoop-hdfs-2.7.3.2.6.3.0-235.jar:?] > at > org.apache.hadoop.fs.FileSystem.listLocatedStatus(FileSystem.java:1735) > ~[hadoop-common-2.7.3.2.6.3.0-235.jar:?] > at > org.apache.hadoop.fs.FileSystem$6.handleFileStat(FileSystem.java:1864) > ~[hadoop-common-2.7.3.2.6.3.0-235.jar:?] > at org.apache.hadoop.fs.FileSystem$6.hasNext(FileSystem.java:1841) > ~[hadoop-common-2.7.3.2.6.3.0-235.jar:?] > at > org.apache.hadoop.hive.metastore.TransactionalValidationListener.containsCopyNFiles(TransactionalValidationListener.java:226) > [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235] > at > org.apache.hadoop.hive.metastore.TransactionalValidationListener.handleAlterTableTransactionalProp(TransactionalValidationListener.java:104) > [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235] > at > org.apache.hadoop.hive.metastore.TransactionalValidationListener.handle(TransactionalValidationListener.java:63) > [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235] > at > org.apache.hadoop.hive.metastore.TransactionalValidationListener.onEvent(TransactionalValidationListener.java:55) > [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235] > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.firePreEvent(HiveMetaStore.java:2478) > [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235] > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_core(HiveMetaStore.java:4145) > [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235] > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_with_environment_context(HiveMetaStore.java:4117) > [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235] > at sun.reflect.GeneratedMethodAccessor107.invoke(Unknown Source) > ~[?:?] > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > ~[?:1.8.0_144] > at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_144] > at > org.apache.hadoop.hive.metastore.Retry
[jira] [Commented] (HIVE-17947) Concurrent inserts might fail for ACID table since HIVE-17526 on branch-1
[ https://issues.apache.org/jira/browse/HIVE-17947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16227188#comment-16227188 ] Daniel Voros commented on HIVE-17947: - Thanks [~ekoifman], I'll take a look at that to see if it's applicable to branch-1. The "best worst case" we can do is to keep the contents of a directory in memory, since that's what the iterator approach is doing under the hood. We can achieve the same with your walk and check approach using {{listFiles():FileStatus[]}}. > Concurrent inserts might fail for ACID table since HIVE-17526 on branch-1 > - > > Key: HIVE-17947 > URL: https://issues.apache.org/jira/browse/HIVE-17947 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 1.3.0 >Reporter: Daniel Voros >Assignee: Daniel Voros >Priority: Blocker > Attachments: HIVE-17947.1-branch-1.patch > > > HIVE-17526 (only on branch-1) disabled conversion to ACID if there are > *_copy_N files under the table, but the filesystem checks introduced there > are running for every insert since the MoveTask in the end of the insert will > call alterTable eventually. > The filename checking also recurses into staging directories created by other > inserts. If those are removed while listing the files, it leads to the > following exception and failing insert: > {code} > java.io.FileNotFoundException: File > hdfs://mycluster/apps/hive/warehouse/dvoros.db/concurrent_insert/.hive-staging_hive_2017-10-30_13-23-35_056_2844419018556002410-2/-ext-10001 > does not exist. > at > org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1081) > ~[hadoop-hdfs-2.7.3.2.6.3.0-235.jar:?] > at > org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1059) > ~[hadoop-hdfs-2.7.3.2.6.3.0-235.jar:?] > at > org.apache.hadoop.hdfs.DistributedFileSystem$23.doCall(DistributedFileSystem.java:1004) > ~[hadoop-hdfs-2.7.3.2.6.3.0-235.jar:?] > at > org.apache.hadoop.hdfs.DistributedFileSystem$23.doCall(DistributedFileSystem.java:1000) > ~[hadoop-hdfs-2.7.3.2.6.3.0-235.jar:?] > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > ~[hadoop-common-2.7.3.2.6.3.0-235.jar:?] > at > org.apache.hadoop.hdfs.DistributedFileSystem.listLocatedStatus(DistributedFileSystem.java:1018) > ~[hadoop-hdfs-2.7.3.2.6.3.0-235.jar:?] > at > org.apache.hadoop.fs.FileSystem.listLocatedStatus(FileSystem.java:1735) > ~[hadoop-common-2.7.3.2.6.3.0-235.jar:?] > at > org.apache.hadoop.fs.FileSystem$6.handleFileStat(FileSystem.java:1864) > ~[hadoop-common-2.7.3.2.6.3.0-235.jar:?] > at org.apache.hadoop.fs.FileSystem$6.hasNext(FileSystem.java:1841) > ~[hadoop-common-2.7.3.2.6.3.0-235.jar:?] > at > org.apache.hadoop.hive.metastore.TransactionalValidationListener.containsCopyNFiles(TransactionalValidationListener.java:226) > [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235] > at > org.apache.hadoop.hive.metastore.TransactionalValidationListener.handleAlterTableTransactionalProp(TransactionalValidationListener.java:104) > [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235] > at > org.apache.hadoop.hive.metastore.TransactionalValidationListener.handle(TransactionalValidationListener.java:63) > [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235] > at > org.apache.hadoop.hive.metastore.TransactionalValidationListener.onEvent(TransactionalValidationListener.java:55) > [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235] > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.firePreEvent(HiveMetaStore.java:2478) > [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235] > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_core(HiveMetaStore.java:4145) > [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235] > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_with_environment_context(HiveMetaStore.java:4117) > [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235] > at sun.reflect.GeneratedMethodAccessor107.invoke(Unknown Source) > ~[?:?] > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > ~[?:1.8.0_144] > at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_144] > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:148) > [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235] > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107) > [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235] >
[jira] [Commented] (HIVE-17947) Concurrent inserts might fail for ACID table since HIVE-17526 on branch-1
[ https://issues.apache.org/jira/browse/HIVE-17947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16227164#comment-16227164 ] Daniel Voros commented on HIVE-17947: - I think one of the reasons for the exception above is using RemoteIterator. Its {{hasNext()}} function seems to fail if the file was removed since creating the iterator. Another reason why I've decided to go with the {{FileUtils#listStatusRecursively()}} (that uses {{FileSystem#listStatus():FileStatus[]}}) is because there's no (public) method in FileSystem that accepts a PathFilter and returns a RemoteIterator. Without that we would fail again at hasNext() when trying to iterate the results to filter out hidden files that were removed in the meantime. Directories being removed between finding them and listing their contents would mess with the current solution as well, but since we're able to filter out hidden files we're not listing the staging directories that are likely to be removed. After a quick look it seems to me that we're already using {{listStatusRecursively()}} to list files under a table [here|https://github.com/apache/hive/blob/cd08cd6d3103ed70a8da4cce7eaaad251eb2a12f/ql/src/java/org/apache/hadoop/hive/ql/exec/FetchOperator.java#L662], so we would probably have come across the memory limit if it could be an issue in this case. Please let me know what you think! > Concurrent inserts might fail for ACID table since HIVE-17526 on branch-1 > - > > Key: HIVE-17947 > URL: https://issues.apache.org/jira/browse/HIVE-17947 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 1.3.0 >Reporter: Daniel Voros >Assignee: Daniel Voros >Priority: Blocker > Attachments: HIVE-17947.1-branch-1.patch > > > HIVE-17526 (only on branch-1) disabled conversion to ACID if there are > *_copy_N files under the table, but the filesystem checks introduced there > are running for every insert since the MoveTask in the end of the insert will > call alterTable eventually. > The filename checking also recurses into staging directories created by other > inserts. If those are removed while listing the files, it leads to the > following exception and failing insert: > {code} > java.io.FileNotFoundException: File > hdfs://mycluster/apps/hive/warehouse/dvoros.db/concurrent_insert/.hive-staging_hive_2017-10-30_13-23-35_056_2844419018556002410-2/-ext-10001 > does not exist. > at > org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1081) > ~[hadoop-hdfs-2.7.3.2.6.3.0-235.jar:?] > at > org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1059) > ~[hadoop-hdfs-2.7.3.2.6.3.0-235.jar:?] > at > org.apache.hadoop.hdfs.DistributedFileSystem$23.doCall(DistributedFileSystem.java:1004) > ~[hadoop-hdfs-2.7.3.2.6.3.0-235.jar:?] > at > org.apache.hadoop.hdfs.DistributedFileSystem$23.doCall(DistributedFileSystem.java:1000) > ~[hadoop-hdfs-2.7.3.2.6.3.0-235.jar:?] > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > ~[hadoop-common-2.7.3.2.6.3.0-235.jar:?] > at > org.apache.hadoop.hdfs.DistributedFileSystem.listLocatedStatus(DistributedFileSystem.java:1018) > ~[hadoop-hdfs-2.7.3.2.6.3.0-235.jar:?] > at > org.apache.hadoop.fs.FileSystem.listLocatedStatus(FileSystem.java:1735) > ~[hadoop-common-2.7.3.2.6.3.0-235.jar:?] > at > org.apache.hadoop.fs.FileSystem$6.handleFileStat(FileSystem.java:1864) > ~[hadoop-common-2.7.3.2.6.3.0-235.jar:?] > at org.apache.hadoop.fs.FileSystem$6.hasNext(FileSystem.java:1841) > ~[hadoop-common-2.7.3.2.6.3.0-235.jar:?] > at > org.apache.hadoop.hive.metastore.TransactionalValidationListener.containsCopyNFiles(TransactionalValidationListener.java:226) > [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235] > at > org.apache.hadoop.hive.metastore.TransactionalValidationListener.handleAlterTableTransactionalProp(TransactionalValidationListener.java:104) > [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235] > at > org.apache.hadoop.hive.metastore.TransactionalValidationListener.handle(TransactionalValidationListener.java:63) > [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235] > at > org.apache.hadoop.hive.metastore.TransactionalValidationListener.onEvent(TransactionalValidationListener.java:55) > [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235] > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.firePreEvent(HiveMetaStore.java:2478) > [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235] > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_core(Hi
[jira] [Commented] (HIVE-17947) Concurrent inserts might fail for ACID table since HIVE-17526 on branch-1
[ https://issues.apache.org/jira/browse/HIVE-17947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16227156#comment-16227156 ] Eugene Koifman commented on HIVE-17947: --- [~dvoros] since this checks all the files in a (bucketed) table, across all partitions, I think this can hit 70K limit. (16 buckets, hourly partitions = 182 days worth of data). Perhaps the process can walk and check at the same time, that way it will at most have to keep the directories in memory. Alternatively, you could consider back porting HIVE-16688. Since this check is now done only once when a table is marked transactional, and with HIVE-16688 under Exclusive lock, there shouldn't be any change to the file tree while this is done iterating w/o a filter may be ok. > Concurrent inserts might fail for ACID table since HIVE-17526 on branch-1 > - > > Key: HIVE-17947 > URL: https://issues.apache.org/jira/browse/HIVE-17947 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 1.3.0 >Reporter: Daniel Voros >Assignee: Daniel Voros >Priority: Blocker > Attachments: HIVE-17947.1-branch-1.patch > > > HIVE-17526 (only on branch-1) disabled conversion to ACID if there are > *_copy_N files under the table, but the filesystem checks introduced there > are running for every insert since the MoveTask in the end of the insert will > call alterTable eventually. > The filename checking also recurses into staging directories created by other > inserts. If those are removed while listing the files, it leads to the > following exception and failing insert: > {code} > java.io.FileNotFoundException: File > hdfs://mycluster/apps/hive/warehouse/dvoros.db/concurrent_insert/.hive-staging_hive_2017-10-30_13-23-35_056_2844419018556002410-2/-ext-10001 > does not exist. > at > org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1081) > ~[hadoop-hdfs-2.7.3.2.6.3.0-235.jar:?] > at > org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1059) > ~[hadoop-hdfs-2.7.3.2.6.3.0-235.jar:?] > at > org.apache.hadoop.hdfs.DistributedFileSystem$23.doCall(DistributedFileSystem.java:1004) > ~[hadoop-hdfs-2.7.3.2.6.3.0-235.jar:?] > at > org.apache.hadoop.hdfs.DistributedFileSystem$23.doCall(DistributedFileSystem.java:1000) > ~[hadoop-hdfs-2.7.3.2.6.3.0-235.jar:?] > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > ~[hadoop-common-2.7.3.2.6.3.0-235.jar:?] > at > org.apache.hadoop.hdfs.DistributedFileSystem.listLocatedStatus(DistributedFileSystem.java:1018) > ~[hadoop-hdfs-2.7.3.2.6.3.0-235.jar:?] > at > org.apache.hadoop.fs.FileSystem.listLocatedStatus(FileSystem.java:1735) > ~[hadoop-common-2.7.3.2.6.3.0-235.jar:?] > at > org.apache.hadoop.fs.FileSystem$6.handleFileStat(FileSystem.java:1864) > ~[hadoop-common-2.7.3.2.6.3.0-235.jar:?] > at org.apache.hadoop.fs.FileSystem$6.hasNext(FileSystem.java:1841) > ~[hadoop-common-2.7.3.2.6.3.0-235.jar:?] > at > org.apache.hadoop.hive.metastore.TransactionalValidationListener.containsCopyNFiles(TransactionalValidationListener.java:226) > [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235] > at > org.apache.hadoop.hive.metastore.TransactionalValidationListener.handleAlterTableTransactionalProp(TransactionalValidationListener.java:104) > [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235] > at > org.apache.hadoop.hive.metastore.TransactionalValidationListener.handle(TransactionalValidationListener.java:63) > [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235] > at > org.apache.hadoop.hive.metastore.TransactionalValidationListener.onEvent(TransactionalValidationListener.java:55) > [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235] > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.firePreEvent(HiveMetaStore.java:2478) > [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235] > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_core(HiveMetaStore.java:4145) > [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235] > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_with_environment_context(HiveMetaStore.java:4117) > [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235] > at sun.reflect.GeneratedMethodAccessor107.invoke(Unknown Source) > ~[?:?] > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > ~[?:1.8.0_144] > at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_144] > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.
[jira] [Commented] (HIVE-17947) Concurrent inserts might fail for ACID table since HIVE-17526 on branch-1
[ https://issues.apache.org/jira/browse/HIVE-17947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16227055#comment-16227055 ] Ashutosh Chauhan commented on HIVE-17947: - {{RemoteIterator}} is the one recommended to be used. Reason is other similar methods return an array which requires all FileStatus objects to be loaded in memory before being returned to client. RemoteIterator on the other hand fetches FileStatus object on demand from NN and never need to load all of them together in memory. You may need to consider only if # of files can be large, if it is guaranteed they can't be large then above doesn't matter. IIRC, I once saw OOM in Tez AM while it loaded all file status to do split computation. If memory serves there were ~70K file status object there. > Concurrent inserts might fail for ACID table since HIVE-17526 on branch-1 > - > > Key: HIVE-17947 > URL: https://issues.apache.org/jira/browse/HIVE-17947 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 1.3.0 >Reporter: Daniel Voros >Assignee: Daniel Voros >Priority: Blocker > Attachments: HIVE-17947.1-branch-1.patch > > > HIVE-17526 (only on branch-1) disabled conversion to ACID if there are > *_copy_N files under the table, but the filesystem checks introduced there > are running for every insert since the MoveTask in the end of the insert will > call alterTable eventually. > The filename checking also recurses into staging directories created by other > inserts. If those are removed while listing the files, it leads to the > following exception and failing insert: > {code} > java.io.FileNotFoundException: File > hdfs://mycluster/apps/hive/warehouse/dvoros.db/concurrent_insert/.hive-staging_hive_2017-10-30_13-23-35_056_2844419018556002410-2/-ext-10001 > does not exist. > at > org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1081) > ~[hadoop-hdfs-2.7.3.2.6.3.0-235.jar:?] > at > org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1059) > ~[hadoop-hdfs-2.7.3.2.6.3.0-235.jar:?] > at > org.apache.hadoop.hdfs.DistributedFileSystem$23.doCall(DistributedFileSystem.java:1004) > ~[hadoop-hdfs-2.7.3.2.6.3.0-235.jar:?] > at > org.apache.hadoop.hdfs.DistributedFileSystem$23.doCall(DistributedFileSystem.java:1000) > ~[hadoop-hdfs-2.7.3.2.6.3.0-235.jar:?] > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > ~[hadoop-common-2.7.3.2.6.3.0-235.jar:?] > at > org.apache.hadoop.hdfs.DistributedFileSystem.listLocatedStatus(DistributedFileSystem.java:1018) > ~[hadoop-hdfs-2.7.3.2.6.3.0-235.jar:?] > at > org.apache.hadoop.fs.FileSystem.listLocatedStatus(FileSystem.java:1735) > ~[hadoop-common-2.7.3.2.6.3.0-235.jar:?] > at > org.apache.hadoop.fs.FileSystem$6.handleFileStat(FileSystem.java:1864) > ~[hadoop-common-2.7.3.2.6.3.0-235.jar:?] > at org.apache.hadoop.fs.FileSystem$6.hasNext(FileSystem.java:1841) > ~[hadoop-common-2.7.3.2.6.3.0-235.jar:?] > at > org.apache.hadoop.hive.metastore.TransactionalValidationListener.containsCopyNFiles(TransactionalValidationListener.java:226) > [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235] > at > org.apache.hadoop.hive.metastore.TransactionalValidationListener.handleAlterTableTransactionalProp(TransactionalValidationListener.java:104) > [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235] > at > org.apache.hadoop.hive.metastore.TransactionalValidationListener.handle(TransactionalValidationListener.java:63) > [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235] > at > org.apache.hadoop.hive.metastore.TransactionalValidationListener.onEvent(TransactionalValidationListener.java:55) > [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235] > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.firePreEvent(HiveMetaStore.java:2478) > [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235] > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_core(HiveMetaStore.java:4145) > [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235] > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_with_environment_context(HiveMetaStore.java:4117) > [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235] > at sun.reflect.GeneratedMethodAccessor107.invoke(Unknown Source) > ~[?:?] > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > ~[?:1.8.0_144] > at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_144] > at > org.apache.hadoop.hive.metastor
[jira] [Commented] (HIVE-17947) Concurrent inserts might fail for ACID table since HIVE-17526 on branch-1
[ https://issues.apache.org/jira/browse/HIVE-17947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16226933#comment-16226933 ] Eugene Koifman commented on HIVE-17947: --- [~ashutoshc], I think you told me that not using {noformat} RemoteIterator iterator = fs.listFiles() {noformat} has some performance issues. Could you provide some context? Is {noformat} FileUtils.listStatusRecursively() {noformat} an adequate replacement? [~dvoros], aside from the above 1 minor nit (technically from HIVE-17526): You already the _Warehouse_ from _Warehouse wh = HiveMetaStore.HMSHandler.getWh();_ > Concurrent inserts might fail for ACID table since HIVE-17526 on branch-1 > - > > Key: HIVE-17947 > URL: https://issues.apache.org/jira/browse/HIVE-17947 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 1.3.0 >Reporter: Daniel Voros >Assignee: Daniel Voros >Priority: Blocker > Attachments: HIVE-17947.1-branch-1.patch > > > HIVE-17526 (only on branch-1) disabled conversion to ACID if there are > *_copy_N files under the table, but the filesystem checks introduced there > are running for every insert since the MoveTask in the end of the insert will > call alterTable eventually. > The filename checking also recurses into staging directories created by other > inserts. If those are removed while listing the files, it leads to the > following exception and failing insert: > {code} > java.io.FileNotFoundException: File > hdfs://mycluster/apps/hive/warehouse/dvoros.db/concurrent_insert/.hive-staging_hive_2017-10-30_13-23-35_056_2844419018556002410-2/-ext-10001 > does not exist. > at > org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1081) > ~[hadoop-hdfs-2.7.3.2.6.3.0-235.jar:?] > at > org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1059) > ~[hadoop-hdfs-2.7.3.2.6.3.0-235.jar:?] > at > org.apache.hadoop.hdfs.DistributedFileSystem$23.doCall(DistributedFileSystem.java:1004) > ~[hadoop-hdfs-2.7.3.2.6.3.0-235.jar:?] > at > org.apache.hadoop.hdfs.DistributedFileSystem$23.doCall(DistributedFileSystem.java:1000) > ~[hadoop-hdfs-2.7.3.2.6.3.0-235.jar:?] > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > ~[hadoop-common-2.7.3.2.6.3.0-235.jar:?] > at > org.apache.hadoop.hdfs.DistributedFileSystem.listLocatedStatus(DistributedFileSystem.java:1018) > ~[hadoop-hdfs-2.7.3.2.6.3.0-235.jar:?] > at > org.apache.hadoop.fs.FileSystem.listLocatedStatus(FileSystem.java:1735) > ~[hadoop-common-2.7.3.2.6.3.0-235.jar:?] > at > org.apache.hadoop.fs.FileSystem$6.handleFileStat(FileSystem.java:1864) > ~[hadoop-common-2.7.3.2.6.3.0-235.jar:?] > at org.apache.hadoop.fs.FileSystem$6.hasNext(FileSystem.java:1841) > ~[hadoop-common-2.7.3.2.6.3.0-235.jar:?] > at > org.apache.hadoop.hive.metastore.TransactionalValidationListener.containsCopyNFiles(TransactionalValidationListener.java:226) > [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235] > at > org.apache.hadoop.hive.metastore.TransactionalValidationListener.handleAlterTableTransactionalProp(TransactionalValidationListener.java:104) > [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235] > at > org.apache.hadoop.hive.metastore.TransactionalValidationListener.handle(TransactionalValidationListener.java:63) > [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235] > at > org.apache.hadoop.hive.metastore.TransactionalValidationListener.onEvent(TransactionalValidationListener.java:55) > [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235] > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.firePreEvent(HiveMetaStore.java:2478) > [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235] > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_core(HiveMetaStore.java:4145) > [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235] > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_with_environment_context(HiveMetaStore.java:4117) > [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235] > at sun.reflect.GeneratedMethodAccessor107.invoke(Unknown Source) > ~[?:?] > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > ~[?:1.8.0_144] > at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_144] > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:148) > [hive-exec-2.1.0.2.6.3.0-235.jar:2.1.0.2.6.3.0-235] > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invok