[ https://issues.apache.org/jira/browse/HIVE-23437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17105218#comment-17105218 ]
Marc Demierre edited comment on HIVE-23437 at 5/12/20, 8:25 AM: ---------------------------------------------------------------- We tried a workaround on the client side to ensure the calls are not simultaneous by delaying them. It didn't solve the issue, only made it rarer. We also observed a second instance of the problem: * T1: ** R1 creates the directory, then is paused/waiting * T2: ** R2 arrives, does not create the directory as it exists ** R2 creates the partition (wins the race on DB) and completes * T3: ** R1 resumes, sees that it failed the DB transaction, deletes the folder Relevant logs (R1=2558, R2=2556): {code:java} 2020-05-11 20:00:00,944 INFO [pool-7-thread-2558]: metastore.HiveMetaStore (HiveMetaStore.java:logInfo(775)) - 2558: append_partition_by_name: db=sbd_dev_autodump tbl=sbd_dev_debug_hive_4 part=time=ingestion/bucket=hourly/date=2020-05-11/hour=18 2020-05-11 20:00:00,945 INFO [pool-7-thread-2558]: HiveMetaStore.audit (HiveMetaStore.java:logAuditEvent(319)) - ugi=sbd-fh-connect-autodumph...@sbd.corproot.net ip=10.222.76.2 cmd=append_partition_by_name: db=sbd_dev_autodump tbl=sbd_dev_debug_hive_4 part=time=ingestion/bucket=hourly/date=2020-05-11/hour=18 2020-05-11 20:00:01,311 INFO [pool-7-thread-2556]: metastore.HiveMetaStore (HiveMetaStore.java:logInfo(775)) - 2556: append_partition_by_name: db=sbd_dev_autodump tbl=sbd_dev_debug_hive_4 part=time=ingestion/bucket=hourly/date=2020-05-11/hour=18 2020-05-11 20:00:01,311 INFO [pool-7-thread-2556]: HiveMetaStore.audit (HiveMetaStore.java:logAuditEvent(319)) - ugi=sbd-fh-connect-autodumph...@sbd.corproot.net ip=10.222.76.2 cmd=append_partition_by_name: db=sbd_dev_autodump tbl=sbd_dev_debug_hive_4 part=time=ingestion/bucket=hourly/date=2020-05-11/hour=18 2020-05-11 20:00:01,481 INFO [pool-7-thread-2558]: common.FileUtils (FileUtils.java:mkdir(573)) - Creating directory if it doesn't exist: hdfs://sbd/data/sbd/dev/sbd.dev.debug-hive-4/time=ingestion/bucket=hourly/date=2020-05-11/hour=18 2020-05-11 20:00:01,521 WARN [pool-7-thread-2556]: hive.log (MetaStoreUtils.java:updatePartitionStatsFast(352)) - Updating partition stats fast for: sbd_dev_debug_hive_4 2020-05-11 20:00:01,537 WARN [pool-7-thread-2556]: hive.log (MetaStoreUtils.java:updatePartitionStatsFast(355)) - Updated size to 0 2020-05-11 20:00:01,764 INFO [pool-7-thread-2558]: metastore.hivemetastoressimpl (HiveMetaStoreFsImpl.java:deleteDir(41)) - deleting hdfs://sbd/data/sbd/dev/sbd.dev.debug-hive-4/time=ingestion/bucket=hourly/date=2020-05-11/hour=18 2020-05-11 20:00:01,787 INFO [pool-7-thread-2558]: fs.TrashPolicyDefault (TrashPolicyDefault.java:moveToTrash(168)) - Moved: 'hdfs://sbd/data/sbd/dev/sbd.dev.debug-hive-4/time=ingestion/bucket=hourly/date=2020-05-11/hour=18' to trash at: hdfs://sbd/user/sbd-fh-connect-autodumphdfs/.Trash/Current/data/sbd/dev/sbd.dev.debug-hive-4/time=ingestion/bucket=hourly/date=2020-05-11/hour=18 2020-05-11 20:00:01,787 INFO [pool-7-thread-2558]: metastore.hivemetastoressimpl (HiveMetaStoreFsImpl.java:deleteDir(48)) - Moved to trash: hdfs://sbd/data/sbd/dev/sbd.dev.debug-hive-4/time=ingestion/bucket=hourly/date=2020-05-11/hour=18 2020-05-11 20:00:01,788 ERROR [pool-7-thread-2558]: metastore.RetryingHMSHandler (RetryingHMSHandler.java:invokeInternal(217)) - Retrying HMSHandler after 2000 ms (attempt 1 of 10) with error: javax.jdo.JDODataStoreException: Insert of object "org.apache.hadoop.hive.metastore.model.MPartition@3254e57d" using statement "INSERT INTO "PARTITIONS" ("PART_ID","CREATE_TIME","LAST_ACCESS_TIME","PART_NAME","SD_ID","TBL_ID") VALUES (?,?,?,?,?,?)" failed : ERROR: duplicate key value violates unique constraint "UNIQUEPARTITION" 2020-05-11 20:00:03,788 INFO [pool-7-thread-2558]: metastore.HiveMetaStore (HiveMetaStore.java:logInfo(775)) - 2558: append_partition_by_name: db=sbd_dev_autodump tbl=sbd_dev_debug_hive_4 part=time=ingestion/bucket=hourly/date=2020-05-11/hour=18 2020-05-11 20:00:03,788 INFO [pool-7-thread-2558]: HiveMetaStore.audit (HiveMetaStore.java:logAuditEvent(319)) - ugi=sbd-fh-connect-autodumph...@sbd.corproot.net ip=10.222.76.2 cmd=append_partition_by_name: db=sbd_dev_autodump tbl=sbd_dev_debug_hive_4 part=time=ingestion/bucket=hourly/date=2020-05-11/hour=18 2020-05-11 20:00:03,869 ERROR [pool-7-thread-2558]: metastore.RetryingHMSHandler (RetryingHMSHandler.java:invokeInternal(203)) - AlreadyExistsException(message:Partition already exists:Partition(values:[ingestion, hourly, 2020-05-11, 18], dbName:sbd_dev_autodump, tableName:sbd_dev_debug_hive_4, createTime:0, lastAccessTime:0, sd:StorageDescriptor(cols:[FieldSchema(name:name, type:string, comment:null), FieldSchema(name:age, type:int, comment:null), FieldSchema(name:salary, type:int, comment:null)], location:hdfs://sbd/data/sbd/dev/sbd.dev.debug-hive-4/time=ingestion/bucket=hourly/date=2020-05-11/hour=18, inputFormat:org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat, outputFormat:org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat, compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.avro.AvroSerDe, parameters:{serialization.format=1}), bucketCols:[], sortCols:[], parameters:{}, skewedInfo:SkewedInfo(skewedColNames:[], skewedColValues:[], skewedColValueLocationMaps:{}), storedAsSubDirectories:false), parameters:null)){code} was (Author: mdemierre): After trying a workaround on the client side to ensure the calls are not simultaneous by delaying them. It didn't solve the issue, only made it rarer. We also observed a second instance of the problem: * T1: ** R1 creates the directory, then is paused/waiting * T2: ** R2 arrives, does not create the directory as it exists ** R2 creates the partition (wins the race on DB) and completes * T3: ** R1 resumes, sees that it failed the DB transaction, deletes the folder Relevant logs (R1=2558, R2=2556): {code:java} 2020-05-11 20:00:00,944 INFO [pool-7-thread-2558]: metastore.HiveMetaStore (HiveMetaStore.java:logInfo(775)) - 2558: append_partition_by_name: db=sbd_dev_autodump tbl=sbd_dev_debug_hive_4 part=time=ingestion/bucket=hourly/date=2020-05-11/hour=18 2020-05-11 20:00:00,945 INFO [pool-7-thread-2558]: HiveMetaStore.audit (HiveMetaStore.java:logAuditEvent(319)) - ugi=sbd-fh-connect-autodumph...@sbd.corproot.net ip=10.222.76.2 cmd=append_partition_by_name: db=sbd_dev_autodump tbl=sbd_dev_debug_hive_4 part=time=ingestion/bucket=hourly/date=2020-05-11/hour=18 2020-05-11 20:00:01,311 INFO [pool-7-thread-2556]: metastore.HiveMetaStore (HiveMetaStore.java:logInfo(775)) - 2556: append_partition_by_name: db=sbd_dev_autodump tbl=sbd_dev_debug_hive_4 part=time=ingestion/bucket=hourly/date=2020-05-11/hour=18 2020-05-11 20:00:01,311 INFO [pool-7-thread-2556]: HiveMetaStore.audit (HiveMetaStore.java:logAuditEvent(319)) - ugi=sbd-fh-connect-autodumph...@sbd.corproot.net ip=10.222.76.2 cmd=append_partition_by_name: db=sbd_dev_autodump tbl=sbd_dev_debug_hive_4 part=time=ingestion/bucket=hourly/date=2020-05-11/hour=18 2020-05-11 20:00:01,481 INFO [pool-7-thread-2558]: common.FileUtils (FileUtils.java:mkdir(573)) - Creating directory if it doesn't exist: hdfs://sbd/data/sbd/dev/sbd.dev.debug-hive-4/time=ingestion/bucket=hourly/date=2020-05-11/hour=18 2020-05-11 20:00:01,521 WARN [pool-7-thread-2556]: hive.log (MetaStoreUtils.java:updatePartitionStatsFast(352)) - Updating partition stats fast for: sbd_dev_debug_hive_4 2020-05-11 20:00:01,537 WARN [pool-7-thread-2556]: hive.log (MetaStoreUtils.java:updatePartitionStatsFast(355)) - Updated size to 0 2020-05-11 20:00:01,764 INFO [pool-7-thread-2558]: metastore.hivemetastoressimpl (HiveMetaStoreFsImpl.java:deleteDir(41)) - deleting hdfs://sbd/data/sbd/dev/sbd.dev.debug-hive-4/time=ingestion/bucket=hourly/date=2020-05-11/hour=18 2020-05-11 20:00:01,787 INFO [pool-7-thread-2558]: fs.TrashPolicyDefault (TrashPolicyDefault.java:moveToTrash(168)) - Moved: 'hdfs://sbd/data/sbd/dev/sbd.dev.debug-hive-4/time=ingestion/bucket=hourly/date=2020-05-11/hour=18' to trash at: hdfs://sbd/user/sbd-fh-connect-autodumphdfs/.Trash/Current/data/sbd/dev/sbd.dev.debug-hive-4/time=ingestion/bucket=hourly/date=2020-05-11/hour=18 2020-05-11 20:00:01,787 INFO [pool-7-thread-2558]: metastore.hivemetastoressimpl (HiveMetaStoreFsImpl.java:deleteDir(48)) - Moved to trash: hdfs://sbd/data/sbd/dev/sbd.dev.debug-hive-4/time=ingestion/bucket=hourly/date=2020-05-11/hour=18 2020-05-11 20:00:01,788 ERROR [pool-7-thread-2558]: metastore.RetryingHMSHandler (RetryingHMSHandler.java:invokeInternal(217)) - Retrying HMSHandler after 2000 ms (attempt 1 of 10) with error: javax.jdo.JDODataStoreException: Insert of object "org.apache.hadoop.hive.metastore.model.MPartition@3254e57d" using statement "INSERT INTO "PARTITIONS" ("PART_ID","CREATE_TIME","LAST_ACCESS_TIME","PART_NAME","SD_ID","TBL_ID") VALUES (?,?,?,?,?,?)" failed : ERROR: duplicate key value violates unique constraint "UNIQUEPARTITION" 2020-05-11 20:00:03,788 INFO [pool-7-thread-2558]: metastore.HiveMetaStore (HiveMetaStore.java:logInfo(775)) - 2558: append_partition_by_name: db=sbd_dev_autodump tbl=sbd_dev_debug_hive_4 part=time=ingestion/bucket=hourly/date=2020-05-11/hour=18 2020-05-11 20:00:03,788 INFO [pool-7-thread-2558]: HiveMetaStore.audit (HiveMetaStore.java:logAuditEvent(319)) - ugi=sbd-fh-connect-autodumph...@sbd.corproot.net ip=10.222.76.2 cmd=append_partition_by_name: db=sbd_dev_autodump tbl=sbd_dev_debug_hive_4 part=time=ingestion/bucket=hourly/date=2020-05-11/hour=18 2020-05-11 20:00:03,869 ERROR [pool-7-thread-2558]: metastore.RetryingHMSHandler (RetryingHMSHandler.java:invokeInternal(203)) - AlreadyExistsException(message:Partition already exists:Partition(values:[ingestion, hourly, 2020-05-11, 18], dbName:sbd_dev_autodump, tableName:sbd_dev_debug_hive_4, createTime:0, lastAccessTime:0, sd:StorageDescriptor(cols:[FieldSchema(name:name, type:string, comment:null), FieldSchema(name:age, type:int, comment:null), FieldSchema(name:salary, type:int, comment:null)], location:hdfs://sbd/data/sbd/dev/sbd.dev.debug-hive-4/time=ingestion/bucket=hourly/date=2020-05-11/hour=18, inputFormat:org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat, outputFormat:org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat, compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.avro.AvroSerDe, parameters:{serialization.format=1}), bucketCols:[], sortCols:[], parameters:{}, skewedInfo:SkewedInfo(skewedColNames:[], skewedColValues:[], skewedColValueLocationMaps:{}), storedAsSubDirectories:false), parameters:null)){code} > Concurrent partition creation requests cause underlying HDFS folder to be > deleted > --------------------------------------------------------------------------------- > > Key: HIVE-23437 > URL: https://issues.apache.org/jira/browse/HIVE-23437 > Project: Hive > Issue Type: Bug > Components: HiveServer2, Standalone Metastore > Affects Versions: 2.1.0, 3.1.2 > Environment: * Hive version: 1.2.1000.2.6.5.0-292 (HDP 2.6.5) > * HDFS version: 2.7.3.2.6.5.0-292 (HDP 2.6.5) > Reporter: Marc Demierre > Priority: Major > > There seems to be a race condition in Hive Metasore when issuing several > concurrent partition creation requests for the same new partition. > In our case, this triggered due to Kafka Connect Hive integration, which > fires simultaneous partition creation requests from all its tasks when > syncing to Hive. > We are running HDP 2.6.5 but a quick survey of the upstream code still shows > the same in 3.1.2 (latest Hive release). > Our investigation pointed to the following code (here in Hive 2.1.0, the base > for HDP 2.6.5): > [https://github.com/apache/hive/blob/rel/release-2.1.0/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java#L2127] > Same code in 3.1.2: > https://github.com/apache/hive/blob/rel/release-3.1.2/standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java#L3202 > The generic scenario is the following: > # T1 (time period 1): > ** R1 (request 1) creates the HDFS dir > ** R2 also tries creating the HDFS dir > ** Both succeed (as if it already exists it succeeds, R1/R2 could be > interverted) > # T2: > ** R1 creates the partition in metastore DB, all OK > # T3: > ** R2 tries to create partition in metastore DB, gets exception from DB > because it exists. Rollback transaction. > ** R2 thinks it created the directory (in fact they both did we do not know > who), so it removes it > # T4: State is invalid: > ## Partition exists > ## HDFS folder does not exist > ## Some Hive/Spark queries fail when trying to use the folder > Here are some logs of the issue happening on our cluster in a standalone > metastore (R1 = thread 2303, R2 = thread 2302): > {code:java} > 2020-05-11 13:43:46,379 INFO [pool-7-thread-2303]: metastore.HiveMetaStore > (HiveMetaStore.java:logInfo(775)) - 2303: append_partition_by_name: > db=sbd_dev_autodump tbl=sbd_dev_debug_hive_4 part=time=ingestion/buc > ket=hourly/date=2020-05-11/hour=11 > 2020-05-11 13:43:46,379 INFO [pool-7-thread-2302]: metastore.HiveMetaStore > (HiveMetaStore.java:logInfo(775)) - 2302: append_partition_by_name: > db=sbd_dev_autodump tbl=sbd_dev_debug_hive_4 part=time=ingestion/buc > ket=hourly/date=2020-05-11/hour=11 > 2020-05-11 13:43:46,379 INFO [pool-7-thread-2303]: HiveMetaStore.audit > (HiveMetaStore.java:logAuditEvent(319)) - > ugi=sbd-fh-connect-autodumph...@sbd.corproot.net ip=10.222.76.1 > cmd=append_partition_by_name > : db=sbd_dev_autodump tbl=sbd_dev_debug_hive_4 > part=time=ingestion/bucket=hourly/date=2020-05-11/hour=11 > 2020-05-11 13:43:46,379 INFO [pool-7-thread-2302]: HiveMetaStore.audit > (HiveMetaStore.java:logAuditEvent(319)) - > ugi=sbd-fh-connect-autodumph...@sbd.corproot.net ip=10.222.76.1 > cmd=append_partition_by_name > : db=sbd_dev_autodump tbl=sbd_dev_debug_hive_4 > part=time=ingestion/bucket=hourly/date=2020-05-11/hour=11 > 2020-05-11 13:43:47,953 INFO [pool-7-thread-2302]: common.FileUtils > (FileUtils.java:mkdir(573)) - Creating directory if it doesn't exist: > hdfs://sbd/data/sbd/dev/sbd.dev.debug-hive-4/time=ingestion/bucket=hourly > /date=2020-05-11/hour=11 > 2020-05-11 13:43:47,957 INFO [pool-7-thread-2303]: common.FileUtils > (FileUtils.java:mkdir(573)) - Creating directory if it doesn't exist: > hdfs://sbd/data/sbd/dev/sbd.dev.debug-hive-4/time=ingestion/bucket=hourly > /date=2020-05-11/hour=11 > 2020-05-11 13:43:47,986 INFO [pool-7-thread-2302]: > metastore.hivemetastoressimpl (HiveMetaStoreFsImpl.java:deleteDir(41)) - > deleting > hdfs://sbd/data/sbd/dev/sbd.dev.debug-hive-4/time=ingestion/bucket=hourly/dat > e=2020-05-11/hour=11 > 2020-05-11 13:43:47,992 INFO [pool-7-thread-2302]: fs.TrashPolicyDefault > (TrashPolicyDefault.java:moveToTrash(168)) - Moved: > 'hdfs://sbd/data/sbd/dev/sbd.dev.debug-hive-4/time=ingestion/bucket=hourly/date=2020-05-11/hour=11' > to trash at: > hdfs://sbd/user/sbd-fh-connect-autodumphdfs/.Trash/Current/data/sbd/dev/sbd.dev.debug-hive-4/time=ingestion/bucket=hourly/date=2020-05-11/hour=11 > 2020-05-11 13:43:47,993 INFO [pool-7-thread-2302]: > metastore.hivemetastoressimpl (HiveMetaStoreFsImpl.java:deleteDir(48)) - > Moved to trash: > hdfs://sbd/data/sbd/dev/sbd.dev.debug-hive-4/time=ingestion/bucket=hourly/date=2020-05-11/hour=11 > 2020-05-11 13:43:47,993 ERROR [pool-7-thread-2302]: > metastore.RetryingHMSHandler (RetryingHMSHandler.java:invokeInternal(217)) - > Retrying HMSHandler after 2000 ms (attempt 1 of 10) with error: > javax.jdo.JDODataStoreException: Insert of object > "org.apache.hadoop.hive.metastore.model.MPartition@548a5b6c" using statement > "INSERT INTO "PARTITIONS" > ("PART_ID","CREATE_TIME","LAST_ACCESS_TIME","PART_NAME","SD_ID","TBL_ID") > VALUES (?,?,?,?,?,?)" failed : ERROR: duplicate key value violates unique > constraint "UNIQUEPARTITION" > Detail: Key ("PART_NAME", > "TBL_ID")=(time=ingestion/bucket=hourly/date=2020-05-11/hour=11, 6015512) > already exists. > at > org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:543) > at > org.datanucleus.api.jdo.JDOPersistenceManager.jdoMakePersistent(JDOPersistenceManager.java:720) > at > org.datanucleus.api.jdo.JDOPersistenceManager.makePersistent(JDOPersistenceManager.java:740) > at > org.apache.hadoop.hive.metastore.ObjectStore.addPartition(ObjectStore.java:1626) > at sun.reflect.GeneratedMethodAccessor255.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:103) > at com.sun.proxy.$Proxy11.addPartition(Unknown Source) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.append_partition_common(HiveMetaStore.java:2247) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.append_partition_by_name_with_environment_context(HiveMetaStore.java:4294) > at sun.reflect.GeneratedMethodAccessor214.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:105) > at > com.sun.proxy.$Proxy13.append_partition_by_name_with_environment_context(Unknown > Source) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$append_partition_by_name_with_environment_context.getResult(ThriftHiveMetastore.java:10192) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$append_partition_by_name_with_environment_context.getResult(ThriftHiveMetastore.java:10176) > at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) > at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) > at > org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge.java:551) > at > org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge.java:546) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869) > at > org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:546) > at > org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > NestedThrowablesStackTrace: > org.postgresql.util.PSQLException: ERROR: duplicate key value violates unique > constraint "UNIQUEPARTITION" > Detail: Key ("PART_NAME", > "TBL_ID")=(time=ingestion/bucket=hourly/date=2020-05-11/hour=11, 6015512) > already exists. > at > org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2440) > at > org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2183) > at > org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:308) > at > org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:441) > at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:365) > at > org.postgresql.jdbc.PgPreparedStatement.executeWithFlags(PgPreparedStatement.java:150) > at > org.postgresql.jdbc.PgPreparedStatement.executeUpdate(PgPreparedStatement.java:127) > at > com.jolbox.bonecp.PreparedStatementHandle.executeUpdate(PreparedStatementHandle.java:205) > at > org.datanucleus.store.rdbms.ParamLoggingPreparedStatement.executeUpdate(ParamLoggingPreparedStatement.java:393) > at > org.datanucleus.store.rdbms.SQLController.executeStatementUpdate(SQLController.java:431) > at > org.datanucleus.store.rdbms.request.InsertRequest.execute(InsertRequest.java:413) > at > org.datanucleus.store.rdbms.RDBMSPersistenceHandler.insertObjectInTable(RDBMSPersistenceHandler.java:162) > at > org.datanucleus.store.rdbms.RDBMSPersistenceHandler.insertObject(RDBMSPersistenceHandler.java:138) > at > org.datanucleus.state.StateManagerImpl.internalMakePersistent(StateManagerImpl.java:3363) > at > org.datanucleus.state.StateManagerImpl.makePersistent(StateManagerImpl.java:3339) > at > org.datanucleus.ExecutionContextImpl.persistObjectInternal(ExecutionContextImpl.java:2079) > at > org.datanucleus.ExecutionContextImpl.persistObjectWork(ExecutionContextImpl.java:1922) > at > org.datanucleus.ExecutionContextImpl.persistObject(ExecutionContextImpl.java:1777) > at > org.datanucleus.ExecutionContextThreadedImpl.persistObject(ExecutionContextThreadedImpl.java:217) > at > org.datanucleus.api.jdo.JDOPersistenceManager.jdoMakePersistent(JDOPersistenceManager.java:715) > at > org.datanucleus.api.jdo.JDOPersistenceManager.makePersistent(JDOPersistenceManager.java:740) > at > org.apache.hadoop.hive.metastore.ObjectStore.addPartition(ObjectStore.java:1626) > at sun.reflect.GeneratedMethodAccessor255.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:103) > at com.sun.proxy.$Proxy11.addPartition(Unknown Source) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.append_partition_common(HiveMetaStore.java:2247) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.append_partition_by_name_with_environment_context(HiveMetaStore.java:4294) > at sun.reflect.GeneratedMethodAccessor214.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:105) > at > com.sun.proxy.$Proxy13.append_partition_by_name_with_environment_context(Unknown > Source) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$append_partition_by_name_with_environment_context.getResult(ThriftHiveMetastore.java:10192) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$append_partition_by_name_with_environment_context.getResult(ThriftHiveMetastore.java:10176) > at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) > at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) > at > org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge.java:551) > at > org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge.java:546) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869) > at > org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:546) > at > org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > 2020-05-11 13:43:49,994 INFO [pool-7-thread-2302]: metastore.HiveMetaStore > (HiveMetaStore.java:logInfo(775)) - 2302: append_partition_by_name: > db=sbd_dev_autodump tbl=sbd_dev_debug_hive_4 > part=time=ingestion/bucket=hourly/date=2020-05-11/hour=11 > 2020-05-11 13:43:49,994 INFO [pool-7-thread-2302]: HiveMetaStore.audit > (HiveMetaStore.java:logAuditEvent(319)) - > ugi=sbd-fh-connect-autodumph...@sbd.corproot.net ip=10.222.76.1 > cmd=append_partition_by_name: db=sbd_dev_autodump tbl=sbd_dev_debug_hive_4 > part=time=ingestion/bucket=hourly/date=2020-05-11/hour=11 > 2020-05-11 13:43:50,039 ERROR [pool-7-thread-2302]: > metastore.RetryingHMSHandler (RetryingHMSHandler.java:invokeInternal(203)) - > AlreadyExistsException(message:Partition already > exists:Partition(values:[ingestion, hourly, 2020-05-11, 11], > dbName:sbd_dev_autodump, tableName:sbd_dev_debug_hive_4, createTime:0, > lastAccessTime:0, sd:StorageDescriptor(cols:[FieldSchema(name:name, > type:string, comment:null), FieldSchema(name:age, type:int, comment:null), > FieldSchema(name:salary, type:int, comment:null)], > location:hdfs://sbd/data/sbd/dev/sbd.dev.debug-hive-4/time=ingestion/bucket=hourly/date=2020-05-11/hour=11, > inputFormat:org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat, > outputFormat:org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat, > compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, > serializationLib:org.apache.hadoop.hive.serde2.avro.AvroSerDe, > parameters:{serialization.format=1}), bucketCols:[], sortCols:[], > parameters:{}, skewedInfo:SkewedInfo(skewedColNames:[], skewedColValues:[], > skewedColValueLocationMaps:{}), storedAsSubDirectories:false), > parameters:null)) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.append_partition_common(HiveMetaStore.java:2226) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.append_partition_by_name_with_environment_context(HiveMetaStore.java:4294) > at sun.reflect.GeneratedMethodAccessor214.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:105) > at > com.sun.proxy.$Proxy13.append_partition_by_name_with_environment_context(Unknown > Source) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$append_partition_by_name_with_environment_context.getResult(ThriftHiveMetastore.java:10192) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$append_partition_by_name_with_environment_context.getResult(ThriftHiveMetastore.java:10176) > at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) > at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) > at > org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge.java:551) > at > org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge.java:546) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869) > at > org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:546) > at > org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) {code} > > We're not sure on the best way to solve it: > - The obvious would be to not delete the folder on rollback but we do not > know if it would have side effects (other than a useless folder in some > cases). > - Another way would be to consider sucessful if the partition already exists, > and in this case not to delete the folder -- This message was sent by Atlassian Jira (v8.3.4#803005)