[jira] [Commented] (HIVE-13014) RetryingMetaStoreClient is retrying too aggresievley
[ https://issues.apache.org/jira/browse/HIVE-13014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15832678#comment-15832678 ] Eugene Koifman commented on HIVE-13014: --- 1. I've not tried to test performance but since Annotations can't change at runtime, I would think Java/jit should handle it. 2. It probably does, but I wasn't aiming to do a comprehensive analysis of all metastore call - just Acid related ones. The title sounds too broad. > RetryingMetaStoreClient is retrying too aggresievley > > > Key: HIVE-13014 > URL: https://issues.apache.org/jira/browse/HIVE-13014 > Project: Hive > Issue Type: Bug > Components: Metastore, Transactions >Affects Versions: 1.0.0 >Reporter: Eugene Koifman >Assignee: Eugene Koifman >Priority: Critical > Attachments: HIVE-13014.01.patch, HIVE-13014.02.patch, > HIVE-13014.03.patch, HIVE-13014.04.patch, HIVE-13014.05.patch, > HIVE-13014.06.patch, HIVE-13014.07.patch > > > Not all metastore operations are idempotent. For example, commit_txn() > consists of > 1. request from client to server > 2. server action > 3. ack to client > If network connection is broken after (or during) 2 but before 3 happens, > RetryingMetastoreClient will retry the operation thus causing an attempt to > commit the same txn twice (sometimes in concurrently) > The 2nd attempt is guaranteed to fail and thus return an error to the caller > (which doesn't know the operation is being retried), while the first attempt > has actually succeeded. Thus the caller thinks commit failed and will likely > attempt to redo the transactions - not what we want in most cases. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13014) RetryingMetaStoreClient is retrying too aggresievley
[ https://issues.apache.org/jira/browse/HIVE-13014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15832666#comment-15832666 ] Alan Gates commented on HIVE-13014: --- In general patch looks fine. I have a couple of questions: # What's the performance impact of looking up the annotations on the method everytime through the retry handler? Is it enough that we should build a map of methods to retriability so that subsequent lookups become O(1)? # Why does this not apply to other metastore operations, like create table? That would seem also to be a case where a timeout but succeeded first attempt could be masked by a failed second attempt. > RetryingMetaStoreClient is retrying too aggresievley > > > Key: HIVE-13014 > URL: https://issues.apache.org/jira/browse/HIVE-13014 > Project: Hive > Issue Type: Bug > Components: Metastore, Transactions >Affects Versions: 1.0.0 >Reporter: Eugene Koifman >Assignee: Eugene Koifman >Priority: Critical > Attachments: HIVE-13014.01.patch, HIVE-13014.02.patch, > HIVE-13014.03.patch, HIVE-13014.04.patch, HIVE-13014.05.patch, > HIVE-13014.06.patch, HIVE-13014.07.patch > > > Not all metastore operations are idempotent. For example, commit_txn() > consists of > 1. request from client to server > 2. server action > 3. ack to client > If network connection is broken after (or during) 2 but before 3 happens, > RetryingMetastoreClient will retry the operation thus causing an attempt to > commit the same txn twice (sometimes in concurrently) > The 2nd attempt is guaranteed to fail and thus return an error to the caller > (which doesn't know the operation is being retried), while the first attempt > has actually succeeded. Thus the caller thinks commit failed and will likely > attempt to redo the transactions - not what we want in most cases. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13014) RetryingMetaStoreClient is retrying too aggresievley
[ https://issues.apache.org/jira/browse/HIVE-13014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15832394#comment-15832394 ] Eugene Koifman commented on HIVE-13014: --- no related test failures > RetryingMetaStoreClient is retrying too aggresievley > > > Key: HIVE-13014 > URL: https://issues.apache.org/jira/browse/HIVE-13014 > Project: Hive > Issue Type: Bug > Components: Metastore, Transactions >Affects Versions: 1.0.0 >Reporter: Eugene Koifman >Assignee: Eugene Koifman >Priority: Critical > Attachments: HIVE-13014.01.patch, HIVE-13014.02.patch, > HIVE-13014.03.patch, HIVE-13014.04.patch, HIVE-13014.05.patch, > HIVE-13014.06.patch, HIVE-13014.07.patch > > > Not all metastore operations are idempotent. For example, commit_txn() > consists of > 1. request from client to server > 2. server action > 3. ack to client > If network connection is broken after (or during) 2 but before 3 happens, > RetryingMetastoreClient will retry the operation thus causing an attempt to > commit the same txn twice (sometimes in concurrently) > The 2nd attempt is guaranteed to fail and thus return an error to the caller > (which doesn't know the operation is being retried), while the first attempt > has actually succeeded. Thus the caller thinks commit failed and will likely > attempt to redo the transactions - not what we want in most cases. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13014) RetryingMetaStoreClient is retrying too aggresievley
[ https://issues.apache.org/jira/browse/HIVE-13014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15832372#comment-15832372 ] Hive QA commented on HIVE-13014: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12848366/HIVE-13014.07.patch {color:green}SUCCESS:{color} +1 due to 3 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 10 failed/errored test(s), 10968 tests executed *Failed tests:* {noformat} TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) (batchId=235) org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys] (batchId=159) org.apache.hadoop.hive.cli.TestHBaseNegativeCliDriver.testCliDriver[cascade_dbdrop] (batchId=226) org.apache.hadoop.hive.cli.TestHBaseNegativeCliDriver.testCliDriver[generatehfiles_require_family_path] (batchId=226) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_basic] (batchId=135) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_schema_evol_3a] (batchId=136) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[escape1] (batchId=139) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[escape2] (batchId=154) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[limit_pushdown3] (batchId=144) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_vec_part] (batchId=149) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3073/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3073/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3073/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 10 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12848366 - PreCommit-HIVE-Build > RetryingMetaStoreClient is retrying too aggresievley > > > Key: HIVE-13014 > URL: https://issues.apache.org/jira/browse/HIVE-13014 > Project: Hive > Issue Type: Bug > Components: Metastore, Transactions >Affects Versions: 1.0.0 >Reporter: Eugene Koifman >Assignee: Eugene Koifman >Priority: Critical > Attachments: HIVE-13014.01.patch, HIVE-13014.02.patch, > HIVE-13014.03.patch, HIVE-13014.04.patch, HIVE-13014.05.patch, > HIVE-13014.06.patch, HIVE-13014.07.patch > > > Not all metastore operations are idempotent. For example, commit_txn() > consists of > 1. request from client to server > 2. server action > 3. ack to client > If network connection is broken after (or during) 2 but before 3 happens, > RetryingMetastoreClient will retry the operation thus causing an attempt to > commit the same txn twice (sometimes in concurrently) > The 2nd attempt is guaranteed to fail and thus return an error to the caller > (which doesn't know the operation is being retried), while the first attempt > has actually succeeded. Thus the caller thinks commit failed and will likely > attempt to redo the transactions - not what we want in most cases. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13014) RetryingMetaStoreClient is retrying too aggresievley
[ https://issues.apache.org/jira/browse/HIVE-13014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15829978#comment-15829978 ] Hive QA commented on HIVE-13014: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12848115/HIVE-13014.06.patch {color:green}SUCCESS:{color} +1 due to 3 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 79 failed/errored test(s), 10963 tests executed *Failed tests:* {noformat} TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) (batchId=234) org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_joins] (batchId=218) org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_predicate_pushdown] (batchId=218) org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_single_sourced_multi_insert] (batchId=218) org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[ctas] (batchId=230) org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_into_dynamic_partitions] (batchId=230) org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_into_table] (batchId=230) org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_directory] (batchId=230) org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions] (batchId=230) org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_table] (batchId=230) org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[write_final_output_blobstore] (batchId=230) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_table_stats] (batchId=48) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[analyze_tbl_part] (batchId=44) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[input19] (batchId=79) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_overwrite_directory] (batchId=25) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample5] (batchId=52) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[serde_opencsv] (batchId=68) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[specialChar] (batchId=22) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorized_math_funcs] (batchId=19) org.apache.hadoop.hive.cli.TestContribCliDriver.testCliDriver[dboutput] (batchId=221) org.apache.hadoop.hive.cli.TestContribCliDriver.testCliDriver[fileformat_base64] (batchId=221) org.apache.hadoop.hive.cli.TestContribCliDriver.testCliDriver[udf_row_sequence] (batchId=221) org.apache.hadoop.hive.cli.TestContribCliDriver.testCliDriver[url_hook] (batchId=221) org.apache.hadoop.hive.cli.TestContribNegativeCliDriver.testCliDriver[case_with_row_sequence] (batchId=224) org.apache.hadoop.hive.cli.TestContribNegativeCliDriver.testCliDriver[invalid_row_sequence] (batchId=224) org.apache.hadoop.hive.cli.TestContribNegativeCliDriver.testCliDriver[serde_regex] (batchId=224) org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_insert_partition_dynamic] (batchId=158) org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_insert_partition_static] (batchId=156) org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_insert_values] (batchId=157) org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_unencrypted_tbl] (batchId=159) org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys] (batchId=159) org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_load_data_to_encrypted_tables] (batchId=157) org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_move_tbl] (batchId=157) org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_select_read_only_encrypted_tbl] (batchId=159) org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_select_read_only_unencrypted_tbl] (batchId=159) org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_unencrypted_nonhdfs_external_tables] (batchId=157) org.apache.hadoop.hive.cli.TestHBaseNegativeCliDriver.testCliDriver[cascade_dbdrop] (batchId=225) org.apache.hadoop.hive.cli.TestHBaseNegativeCliDriver.testCliDriver[generatehfiles_require_family_path] (batchId=225) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_llap_counters] (batchId=137) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_basic] (batchId=135) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_schema_evol_3a] (batchId=136) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[escape1] (batchId=139) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[escape2] (batchId=154) org.apache.hadoop.hive.cli.TestMiniL
[jira] [Commented] (HIVE-13014) RetryingMetaStoreClient is retrying too aggresievley
[ https://issues.apache.org/jira/browse/HIVE-13014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15829393#comment-15829393 ] Hive QA commented on HIVE-13014: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12848115/HIVE-13014.06.patch {color:green}SUCCESS:{color} +1 due to 3 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 81 failed/errored test(s), 10963 tests executed *Failed tests:* {noformat} TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) (batchId=234) org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_joins] (batchId=218) org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_predicate_pushdown] (batchId=218) org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_single_sourced_multi_insert] (batchId=218) org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[ctas] (batchId=230) org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_into_dynamic_partitions] (batchId=230) org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_into_table] (batchId=230) org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_directory] (batchId=230) org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_dynamic_partitions] (batchId=230) org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[insert_overwrite_table] (batchId=230) org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[write_final_output_blobstore] (batchId=230) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_table_stats] (batchId=48) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[analyze_tbl_part] (batchId=44) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[input19] (batchId=79) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_overwrite_directory] (batchId=25) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sample5] (batchId=52) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[serde_opencsv] (batchId=68) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[specialChar] (batchId=22) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorized_math_funcs] (batchId=19) org.apache.hadoop.hive.cli.TestContribCliDriver.testCliDriver[dboutput] (batchId=221) org.apache.hadoop.hive.cli.TestContribCliDriver.testCliDriver[fileformat_base64] (batchId=221) org.apache.hadoop.hive.cli.TestContribCliDriver.testCliDriver[udf_row_sequence] (batchId=221) org.apache.hadoop.hive.cli.TestContribCliDriver.testCliDriver[url_hook] (batchId=221) org.apache.hadoop.hive.cli.TestContribNegativeCliDriver.testCliDriver[case_with_row_sequence] (batchId=224) org.apache.hadoop.hive.cli.TestContribNegativeCliDriver.testCliDriver[invalid_row_sequence] (batchId=224) org.apache.hadoop.hive.cli.TestContribNegativeCliDriver.testCliDriver[serde_regex] (batchId=224) org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_insert_partition_dynamic] (batchId=158) org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_insert_partition_static] (batchId=156) org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_insert_values] (batchId=157) org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_unencrypted_tbl] (batchId=159) org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys] (batchId=159) org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_load_data_to_encrypted_tables] (batchId=157) org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_move_tbl] (batchId=157) org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_select_read_only_encrypted_tbl] (batchId=159) org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_select_read_only_unencrypted_tbl] (batchId=159) org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_unencrypted_nonhdfs_external_tables] (batchId=157) org.apache.hadoop.hive.cli.TestHBaseNegativeCliDriver.testCliDriver[cascade_dbdrop] (batchId=225) org.apache.hadoop.hive.cli.TestHBaseNegativeCliDriver.testCliDriver[generatehfiles_require_family_path] (batchId=225) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_llap_counters] (batchId=137) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_basic] (batchId=135) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_schema_evol_3a] (batchId=136) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[columnstats_part_coltype] (batchId=151) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[escape1] (batchId=139) org.apache.hadoop.h
[jira] [Commented] (HIVE-13014) RetryingMetaStoreClient is retrying too aggresievley
[ https://issues.apache.org/jira/browse/HIVE-13014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15827561#comment-15827561 ] Hive QA commented on HIVE-13014: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12847947/HIVE-13014.04.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 375 failed/errored test(s), 10824 tests executed *Failed tests:* {noformat} TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) (batchId=234) org.apache.hadoop.hive.accumulo.TestAccumuloConnectionParameters.testMissingPassword (batchId=165) org.apache.hadoop.hive.accumulo.TestAccumuloConnectionParameters.testMissingUserName (batchId=165) org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testDropTableWithoutDeleteLeavesTableIntact (batchId=165) org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testEmptyIteratorPushdownValue (batchId=165) org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testExternalNonExistentTableFails (batchId=165) org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testMissingColumnMappingFails (batchId=165) org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testNonBooleanIteratorPushdownValue (batchId=165) org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testNonExternalExistentTable (batchId=165) org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testNonNullLocation (batchId=165) org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testPreCreateTable (batchId=165) org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testRollbackCreateTableDeletesExistentTable (batchId=165) org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testRollbackCreateTableDoesntDeleteExternalExistentTable (batchId=165) org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testRollbackCreateTableOnNonExistentTable (batchId=165) org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testTableJobPropertiesCallsInputAndOutputMethods (batchId=165) org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testTablePropertiesPassedToInputJobProperties (batchId=165) org.apache.hadoop.hive.accumulo.TestAccumuloStorageHandler.testTablePropertiesPassedToOutputJobProperties (batchId=165) org.apache.hadoop.hive.accumulo.TestHiveAccumuloHelper.testISEIsPropagated (batchId=165) org.apache.hadoop.hive.accumulo.TestHiveAccumuloHelper.testISEIsPropagatedWithReflection (batchId=165) org.apache.hadoop.hive.accumulo.TestHiveAccumuloHelper.testTokenMerge (batchId=165) org.apache.hadoop.hive.accumulo.TestHiveAccumuloHelper.testTokenToConfFromUser (batchId=165) org.apache.hadoop.hive.accumulo.mr.TestHiveAccumuloTableInputFormat.testConfigureAccumuloInputFormat (batchId=165) org.apache.hadoop.hive.accumulo.mr.TestHiveAccumuloTableInputFormat.testConfigureAccumuloInputFormatWithAuthorizations (batchId=165) org.apache.hadoop.hive.accumulo.mr.TestHiveAccumuloTableInputFormat.testConfigureAccumuloInputFormatWithEmptyColumns (batchId=165) org.apache.hadoop.hive.accumulo.mr.TestHiveAccumuloTableInputFormat.testConfigureAccumuloInputFormatWithIterators (batchId=165) org.apache.hadoop.hive.accumulo.mr.TestHiveAccumuloTableInputFormat.testConfigureMockAccumuloInputFormat (batchId=165) org.apache.hadoop.hive.accumulo.mr.TestHiveAccumuloTableInputFormat.testIteratorNotInSplitsCompensation (batchId=165) org.apache.hadoop.hive.accumulo.mr.TestHiveAccumuloTableOutputFormat.testBasicConfiguration (batchId=165) org.apache.hadoop.hive.accumulo.mr.TestHiveAccumuloTableOutputFormat.testMockInstance (batchId=165) org.apache.hadoop.hive.accumulo.mr.TestHiveAccumuloTableOutputFormat.testSaslConfiguration (batchId=165) org.apache.hadoop.hive.accumulo.predicate.TestAccumuloPredicateHandler.testEmptyListRangeGeneratorOutput (batchId=165) org.apache.hadoop.hive.accumulo.predicate.TestAccumuloPredicateHandler.testManyRangesGeneratorOutput (batchId=165) org.apache.hadoop.hive.accumulo.predicate.TestAccumuloPredicateHandler.testNullRangeGeneratorOutput (batchId=165) org.apache.hadoop.hive.accumulo.predicate.TestAccumuloPredicateHandler.testSingleRangeGeneratorOutput (batchId=165) org.apache.hadoop.hive.accumulo.serde.TestAccumuloRowSerializer.testBufferResetBeforeUse (batchId=165) org.apache.hadoop.hive.cli.TestCliDriverMethods.testThatCliDriverPrintsHeaderForCommandsWithSchema (batchId=166) org.apache.hadoop.hive.cli.TestCliDriverMethods.testThatCliDriverPrintsNoHeaderForCommandsWithNoSchema (batchId=166) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_vec_part] (batchId=148) org.apache.hadoop.hive.common.TestBlobStorageUtils.testValidAndInvalidFileSystems (batchId=240) org.apache.hadoop.hive.io.TestHadoopFileStatus.testHadoopFileStatusAclEntries (batchId=190) org.apache.hadoop.hive.llap.cache.TestIn
[jira] [Commented] (HIVE-13014) RetryingMetaStoreClient is retrying too aggresievley
[ https://issues.apache.org/jira/browse/HIVE-13014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15826572#comment-15826572 ] Eugene Koifman commented on HIVE-13014: --- The patch makes a few methods safe to retry (which were not so before) and annotates others to indicate retry semantics The worst case to avoid is when server side op succeeds (and commits against the metastore RDBMS) but the remote caller doesn't know this and retries an op that cannot be retried. [~alangates] could you review please > RetryingMetaStoreClient is retrying too aggresievley > > > Key: HIVE-13014 > URL: https://issues.apache.org/jira/browse/HIVE-13014 > Project: Hive > Issue Type: Bug > Components: Metastore, Transactions >Affects Versions: 1.0.0 >Reporter: Eugene Koifman >Assignee: Eugene Koifman >Priority: Critical > Attachments: HIVE-13014.01.patch, HIVE-13014.02.patch, > HIVE-13014.03.patch > > > Not all metastore operations are idempotent. For example, commit_txn() > consists of > 1. request from client to server > 2. server action > 3. ack to client > If network connection is broken after (or during) 2 but before 3 happens, > RetryingMetastoreClient will retry the operation thus causing an attempt to > commit the same txn twice (sometimes in concurrently) > The 2nd attempt is guaranteed to fail and thus return an error to the caller > (which doesn't know the operation is being retried), while the first attempt > has actually succeeded. Thus the caller thinks commit failed and will likely > attempt to redo the transactions - not what we want in most cases. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13014) RetryingMetaStoreClient is retrying too aggresievley
[ https://issues.apache.org/jira/browse/HIVE-13014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15308473#comment-15308473 ] Jesus Camacho Rodriguez commented on HIVE-13014: [~ekoifman], I am removing 2.1.0 as target because the RC will be created tomorrow. Please feel free to commit to branch-2.1 anyway and fix for 2.1.0 if this happens before the release, or let me know if this is a Blocker. Thanks > RetryingMetaStoreClient is retrying too aggresievley > > > Key: HIVE-13014 > URL: https://issues.apache.org/jira/browse/HIVE-13014 > Project: Hive > Issue Type: Bug > Components: Metastore, Transactions >Affects Versions: 1.0.0 >Reporter: Eugene Koifman >Assignee: Eugene Koifman >Priority: Critical > > Not all metastore operations are idempotent. For example, commit_txn() > consists of > 1. request from client to server > 2. server action > 3. ack to client > If network connection is broken after (or during) 2 but before 3 happens, > RetryingMetastoreClient will retry the operation thus causing an attempt to > commit the same txn twice (sometimes in concurrently) > The 2nd attempt is guaranteed to fail and thus return an error to the caller > (which doesn't know the operation is being retried), while the first attempt > has actually succeeded. Thus the caller thinks commit failed and will likely > attempt to redo the transactions - not what we want in most cases. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13014) RetryingMetaStoreClient is retrying too aggresievley
[ https://issues.apache.org/jira/browse/HIVE-13014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15300499#comment-15300499 ] Eugene Koifman commented on HIVE-13014: --- [~jcamachorodriguez] it should if possible > RetryingMetaStoreClient is retrying too aggresievley > > > Key: HIVE-13014 > URL: https://issues.apache.org/jira/browse/HIVE-13014 > Project: Hive > Issue Type: Bug > Components: Metastore, Transactions >Affects Versions: 1.0.0 >Reporter: Eugene Koifman >Assignee: Eugene Koifman >Priority: Critical > > Not all metastore operations are idempotent. For example, commit_txn() > consists of > 1. request from client to server > 2. server action > 3. ack to client > If network connection is broken after (or during) 2 but before 3 happens, > RetryingMetastoreClient will retry the operation thus causing an attempt to > commit the same txn twice (sometimes in concurrently) > The 2nd attempt is guaranteed to fail and thus return an error to the caller > (which doesn't know the operation is being retried), while the first attempt > has actually succeeded. Thus the caller thinks commit failed and will likely > attempt to redo the transactions - not what we want in most cases. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13014) RetryingMetaStoreClient is retrying too aggresievley
[ https://issues.apache.org/jira/browse/HIVE-13014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15299914#comment-15299914 ] Jesus Camacho Rodriguez commented on HIVE-13014: [~ekoifman], this is marked as Critical. Will it go into 2.1.0? Thanks > RetryingMetaStoreClient is retrying too aggresievley > > > Key: HIVE-13014 > URL: https://issues.apache.org/jira/browse/HIVE-13014 > Project: Hive > Issue Type: Bug > Components: Metastore, Transactions >Affects Versions: 1.0.0 >Reporter: Eugene Koifman >Assignee: Eugene Koifman >Priority: Critical > > Not all metastore operations are idempotent. For example, commit_txn() > consists of > 1. request from client to server > 2. server action > 3. ack to client > If network connection is broken after (or during) 2 but before 3 happens, > RetryingMetastoreClient will retry the operation thus causing an attempt to > commit the same txn twice (sometimes in concurrently) > The 2nd attempt is guaranteed to fail and thus return an error to the caller > (which doesn't know the operation is being retried), while the first attempt > has actually succeeded. Thus the caller thinks commit failed and will likely > attempt to redo the transactions - not what we want in most cases. -- This message was sent by Atlassian JIRA (v6.3.4#6332)