[jira] [Commented] (IMPALA-13143) TestCatalogdHA.test_catalogd_failover_with_sync_ddl times out expecting query failure
[ https://issues.apache.org/jira/browse/IMPALA-13143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17853328#comment-17853328 ] ASF subversion and git services commented on IMPALA-13143: -- Commit bafd1903069163f38812d7fa42f9c4d2f7218fcf in impala's branch refs/heads/master from wzhou-code [ https://gitbox.apache.org/repos/asf?p=impala.git;h=bafd19030 ] IMPALA-13143: Fix flaky test_catalogd_failover_with_sync_ddl The test_catalogd_failover_with_sync_ddl test which was added to custom_cluster/test_catalogd_ha.py in IMPALA-13134 failed on s3. The test relies on specific timing with a sleep injected via a debug action so that the DDL query is still running when catalogd failover is triggered. The failures were caused by slowly restarting for catalogd on s3 so that the query finished before catalogd failover was triggered. This patch fixed the issue by increasing the sleep time for s3 builds and other slow builds. Testing: - Ran the test 100 times in a loop on s3. Change-Id: I15bb6aae23a2f544067f993533e322969372ebd5 Reviewed-on: http://gerrit.cloudera.org:8080/21491 Reviewed-by: Riza Suminto Tested-by: Impala Public Jenkins > TestCatalogdHA.test_catalogd_failover_with_sync_ddl times out expecting query > failure > - > > Key: IMPALA-13143 > URL: https://issues.apache.org/jira/browse/IMPALA-13143 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 4.5.0 >Reporter: Joe McDonnell >Assignee: Wenzhe Zhou >Priority: Critical > Labels: broken-build, flaky > > The new TestCatalogdHA.test_catalogd_failover_with_sync_ddl test is failing > intermittently with: > {noformat} > custom_cluster/test_catalogd_ha.py:472: in > test_catalogd_failover_with_sync_ddl > self.wait_for_state(handle, QueryState.EXCEPTION, 30, client=client) > common/impala_test_suite.py:1216: in wait_for_state > self.wait_for_any_state(handle, [expected_state], timeout, client) > common/impala_test_suite.py:1234: in wait_for_any_state > raise Timeout(timeout_msg) > E Timeout: query '9d49ab6360f6cbc5:4826a796' did not reach one of > the expected states [5], last known state 4{noformat} > This means the query succeeded even though we expected it to fail. This is > currently limited to s3 jobs. In a different test, we saw issues because s3 > is slower (see IMPALA-12616). > This test was introduced by IMPALA-13134: > https://github.com/apache/impala/commit/70b7b6a78d49c30933d79e0a1c2a725f7e0a3e50 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-13134) DDL hang with SYNC_DDL enabled when Catalogd is changed to standby status
[ https://issues.apache.org/jira/browse/IMPALA-13134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17853329#comment-17853329 ] ASF subversion and git services commented on IMPALA-13134: -- Commit bafd1903069163f38812d7fa42f9c4d2f7218fcf in impala's branch refs/heads/master from wzhou-code [ https://gitbox.apache.org/repos/asf?p=impala.git;h=bafd19030 ] IMPALA-13143: Fix flaky test_catalogd_failover_with_sync_ddl The test_catalogd_failover_with_sync_ddl test which was added to custom_cluster/test_catalogd_ha.py in IMPALA-13134 failed on s3. The test relies on specific timing with a sleep injected via a debug action so that the DDL query is still running when catalogd failover is triggered. The failures were caused by slowly restarting for catalogd on s3 so that the query finished before catalogd failover was triggered. This patch fixed the issue by increasing the sleep time for s3 builds and other slow builds. Testing: - Ran the test 100 times in a loop on s3. Change-Id: I15bb6aae23a2f544067f993533e322969372ebd5 Reviewed-on: http://gerrit.cloudera.org:8080/21491 Reviewed-by: Riza Suminto Tested-by: Impala Public Jenkins > DDL hang with SYNC_DDL enabled when Catalogd is changed to standby status > - > > Key: IMPALA-13134 > URL: https://issues.apache.org/jira/browse/IMPALA-13134 > Project: IMPALA > Issue Type: Bug > Components: Backend, Catalog >Reporter: Wenzhe Zhou >Assignee: Wenzhe Zhou >Priority: Major > Fix For: Impala 4.5.0 > > > Catalogd waits for SYNC_DDL version when it processes a DDL with SYNC_DDL > enabled. If the status of Catalogd is changed from active to standby when > CatalogServiceCatalog.waitForSyncDdlVersion() is called, the standby catalogd > does not receive catalog topic updates from statestore. This causes catalogd > thread waits indefinitely and DDL query hanging. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Assigned] (IMPALA-13146) Javascript tests sometimes fail to download NodeJS
[ https://issues.apache.org/jira/browse/IMPALA-13146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joe McDonnell reassigned IMPALA-13146: -- Assignee: Joe McDonnell > Javascript tests sometimes fail to download NodeJS > -- > > Key: IMPALA-13146 > URL: https://issues.apache.org/jira/browse/IMPALA-13146 > Project: IMPALA > Issue Type: Bug > Components: Infrastructure >Affects Versions: Impala 4.5.0 >Reporter: Joe McDonnell >Assignee: Joe McDonnell >Priority: Critical > Labels: broken-build, flaky > > For automated tests, sometimes the Javascript tests fail to download NodeJS: > {noformat} > 01:37:16 Fetching NodeJS v16.20.2-linux-x64 binaries ... > 01:37:16 % Total% Received % Xferd Average Speed TimeTime > Time Current > 01:37:16 Dload Upload Total Spent > Left Speed > 01:37:16 > 0 00 00 0 0 0 --:--:-- --:--:-- --:--:-- 0 > 0 00 00 0 0 0 --:--:-- 0:00:01 --:--:-- 0 > 0 00 00 0 0 0 --:--:-- 0:00:02 --:--:-- 0 > 0 21.5M0 9020 0293 0 21:23:04 0:00:03 21:23:01 293 > ... > 30 21.5M 30 6776k 0 0 50307 0 0:07:28 0:02:17 0:05:11 23826 > 01:39:34 curl: (18) transfer closed with 15617860 bytes remaining to > read{noformat} > If this keeps happening, we should mirror the NodeJS binary on the > native-toolchain s3 bucket. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-13147) Add support for limiting the concurrency of link jobs
Joe McDonnell created IMPALA-13147: -- Summary: Add support for limiting the concurrency of link jobs Key: IMPALA-13147 URL: https://issues.apache.org/jira/browse/IMPALA-13147 Project: IMPALA Issue Type: Improvement Components: Infrastructure Affects Versions: Impala 4.5.0 Reporter: Joe McDonnell Link jobs can use a lot of memory due to the amount of debug info. The level of concurrency that is useful for compilation can be too high for linking. Running a link-heavy command like buildall.sh -skiptests can run out of memory from linking all of the backend tests / benchmarks. It would be useful to be able to limit the number of concurrent link jobs. There are two basic approaches: When using the ninja generator for CMake, ninja supports having job pools with limited parallelism. CMake has support for mapping link tasks to their own pool. Here is an example: {noformat} set(CMAKE_JOB_POOLS compilation_pool=24 link_pool=8) set(CMAKE_JOB_POOL_COMPILE compilation_pool) set(CMAKE_JOB_POOL_LINK link_pool){noformat} The makefile generator does not have equivalent functionality, but we could do a more limited version where buildall.sh can split the -skiptests into two make invocations. The first does all the compilation with full parallelism (equivalent to -notests) and then the second make invocation does the backend tests / benchmarks with a reduced parallelism. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-13096) Cleanup Parser.jj for Calcite planner to only use supported syntax
[ https://issues.apache.org/jira/browse/IMPALA-13096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17853235#comment-17853235 ] ASF subversion and git services commented on IMPALA-13096: -- Commit 141f38197be2ca23757cb8b3f283cdb9dd62de47 in impala's branch refs/heads/master from Steve Carlin [ https://gitbox.apache.org/repos/asf?p=impala.git;h=141f38197 ] IMPALA-12935: First pass on Calcite planner functions This commit handles the first pass on getting functions to work through the Calcite planner. Only basic functions will work with this commit. Implicit conversions for parameters are not yet supported. Custom UDFs are also not supported yet. The ImpalaOperatorTable is used at validation time to check for existence of the function name for Impala. At first, it will check Calcite operators for the existence of the function name (A TODO, IMPALA-13096, is that we need to remove non-supported names from the parser file). It is preferable to use the Calcite Operator since Calcite does some optimizations based on the Calcite Operator class. If the name is not found within the Calcite Operators, a check is done within the BuiltinsDb (TODO: IMPALA-13095 handle UDFs) for the function. If found, and SqlOperator class is generated on the fly to handle this function. The validation process for Calcite includes a call into the operator method "inferReturnType". This method will validate that there exists a function that will handle the operands, and if so, return the "return type" of the function. In this commit, we will assume that the Calcite operators will match Impala functionality. In later commits, there will be overrides where we will use Impala validation for operators where Calcite's validation isn't good enough. After validation is complete, the functions will be in a Calcite format. After the rest of compilation (relnode conversion, optimization) is complete, the function needs to be converted back into Impala form (the Expr object) to eventually get it into its thrift request. In this commit, all functions are converted into Expr starting in the ImpalaProjectRel, since this is the RelNode where functions do their thing. The RexCallConverter and RexLiteralConverter get called via the CreateExprVisitor for this conversion. Since Calcite is providing the analysis portion of the planning, there is no need to go through Impala's Analyzer object. However, the Impala planner requires Expr objects to be analyzed. To get around this, the AnalyzedFunctionCallExpr and AnalyzedNullLiteral objects exist which analyze the expression in the constructor. While this could potentially be combined with the existing FunctionCallExpr and NullLiteral objects, this fits in with the general plan to avoid changing "fe" Impala code as much as we can until much later in the commit cycle. Also, there will be other Analyzed*Expr classes created in the future, but this commit is intended for basic function call expressions only. One minor change to the parser is added with this commit. Calcite parser does not have acknowledge the "string" datatype, so this has been added here in Parser.jj and config.fmpp. Change-Id: I2dd4e402d69ee10547abeeafe893164ffd789b88 Reviewed-on: http://gerrit.cloudera.org:8080/21357 Reviewed-by: Michael Smith Tested-by: Impala Public Jenkins > Cleanup Parser.jj for Calcite planner to only use supported syntax > -- > > Key: IMPALA-13096 > URL: https://issues.apache.org/jira/browse/IMPALA-13096 > Project: IMPALA > Issue Type: Sub-task >Reporter: Steve Carlin >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-13095) Handle UDFs in Calcite planner
[ https://issues.apache.org/jira/browse/IMPALA-13095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17853236#comment-17853236 ] ASF subversion and git services commented on IMPALA-13095: -- Commit 141f38197be2ca23757cb8b3f283cdb9dd62de47 in impala's branch refs/heads/master from Steve Carlin [ https://gitbox.apache.org/repos/asf?p=impala.git;h=141f38197 ] IMPALA-12935: First pass on Calcite planner functions This commit handles the first pass on getting functions to work through the Calcite planner. Only basic functions will work with this commit. Implicit conversions for parameters are not yet supported. Custom UDFs are also not supported yet. The ImpalaOperatorTable is used at validation time to check for existence of the function name for Impala. At first, it will check Calcite operators for the existence of the function name (A TODO, IMPALA-13096, is that we need to remove non-supported names from the parser file). It is preferable to use the Calcite Operator since Calcite does some optimizations based on the Calcite Operator class. If the name is not found within the Calcite Operators, a check is done within the BuiltinsDb (TODO: IMPALA-13095 handle UDFs) for the function. If found, and SqlOperator class is generated on the fly to handle this function. The validation process for Calcite includes a call into the operator method "inferReturnType". This method will validate that there exists a function that will handle the operands, and if so, return the "return type" of the function. In this commit, we will assume that the Calcite operators will match Impala functionality. In later commits, there will be overrides where we will use Impala validation for operators where Calcite's validation isn't good enough. After validation is complete, the functions will be in a Calcite format. After the rest of compilation (relnode conversion, optimization) is complete, the function needs to be converted back into Impala form (the Expr object) to eventually get it into its thrift request. In this commit, all functions are converted into Expr starting in the ImpalaProjectRel, since this is the RelNode where functions do their thing. The RexCallConverter and RexLiteralConverter get called via the CreateExprVisitor for this conversion. Since Calcite is providing the analysis portion of the planning, there is no need to go through Impala's Analyzer object. However, the Impala planner requires Expr objects to be analyzed. To get around this, the AnalyzedFunctionCallExpr and AnalyzedNullLiteral objects exist which analyze the expression in the constructor. While this could potentially be combined with the existing FunctionCallExpr and NullLiteral objects, this fits in with the general plan to avoid changing "fe" Impala code as much as we can until much later in the commit cycle. Also, there will be other Analyzed*Expr classes created in the future, but this commit is intended for basic function call expressions only. One minor change to the parser is added with this commit. Calcite parser does not have acknowledge the "string" datatype, so this has been added here in Parser.jj and config.fmpp. Change-Id: I2dd4e402d69ee10547abeeafe893164ffd789b88 Reviewed-on: http://gerrit.cloudera.org:8080/21357 Reviewed-by: Michael Smith Tested-by: Impala Public Jenkins > Handle UDFs in Calcite planner > -- > > Key: IMPALA-13095 > URL: https://issues.apache.org/jira/browse/IMPALA-13095 > Project: IMPALA > Issue Type: Sub-task >Reporter: Steve Carlin >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-12935) Allow function parsing for Impala Calcite planner
[ https://issues.apache.org/jira/browse/IMPALA-12935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17853234#comment-17853234 ] ASF subversion and git services commented on IMPALA-12935: -- Commit 141f38197be2ca23757cb8b3f283cdb9dd62de47 in impala's branch refs/heads/master from Steve Carlin [ https://gitbox.apache.org/repos/asf?p=impala.git;h=141f38197 ] IMPALA-12935: First pass on Calcite planner functions This commit handles the first pass on getting functions to work through the Calcite planner. Only basic functions will work with this commit. Implicit conversions for parameters are not yet supported. Custom UDFs are also not supported yet. The ImpalaOperatorTable is used at validation time to check for existence of the function name for Impala. At first, it will check Calcite operators for the existence of the function name (A TODO, IMPALA-13096, is that we need to remove non-supported names from the parser file). It is preferable to use the Calcite Operator since Calcite does some optimizations based on the Calcite Operator class. If the name is not found within the Calcite Operators, a check is done within the BuiltinsDb (TODO: IMPALA-13095 handle UDFs) for the function. If found, and SqlOperator class is generated on the fly to handle this function. The validation process for Calcite includes a call into the operator method "inferReturnType". This method will validate that there exists a function that will handle the operands, and if so, return the "return type" of the function. In this commit, we will assume that the Calcite operators will match Impala functionality. In later commits, there will be overrides where we will use Impala validation for operators where Calcite's validation isn't good enough. After validation is complete, the functions will be in a Calcite format. After the rest of compilation (relnode conversion, optimization) is complete, the function needs to be converted back into Impala form (the Expr object) to eventually get it into its thrift request. In this commit, all functions are converted into Expr starting in the ImpalaProjectRel, since this is the RelNode where functions do their thing. The RexCallConverter and RexLiteralConverter get called via the CreateExprVisitor for this conversion. Since Calcite is providing the analysis portion of the planning, there is no need to go through Impala's Analyzer object. However, the Impala planner requires Expr objects to be analyzed. To get around this, the AnalyzedFunctionCallExpr and AnalyzedNullLiteral objects exist which analyze the expression in the constructor. While this could potentially be combined with the existing FunctionCallExpr and NullLiteral objects, this fits in with the general plan to avoid changing "fe" Impala code as much as we can until much later in the commit cycle. Also, there will be other Analyzed*Expr classes created in the future, but this commit is intended for basic function call expressions only. One minor change to the parser is added with this commit. Calcite parser does not have acknowledge the "string" datatype, so this has been added here in Parser.jj and config.fmpp. Change-Id: I2dd4e402d69ee10547abeeafe893164ffd789b88 Reviewed-on: http://gerrit.cloudera.org:8080/21357 Reviewed-by: Michael Smith Tested-by: Impala Public Jenkins > Allow function parsing for Impala Calcite planner > - > > Key: IMPALA-12935 > URL: https://issues.apache.org/jira/browse/IMPALA-12935 > Project: IMPALA > Issue Type: Sub-task >Reporter: Steve Carlin >Priority: Major > > We need the ability to parse and validate Impala functions using the Calcite > planner > This commit is not attended to work for all functions, or even most > functions. It will work as a base to be reviewed, and at least some > functions will work. More complicated functions will be added in a later > commit. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-13146) Javascript tests sometimes fail to download NodeJS
Joe McDonnell created IMPALA-13146: -- Summary: Javascript tests sometimes fail to download NodeJS Key: IMPALA-13146 URL: https://issues.apache.org/jira/browse/IMPALA-13146 Project: IMPALA Issue Type: Bug Components: Infrastructure Affects Versions: Impala 4.5.0 Reporter: Joe McDonnell For automated tests, sometimes the Javascript tests fail to download NodeJS: {noformat} 01:37:16 Fetching NodeJS v16.20.2-linux-x64 binaries ... 01:37:16 % Total% Received % Xferd Average Speed TimeTime Time Current 01:37:16 Dload Upload Total SpentLeft Speed 01:37:16 0 00 00 0 0 0 --:--:-- --:--:-- --:--:-- 0 0 00 00 0 0 0 --:--:-- 0:00:01 --:--:-- 0 0 00 00 0 0 0 --:--:-- 0:00:02 --:--:-- 0 0 21.5M0 9020 0293 0 21:23:04 0:00:03 21:23:01 293 ... 30 21.5M 30 6776k 0 0 50307 0 0:07:28 0:02:17 0:05:11 23826 01:39:34 curl: (18) transfer closed with 15617860 bytes remaining to read{noformat} If this keeps happening, we should mirror the NodeJS binary on the native-toolchain s3 bucket. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-13130) Under heavy load, Impala does not prioritize data stream operations
[ https://issues.apache.org/jira/browse/IMPALA-13130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Smith resolved IMPALA-13130. Fix Version/s: Impala 4.5.0 Resolution: Fixed > Under heavy load, Impala does not prioritize data stream operations > --- > > Key: IMPALA-13130 > URL: https://issues.apache.org/jira/browse/IMPALA-13130 > Project: IMPALA > Issue Type: Bug >Reporter: Michael Smith >Assignee: Michael Smith >Priority: Major > Fix For: Impala 4.5.0 > > > Under heavy load - where Impala reaches max memory for the DataStreamService > and applies backpressure via > https://github.com/apache/impala/blob/4.4.0/be/src/rpc/impala-service-pool.cc#L191-L199 > - DataStreamService does not differentiate between types of requests and may > reject requests that could help reduce load. > The DataStreamService deals with TransmitData, PublishFilter, UpdateFilter, > UpdateFilterFromRemote, and EndDataStream. It seems like we should prioritize > completing EndDataStream, especially under heavy load, to complete work and > release resources more quickly. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Assigned] (IMPALA-13143) TestCatalogdHA.test_catalogd_failover_with_sync_ddl times out expecting query failure
[ https://issues.apache.org/jira/browse/IMPALA-13143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenzhe Zhou reassigned IMPALA-13143: Assignee: Wenzhe Zhou > TestCatalogdHA.test_catalogd_failover_with_sync_ddl times out expecting query > failure > - > > Key: IMPALA-13143 > URL: https://issues.apache.org/jira/browse/IMPALA-13143 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 4.5.0 >Reporter: Joe McDonnell >Assignee: Wenzhe Zhou >Priority: Critical > Labels: broken-build, flaky > > The new TestCatalogdHA.test_catalogd_failover_with_sync_ddl test is failing > intermittently with: > {noformat} > custom_cluster/test_catalogd_ha.py:472: in > test_catalogd_failover_with_sync_ddl > self.wait_for_state(handle, QueryState.EXCEPTION, 30, client=client) > common/impala_test_suite.py:1216: in wait_for_state > self.wait_for_any_state(handle, [expected_state], timeout, client) > common/impala_test_suite.py:1234: in wait_for_any_state > raise Timeout(timeout_msg) > E Timeout: query '9d49ab6360f6cbc5:4826a796' did not reach one of > the expected states [5], last known state 4{noformat} > This means the query succeeded even though we expected it to fail. This is > currently limited to s3 jobs. In a different test, we saw issues because s3 > is slower (see IMPALA-12616). > This test was introduced by IMPALA-13134: > https://github.com/apache/impala/commit/70b7b6a78d49c30933d79e0a1c2a725f7e0a3e50 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-13145) Upgrade mold linker to 2.31.0
Joe McDonnell created IMPALA-13145: -- Summary: Upgrade mold linker to 2.31.0 Key: IMPALA-13145 URL: https://issues.apache.org/jira/browse/IMPALA-13145 Project: IMPALA Issue Type: Improvement Components: Infrastructure Affects Versions: Impala 4.5.0 Reporter: Joe McDonnell Mold 2.31.0 claims performance improvements and a reduction in the memory needed for linking. See [https://github.com/rui314/mold/releases/tag/v2.31.0] and [https://github.com/rui314/mold/commit/53ebcd80d888778cde16952270f73343f090f342] We should move to that version as some developers are seeing issues with high memory usage for linking. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-12967) Testcase fails at test_migrated_table_field_id_resolution due to "Table does not exist"
[ https://issues.apache.org/jira/browse/IMPALA-12967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17853224#comment-17853224 ] Joe McDonnell commented on IMPALA-12967: There is a separate symptom where this test fails with a Disk I/O error. It is probably somewhat related, so we need to decide whether to include that symptom here. See IMPALA-13144. > Testcase fails at test_migrated_table_field_id_resolution due to "Table does > not exist" > --- > > Key: IMPALA-12967 > URL: https://issues.apache.org/jira/browse/IMPALA-12967 > Project: IMPALA > Issue Type: Bug > Components: Backend >Reporter: Yida Wu >Assignee: Quanlong Huang >Priority: Major > Labels: broken-build > > Testcase test_migrated_table_field_id_resolution fails at exhaustive release > build with following messages: > *Regression* > {code:java} > query_test.test_iceberg.TestIcebergTable.test_migrated_table_field_id_resolution[protocol: > beeswax | exec_option: {'test_replan': 1, 'batch_size': 0, 'num_nodes': 0, > 'disable_codegen_rows_threshold': 0, 'disable_codegen': True, > 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: > parquet/none] (from pytest) > {code} > *Error Message* > {code:java} > query_test/test_iceberg.py:266: in test_migrated_table_field_id_resolution > "iceberg_migrated_alter_test_orc", "orc") common/file_utils.py:68: in > create_iceberg_table_from_directory file_format)) > common/impala_connection.py:215: in execute > fetch_profile_after_close=fetch_profile_after_close) > beeswax/impala_beeswax.py:191: in execute handle = > self.__execute_query(query_string.strip(), user=user) > beeswax/impala_beeswax.py:384: in __execute_query > self.wait_for_finished(handle) beeswax/impala_beeswax.py:405: in > wait_for_finished raise ImpalaBeeswaxException("Query aborted:" + > error_log, None) E ImpalaBeeswaxException: ImpalaBeeswaxException: E > Query aborted:ImpalaRuntimeException: Error making 'createTable' RPC to Hive > Metastore: E CAUSED BY: IcebergTableLoadingException: Table does not exist > at location: > hdfs://localhost:20500/test-warehouse/iceberg_migrated_alter_test_orc > Stacktrace > query_test/test_iceberg.py:266: in test_migrated_table_field_id_resolution > "iceberg_migrated_alter_test_orc", "orc") > common/file_utils.py:68: in create_iceberg_table_from_directory > file_format)) > common/impala_connection.py:215: in execute > fetch_profile_after_close=fetch_profile_after_close) > beeswax/impala_beeswax.py:191: in execute > handle = self.__execute_query(query_string.strip(), user=user) > beeswax/impala_beeswax.py:384: in __execute_query > self.wait_for_finished(handle) > beeswax/impala_beeswax.py:405: in wait_for_finished > raise ImpalaBeeswaxException("Query aborted:" + error_log, None) > E ImpalaBeeswaxException: ImpalaBeeswaxException: > EQuery aborted:ImpalaRuntimeException: Error making 'createTable' RPC to > Hive Metastore: > E CAUSED BY: IcebergTableLoadingException: Table does not exist at > location: > hdfs://localhost:20500/test-warehouse/iceberg_migrated_alter_test_orc > {code} > *Standard Error* > {code:java} > SET > client_identifier=query_test/test_iceberg.py::TestIcebergTable::()::test_migrated_table_field_id_resolution[protocol:beeswax|exec_option:{'test_replan':1;'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':True;'abort_on_error':1;'exec_single_; > SET sync_ddl=False; > -- executing against localhost:21000 > DROP DATABASE IF EXISTS `test_migrated_table_field_id_resolution_b59d79db` > CASCADE; > -- 2024-04-02 00:56:55,137 INFO MainThread: Started query > f34399a8b7cddd67:031a3b96 > SET > client_identifier=query_test/test_iceberg.py::TestIcebergTable::()::test_migrated_table_field_id_resolution[protocol:beeswax|exec_option:{'test_replan':1;'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':True;'abort_on_error':1;'exec_single_; > SET sync_ddl=False; > -- executing against localhost:21000 > CREATE DATABASE `test_migrated_table_field_id_resolution_b59d79db`; > -- 2024-04-02 00:56:57,302 INFO MainThread: Started query > 94465af69907eac5:e33f17e0 > -- 2024-04-02 00:56:57,353 INFO MainThread: Created database > "test_migrated_table_field_id_resolution_b59d79db" for test ID > "query_test/test_iceberg.py::TestIcebergTable::()::test_migrated_table_field_id_resolution[protocol: > beeswax | exec_option: {'test_replan': 1, 'batch_size': 0, 'num_nodes': 0, > 'disable_codegen_rows_threshold': 0, 'disable_codegen': True, > 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: > parquet/none]" > Picked up
[jira] [Commented] (IMPALA-13144) TestIcebergTable.test_migrated_table_field_id_resolution fails with Disk I/O error
[ https://issues.apache.org/jira/browse/IMPALA-13144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17853223#comment-17853223 ] Joe McDonnell commented on IMPALA-13144: We need to decide whether we want to track this with IMPALA-12967 (which was originally about "Table does not exist at location" on the same test) or keep it separate. > TestIcebergTable.test_migrated_table_field_id_resolution fails with Disk I/O > error > -- > > Key: IMPALA-13144 > URL: https://issues.apache.org/jira/browse/IMPALA-13144 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 4.5.0 >Reporter: Joe McDonnell >Priority: Critical > Labels: broken-build, flaky > > A couple test jobs hit a failure on > TestIcebergTable.test_migrated_table_field_id_resolution: > {noformat} > query_test/test_iceberg.py:270: in test_migrated_table_field_id_resolution > vector, unique_database) > common/impala_test_suite.py:725: in run_test_case > result = exec_fn(query, user=test_section.get('USER', '').strip() or None) > common/impala_test_suite.py:660: in __exec_in_impala > result = self.__execute_query(target_impalad_client, query, user=user) > common/impala_test_suite.py:1013: in __execute_query > return impalad_client.execute(query, user=user) > common/impala_connection.py:216: in execute > fetch_profile_after_close=fetch_profile_after_close) > beeswax/impala_beeswax.py:191: in execute > handle = self.__execute_query(query_string.strip(), user=user) > beeswax/impala_beeswax.py:384: in __execute_query > self.wait_for_finished(handle) > beeswax/impala_beeswax.py:405: in wait_for_finished > raise ImpalaBeeswaxException("Query aborted:" + error_log, None) > E ImpalaBeeswaxException: ImpalaBeeswaxException: > EQuery aborted:Disk I/O error on > impala-ec2-centos79-m6i-4xlarge-xldisk-153e.vpc.cloudera.com:27000: Failed to > open HDFS file > hdfs://localhost:20500/test-warehouse/iceberg_migrated_alter_test/00_0 > E Error(2): No such file or directory > E Root cause: RemoteException: File does not exist: > /test-warehouse/iceberg_migrated_alter_test/00_0 > E at > org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:87) > E at > org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:77) > E at > org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp.getBlockLocations(FSDirStatAndListingOp.java:159) > E at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:2040) > E at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:738) > E at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:454) > E at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > E at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:533) > E at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1070) > E at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:994) > E at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:922) > E at java.security.AccessController.doPrivileged(Native Method) > E at javax.security.auth.Subject.doAs(Subject.java:422) > E at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1899) > E at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2899){noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-13144) TestIcebergTable.test_migrated_table_field_id_resolution fails with Disk I/O error
Joe McDonnell created IMPALA-13144: -- Summary: TestIcebergTable.test_migrated_table_field_id_resolution fails with Disk I/O error Key: IMPALA-13144 URL: https://issues.apache.org/jira/browse/IMPALA-13144 Project: IMPALA Issue Type: Bug Components: Backend Affects Versions: Impala 4.5.0 Reporter: Joe McDonnell A couple test jobs hit a failure on TestIcebergTable.test_migrated_table_field_id_resolution: {noformat} query_test/test_iceberg.py:270: in test_migrated_table_field_id_resolution vector, unique_database) common/impala_test_suite.py:725: in run_test_case result = exec_fn(query, user=test_section.get('USER', '').strip() or None) common/impala_test_suite.py:660: in __exec_in_impala result = self.__execute_query(target_impalad_client, query, user=user) common/impala_test_suite.py:1013: in __execute_query return impalad_client.execute(query, user=user) common/impala_connection.py:216: in execute fetch_profile_after_close=fetch_profile_after_close) beeswax/impala_beeswax.py:191: in execute handle = self.__execute_query(query_string.strip(), user=user) beeswax/impala_beeswax.py:384: in __execute_query self.wait_for_finished(handle) beeswax/impala_beeswax.py:405: in wait_for_finished raise ImpalaBeeswaxException("Query aborted:" + error_log, None) E ImpalaBeeswaxException: ImpalaBeeswaxException: EQuery aborted:Disk I/O error on impala-ec2-centos79-m6i-4xlarge-xldisk-153e.vpc.cloudera.com:27000: Failed to open HDFS file hdfs://localhost:20500/test-warehouse/iceberg_migrated_alter_test/00_0 E Error(2): No such file or directory E Root cause: RemoteException: File does not exist: /test-warehouse/iceberg_migrated_alter_test/00_0 E at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:87) E at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:77) E at org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp.getBlockLocations(FSDirStatAndListingOp.java:159) E at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:2040) E at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:738) E at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:454) E at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) E at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:533) E at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1070) E at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:994) E at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:922) E at java.security.AccessController.doPrivileged(Native Method) E at javax.security.auth.Subject.doAs(Subject.java:422) E at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1899) E at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2899){noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-13143) TestCatalogdHA.test_catalogd_failover_with_sync_ddl times out expecting query failure
Joe McDonnell created IMPALA-13143: -- Summary: TestCatalogdHA.test_catalogd_failover_with_sync_ddl times out expecting query failure Key: IMPALA-13143 URL: https://issues.apache.org/jira/browse/IMPALA-13143 Project: IMPALA Issue Type: Bug Components: Backend Affects Versions: Impala 4.5.0 Reporter: Joe McDonnell The new TestCatalogdHA.test_catalogd_failover_with_sync_ddl test is failing intermittently with: {noformat} custom_cluster/test_catalogd_ha.py:472: in test_catalogd_failover_with_sync_ddl self.wait_for_state(handle, QueryState.EXCEPTION, 30, client=client) common/impala_test_suite.py:1216: in wait_for_state self.wait_for_any_state(handle, [expected_state], timeout, client) common/impala_test_suite.py:1234: in wait_for_any_state raise Timeout(timeout_msg) E Timeout: query '9d49ab6360f6cbc5:4826a796' did not reach one of the expected states [5], last known state 4{noformat} This means the query succeeded even though we expected it to fail. This is currently limited to s3 jobs. In a different test, we saw issues because s3 is slower (see IMPALA-12616). This test was introduced by IMPALA-13134: https://github.com/apache/impala/commit/70b7b6a78d49c30933d79e0a1c2a725f7e0a3e50 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-12616) test_restart_catalogd_while_handling_rpc_response* tests fail not reaching expected states
[ https://issues.apache.org/jira/browse/IMPALA-12616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joe McDonnell resolved IMPALA-12616. Fix Version/s: Impala 4.5.0 Resolution: Fixed I think the s3 slowness version of this is fixed, so I'm going to resolve this. > test_restart_catalogd_while_handling_rpc_response* tests fail not reaching > expected states > -- > > Key: IMPALA-12616 > URL: https://issues.apache.org/jira/browse/IMPALA-12616 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 1.4.2 >Reporter: Andrew Sherman >Assignee: Daniel Becker >Priority: Critical > Fix For: Impala 4.5.0 > > > There are failures in both > custom_cluster.test_restart_services.TestRestart.test_restart_catalogd_while_handling_rpc_response_with_timeout > and > custom_cluster.test_restart_services.TestRestart.test_restart_catalogd_while_handling_rpc_response_with_max_iters, > both look the same: > {code:java} > custom_cluster/test_restart_services.py:232: in > test_restart_catalogd_while_handling_rpc_response_with_timeout > self.wait_for_state(handle, self.client.QUERY_STATES["FINISHED"], > max_wait_time) > common/impala_test_suite.py:1181: in wait_for_state > self.wait_for_any_state(handle, [expected_state], timeout, client) > common/impala_test_suite.py:1199: in wait_for_any_state > raise Timeout(timeout_msg) > E Timeout: query '6a4e0bad9b511ccf:bf93de68' did not reach one of > the expected states [4], last known state 5 > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-12322) return wrong timestamp when scan kudu timestamp with timezone
[ https://issues.apache.org/jira/browse/IMPALA-12322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17853203#comment-17853203 ] Csaba Ringhofer commented on IMPALA-12322: -- Thanks for the feedback[~eyizoha]. I have uploaded a patch that adds a new query option: https://gerrit.cloudera.org/#/c/21492/ > return wrong timestamp when scan kudu timestamp with timezone > - > > Key: IMPALA-12322 > URL: https://issues.apache.org/jira/browse/IMPALA-12322 > Project: IMPALA > Issue Type: Bug >Affects Versions: Impala 4.1.1 > Environment: impala 4.1.1 >Reporter: daicheng >Assignee: Zihao Ye >Priority: Major > Attachments: image-2022-04-24-00-01-05-746-1.png, > image-2022-04-24-00-01-05-746.png, image-2022-04-24-00-01-37-520.png, > image-2022-04-24-00-03-14-467-1.png, image-2022-04-24-00-03-14-467.png, > image-2022-04-24-00-04-16-240-1.png, image-2022-04-24-00-04-16-240.png, > image-2022-04-24-00-04-52-860-1.png, image-2022-04-24-00-04-52-860.png, > image-2022-04-24-00-05-52-086-1.png, image-2022-04-24-00-05-52-086.png, > image-2022-04-24-00-07-09-776-1.png, image-2022-04-24-00-07-09-776.png, > image-2023-07-28-20-31-09-457.png, image-2023-07-28-22-27-38-521.png, > image-2023-07-28-22-29-40-083.png, image-2023-07-28-22-36-17-460.png, > image-2023-07-28-22-36-37-884.png, image-2023-07-28-22-38-19-728.png > > > impala version is 3.1.0-cdh6.1 > i have set system timezone=Asia/Shanghai: > !image-2022-04-24-00-01-37-520.png! > !image-2022-04-24-00-01-05-746.png! > here is the bug: > *step 1* > i have parquet file with two columns like below,and read it with impala-shell > and spark (timezone=shanghai) > !image-2022-04-24-00-03-14-467.png|width=1016,height=154! > !image-2022-04-24-00-04-16-240.png|width=944,height=367! > the result both exactly right。 > *step two* > create kudu table with impala-shell: > CREATE TABLE default.test_{_}test{_}_test_time2 (id BIGINT,t > TIMESTAMP,PRIMARY KEY (id) ) STORED AS KUDU; > note: kudu version:1.8 > and insert 2 row into the table with spark : > !image-2022-04-24-00-04-52-860.png|width=914,height=279! > *stop 3* > read it with spark (timezone=shanghai),spark read kudu table with kudu-client > api,here is the result: > !image-2022-04-24-00-05-52-086.png|width=914,height=301! > the result is still exactly right。 > but read it with impala-shell: > !image-2022-04-24-00-07-09-776.png|width=915,height=154! > the result show late 8hour > *conclusion* > it seems like impala timezone didn't work when kudu column type is > timestamp, but it work fine in parquet file,I don't know why? -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-12616) test_restart_catalogd_while_handling_rpc_response* tests fail not reaching expected states
[ https://issues.apache.org/jira/browse/IMPALA-12616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17853196#comment-17853196 ] ASF subversion and git services commented on IMPALA-12616: -- Commit 1935f9e1a199c958c5fb12ad53277fa720d6ae5c in impala's branch refs/heads/master from Joe McDonnell [ https://gitbox.apache.org/repos/asf?p=impala.git;h=1935f9e1a ] IMPALA-12616: Fix test_restart_services.py::TestRestart tests for S3 The test_restart_catalogd_while_handling_rpc_response* tests from custom_cluster/test_restart_services.py have been failing consistently on s3. The alter table statement is expected to succeed, but instead it fails with: "CatalogException: Detected catalog service ID changes" This manifests as a timeout waiting for the statement to reach the finished state. The test relies on specific timing with a sleep injected via a debug action. The failure stems from the catalog being slower on s3. The alter table wakes up before the catalog service ID change has fully completed, and it fails when it sees the catalog service ID change. This increases two sleep times: 1. This increases the sleep time before restarting the catalogd from 0.5 seconds to 5 seconds. This gives the catalogd longer to receive the message about the alter table and respond back to the impalad. 2. This increases the WAIT_BEFORE_PROCESSING_CATALOG_UPDATE sleep from 10 seconds to 30 seconds so the alter table statement doesn't wake up until the catalog service ID change is finalized. The test is verifying that the right messages are in the impalad logs, so we know this is still testing the same condition. This modifies the tests to use wait_for_finished_timeout() rather than wait_for_state(). This bails out immediately if the query fails rather than waiting unnecessarily for the full timeout. This also clears the query options so that later statements don't inherit the debug_action that the alter table statement used. Testing: - Ran the tests 100x in a loop on s3 - Ran the tests 100x in a loop on HDFS Change-Id: Ieb5699b8fb0b2ad8bad4ac30922a7b4d7fa17d29 Reviewed-on: http://gerrit.cloudera.org:8080/21485 Tested-by: Impala Public Jenkins Reviewed-by: Daniel Becker > test_restart_catalogd_while_handling_rpc_response* tests fail not reaching > expected states > -- > > Key: IMPALA-12616 > URL: https://issues.apache.org/jira/browse/IMPALA-12616 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 1.4.2 >Reporter: Andrew Sherman >Assignee: Daniel Becker >Priority: Critical > > There are failures in both > custom_cluster.test_restart_services.TestRestart.test_restart_catalogd_while_handling_rpc_response_with_timeout > and > custom_cluster.test_restart_services.TestRestart.test_restart_catalogd_while_handling_rpc_response_with_max_iters, > both look the same: > {code:java} > custom_cluster/test_restart_services.py:232: in > test_restart_catalogd_while_handling_rpc_response_with_timeout > self.wait_for_state(handle, self.client.QUERY_STATES["FINISHED"], > max_wait_time) > common/impala_test_suite.py:1181: in wait_for_state > self.wait_for_any_state(handle, [expected_state], timeout, client) > common/impala_test_suite.py:1199: in wait_for_any_state > raise Timeout(timeout_msg) > E Timeout: query '6a4e0bad9b511ccf:bf93de68' did not reach one of > the expected states [4], last known state 5 > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-13142) Documentation for Impala StateStore HA
Sanjana Malhotra created IMPALA-13142: - Summary: Documentation for Impala StateStore HA Key: IMPALA-13142 URL: https://issues.apache.org/jira/browse/IMPALA-13142 Project: IMPALA Issue Type: Documentation Reporter: Sanjana Malhotra Assignee: Sanjana Malhotra IMPALA-12156 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-13137) Add additional client fetch metrics columns to the queries page
[ https://issues.apache.org/jira/browse/IMPALA-13137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17853096#comment-17853096 ] Surya Hebbar commented on IMPALA-13137: --- It was confirmed in the meeting, that the expected column was the {{{}ClientFetchWaitTimer{}}}'s value and not the difference between "First row fetched" and "Last row fetched". > Add additional client fetch metrics columns to the queries page > --- > > Key: IMPALA-13137 > URL: https://issues.apache.org/jira/browse/IMPALA-13137 > Project: IMPALA > Issue Type: New Feature > Components: Backend, be >Reporter: Surya Hebbar >Assignee: Surya Hebbar >Priority: Major > Attachments: completed_query.png, in_flight_query_1.png, > in_flight_query_2.png, in_flight_query_3.png, very_short_fetch_timer.png > > > For helping users to better understand query execution times, it would be > helpful to add the following columns on the queries page. > * First row fetched time - Time taken for the client to fetch the first row > * Client fetch wait time - Time taken for the client to fetch all rows > Additional details - > https://jira.cloudera.com/browse/DWX-18295 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-13141) Partition transactional table is not updated on alter partition when hms_event_incremental_refresh_transactional_table is disabled
[ https://issues.apache.org/jira/browse/IMPALA-13141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Venugopal Reddy K updated IMPALA-13141: --- Description: Partition transactional table is not updated on alter partition when hms_event_incremental_refresh_transactional_table is disabled. *Observations:* 1. In case of AlterPartitionEvent, this issue occurs when hms_event_incremental_refresh_transactional_table is disabled. 2. In case of BatchPartitionEvent(when. more than 1 AlterPartitionEvent are batched together), this issue occurs without disabling hms_event_incremental_refresh_transactional_table. *Steps to reproduce:* 1. Create partitioned table and add some partitions from hive: Note: This step can be from impala too. {code:java} 0: jdbc:hive2://localhost:11050> create table s(i int, j int, p int); 0: jdbc:hive2://localhost:11050> insert into s values(1,10,100),(2,20,200); {code} {code:java} 0: jdbc:hive2://localhost:11050> create table test1(i int, j int) partitioned by(p int) tblproperties ('transactional'='true', 'transactional_properties'='insert_only'); 0: jdbc:hive2://localhost:11050> set hive.exec.dynamic.partition.mode=nonstrict; 0: jdbc:hive2://localhost:11050> insert into test partition(p) select * from s; 0: jdbc:hive2://localhost:11050> show partitions test; ++ | partition | ++ | p=100 | | p=200 | ++ 0: jdbc:hive2://localhost:11050> desc formatted test partition(p=100); +---++---+ | col_name | data_type | comment | +---++---+ | i | int | | | j | int | | | | NULL | NULL | | # Partition Information | NULL | NULL | | # col_name | data_type | comment | | p | int | | | | NULL | NULL | | # Detailed Partition Information | NULL | NULL | | Partition Value: | [100] | NULL | | Database: | default | NULL | | Table: | test | NULL | | CreateTime: | Fri Jun 07 14:21:17 IST 2024 | NULL | | LastAccessTime: | UNKNOWN | NULL | | Location: | hdfs://localhost:20500/test-warehouse/managed/test/p=100 | NULL | | Partition Parameters: | NULL | NULL | | | numFiles | 1 | | | totalSize | 5 | | | transient_lastDdlTime | 1717750277 | | | NULL | NULL | | # Storage Information | NULL | NULL | | SerDe Library: | org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe | NULL | | InputFormat: | org.apache.hadoop.mapred.TextInputFormat | NULL | | OutputFormat: | org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat | NULL | | Compressed: | No | NULL | | Num Buckets: | -1 | NULL | | Bucket Columns: | [] | NULL | | Sort Columns: | []
[jira] [Updated] (IMPALA-13141) Partition transactional table is not updated on alter partition when hms_event_incremental_refresh_transactional_table is disabled
[ https://issues.apache.org/jira/browse/IMPALA-13141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Venugopal Reddy K updated IMPALA-13141: --- Description: Partition transactional table is not updated on alter partition when hms_event_incremental_refresh_transactional_table is disabled. *Observations:* 1. In case of AlterPartitionEvent, this issue occurs when hms_event_incremental_refresh_transactional_table is disabled. 2. In case of BatchPartitionEvent(when. more than 1 AlterPartitionEvent are batched together), this issue occurs without disabling hms_event_incremental_refresh_transactional_table. *Steps to reproduce:* 1. Create partitioned table and add some partitions from hive: Note: This step can be from impala too. {code:java} 0: jdbc:hive2://localhost:11050> create table s(i int, j int, p int); 0: jdbc:hive2://localhost:11050> insert into s values(1,10,100),(2,20,200); {code} {code:java} 0: jdbc:hive2://localhost:11050> create table test1(i int, j int) partitioned by(p int) tblproperties ('transactional'='true', 'transactional_properties'='insert_only'); 0: jdbc:hive2://localhost:11050> set hive.exec.dynamic.partition.mode=nonstrict; 0: jdbc:hive2://localhost:11050> insert into test partition(p) select * from s; 0: jdbc:hive2://localhost:11050> show partitions test; ++ | partition | ++ | p=100 | | p=200 | ++ 0: jdbc:hive2://localhost:11050> desc formatted test partition(p=100); +---++---+ | col_name | data_type | comment | +---++---+ | i | int | | | j | int | | | | NULL | NULL | | # Partition Information | NULL | NULL | | # col_name | data_type | comment | | p | int | | | | NULL | NULL | | # Detailed Partition Information | NULL | NULL | | Partition Value: | [100] | NULL | | Database: | default | NULL | | Table: | test | NULL | | CreateTime: | Fri Jun 07 14:21:17 IST 2024 | NULL | | LastAccessTime: | UNKNOWN | NULL | | Location: | hdfs://localhost:20500/test-warehouse/managed/test/p=100 | NULL | | Partition Parameters: | NULL | NULL | | | numFiles | 1 | | | totalSize | 5 | | | transient_lastDdlTime | 1717750277 | | | NULL | NULL | | # Storage Information | NULL | NULL | | SerDe Library: | org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe | NULL | | InputFormat: | org.apache.hadoop.mapred.TextInputFormat | NULL | | OutputFormat: | org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat | NULL | | Compressed: | No | NULL | | Num Buckets: | -1 | NULL | | Bucket Columns: | [] | NULL | | Sort Columns: | []
[jira] [Created] (IMPALA-13141) Partition transactional table is not updated on alter partition when hms_event_incremental_refresh_transactional_table is disabled
Venugopal Reddy K created IMPALA-13141: -- Summary: Partition transactional table is not updated on alter partition when hms_event_incremental_refresh_transactional_table is disabled Key: IMPALA-13141 URL: https://issues.apache.org/jira/browse/IMPALA-13141 Project: IMPALA Issue Type: Bug Reporter: Venugopal Reddy K Partition transactional table is not updated on alter partition when hms_event_incremental_refresh_transactional_table is disabled. *Observations:* 1. In case of AlterPartitionEvent, this issue occurs when hms_event_incremental_refresh_transactional_table is disabled. 2. In case of BatchPartitionEvent(when. more than 1 AlterPartitionEvent are batched together), this issue occurs without disabling hms_event_incremental_refresh_transactional_table. *Steps to reproduce:* 1. Create partitioned table and add some partitions from hive: Note: This step can be from impala too. {code:java} 0: jdbc:hive2://localhost:11050> create table s(i int, j int, p int); 0: jdbc:hive2://localhost:11050> insert into s values(1,10,100),(2,20,200); {code} {code:java} 0: jdbc:hive2://localhost:11050> create table test1(i int, j int) partitioned by(p int) tblproperties ('transactional'='true', 'transactional_properties'='insert_only'); 0: jdbc:hive2://localhost:11050> set hive.exec.dynamic.partition.mode=nonstrict; 0: jdbc:hive2://localhost:11050> insert into test partition(p) select * from s; 0: jdbc:hive2://localhost:11050> show partitions test; ++ | partition | ++ | p=100 | | p=200 | ++ 0: jdbc:hive2://localhost:11050> desc formatted test partition(p=100); +---++---+ | col_name | data_type | comment | +---++---+ | i | int | | | j | int | | | | NULL | NULL | | # Partition Information | NULL | NULL | | # col_name | data_type | comment | | p | int | | | | NULL | NULL | | # Detailed Partition Information | NULL | NULL | | Partition Value: | [100] | NULL | | Database: | default | NULL | | Table: | test | NULL | | CreateTime: | Fri Jun 07 14:21:17 IST 2024 | NULL | | LastAccessTime: | UNKNOWN | NULL | | Location: | hdfs://localhost:20500/test-warehouse/managed/test/p=100 | NULL | | Partition Parameters: | NULL | NULL | | | numFiles | 1 | | | totalSize | 5 | | | transient_lastDdlTime | 1717750277 | | | NULL | NULL | | # Storage Information | NULL | NULL | | SerDe Library: | org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe | NULL | | InputFormat: | org.apache.hadoop.mapred.TextInputFormat | NULL | | OutputFormat: | org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat | NULL | | Compressed: | No | NULL | | Num Buckets: | -1 | NUL
[jira] [Assigned] (IMPALA-13140) Add backend flag to disable small string optimization
[ https://issues.apache.org/jira/browse/IMPALA-13140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zoltán Borók-Nagy reassigned IMPALA-13140: -- Assignee: Zoltán Borók-Nagy > Add backend flag to disable small string optimization > - > > Key: IMPALA-13140 > URL: https://issues.apache.org/jira/browse/IMPALA-13140 > Project: IMPALA > Issue Type: Sub-task >Reporter: Zoltán Borók-Nagy >Assignee: Zoltán Borók-Nagy >Priority: Critical > > We could have a backend flag that would make SmallableString::Smallify() a > no-op. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-13140) Add backend flag to disable small string optimization
Zoltán Borók-Nagy created IMPALA-13140: -- Summary: Add backend flag to disable small string optimization Key: IMPALA-13140 URL: https://issues.apache.org/jira/browse/IMPALA-13140 Project: IMPALA Issue Type: Sub-task Reporter: Zoltán Borók-Nagy We could have a backend flag that would make SmallableString::Smallify() a no-op. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-13130) Under heavy load, Impala does not prioritize data stream operations
[ https://issues.apache.org/jira/browse/IMPALA-13130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17853074#comment-17853074 ] ASF subversion and git services commented on IMPALA-13130: -- Commit 3f827bfc2447d8c11a4f09bcb96e86c53b92d753 in impala's branch refs/heads/master from Michael Smith [ https://gitbox.apache.org/repos/asf?p=impala.git;h=3f827bfc2 ] IMPALA-13130: Prioritize EndDataStream messages Prioritize EndDataStream messages over other types handled by DataStreamService, and avoid rejecting them when memory limit is reached. They take very little memory (~75 bytes) and will usually help reduce memory use by closing out in-progress operations. Adds the 'data_stream_sender_eos_timeout_ms' flag to control EOS timeouts. Defaults to 1 hour, and can be disabled by setting to -1. Adds unit tests ensuring EOS are processed even if mem limit is reached and ahead of TransmitData messages in the queue. Change-Id: I2829e1ab5bcde36107e10bff5fe629c5ee60f3e8 Reviewed-on: http://gerrit.cloudera.org:8080/21476 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins > Under heavy load, Impala does not prioritize data stream operations > --- > > Key: IMPALA-13130 > URL: https://issues.apache.org/jira/browse/IMPALA-13130 > Project: IMPALA > Issue Type: Bug >Reporter: Michael Smith >Assignee: Michael Smith >Priority: Major > > Under heavy load - where Impala reaches max memory for the DataStreamService > and applies backpressure via > https://github.com/apache/impala/blob/4.4.0/be/src/rpc/impala-service-pool.cc#L191-L199 > - DataStreamService does not differentiate between types of requests and may > reject requests that could help reduce load. > The DataStreamService deals with TransmitData, PublishFilter, UpdateFilter, > UpdateFilterFromRemote, and EndDataStream. It seems like we should prioritize > completing EndDataStream, especially under heavy load, to complete work and > release resources more quickly. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-12569) Harden long string testing
[ https://issues.apache.org/jira/browse/IMPALA-12569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zoltán Borók-Nagy updated IMPALA-12569: --- Priority: Critical (was: Major) > Harden long string testing > -- > > Key: IMPALA-12569 > URL: https://issues.apache.org/jira/browse/IMPALA-12569 > Project: IMPALA > Issue Type: Sub-task > Components: Backend, Infrastructure >Reporter: Zoltán Borók-Nagy >Priority: Critical > > With small string optimization [~csringhofer] pointed out that most of our > test data have small strings. And new features are typically tested on the > existing test tables (e.g. alltypes that only have small strings), or they > add new tests with usually small strings only. The latter is hard to prevent. > Therefore the long strings might have less test coverage if we don't pay > enough attention. > To make the situation better, we could > # Add long string data to the string column of alltypes table and > complextypestbl and update the tests > # Add backend flag the makes StringValue.Smallify() a no-op, and create a > test job (probably with an ASAN build) that runs the tests with that flag > turned on. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org