[jira] [Commented] (IMPALA-12402) Add some configurations for CatalogdMetaProvider's cache_
[ https://issues.apache.org/jira/browse/IMPALA-12402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17758821#comment-17758821 ] Maxwell Guo commented on IMPALA-12402: -- How can I assign this jira to myself ? I can't find the button. > Add some configurations for CatalogdMetaProvider's cache_ > - > > Key: IMPALA-12402 > URL: https://issues.apache.org/jira/browse/IMPALA-12402 > Project: IMPALA > Issue Type: Improvement > Components: fe >Reporter: Maxwell Guo >Priority: Minor > Labels: pull-request-available > > when the cluster contains many db and tables such as if there are more than > 10 tables, and if we restart the impalad , the local cache_ > CatalogMetaProvider's need to doing some loading process. > As we know that the goole's guava cache 's concurrencyLevel os set to 4 by > default. > but if there is many tables the loading process will need more time and > increase the probability of lock contention, see > [here|https://github.com/google/guava/blob/master/guava/src/com/google/common/cache/CacheBuilder.java#L437]. > > So we propose to add some configurations here, the first is the concurrency > of cache. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-12402) Add some configurations for CatalogdMetaProvider's cache_
[ https://issues.apache.org/jira/browse/IMPALA-12402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maxwell Guo updated IMPALA-12402: - Flags: Patch Labels: pull-request-available (was: ) > Add some configurations for CatalogdMetaProvider's cache_ > - > > Key: IMPALA-12402 > URL: https://issues.apache.org/jira/browse/IMPALA-12402 > Project: IMPALA > Issue Type: Improvement > Components: fe >Reporter: Maxwell Guo >Priority: Minor > Labels: pull-request-available > > when the cluster contains many db and tables such as if there are more than > 10 tables, and if we restart the impalad , the local cache_ > CatalogMetaProvider's need to doing some loading process. > As we know that the goole's guava cache 's concurrencyLevel os set to 4 by > default. > but if there is many tables the loading process will need more time and > increase the probability of lock contention, see > [here|https://github.com/google/guava/blob/master/guava/src/com/google/common/cache/CacheBuilder.java#L437]. > > So we propose to add some configurations here, the first is the concurrency > of cache. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-12402) Add some configurations for CatalogdMetaProvider's cache_
[ https://issues.apache.org/jira/browse/IMPALA-12402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maxwell Guo updated IMPALA-12402: - Summary: Add some configurations for CatalogdMetaProvider's cache_ (was: Add some configurations for CatalogMetaProvider's cache_) > Add some configurations for CatalogdMetaProvider's cache_ > - > > Key: IMPALA-12402 > URL: https://issues.apache.org/jira/browse/IMPALA-12402 > Project: IMPALA > Issue Type: Improvement > Components: fe >Reporter: Maxwell Guo >Priority: Minor > > when the cluster contains many db and tables such as if there are more than > 10 tables, and if we restart the impalad , the local cache_ > CatalogMetaProvider's need to doing some loading process. > As we know that the goole's guava cache 's concurrencyLevel os set to 4 by > default. > but if there is many tables the loading process will need more time and > increase the probability of lock contention, see > [here|https://github.com/google/guava/blob/master/guava/src/com/google/common/cache/CacheBuilder.java#L437]. > > So we propose to add some configurations here, the first is the concurrency > of cache. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-12402) Add some configurations for CatalogMetaProvider's cache_
[ https://issues.apache.org/jira/browse/IMPALA-12402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maxwell Guo updated IMPALA-12402: - Language: java Target Version: Impala 4.2.0 > Add some configurations for CatalogMetaProvider's cache_ > > > Key: IMPALA-12402 > URL: https://issues.apache.org/jira/browse/IMPALA-12402 > Project: IMPALA > Issue Type: Improvement > Components: fe >Reporter: Maxwell Guo >Priority: Minor > > when the cluster contains many db and tables such as if there are more than > 10 tables, and if we restart the impalad , the local cache_ > CatalogMetaProvider's need to doing some loading process. > As we know that the goole's guava cache 's concurrencyLevel os set to 4 by > default. > but if there is many tables the loading process will need more time and > increase the probability of lock contention, see > [here|https://github.com/google/guava/blob/master/guava/src/com/google/common/cache/CacheBuilder.java#L437]. > > So we propose to add some configurations here, the first is the concurrency > of cache. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-12402) Add some configurations for CatalogMetaProvider's cache_
Maxwell Guo created IMPALA-12402: Summary: Add some configurations for CatalogMetaProvider's cache_ Key: IMPALA-12402 URL: https://issues.apache.org/jira/browse/IMPALA-12402 Project: IMPALA Issue Type: Improvement Components: fe Reporter: Maxwell Guo when the cluster contains many db and tables such as if there are more than 10 tables, and if we restart the impalad , the local cache_ CatalogMetaProvider's need to doing some loading process. As we know that the goole's guava cache 's concurrencyLevel os set to 4 by default. but if there is many tables the loading process will need more time and increase the probability of lock contention, see [here|https://github.com/google/guava/blob/master/guava/src/com/google/common/cache/CacheBuilder.java#L437]. So we propose to add some configurations here, the first is the concurrency of cache. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-12401) Support more info types for HS2 GetInfo() API
Quanlong Huang created IMPALA-12401: --- Summary: Support more info types for HS2 GetInfo() API Key: IMPALA-12401 URL: https://issues.apache.org/jira/browse/IMPALA-12401 Project: IMPALA Issue Type: Improvement Components: Backend Reporter: Quanlong Huang Impalad coordinators can act as HiveServer2 since they implement the HS2 APIs. Currently, we just support 3 info types for the HS2 GetInfo() API: CLI_SERVER_NAME, CLI_DBMS_NAME, CLI_DBMS_VER. https://github.com/apache/impala/blob/11a9861ec695fe62b39095940514b28a8c684484/be/src/service/impala-hs2-server.cc#L468-L474 We can add more to be compatible with Hive, e.g. CLI_MAX_COLUMN_NAME_LEN, CLI_MAX_TABLE_NAME_LEN, CLI_MAX_SCHEMA_NAME_LEN, CLI_ODBC_KEYWORDS. https://github.com/apache/hive/blob/4903585a34ae44bb3fec4207b5acab63f6bfc8c1/service/src/java/org/apache/hive/service/cli/session/HiveSessionImpl.java#L501-L508 Note that CLI_ODBC_KEYWORDS is a new type of emun TGetInfoType added in HIVE-17765 which is not in our common/thrift/hive-1-api/TCLIService.thrift We can add CLI_ODBC_KEYWORDS and other new types to our TCLIService.thrift file. New tests can be added in tests/hs2/test_hs2.py -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-10086) SqlCastException when comparing char with varchar
[ https://issues.apache.org/jira/browse/IMPALA-10086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17758776#comment-17758776 ] Michael Smith commented on IMPALA-10086: https://gerrit.cloudera.org/c/18001/ > SqlCastException when comparing char with varchar > - > > Key: IMPALA-10086 > URL: https://issues.apache.org/jira/browse/IMPALA-10086 > Project: IMPALA > Issue Type: Bug > Components: Frontend >Affects Versions: Impala 4.0.0 >Reporter: Tim Armstrong >Assignee: Bruno Pusztahazi >Priority: Minor > Labels: newbie, ramp-up > > {noformat} > [localhost:21000] default> select 'expected 2',count(*) from ax where cast(t > as string) = cast('a ' as varchar(10)); > +--+--+ > | 'expected 2' | count(*) | > +--+--+ > | expected 2 | 2| > +--+--+ > Fetched 1 row(s) in 0.44s > [localhost:21000] default> create table chartbl (c char(10)); > +-+ > | summary | > +-+ > | Table has been created. | > +-+ > Fetched 1 row(s) in 0.23s > [localhost:21000] default> select * from chartbl where c = cast('test' as > varchar(10)); > ERROR: SqlCastException: targetType=VARCHAR(*) type=VARCHAR(10) > {noformat} > Also using the functional dataset: > {noformat} > [localhost:21000] functional> select * from chars_tiny where cs = vc; > ERROR: SqlCastException: targetType=VARCHAR(*) type=VARCHAR(5) > {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-12395) Planner overestimates scan cardinality for queries using count star optimization
[ https://issues.apache.org/jira/browse/IMPALA-12395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17758774#comment-17758774 ] Riza Suminto commented on IMPALA-12395: --- Filed patch at: https://gerrit.cloudera.org/c/20406/ > Planner overestimates scan cardinality for queries using count star > optimization > > > Key: IMPALA-12395 > URL: https://issues.apache.org/jira/browse/IMPALA-12395 > Project: IMPALA > Issue Type: Bug > Components: fe >Reporter: David Rorke >Assignee: Riza Suminto >Priority: Major > > The scan cardinality estimate for count(*) queries doesn't account for the > fact that the count(*) optimization only scans metadata and not the actual > columns. > Scan for a count(*) query on Parquet store_sales: > > {noformat} > Operator #Hosts #Inst Avg Time Max Time #Rows Est. #Rows Peak Mem Est. Peak > Mem Detail > - > 00:SCAN S3 6 72 8s131ms 8s496ms 2.71K 8.64B 128.00 KB 88.00 MB > tpcds_3000_string_parquet_managed.store_sales > {noformat} > > This is a problem with all file/table formats that implement count(*) > optimizations (Parquet and also probably ORC and Iceberg). > This problem is more serious than it was in the past because with > IMPALA-12091 we now rely on scan cardinality estimates for executor group > assignments so count(*) queries are likely to get assigned to a larger > executor group than needed. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-12400) Test expected executors used for planning when no executor groups are healthy
Abhishek Rawat created IMPALA-12400: --- Summary: Test expected executors used for planning when no executor groups are healthy Key: IMPALA-12400 URL: https://issues.apache.org/jira/browse/IMPALA-12400 Project: IMPALA Issue Type: Test Reporter: Abhishek Rawat Planner uses expected executors from 'num_expected_executors' and ' 'expected_executor_group_sets' config when no executor groups are healthy. Would be good to write a test case. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Comment Edited] (IMPALA-12382) Coordinator could schedule fragments on gracefully shutdown executors
[ https://issues.apache.org/jira/browse/IMPALA-12382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17758687#comment-17758687 ] Wenzhe Zhou edited comment on IMPALA-12382 at 8/24/23 6:38 PM: --- If the executor is removed from the cluster membership by statestore when receiving un-registering request, it could affect running queries. Coordinators cancel the queries which are running on failed executors (as evidenced by their absence from the membership list). See [ImpalaServer::CancelQueriesOnFailedBackends()|https://github.com/apache/impala/blob/master/be/src/service/impala-server.cc#L2365-L2375]. It seems we already have [mechanism|https://github.com/apache/impala/blob/master/be/src/service/impala-server.h#L124-L126] to avoid scheduling new task on the executors which are shutting down by marking the executor in "quiescing" state. was (Author: wzhou): If the executor is removed from the cluster membership by statestore when receiving un-registering request, it could affect running queries. Coordinators cancel the queries which are running on failed executors (as evidenced by their absence from the membership list). See [ImpalaServer::CancelQueriesOnFailedBackends()|https://github.com/apache/impala/blob/master/be/src/service/impala-server.cc#L2365-L2375]. > Coordinator could schedule fragments on gracefully shutdown executors > - > > Key: IMPALA-12382 > URL: https://issues.apache.org/jira/browse/IMPALA-12382 > Project: IMPALA > Issue Type: Improvement >Reporter: Abhishek Rawat >Assignee: Wenzhe Zhou >Priority: Critical > > Statestore does failure detection based on consecutive heartbeat failures. > This is by default configured to be 10 (statestore_max_missed_heartbeats) at > 1 second intervals (statestore_heartbeat_frequency_ms). This could however > take much longer than 10 seconds overall, especially if statestore is busy > and due to rpc timeout duration. > In the following example it took 50 seconds for failure detection: > {code:java} > I0817 12:32:06.824721 86 statestore.cc:1157] Unable to send heartbeat > message to subscriber > impa...@impala-executor-001-5.impala-executor.impala-1692115218-htqx.svc.cluster.local:27010, > received error: RPC Error: Client for 10.80.199.159:23000 hit an unexpected > exception: No more data to read., type: > N6apache6thrift9transport19TTransportExceptionE, rpc: > N6impala18THeartbeatResponseE, send: done > I0817 12:32:06.824741 86 failure-detector.cc:91] 1 consecutive heartbeats > failed for > 'impa...@impala-executor-001-5.impala-executor.impala-1692115218-htqx.svc.cluster.local:27010'. > State is OK > . > . > . > I0817 12:32:56.800251 83 statestore.cc:1157] Unable to send heartbeat > message to subscriber > impa...@impala-executor-001-5.impala-executor.impala-1692115218-htqx.svc.cluster.local:27010, > received error: RPC Error: Client for 10.80.199.159:23000 hit an unexpected > exception: No more data to read., type: > N6apache6thrift9transport19TTransportExceptionE, rpc: > N6impala18THeartbeatResponseE, send: done > I0817 12:32:56.800267 83 failure-detector.cc:91] 10 consecutive heartbeats > failed for > 'impa...@impala-executor-001-5.impala-executor.impala-1692115218-htqx.svc.cluster.local:27010'. > State is FAILED > I0817 12:32:56.800276 83 statestore.cc:1168] Subscriber > 'impa...@impala-executor-001-5.impala-executor.impala-1692115218-htqx.svc.cluster.local:27010' > has failed, disconnected or re-registered (last known registration ID: > c84bf70f03acda2b:b34a812c5e96e687){code} > As a result there is a window when statestore is determining node failure and > coordinator might schedule fragments on that particular executor(s). The exec > RPC will fail and if transparent query retries is enabled, coordinator will > immediately retry the query and it will fail again. > Ideally in such situations coordinator should be notified sooner about a > failed executor. Statestore could send priority topic update to coordinator > when it enters failure detection logic. This should reduce the chances of > coordinator scheduling query fragment on a failed executor. > The other argument could be to tune the heartbeat frequency and interval > parameters. But, it's hard to find configuration which works for all cases. > And, so while the default values are reasonable, under certain conditions > they could be unreasonable as seen in the above example. > It might make sense to especially handle the case where executors are > shutdown gracefully and in such case statestore shouldn't do failure > detection and instead fail these executor immediately. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (IMPALA-12382) Coordinator could schedule fragments on gracefully shutdown executors
[ https://issues.apache.org/jira/browse/IMPALA-12382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17758687#comment-17758687 ] Wenzhe Zhou commented on IMPALA-12382: -- If the executor is removed from the cluster membership by statestore when receiving un-registering request, it could affect running queries. Coordinators cancel the queries which are running on failed executors (as evidenced by their absence from the membership list). See [ImpalaServer::CancelQueriesOnFailedBackends()|https://github.com/apache/impala/blob/master/be/src/service/impala-server.cc#L2365-L2375]. > Coordinator could schedule fragments on gracefully shutdown executors > - > > Key: IMPALA-12382 > URL: https://issues.apache.org/jira/browse/IMPALA-12382 > Project: IMPALA > Issue Type: Improvement >Reporter: Abhishek Rawat >Assignee: Wenzhe Zhou >Priority: Critical > > Statestore does failure detection based on consecutive heartbeat failures. > This is by default configured to be 10 (statestore_max_missed_heartbeats) at > 1 second intervals (statestore_heartbeat_frequency_ms). This could however > take much longer than 10 seconds overall, especially if statestore is busy > and due to rpc timeout duration. > In the following example it took 50 seconds for failure detection: > {code:java} > I0817 12:32:06.824721 86 statestore.cc:1157] Unable to send heartbeat > message to subscriber > impa...@impala-executor-001-5.impala-executor.impala-1692115218-htqx.svc.cluster.local:27010, > received error: RPC Error: Client for 10.80.199.159:23000 hit an unexpected > exception: No more data to read., type: > N6apache6thrift9transport19TTransportExceptionE, rpc: > N6impala18THeartbeatResponseE, send: done > I0817 12:32:06.824741 86 failure-detector.cc:91] 1 consecutive heartbeats > failed for > 'impa...@impala-executor-001-5.impala-executor.impala-1692115218-htqx.svc.cluster.local:27010'. > State is OK > . > . > . > I0817 12:32:56.800251 83 statestore.cc:1157] Unable to send heartbeat > message to subscriber > impa...@impala-executor-001-5.impala-executor.impala-1692115218-htqx.svc.cluster.local:27010, > received error: RPC Error: Client for 10.80.199.159:23000 hit an unexpected > exception: No more data to read., type: > N6apache6thrift9transport19TTransportExceptionE, rpc: > N6impala18THeartbeatResponseE, send: done > I0817 12:32:56.800267 83 failure-detector.cc:91] 10 consecutive heartbeats > failed for > 'impa...@impala-executor-001-5.impala-executor.impala-1692115218-htqx.svc.cluster.local:27010'. > State is FAILED > I0817 12:32:56.800276 83 statestore.cc:1168] Subscriber > 'impa...@impala-executor-001-5.impala-executor.impala-1692115218-htqx.svc.cluster.local:27010' > has failed, disconnected or re-registered (last known registration ID: > c84bf70f03acda2b:b34a812c5e96e687){code} > As a result there is a window when statestore is determining node failure and > coordinator might schedule fragments on that particular executor(s). The exec > RPC will fail and if transparent query retries is enabled, coordinator will > immediately retry the query and it will fail again. > Ideally in such situations coordinator should be notified sooner about a > failed executor. Statestore could send priority topic update to coordinator > when it enters failure detection logic. This should reduce the chances of > coordinator scheduling query fragment on a failed executor. > The other argument could be to tune the heartbeat frequency and interval > parameters. But, it's hard to find configuration which works for all cases. > And, so while the default values are reasonable, under certain conditions > they could be unreasonable as seen in the above example. > It might make sense to especially handle the case where executors are > shutdown gracefully and in such case statestore shouldn't do failure > detection and instead fail these executor immediately. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-11669) Make Thrift max message size configuration
[ https://issues.apache.org/jira/browse/IMPALA-11669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17758655#comment-17758655 ] ASF subversion and git services commented on IMPALA-11669: -- Commit 81844499b51da092567c510202a4b7de81ecd8af in impala's branch refs/heads/master from Joe McDonnell [ https://gitbox.apache.org/repos/asf?p=impala.git;h=81844499b ] IMPALA-12366: Use 2GB as the default for thrift_rpc_max_message_size Thrift 0.16 implemented a limit on the max message size. In IMPALA-11669, we added the thrift_rpc_max_message_size parameter and set the default size to 1GB. Some existing clusters have needed to tune this parameter higher because their workloads use message sizes larger than 1GB (e.g. for metadata updates). Historically, Impala has been able to send and receive 2GB messages, so this changes the default value for thrift_rpc_max_message_size to 2GB (INT_MAX). This can be reduced in future when Impala can guarantee that messages work properly when split up into smaller batches. TestGracefulShutdown::test_shutdown_idle started failing with this change, because it is producing a different error message for one of the negative tests. ClientRequestState::ExecShutdownRequest() appends some extra explanation when it sees a "Network error" KRPC error, and the test expects that extra explanation. This modifies ClientRequestState::ExecShutdownRequest() to provide the extra explanation for the new error ("Timed out") as well. Testing: - Ran GVO Change-Id: Ib624201b683966a9feefb8fe45985f3d52d869fc Reviewed-on: http://gerrit.cloudera.org:8080/20394 Tested-by: Impala Public Jenkins Reviewed-by: Riza Suminto Reviewed-by: Michael Smith > Make Thrift max message size configuration > -- > > Key: IMPALA-11669 > URL: https://issues.apache.org/jira/browse/IMPALA-11669 > Project: IMPALA > Issue Type: Task > Components: Backend >Affects Versions: Impala 4.2.0 >Reporter: Joe McDonnell >Assignee: Riza Suminto >Priority: Critical > Fix For: Impala 4.2.0 > > > With the upgrade to Thrift 0.16, Thrift now has a protection against > malicious message in the form of a maximum size for messages. This is > currently set to 100MB by default. Impala should add the ability to override > this default value. In particular, it seems like communication between > coordinators and the catalogd may need a larger value. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-11957) Implement Regression functions : regr_slope(), regr_intercept() and regr_r2()
[ https://issues.apache.org/jira/browse/IMPALA-11957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17758653#comment-17758653 ] ASF subversion and git services commented on IMPALA-11957: -- Commit 20a9d2669c69f8e5b0a5c0b9487fa0212a00ad9c in impala's branch refs/heads/master from pranav.lodha [ https://gitbox.apache.org/repos/asf?p=impala.git;h=20a9d2669 ] IMPALA-11957: Implement Regression functions: regr_slope(), regr_intercept() and regr_r2() The linear regression functions fit an ordinary-least-squares regression line to a set of number pairs. They can be used both as aggregate and analytic functions. regr_slope() takes two arguments of numeric type and returns the slope of the line. regr_intercept() takes two arguments of numeric type and returns the y-intercept of the regression line. regr_r2() takes two arguments of numeric type and returns the coefficient of determination (also called R-squared or goodness of fit) for the regression. Testing: The functions are extensively tested and cross-checked with Hive. The tests can be found in aggregation.test. Change-Id: Iab6bd84ae3e0c02ec924c30183308123b951caa3 Reviewed-on: http://gerrit.cloudera.org:8080/19569 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins > Implement Regression functions : regr_slope(), regr_intercept() and regr_r2() > - > > Key: IMPALA-11957 > URL: https://issues.apache.org/jira/browse/IMPALA-11957 > Project: IMPALA > Issue Type: Sub-task >Reporter: Pranav Yogi Lodha >Assignee: Pranav Yogi Lodha >Priority: Major > > The linear regression functions fit an ordinary-least-squares regression line > to a set of number pairs which can be used both as aggregate and analytic > functions. > * regr_slope() takes two arguments of numeric type and returns the slope of > the line. > * regr_intercept() takes two arguments of numeric type and returns the > y-intercept of the regression line. > * regr_r2() takes two arguments of numeric type and returns the coefficient > of determination (also called R-squared or goodness of fit) for the > regression. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-12390) Enable performance related clang-tidy checks
[ https://issues.apache.org/jira/browse/IMPALA-12390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17758657#comment-17758657 ] ASF subversion and git services commented on IMPALA-12390: -- Commit d96341ed537a3e321d5fa6a0235ab06b5d9169a2 in impala's branch refs/heads/master from Joe McDonnell [ https://gitbox.apache.org/repos/asf?p=impala.git;h=d96341ed5 ] IMPALA-12393: Fix inconsistent hash for TimestampValue in DictEncoder Currently, DictEncoder uses the default hash function for TimestampValue, which means it is hashing the entire TimestampValue struct. This can be inconsistent, because TimestampValue contains some padding that may not be zero in some cases. For TimestampValues that are part of a Tuple, the padding is zero, so this is mainly present in test cases. This was discovered when fixing a Clang Tidy performance-for-range-copy warning by iterating with a const reference rather than making a copy of the value. DictTest.TestTimestamps became flaky with that change, because the hash was no longer consistent. The copy must have had consistent content for the padding through the iteration, but the const reference did not. This adds a template specialization of the Hash function for TimestampValue. The specialization uses TimestampValue::Hash(), which hashes only the non-padding pieces of the struct. This also includes the change to dict-test.cc that uncovered the issue. This fix is mostly to unblock IMPALA-12390. Testing: - Ran dict-test in a loop for a few hundred iterations - Hand tested inserting many timestamps into a Parquet table with dictionary encoding and verified that the performance didn't change. Change-Id: Iad86e9b0f645311c3389cf2804dcc1a346ff10a9 Reviewed-on: http://gerrit.cloudera.org:8080/20396 Tested-by: Impala Public Jenkins Reviewed-by: Daniel Becker Reviewed-by: Michael Smith > Enable performance related clang-tidy checks > > > Key: IMPALA-12390 > URL: https://issues.apache.org/jira/browse/IMPALA-12390 > Project: IMPALA > Issue Type: Improvement > Components: Backend >Affects Versions: Impala 4.3.0 >Reporter: Joe McDonnell >Assignee: Joe McDonnell >Priority: Major > > clang-tidy has several performance-related checks that seem like they would > be useful to enforce. Here are some examples: > {noformat} > /home/joemcdonnell/upstream/Impala/be/src/runtime/types.h:313:25: warning: > loop variable is copied but only used as const reference; consider making it > a const reference [performance-for-range-copy] > for (ColumnType child_type : col_type.children) { > ~~ ^ > const & > /home/joemcdonnell/upstream/Impala/be/src/catalog/catalog-util.cc:168:34: > warning: 'find' called with a string literal consisting of a single > character; consider using the more effective overload accepting a character > [performance-faster-string-find] > int pos = object_name.find("."); > ^~~~ > '.' > /home/joemcdonnell/upstream/Impala/be/src/util/decimal-util.h:55:53: warning: > the parameter 'b' is copied for each invocation but only used as a const > reference; consider making it a const reference > [performance-unnecessary-value-param] > static int256_t SafeMultiply(int256_t a, int256_t b, bool may_overflow) { > ^ > const & > /home/joemcdonnell/upstream/Impala/be/src/codegen/llvm-codegen.cc:847:5: > warning: 'push_back' is called inside a loop; consider pre-allocating the > vector capacity before the loop [performance-inefficient-vector-operation] > arguments.push_back(args_[i].type); > ^{noformat} > In all, they seem to flag things that developers wouldn't ordinarily notice, > and it doesn't seem to have too many false positives. We should look into > enabling these. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-12366) If Thrift messages are between 1GB and 2GB, the max message size will trigger
[ https://issues.apache.org/jira/browse/IMPALA-12366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17758654#comment-17758654 ] ASF subversion and git services commented on IMPALA-12366: -- Commit 81844499b51da092567c510202a4b7de81ecd8af in impala's branch refs/heads/master from Joe McDonnell [ https://gitbox.apache.org/repos/asf?p=impala.git;h=81844499b ] IMPALA-12366: Use 2GB as the default for thrift_rpc_max_message_size Thrift 0.16 implemented a limit on the max message size. In IMPALA-11669, we added the thrift_rpc_max_message_size parameter and set the default size to 1GB. Some existing clusters have needed to tune this parameter higher because their workloads use message sizes larger than 1GB (e.g. for metadata updates). Historically, Impala has been able to send and receive 2GB messages, so this changes the default value for thrift_rpc_max_message_size to 2GB (INT_MAX). This can be reduced in future when Impala can guarantee that messages work properly when split up into smaller batches. TestGracefulShutdown::test_shutdown_idle started failing with this change, because it is producing a different error message for one of the negative tests. ClientRequestState::ExecShutdownRequest() appends some extra explanation when it sees a "Network error" KRPC error, and the test expects that extra explanation. This modifies ClientRequestState::ExecShutdownRequest() to provide the extra explanation for the new error ("Timed out") as well. Testing: - Ran GVO Change-Id: Ib624201b683966a9feefb8fe45985f3d52d869fc Reviewed-on: http://gerrit.cloudera.org:8080/20394 Tested-by: Impala Public Jenkins Reviewed-by: Riza Suminto Reviewed-by: Michael Smith > If Thrift messages are between 1GB and 2GB, the max message size will trigger > - > > Key: IMPALA-12366 > URL: https://issues.apache.org/jira/browse/IMPALA-12366 > Project: IMPALA > Issue Type: Bug > Components: Backend, Frontend >Affects Versions: Impala 4.3.0 >Reporter: Joe McDonnell >Priority: Major > Fix For: Impala 4.3.0 > > > In a user cluster, we ran into a circumstance where a Thrift message was > greater than 1GB (which is the value for thrift_rpc_max_message_size). The > issue was alleviated by changing the value of thrift_rpc_max_message_size to > 32-bit int max (~2GB). We may want to simply ship with > thrift_rpc_max_message_size=2GB. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-12393) DictEncoder uses inconsistent hash function for TimestampValue
[ https://issues.apache.org/jira/browse/IMPALA-12393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17758656#comment-17758656 ] ASF subversion and git services commented on IMPALA-12393: -- Commit d96341ed537a3e321d5fa6a0235ab06b5d9169a2 in impala's branch refs/heads/master from Joe McDonnell [ https://gitbox.apache.org/repos/asf?p=impala.git;h=d96341ed5 ] IMPALA-12393: Fix inconsistent hash for TimestampValue in DictEncoder Currently, DictEncoder uses the default hash function for TimestampValue, which means it is hashing the entire TimestampValue struct. This can be inconsistent, because TimestampValue contains some padding that may not be zero in some cases. For TimestampValues that are part of a Tuple, the padding is zero, so this is mainly present in test cases. This was discovered when fixing a Clang Tidy performance-for-range-copy warning by iterating with a const reference rather than making a copy of the value. DictTest.TestTimestamps became flaky with that change, because the hash was no longer consistent. The copy must have had consistent content for the padding through the iteration, but the const reference did not. This adds a template specialization of the Hash function for TimestampValue. The specialization uses TimestampValue::Hash(), which hashes only the non-padding pieces of the struct. This also includes the change to dict-test.cc that uncovered the issue. This fix is mostly to unblock IMPALA-12390. Testing: - Ran dict-test in a loop for a few hundred iterations - Hand tested inserting many timestamps into a Parquet table with dictionary encoding and verified that the performance didn't change. Change-Id: Iad86e9b0f645311c3389cf2804dcc1a346ff10a9 Reviewed-on: http://gerrit.cloudera.org:8080/20396 Tested-by: Impala Public Jenkins Reviewed-by: Daniel Becker Reviewed-by: Michael Smith > DictEncoder uses inconsistent hash function for TimestampValue > -- > > Key: IMPALA-12393 > URL: https://issues.apache.org/jira/browse/IMPALA-12393 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 4.3.0 >Reporter: Joe McDonnell >Assignee: Joe McDonnell >Priority: Major > Fix For: Impala 4.3.0 > > > DictEncoder currently uses this hash function for TimestampValue: > {noformat} > template > inline uint32_t DictEncoder::Hash(const T& value) const { > return HashUtil::Hash(&value, sizeof(value), 0); > }{noformat} > TimestampValue has some padding, and nothing ensures that the padding is > cleared. This means that identical TimestampValue objects can hash to > different values. > This came up when fixing a Clang-Tidy performance check. This line in > dict-test.cc changed from iterating over values to iterating over const > references. > {noformat} > DictEncoder encoder(&pool, fixed_buffer_byte_size, > &track_encoder); > encoder.UsedbyTest(); > << > for (InternalType i: values) encoder.Put(i); > = > for (const InternalType& i: values) encoder.Put(i); > > > bytes_alloc = encoder.DictByteSize(); > EXPECT_EQ(track_encoder.consumption(), bytes_alloc); > EXPECT_EQ(encoder.num_entries(), values_set.size()); <{noformat} > The test became flaky, with the encoder.num_entries() being larger than the > values_set.size() for TimestampValue. This happened because the hash values > didn't match even for identical entries and the dictionary would have > multiple copies of the same value. When iterating over a plain non-reference > TimestampValue, each TimestampValue is being copied to a temporary value. > Maybe in this circumstance the padding stays the same between iterations. > It's possible this would come up when writing Parquet data files. > One fix would be to use TimestampValue's Hash function, which ignores the > padding: > {noformat} > template<> > inline uint32_t DictEncoder::Hash(const TimestampValue& > value) const { > return value.Hash(); > }{noformat} > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-12366) If Thrift messages are between 1GB and 2GB, the max message size will trigger
[ https://issues.apache.org/jira/browse/IMPALA-12366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Smith resolved IMPALA-12366. Fix Version/s: Impala 4.3.0 Resolution: Fixed > If Thrift messages are between 1GB and 2GB, the max message size will trigger > - > > Key: IMPALA-12366 > URL: https://issues.apache.org/jira/browse/IMPALA-12366 > Project: IMPALA > Issue Type: Bug > Components: Backend, Frontend >Affects Versions: Impala 4.3.0 >Reporter: Joe McDonnell >Priority: Major > Fix For: Impala 4.3.0 > > > In a user cluster, we ran into a circumstance where a Thrift message was > greater than 1GB (which is the value for thrift_rpc_max_message_size). The > issue was alleviated by changing the value of thrift_rpc_max_message_size to > 32-bit int max (~2GB). We may want to simply ship with > thrift_rpc_max_message_size=2GB. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-12393) DictEncoder uses inconsistent hash function for TimestampValue
[ https://issues.apache.org/jira/browse/IMPALA-12393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Smith resolved IMPALA-12393. Fix Version/s: Impala 4.3.0 Resolution: Fixed > DictEncoder uses inconsistent hash function for TimestampValue > -- > > Key: IMPALA-12393 > URL: https://issues.apache.org/jira/browse/IMPALA-12393 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 4.3.0 >Reporter: Joe McDonnell >Assignee: Joe McDonnell >Priority: Major > Fix For: Impala 4.3.0 > > > DictEncoder currently uses this hash function for TimestampValue: > {noformat} > template > inline uint32_t DictEncoder::Hash(const T& value) const { > return HashUtil::Hash(&value, sizeof(value), 0); > }{noformat} > TimestampValue has some padding, and nothing ensures that the padding is > cleared. This means that identical TimestampValue objects can hash to > different values. > This came up when fixing a Clang-Tidy performance check. This line in > dict-test.cc changed from iterating over values to iterating over const > references. > {noformat} > DictEncoder encoder(&pool, fixed_buffer_byte_size, > &track_encoder); > encoder.UsedbyTest(); > << > for (InternalType i: values) encoder.Put(i); > = > for (const InternalType& i: values) encoder.Put(i); > > > bytes_alloc = encoder.DictByteSize(); > EXPECT_EQ(track_encoder.consumption(), bytes_alloc); > EXPECT_EQ(encoder.num_entries(), values_set.size()); <{noformat} > The test became flaky, with the encoder.num_entries() being larger than the > values_set.size() for TimestampValue. This happened because the hash values > didn't match even for identical entries and the dictionary would have > multiple copies of the same value. When iterating over a plain non-reference > TimestampValue, each TimestampValue is being copied to a temporary value. > Maybe in this circumstance the padding stays the same between iterations. > It's possible this would come up when writing Parquet data files. > One fix would be to use TimestampValue's Hash function, which ignores the > padding: > {noformat} > template<> > inline uint32_t DictEncoder::Hash(const TimestampValue& > value) const { > return value.Hash(); > }{noformat} > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-11460) Enable ASYNC CODEGEN by default
[ https://issues.apache.org/jira/browse/IMPALA-11460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17758551#comment-17758551 ] Daniel Becker commented on IMPALA-11460: I ran a benchmark and added the results in {{{}Async_benchmark.txt{}}}. CodeGen cache was turned off for both the async and the sync case. It is a TCPH benchmark with scale factor 2. The small scale factor was chosen because async codegen is most useful for small, fast queries. Overall the benchmark shows significant improvement (-28.65%) but TPCH-Q1 had a regression of +7.87%. > Enable ASYNC CODEGEN by default > --- > > Key: IMPALA-11460 > URL: https://issues.apache.org/jira/browse/IMPALA-11460 > Project: IMPALA > Issue Type: Improvement >Reporter: Abhishek Rawat >Priority: Major > Attachments: Async_benchmark.txt > > > Would be good to do some additional testing and address any gaps and enable > the feature by default. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-11460) Enable ASYNC CODEGEN by default
[ https://issues.apache.org/jira/browse/IMPALA-11460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Becker updated IMPALA-11460: --- Attachment: Async_benchmark.txt > Enable ASYNC CODEGEN by default > --- > > Key: IMPALA-11460 > URL: https://issues.apache.org/jira/browse/IMPALA-11460 > Project: IMPALA > Issue Type: Improvement >Reporter: Abhishek Rawat >Priority: Major > Attachments: Async_benchmark.txt > > > Would be good to do some additional testing and address any gaps and enable > the feature by default. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-5081) Expose IR optimization level via query option
[ https://issues.apache.org/jira/browse/IMPALA-5081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17758478#comment-17758478 ] Daniel Becker commented on IMPALA-5081: --- Having a way to invalidate the cache would possibly be useful in testing also. How difficult would it be to do it? It it's complicated or opens up the possibility for subtle errors we should not do it. > Expose IR optimization level via query option > - > > Key: IMPALA-5081 > URL: https://issues.apache.org/jira/browse/IMPALA-5081 > Project: IMPALA > Issue Type: Improvement > Components: Backend >Reporter: Michael Ho >Assignee: Michael Smith >Priority: Minor > Labels: codegen > > Certain queries may spend a lot of time in the IR optimization. Currently, > there is a start-up option to disable optimization in LLVM. However, it may > be of inconvenience to users to have to restart the entire Impala cluster to > just use that option. This JIRA aims at exploring exposing a query option for > users to choose the optimization level for a given query (e.g. we can have a > level which just only have a dead code elimination pass or no optimization at > all). -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-12399) Pass eventTypeSkipList with OPEN_TXN in NotificationEventRequest to avoid receiving OPEN_TXN events from HMS
[ https://issues.apache.org/jira/browse/IMPALA-12399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Quanlong Huang updated IMPALA-12399: Epic Link: IMPALA-11532 > Pass eventTypeSkipList with OPEN_TXN in NotificationEventRequest to avoid > receiving OPEN_TXN events from HMS > > > Key: IMPALA-12399 > URL: https://issues.apache.org/jira/browse/IMPALA-12399 > Project: IMPALA > Issue Type: Improvement > Components: Catalog >Reporter: Venugopal Reddy K >Priority: Major > > Notification events like OPEN_TXN are ignored on catalogd > {{{}MetastoreEventsProcessor{}}}. So, we can pass eventTypeSkipList with > OPEN_TXN in NotificationEventRequest while invoking get_next_notification() > to avoid reading such notification messages from HMS and then ignoring on > catalogd. OPEN_TXN event being more frequent(received even upon describe > table operation from beeline), we can significantly reduce unwanted > processing on both HMS and catalogd. Catalogd reads events in batches of > EVENTS_BATCH_SIZE_PER_RPC, skipping such unnecessary events can help catchup > the events faster. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-12399) Pass eventTypeSkipList with OPEN_TXN in NotificationEventRequest to avoid receiving OPEN_TXN events from HMS
[ https://issues.apache.org/jira/browse/IMPALA-12399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Venugopal Reddy K updated IMPALA-12399: --- Description: Notification events like OPEN_TXN are ignored on catalogd {{{}MetastoreEventsProcessor{}}}. So, we can pass eventTypeSkipList with OPEN_TXN in NotificationEventRequest while invoking get_next_notification() to avoid reading such notification messages from HMS and then ignoring on catalogd. OPEN_TXN event being more frequent(received even upon describe table operation from beeline), we can significantly reduce unwanted processing on both HMS and catalogd. Catalogd reads events in batches of EVENTS_BATCH_SIZE_PER_RPC, skipping such unnecessary events can help catchup the events faster. (was: Notification events like OPEN_TXN are ignored on catalogd. So, we can pass eventTypeSkipList with OPEN_TXN in NotificationEventRequest while invoking get_next_notification() to avoid reading such notification messages from HMS and then ignoring on catalogd. OPEN_TXN event being more frequent(received even upon describe table operation from beeline), we can significantly reduce unwanted processing on both HMS and catalogd. Catalogd reads events in batches of EVENTS_BATCH_SIZE_PER_RPC, skipping such unnecessary events can help catchup the events faster.) > Pass eventTypeSkipList with OPEN_TXN in NotificationEventRequest to avoid > receiving OPEN_TXN events from HMS > > > Key: IMPALA-12399 > URL: https://issues.apache.org/jira/browse/IMPALA-12399 > Project: IMPALA > Issue Type: Improvement > Components: Catalog >Reporter: Venugopal Reddy K >Priority: Major > > Notification events like OPEN_TXN are ignored on catalogd > {{{}MetastoreEventsProcessor{}}}. So, we can pass eventTypeSkipList with > OPEN_TXN in NotificationEventRequest while invoking get_next_notification() > to avoid reading such notification messages from HMS and then ignoring on > catalogd. OPEN_TXN event being more frequent(received even upon describe > table operation from beeline), we can significantly reduce unwanted > processing on both HMS and catalogd. Catalogd reads events in batches of > EVENTS_BATCH_SIZE_PER_RPC, skipping such unnecessary events can help catchup > the events faster. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-12399) Pass eventTypeSkipList with OPEN_TXN in NotificationEventRequest to avoid receiving OPEN_TXN events from HMS
Venugopal Reddy K created IMPALA-12399: -- Summary: Pass eventTypeSkipList with OPEN_TXN in NotificationEventRequest to avoid receiving OPEN_TXN events from HMS Key: IMPALA-12399 URL: https://issues.apache.org/jira/browse/IMPALA-12399 Project: IMPALA Issue Type: Improvement Components: Catalog Reporter: Venugopal Reddy K Notification events like OPEN_TXN are ignored on catalogd. So, we can pass eventTypeSkipList with OPEN_TXN in NotificationEventRequest while invoking get_next_notification() to avoid reading such notification messages from HMS and then ignoring on catalogd. OPEN_TXN event being more frequent(received even upon describe table operation from beeline), we can significantly reduce unwanted processing on both HMS and catalogd. Catalogd reads events in batches of EVENTS_BATCH_SIZE_PER_RPC, skipping such unnecessary events can help catchup the events faster. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org