[jira] [Created] (IMPALA-13182) Support Uploading additional JARs to CDW Impala

2024-06-25 Thread gaurav singh (Jira)
gaurav singh created IMPALA-13182:
-

 Summary: Support Uploading additional JARs to CDW Impala
 Key: IMPALA-13182
 URL: https://issues.apache.org/jira/browse/IMPALA-13182
 Project: IMPALA
  Issue Type: Improvement
  Components: Backend
Reporter: gaurav singh
Assignee: gaurav singh


Support Uploading additional JARs to CDW Impala for use cases as below -

Adding Snowflake jar to connect to snowflake catalog. This to support reading 
remote iceberg catalog tables, snowflake catalog in this case
Adding jar to connect to custom on prem S3 storages 
-[https://cloudera.slack.com/archives/CC9DCQ3GD/p1712826317729039]
Hive LLAP Supports this today - 
[https://docs.cloudera.com/data-warehouse/cloud/bi-tools/topics/dw-hive-upload-additional-jars.html]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-12800) Queries with many nested inline views see performance issues with ExprSubstitutionMap

2024-06-25 Thread Michael Smith (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-12800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Smith resolved IMPALA-12800.

Resolution: Fixed

> Queries with many nested inline views see performance issues with 
> ExprSubstitutionMap
> -
>
> Key: IMPALA-12800
> URL: https://issues.apache.org/jira/browse/IMPALA-12800
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Frontend
>Affects Versions: Impala 4.3.0
>Reporter: Joe McDonnell
>Assignee: Michael Smith
>Priority: Critical
> Fix For: Impala 4.5.0
>
> Attachments: impala12800repro.sql, impala12800schema.sql, 
> long_query_jstacks.tar.gz
>
>
> A user running a query with many layers of inline views saw a large amount of 
> time spent in analysis. 
>  
> {noformat}
> - Authorization finished (ranger): 7s518ms (13.134ms)
> - Value transfer graph computed: 7s760ms (241.953ms)
> - Single node plan created: 2m47s (2m39s)
> - Distributed plan created: 2m47s (7.430ms)
> - Lineage info computed: 2m47s (39.017ms)
> - Planning finished: 2m47s (672.518ms){noformat}
> In reproducing it locally, we found that most of the stacks end up in 
> ExprSubstitutionMap.
>  
> Here are the main stacks seen while running jstack every 3 seconds during a 
> 75 second execution:
> Location 1: (ExprSubstitutionMap::compose -> contains -> indexOf -> Expr 
> equals) (4 samples)
> {noformat}
>    java.lang.Thread.State: RUNNABLE
>     at org.apache.impala.analysis.Expr.equals(Expr.java:1008)
>     at java.util.ArrayList.indexOf(ArrayList.java:323)
>     at java.util.ArrayList.contains(ArrayList.java:306)
>     at 
> org.apache.impala.analysis.ExprSubstitutionMap.compose(ExprSubstitutionMap.java:120){noformat}
> Location 2:  (ExprSubstitutionMap::compose -> verify -> Expr equals) (9 
> samples)
> {noformat}
>    java.lang.Thread.State: RUNNABLE
>     at org.apache.impala.analysis.Expr.equals(Expr.java:1008)
>     at 
> org.apache.impala.analysis.ExprSubstitutionMap.verify(ExprSubstitutionMap.java:173)
>     at 
> org.apache.impala.analysis.ExprSubstitutionMap.compose(ExprSubstitutionMap.java:126){noformat}
> Location 3: (ExprSubstitutionMap::combine -> verify -> Expr equals) (5 
> samples)
> {noformat}
>    java.lang.Thread.State: RUNNABLE
>     at org.apache.impala.analysis.Expr.equals(Expr.java:1008)
>     at 
> org.apache.impala.analysis.ExprSubstitutionMap.verify(ExprSubstitutionMap.java:173)
>     at 
> org.apache.impala.analysis.ExprSubstitutionMap.combine(ExprSubstitutionMap.java:143){noformat}
> Location 4:  (TupleIsNullPredicate.wrapExprs ->  Analyzer.isTrueWithNullSlots 
> -> FeSupport.EvalPredicate -> Thrift serialization) (4 samples)
> {noformat}
>    java.lang.Thread.State: RUNNABLE
>     at java.lang.StringCoding.encode(StringCoding.java:364)
>     at java.lang.String.getBytes(String.java:941)
>     at 
> org.apache.thrift.protocol.TBinaryProtocol.writeString(TBinaryProtocol.java:227)
>     at 
> org.apache.impala.thrift.TClientRequest$TClientRequestStandardScheme.write(TClientRequest.java:532)
>     at 
> org.apache.impala.thrift.TClientRequest$TClientRequestStandardScheme.write(TClientRequest.java:467)
>     at org.apache.impala.thrift.TClientRequest.write(TClientRequest.java:394)
>     at 
> org.apache.impala.thrift.TQueryCtx$TQueryCtxStandardScheme.write(TQueryCtx.java:3034)
>     at 
> org.apache.impala.thrift.TQueryCtx$TQueryCtxStandardScheme.write(TQueryCtx.java:2709)
>     at org.apache.impala.thrift.TQueryCtx.write(TQueryCtx.java:2400)
>     at org.apache.thrift.TSerializer.serialize(TSerializer.java:84)
>     at 
> org.apache.impala.service.FeSupport.EvalExprWithoutRowBounded(FeSupport.java:206)
>     at 
> org.apache.impala.service.FeSupport.EvalExprWithoutRow(FeSupport.java:194)
>     at org.apache.impala.service.FeSupport.EvalPredicate(FeSupport.java:275)
>     at 
> org.apache.impala.analysis.Analyzer.isTrueWithNullSlots(Analyzer.java:2888)
>     at 
> org.apache.impala.analysis.TupleIsNullPredicate.requiresNullWrapping(TupleIsNullPredicate.java:181)
>     at 
> org.apache.impala.analysis.TupleIsNullPredicate.wrapExpr(TupleIsNullPredicate.java:147)
>     at 
> org.apache.impala.analysis.TupleIsNullPredicate.wrapExprs(TupleIsNullPredicate.java:136){noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-13167) Impala's coordinator could not be connected after a restart in custom cluster test in the ASAN build on ARM

2024-06-25 Thread Michael Smith (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-13167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Smith reassigned IMPALA-13167:
--

Assignee: Jason Fehr  (was: Fang-Yu Rao)

> Impala's coordinator could not be connected after a restart in custom cluster 
> test in the ASAN build on ARM
> ---
>
> Key: IMPALA-13167
> URL: https://issues.apache.org/jira/browse/IMPALA-13167
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Fang-Yu Rao
>Assignee: Jason Fehr
>Priority: Minor
>  Labels: broken-build
> Fix For: Impala 4.5.0
>
>
> In an internal Jenkins run, we found that it's possible that Impala's 
> coordinator could not be connected after a restart that occurred after the 
> coordinator hit a DCHECK during the custom cluster test in the ASAN build on 
> ARM.
> Specifically, in that Jenkins run, we found that Impala's coordinator hit the 
> DCHECK in [RuntimeProfile::EventSequence::Start(int64_t 
> start_time_ns)|https://github.com/apache/impala/blob/master/be/src/util/runtime-profile-counters.h#L656]
>  while running a query in 
> [ranger_column_masking_complex_types.test|https://github.com/apache/impala/blob/master/testdata/workloads/functional-query/queries/QueryTest/ranger_column_masking_complex_types.test#L724-L732]
>  that was run by 
> [test_column_masking()|https://github.com/apache/impala/blob/master/tests/authorization/test_ranger.py#L1916].
>  This is a known issue as described in IMPALA-4631.
> Since Impala daemons and the catalog server are restarted for each test in 
> test_ranger.py, the next test run after test_column_masking() should most 
> likely be passed. However it did not seem like this. We found that for the 
> following few tests (e.g., test_block_metadata_update()) in test_ranger.py, 
> Impala's pytest framework was not able to connect to the coordinator with the 
> following error and hence those tests failed.
> {code:java}
> -- 2024-06-18 08:49:43,350 INFO MainThread: Starting cluster with 
> command: 
> /data/jenkins/workspace/impala-asf-master-core-asan-arm/repos/Impala/bin/start-impala-cluster.py
>  '--state_store_args=--statestore_update_frequency_ms=50 
> --statestore_priority_update_frequency_ms=50 
> --statestore_heartbeat_frequency_ms=50' --cluster_size=3 --num_coordinators=3 
> --log_dir=/data/jenkins/workspace/impala-asf-master-core-asan-arm/repos/Impala/logs/custom_cluster_tests
>  --log_level=1 '--impalad_args=--server-name=server1 
> --ranger_service_type=hive --ranger_app_id=impala 
> --authorization_provider=ranger ' '--state_store_args=None ' 
> '--catalogd_args=--server-name=server1 --ranger_service_type=hive 
> --ranger_app_id=impala --authorization_provider=ranger ' 
> --impalad_args=--default_query_options=
> 08:49:43 MainThread: Found 0 impalad/0 statestored/0 catalogd process(es)
> 08:49:43 MainThread: Starting State Store logging to 
> /data/jenkins/workspace/impala-asf-master-core-asan-arm/repos/Impala/logs/custom_cluster_tests/statestored.INFO
> 08:49:43 MainThread: Starting Catalog Service logging to 
> /data/jenkins/workspace/impala-asf-master-core-asan-arm/repos/Impala/logs/custom_cluster_tests/catalogd.INFO
> 08:49:44 MainThread: Starting Impala Daemon logging to 
> /data/jenkins/workspace/impala-asf-master-core-asan-arm/repos/Impala/logs/custom_cluster_tests/impalad.INFO
> 08:49:44 MainThread: Starting Impala Daemon logging to 
> /data/jenkins/workspace/impala-asf-master-core-asan-arm/repos/Impala/logs/custom_cluster_tests/impalad_node1.INFO
> 08:49:44 MainThread: Starting Impala Daemon logging to 
> /data/jenkins/workspace/impala-asf-master-core-asan-arm/repos/Impala/logs/custom_cluster_tests/impalad_node2.INFO
> 08:49:47 MainThread: Found 3 impalad/1 statestored/1 catalogd process(es)
> 08:49:47 MainThread: Found 3 impalad/1 statestored/1 catalogd process(es)
> 08:49:47 MainThread: Getting num_known_live_backends from 
> impala-ec2-rhel88-m7g-4xlarge-ondemand-1d18.vpc.cloudera.com:25000
> 08:49:47 MainThread: Debug webpage not yet available: 
> HTTPConnectionPool(host='impala-ec2-rhel88-m7g-4xlarge-ondemand-1d18.vpc.cloudera.com',
>  port=25000): Max retries exceeded with url: /backends?json (Caused by 
> NewConnectionError(' 0x8d176750>: Failed to establish a new connection: [Errno 111] Connection 
> refused',))
> 08:49:49 MainThread: Debug webpage did not become available in expected time.
> 08:49:49 MainThread: Waiting for num_known_live_backends=3. Current value: 
> None
> 08:49:50 MainThread: Found 3 impalad/1 statestored/1 catalogd process(es)
> 08:49:50 MainThread: Getting num_known_live_backends from 
> impala-ec2-rhel88-m7g-4xlarge-ondemand-1d18.vpc.cloudera.com:25000
> 08:49:50 MainThread: Waiting for num_known_live_backends=3. Current 

[jira] [Resolved] (IMPALA-13167) Impala's coordinator could not be connected after a restart in custom cluster test in the ASAN build on ARM

2024-06-25 Thread Michael Smith (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-13167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Smith resolved IMPALA-13167.

Fix Version/s: Impala 4.5.0
   Resolution: Fixed

> Impala's coordinator could not be connected after a restart in custom cluster 
> test in the ASAN build on ARM
> ---
>
> Key: IMPALA-13167
> URL: https://issues.apache.org/jira/browse/IMPALA-13167
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Fang-Yu Rao
>Assignee: Fang-Yu Rao
>Priority: Minor
>  Labels: broken-build
> Fix For: Impala 4.5.0
>
>
> In an internal Jenkins run, we found that it's possible that Impala's 
> coordinator could not be connected after a restart that occurred after the 
> coordinator hit a DCHECK during the custom cluster test in the ASAN build on 
> ARM.
> Specifically, in that Jenkins run, we found that Impala's coordinator hit the 
> DCHECK in [RuntimeProfile::EventSequence::Start(int64_t 
> start_time_ns)|https://github.com/apache/impala/blob/master/be/src/util/runtime-profile-counters.h#L656]
>  while running a query in 
> [ranger_column_masking_complex_types.test|https://github.com/apache/impala/blob/master/testdata/workloads/functional-query/queries/QueryTest/ranger_column_masking_complex_types.test#L724-L732]
>  that was run by 
> [test_column_masking()|https://github.com/apache/impala/blob/master/tests/authorization/test_ranger.py#L1916].
>  This is a known issue as described in IMPALA-4631.
> Since Impala daemons and the catalog server are restarted for each test in 
> test_ranger.py, the next test run after test_column_masking() should most 
> likely be passed. However it did not seem like this. We found that for the 
> following few tests (e.g., test_block_metadata_update()) in test_ranger.py, 
> Impala's pytest framework was not able to connect to the coordinator with the 
> following error and hence those tests failed.
> {code:java}
> -- 2024-06-18 08:49:43,350 INFO MainThread: Starting cluster with 
> command: 
> /data/jenkins/workspace/impala-asf-master-core-asan-arm/repos/Impala/bin/start-impala-cluster.py
>  '--state_store_args=--statestore_update_frequency_ms=50 
> --statestore_priority_update_frequency_ms=50 
> --statestore_heartbeat_frequency_ms=50' --cluster_size=3 --num_coordinators=3 
> --log_dir=/data/jenkins/workspace/impala-asf-master-core-asan-arm/repos/Impala/logs/custom_cluster_tests
>  --log_level=1 '--impalad_args=--server-name=server1 
> --ranger_service_type=hive --ranger_app_id=impala 
> --authorization_provider=ranger ' '--state_store_args=None ' 
> '--catalogd_args=--server-name=server1 --ranger_service_type=hive 
> --ranger_app_id=impala --authorization_provider=ranger ' 
> --impalad_args=--default_query_options=
> 08:49:43 MainThread: Found 0 impalad/0 statestored/0 catalogd process(es)
> 08:49:43 MainThread: Starting State Store logging to 
> /data/jenkins/workspace/impala-asf-master-core-asan-arm/repos/Impala/logs/custom_cluster_tests/statestored.INFO
> 08:49:43 MainThread: Starting Catalog Service logging to 
> /data/jenkins/workspace/impala-asf-master-core-asan-arm/repos/Impala/logs/custom_cluster_tests/catalogd.INFO
> 08:49:44 MainThread: Starting Impala Daemon logging to 
> /data/jenkins/workspace/impala-asf-master-core-asan-arm/repos/Impala/logs/custom_cluster_tests/impalad.INFO
> 08:49:44 MainThread: Starting Impala Daemon logging to 
> /data/jenkins/workspace/impala-asf-master-core-asan-arm/repos/Impala/logs/custom_cluster_tests/impalad_node1.INFO
> 08:49:44 MainThread: Starting Impala Daemon logging to 
> /data/jenkins/workspace/impala-asf-master-core-asan-arm/repos/Impala/logs/custom_cluster_tests/impalad_node2.INFO
> 08:49:47 MainThread: Found 3 impalad/1 statestored/1 catalogd process(es)
> 08:49:47 MainThread: Found 3 impalad/1 statestored/1 catalogd process(es)
> 08:49:47 MainThread: Getting num_known_live_backends from 
> impala-ec2-rhel88-m7g-4xlarge-ondemand-1d18.vpc.cloudera.com:25000
> 08:49:47 MainThread: Debug webpage not yet available: 
> HTTPConnectionPool(host='impala-ec2-rhel88-m7g-4xlarge-ondemand-1d18.vpc.cloudera.com',
>  port=25000): Max retries exceeded with url: /backends?json (Caused by 
> NewConnectionError(' 0x8d176750>: Failed to establish a new connection: [Errno 111] Connection 
> refused',))
> 08:49:49 MainThread: Debug webpage did not become available in expected time.
> 08:49:49 MainThread: Waiting for num_known_live_backends=3. Current value: 
> None
> 08:49:50 MainThread: Found 3 impalad/1 statestored/1 catalogd process(es)
> 08:49:50 MainThread: Getting num_known_live_backends from 
> impala-ec2-rhel88-m7g-4xlarge-ondemand-1d18.vpc.cloudera.com:25000
> 08:49:50 MainThread: Waiting for num_known_live_backends=3. 

[jira] [Created] (IMPALA-13181) Disable tuple caching for locations that have a limit

2024-06-25 Thread Joe McDonnell (Jira)
Joe McDonnell created IMPALA-13181:
--

 Summary: Disable tuple caching for locations that have a limit
 Key: IMPALA-13181
 URL: https://issues.apache.org/jira/browse/IMPALA-13181
 Project: IMPALA
  Issue Type: Bug
  Components: Frontend
Affects Versions: Impala 4.5.0
Reporter: Joe McDonnell


Statements that use a limit are non-deterministic unless there is a sort. 
Locations with limits should be marked ineligible for tuple caching.

As an example, for a hash join, suppose the build side has a limit. This means 
that the build side could vary from run to run. A requirement for our 
correctness is that all nodes agree on the contents of the build side. The 
variability of the limit is a problem for the build side, because if one node 
hits the cache and another does not, there is no guarantee that they agree on 
the contents of the build side.

Concrete example: 
{noformat}
select a.l_orderkey from (select l_orderkey from tpch_parquet.lineitem limit 
10) a, tpch_parquet.orders b where a.l_orderkey = b.o_orderkey;{noformat}
There are times when limits are deterministic or the non-determinism is 
harmless. It is safer to ban in completely at first. In a future change, this 
rule can be relaxed to allow caching in those cases.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-13168) Add README file for setting up Trino

2024-06-25 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-13168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17859989#comment-17859989
 ] 

ASF subversion and git services commented on IMPALA-13168:
--

Commit a6f285cdd5c9e94d720cbbb3d517482768ec00bb in impala's branch 
refs/heads/master from Daniel Becker
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=a6f285cdd ]

IMPALA-13168: Add README file for setting up Trino

The Impala repository contains scripts that make it easy to set up Trino
in the development environment. This commit adds the TRINO-README.md
file that describes how they can be used.

Change-Id: Ic9fea891074223475a57c8f49f788924a0929b12
Reviewed-on: http://gerrit.cloudera.org:8080/21538
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Add README file for setting up Trino
> 
>
> Key: IMPALA-13168
> URL: https://issues.apache.org/jira/browse/IMPALA-13168
> Project: IMPALA
>  Issue Type: Improvement
>Reporter: Daniel Becker
>Assignee: Daniel Becker
>Priority: Major
>
> The Impala repository contains scripts that make it easy to set up Trino in 
> the development environment. We should add a README file that describes how 
> they can be used.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-13179) Disable tuple caching when using non-deterministic functions

2024-06-25 Thread Joe McDonnell (Jira)
Joe McDonnell created IMPALA-13179:
--

 Summary: Disable tuple caching when using non-deterministic 
functions
 Key: IMPALA-13179
 URL: https://issues.apache.org/jira/browse/IMPALA-13179
 Project: IMPALA
  Issue Type: Bug
  Components: Frontend
Affects Versions: Impala 4.5.0
Reporter: Joe McDonnell


Some functions are non-deterministic, so tuple caching needs to detect those 
functions and avoid caching at locations that are non-deterministic.

There are two different pieces:
 # Correctness: If the key is constant but the results can be variable, then 
that is a correctness issue. That can happen for genuinely random functions 
like uuid(). It can happen when timestamp functions like now() are evaluated at 
runtime.
 # Performance: The frontend does constant-folding of functions that don't vary 
during executions, so something like now() might be replaced by a hard-coded 
integer. This means that the key contains something that varies frequently. 
That can be a performance issue, because we can be caching things that cannot 
be reused. This doesn't have the same correctness issue.

This ticket is focused on correctness piece. If uuid()/now()/etc are referenced 
and would be evaluated at runtime, the location should be ineligible for tuple 
caching.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-12541) Compile toolchain GCC with --enable-linker-build-id to add Build ID to binaries

2024-06-25 Thread Joe McDonnell (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-12541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joe McDonnell resolved IMPALA-12541.

Fix Version/s: Impala 4.5.0
   Resolution: Fixed

> Compile toolchain GCC with --enable-linker-build-id to add Build ID to 
> binaries
> ---
>
> Key: IMPALA-12541
> URL: https://issues.apache.org/jira/browse/IMPALA-12541
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Infrastructure
>Affects Versions: Impala 4.4.0
>Reporter: Joe McDonnell
>Assignee: Joe McDonnell
>Priority: Major
> Fix For: Impala 4.5.0
>
>
> A "Build ID" is a unique identifier for binaries (which is a hash of the 
> contents). Producing OS packages with separate debug symbols requires each 
> binary to have a Build ID. This is particularly important for libstdc++, 
> because it is produced during the native-toolchain build rather than the 
> regular Impala build. To turn on Build IDs, one can configure that at GCC 
> build time by specifying "--enable-linker-build-id". This causes GCC to tell 
> the linker to compute the Build ID.
> Breakpad will also use the Build ID when resolving symbols.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-12541) Compile toolchain GCC with --enable-linker-build-id to add Build ID to binaries

2024-06-25 Thread Joe McDonnell (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-12541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17859971#comment-17859971
 ] 

Joe McDonnell commented on IMPALA-12541:


{noformat}
commit e78b0ef34241218cda7eac3b526cb6a824596df1
Author: Joe McDonnell 
Date:   Fri Nov 3 14:18:47 2023 -0700    IMPALA-12541: Build GCC with 
--enable-linker-build-id
    
    This builds GCC with --enable-linker-build-id so that
    binaries have Build ID specified. Build ID is needed to
    produce OS packages with separate debuginfo. This is
    particularly important for libstdc++, because it is
    not built as part of the regular Impala build.
    
    Testing:
     - Verified that resulting binaries have .note.gnu.build-id
    
    Change-Id: Ieb2017ba1a348a9e9e549fa3268635afa94ae6d0
    Reviewed-on: http://gerrit.cloudera.org:8080/21469
    Reviewed-by: Michael Smith 
    Reviewed-by: Laszlo Gaal 
    Tested-by: Joe McDonnell 
{noformat}

> Compile toolchain GCC with --enable-linker-build-id to add Build ID to 
> binaries
> ---
>
> Key: IMPALA-12541
> URL: https://issues.apache.org/jira/browse/IMPALA-12541
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Infrastructure
>Affects Versions: Impala 4.4.0
>Reporter: Joe McDonnell
>Assignee: Joe McDonnell
>Priority: Major
>
> A "Build ID" is a unique identifier for binaries (which is a hash of the 
> contents). Producing OS packages with separate debug symbols requires each 
> binary to have a Build ID. This is particularly important for libstdc++, 
> because it is produced during the native-toolchain build rather than the 
> regular Impala build. To turn on Build IDs, one can configure that at GCC 
> build time by specifying "--enable-linker-build-id". This causes GCC to tell 
> the linker to compute the Build ID.
> Breakpad will also use the Build ID when resolving symbols.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-13121) Move the toolchain to a newer version of ccache

2024-06-25 Thread Joe McDonnell (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-13121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17859962#comment-17859962
 ] 

Joe McDonnell commented on IMPALA-13121:


{noformat}
commit b9167e985c69fd321e9e25e5ae0c7747682f06f6
Author: Joe McDonnell 
Date:   Fri May 31 15:20:20 2024 -0700    IMPALA-13121: Switch to ccache 3.7.12
    
    The docker images currently build and use ccache 3.3.3.
    Recently, we ran into a case where debuginfo was being
    generated even though the cflags ended with -g0. The
    ccache release history has this note for 3.3.5:
     - Fixed a regression where the original order of
       debug options could be lost.
    
    This upgrades ccache to 3.7.12 to address this issue.
    
    Ccache 3.7.12 is the last ccache release that builds
    using autotools. Ccache 4 moves to build with CMake.
    Adding a CMake dependency would be complicated at this
    stage, because some of the older OSes don't provide a
    new enough CMake in the package repositories. Since we
    don't really need the new features of Ccache 4+, this
    sticks with 3.7.12 for now.
    
    This reenables the check_ccache_works() logic in
    assert-dependencies-present.py.
    
    Testing:
     - Built docker images and ran a toolchain build
     - The newer ccache resolves the unexpected debuginfo issue
    
    Change-Id: I90d751445daa0dc298b634c1049d637a14afac40
    Reviewed-on: http://gerrit.cloudera.org:8080/21473
    Reviewed-by: Michael Smith 
    Reviewed-by: Laszlo Gaal 
    Tested-by: Joe McDonnell 
{noformat}

> Move the toolchain to a newer version of ccache
> ---
>
> Key: IMPALA-13121
> URL: https://issues.apache.org/jira/browse/IMPALA-13121
> Project: IMPALA
>  Issue Type: Task
>  Components: Infrastructure
>Affects Versions: Impala 4.5.0
>Reporter: Joe McDonnell
>Assignee: Joe McDonnell
>Priority: Major
>
> The native-toolchain currently uses ccache 3.3.3. In a recent change adding 
> debug info, I ran into a case where the debug level was not what I expected. 
> I had added a -g0 at the end to turn off debug information for the cmake 
> build, but it still ended up with debug info.
> The release notes for ccache 3.3.5 says this:
>  * Fixed a regression where the original order of debug options could be 
> lost. This reverts the “Improved parsing of {{-g*}} options” feature in 
> ccache 3.3.
> [https://ccache.dev/releasenotes.html#_ccache_3_3_5]
> I think I may have been hitting that. We should upgrade ccache to a more 
> recent version.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-13121) Move the toolchain to a newer version of ccache

2024-06-25 Thread Joe McDonnell (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-13121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joe McDonnell resolved IMPALA-13121.

Fix Version/s: Impala 4.5.0
   Resolution: Fixed

> Move the toolchain to a newer version of ccache
> ---
>
> Key: IMPALA-13121
> URL: https://issues.apache.org/jira/browse/IMPALA-13121
> Project: IMPALA
>  Issue Type: Task
>  Components: Infrastructure
>Affects Versions: Impala 4.5.0
>Reporter: Joe McDonnell
>Assignee: Joe McDonnell
>Priority: Major
> Fix For: Impala 4.5.0
>
>
> The native-toolchain currently uses ccache 3.3.3. In a recent change adding 
> debug info, I ran into a case where the debug level was not what I expected. 
> I had added a -g0 at the end to turn off debug information for the cmake 
> build, but it still ended up with debug info.
> The release notes for ccache 3.3.5 says this:
>  * Fixed a regression where the original order of debug options could be 
> lost. This reverts the “Improved parsing of {{-g*}} options” feature in 
> ccache 3.3.
> [https://ccache.dev/releasenotes.html#_ccache_3_3_5]
> I think I may have been hitting that. We should upgrade ccache to a more 
> recent version.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-13146) Javascript tests sometimes fail to download NodeJS

2024-06-25 Thread Joe McDonnell (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-13146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joe McDonnell resolved IMPALA-13146.

Fix Version/s: Impala 4.5.0
   Resolution: Fixed

> Javascript tests sometimes fail to download NodeJS
> --
>
> Key: IMPALA-13146
> URL: https://issues.apache.org/jira/browse/IMPALA-13146
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 4.5.0
>Reporter: Joe McDonnell
>Assignee: Joe McDonnell
>Priority: Critical
>  Labels: broken-build, flaky
> Fix For: Impala 4.5.0
>
>
> For automated tests, sometimes the Javascript tests fail to download NodeJS:
> {noformat}
> 01:37:16 Fetching NodeJS v16.20.2-linux-x64 binaries ...
> 01:37:16   % Total% Received % Xferd  Average Speed   TimeTime 
> Time  Current
> 01:37:16  Dload  Upload   Total   Spent
> Left  Speed
> 01:37:16 
>   0 00 00 0  0  0 --:--:-- --:--:-- --:--:-- 0
>   0 00 00 0  0  0 --:--:--  0:00:01 --:--:-- 0
>   0 00 00 0  0  0 --:--:--  0:00:02 --:--:-- 0
>   0 21.5M0   9020 0293  0 21:23:04  0:00:03 21:23:01   293
> ...
>  30 21.5M   30 6776k    0     0  50307      0  0:07:28  0:02:17  0:05:11 23826
> 01:39:34 curl: (18) transfer closed with 15617860 bytes remaining to 
> read{noformat}
> If this keeps happening, we should mirror the NodeJS binary on the 
> native-toolchain s3 bucket.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-12771) Impala catalogd events-skipped may mark the wrong number

2024-06-25 Thread Maxwell Guo (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-12771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17859856#comment-17859856
 ] 

Maxwell Guo commented on IMPALA-12771:
--

ping [~stigahuang][~mylogi...@gmail.com]

> Impala catalogd events-skipped may mark the wrong number
> 
>
> Key: IMPALA-12771
> URL: https://issues.apache.org/jira/browse/IMPALA-12771
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Reporter: Maxwell Guo
>Assignee: Maxwell Guo
>Priority: Minor
>
> See the description of [event-skipped 
> metric|https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java#L237]
>  
> {code:java}
>  // total number of events which are skipped because of the flag setting or
>   // in case of [CREATE|DROP] events on [DATABASE|TABLE|PARTITION] which were 
> ignored
>   // because the [DATABASE|TABLE|PARTITION] was already [PRESENT|ABSENT] in 
> the catalogd.
> {code}
>  
> As for CREATE and DROP event on Database/Table/Partition (Also AddPartition 
> is inclued) when we found that the table/database when the database or table 
> is not found in the cache then we will skip the event process and make the 
> event-skipped metric +1.
> But I found that there is some question here for alter table and Reload event:
> * For Reload event that is not describe in the description of events-skipped, 
> but the value is +1 when is oldevent;
> * Besides if the table is in blacklist the metric will also +1
> In summary, I think this description is inconsistent with the actual 
> implementation.
> So can we also mark the events-skipped metric for alter partition events and 
> modify the 
> description  to be all the events skipped 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-13178) Flush the metadata cache to remote storage instead of just invalidating them in full GCs

2024-06-25 Thread Quanlong Huang (Jira)
Quanlong Huang created IMPALA-13178:
---

 Summary: Flush the metadata cache to remote storage instead of 
just invalidating them in full GCs
 Key: IMPALA-13178
 URL: https://issues.apache.org/jira/browse/IMPALA-13178
 Project: IMPALA
  Issue Type: Improvement
  Components: Catalog
Reporter: Quanlong Huang
Assignee: Quanlong Huang


When invalidate_tables_on_memory_pressure is enabled, catalogd will invalidate 
10% (configured by invalidate_tables_fraction_on_memory_pressure) of the tables 
if the old gen usage of JVM still exceeds 60% (configured by 
invalidate_tables_gc_old_gen_full_threshold) after a full GC.

Later if the table is used again, catalogd will try to load its metadata. The 
loading process could also lead to OOM (see IMPALA-13117).

On the other hand, the metadata might have no changes so it's a waste to evict 
and reload them again. Fetching all the partitions from HMS and file listing on 
the storage are expensive. It'd be better to flush out the metadata cache of a 
table instead of just invalidating it. If there are no more invalidates (either 
implicit ones from HMS event processing or explicit ones from user commands) on 
the table, we can reuse the flushed metadata.

They can be flushed to the remote storage (e.g. HDFS/Ozone/S3) so catalogd has 
unlimited space to use. We can consider just flushing out the 
encodedFileDescriptors (the file metadata) and incremental stats which are 
usually the majority of the metadata cache. Or use a well-defined format (e.g. 
Iceberg manifest files) so we can incrementally flush the metadata even with 
catalog changes (DDL/DMLs).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-13142) Documentation for Impala StateStore & Catalogd HA

2024-06-25 Thread Sanjana Malhotra (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-13142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sanjana Malhotra updated IMPALA-13142:
--
Summary: Documentation for Impala StateStore & Catalogd HA  (was: 
Documentation for Impala StateStore HA)

> Documentation for Impala StateStore & Catalogd HA
> -
>
> Key: IMPALA-13142
> URL: https://issues.apache.org/jira/browse/IMPALA-13142
> Project: IMPALA
>  Issue Type: Documentation
>Reporter: Sanjana Malhotra
>Assignee: Sanjana Malhotra
>Priority: Major
>
> IMPALA-12156



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-13117) Improve the heap usage during metadata loading and DDL/DML executions

2024-06-25 Thread Quanlong Huang (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-13117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17859823#comment-17859823
 ] 

Quanlong Huang commented on IMPALA-13117:
-

Ideally the overhead of metadata loading, i.e. temp objects created during 
metadata loading, should be negligible comparing to the HdfsTable itself. 
However, a heap dump during the metadata loading reveals that we are holding 
the FileDescriptor objects until the parallel file metadata loading finishes.

!Selection_125.png|width=561,height=365!

Note that the table has small files issue so the memory space is mostly 
occupied by file metadata. Each FileDescriptor object takes 256B. The 
encodedFileDescriptor (the byte array inside it) just takes 160B. 

The FileDescriptors are unwrapped after all the loads on all partitions are 
finished:
[https://github.com/apache/impala/blob/6632fd00e17867c9f8f40d6905feafa049368a98/fe/src/main/java/org/apache/impala/catalog/ParallelFileMetadataLoader.java#L161]
[https://github.com/apache/impala/blob/6632fd00e17867c9f8f40d6905feafa049368a98/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java#L1585-L1586]

This introduces an overhead of 60% mem space during metadata loading comparing 
to the actual space needed to cache the metadata. We should unwrap the 
FileDescriptors in time just after they are generated.

> Improve the heap usage during metadata loading and DDL/DML executions
> -
>
> Key: IMPALA-13117
> URL: https://issues.apache.org/jira/browse/IMPALA-13117
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Catalog
>Reporter: Quanlong Huang
>Assignee: Quanlong Huang
>Priority: Critical
>  Labels: catalog-2024
> Attachments: Selection_125.png
>
>
> The JVM heap size of catalogd is not just used by the metadata cache. The 
> in-progress metadata loading threads and DDL/DML executions also creates temp 
> objects, which introduces spikes in the heap usage. We should improve the 
> heap usage in this part, especially when the metadata loading is slow due to 
> external slowness (e.g. listing files on S3).
> CC [~mylogi...@gmail.com] 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-13117) Improve the heap usage during metadata loading and DDL/DML executions

2024-06-25 Thread Quanlong Huang (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-13117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Quanlong Huang updated IMPALA-13117:

Attachment: Selection_125.png

> Improve the heap usage during metadata loading and DDL/DML executions
> -
>
> Key: IMPALA-13117
> URL: https://issues.apache.org/jira/browse/IMPALA-13117
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Catalog
>Reporter: Quanlong Huang
>Assignee: Quanlong Huang
>Priority: Critical
>  Labels: catalog-2024
> Attachments: Selection_125.png
>
>
> The JVM heap size of catalogd is not just used by the metadata cache. The 
> in-progress metadata loading threads and DDL/DML executions also creates temp 
> objects, which introduces spikes in the heap usage. We should improve the 
> heap usage in this part, especially when the metadata loading is slow due to 
> external slowness (e.g. listing files on S3).
> CC [~mylogi...@gmail.com] 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-13177) Compress encodedFileDescriptors inside the same partition

2024-06-25 Thread Quanlong Huang (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-13177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Quanlong Huang updated IMPALA-13177:

Description: 
File names under a table usually share some substrings, e.g. query id, job id, 
task id, etc. We can compress them to save some memory space. Especially in the 
case of small files issue, the memory footprint of the metadata cache is 
occupied by encodedFileDescriptors.

An experiment shows that an HdfsTable with 67708 partitions and 3167561 files 
on S3 takes 605MB. 80% of it is spent in encodedFileDescriptors. Each 
encodedFileDescriptor is a byte array that takes 160B. Codes:
[https://github.com/apache/impala/blob/6632fd00e17867c9f8f40d6905feafa049368a98/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java#L723]

Files of that table are created by Spark jobs. Here are some file names inside 
the same partition:
{noformat}
part-0-14015d2b-b534-4747-8c42-c83a7af0f006-71fda97e-a41d-488f-aa15-6fd9112b6c5b.c000
part-1-14015d2b-b534-4747-8c42-c83a7af0f006-71fda97e-a41d-488f-aa15-6fd9112b6c5b.c000
part-2-14015d2b-b534-4747-8c42-c83a7af0f006-71fda97e-a41d-488f-aa15-6fd9112b6c5b.c000
part-3-14015d2b-b534-4747-8c42-c83a7af0f006-71fda97e-a41d-488f-aa15-6fd9112b6c5b.c000
part-4-14015d2b-b534-4747-8c42-c83a7af0f006-71fda97e-a41d-488f-aa15-6fd9112b6c5b.c000
part-5-14015d2b-b534-4747-8c42-c83a7af0f006-71fda97e-a41d-488f-aa15-6fd9112b6c5b.c000
part-6-14015d2b-b534-4747-8c42-c83a7af0f006-71fda97e-a41d-488f-aa15-6fd9112b6c5b.c000
part-7-14015d2b-b534-4747-8c42-c83a7af0f006-71fda97e-a41d-488f-aa15-6fd9112b6c5b.c000
part-8-14015d2b-b534-4747-8c42-c83a7af0f006-71fda97e-a41d-488f-aa15-6fd9112b6c5b.c000
part-9-14015d2b-b534-4747-8c42-c83a7af0f006-71fda97e-a41d-488f-aa15-6fd9112b6c5b.c000
part-00010-14015d2b-b534-4747-8c42-c83a7af0f006-71fda97e-a41d-488f-aa15-6fd9112b6c5b.c000
part-00011-14015d2b-b534-4747-8c42-c83a7af0f006-71fda97e-a41d-488f-aa15-6fd9112b6c5b.c000
part-00012-14015d2b-b534-4747-8c42-c83a7af0f006-71fda97e-a41d-488f-aa15-6fd9112b6c5b.c000
part-00013-14015d2b-b534-4747-8c42-c83a7af0f006-71fda97e-a41d-488f-aa15-6fd9112b6c5b.c000
part-00014-14015d2b-b534-4747-8c42-c83a7af0f006-71fda97e-a41d-488f-aa15-6fd9112b6c5b.c000
part-00015-14015d2b-b534-4747-8c42-c83a7af0f006-71fda97e-a41d-488f-aa15-6fd9112b6c5b.c000
 {noformat}
By compressing the encodedFileDescriptors inside the same partition, we should 
be able to save a significant memory space in this case. Compressing all of 
them inside the same table might be even better, but it impacts the performance 
when coordinator loading specific partitions from catalogd.

We can consider only do this for partitions whose number of files exceeds a 
threshold (e.g. 10).

  was:
File names under a table usually share some substrings, e.g. query id, job id, 
task id, etc. We can compress them to save some memory space. Especially in the 
case of small files issue, the memory footprint of the metadata cache is 
occupied by encodedFileDescriptors.

An experiment shows that an HdfsTable with 67708 partitions and 3167561 files 
on S3 takes 605MB. 80% of it is spent in encodedFileDescriptors. Each 
encodedFileDescriptor is a byte array that takes 160B. Codes:
[https://github.com/apache/impala/blob/6632fd00e17867c9f8f40d6905feafa049368a98/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java#L723]

Files of that table are created by Spark jobs. An example file name: 
part-6-f7e5265d-5a63-4477-8954-ac6cbaef553b-face6153-588c-4b44-a277-2836396bc57a.c000
Here are some file names inside the same partition:
!Selection_124.png|width=410,height=172!

By compressing the encodedFileDescriptors inside the same partition, we should 
be able to save a significant memory space in this case. Compressing all of 
them inside the same table might be even better, but it impacts the performance 
when coordinator loading specific partitions from catalogd.

We can consider only do this for partitions whose number of files exceeds a 
threshold (e.g. 10).


> Compress encodedFileDescriptors inside the same partition
> -
>
> Key: IMPALA-13177
> URL: https://issues.apache.org/jira/browse/IMPALA-13177
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Catalog
>Reporter: Quanlong Huang
>Assignee: Quanlong Huang
>Priority: Critical
>  Labels: catalog-2024
> Attachments: Selection_124.png
>
>
> File names under a table usually share some substrings, e.g. query id, job 
> id, task id, etc. We can compress them to save some memory space. Especially 
> in the case of small files issue, the memory footprint of the metadata cache 
> is occupied by encodedFileDescriptors.
> An experiment shows that an HdfsTable with 67708 partitions and 3167561 files 
> on S3