[jira] [Resolved] (IMPALA-8154) Disable auth_to_local by default
[ https://issues.apache.org/jira/browse/IMPALA-8154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Ho resolved IMPALA-8154. Resolution: Fixed Fix Version/s: Impala 3.2.0 > Disable auth_to_local by default > > > Key: IMPALA-8154 > URL: https://issues.apache.org/jira/browse/IMPALA-8154 > Project: IMPALA > Issue Type: Bug > Components: Distributed Exec >Affects Versions: Impala 2.12.0, Impala 3.1.0 >Reporter: Michael Ho >Assignee: Michael Ho >Priority: Major > Fix For: Impala 3.2.0 > > > Before KRPC the local name mapping was done from the principal name entirely, > however when KRPC is enabled Impala starts to use the system auth_to_local > rules, "use_system_auth_to_local" is enabled by default. This can cause > regression in cases where localauth is configured in the krb5.conf. This may > cause issue for connection between Impalad after [this > commit|https://github.com/apache/impala/commit/5c541b960491ba91533712144599fb3b6d99521d] > The fix is to disable use_system_auth_to_local by default. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-8154) Disable auth_to_local by default
[ https://issues.apache.org/jira/browse/IMPALA-8154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Ho resolved IMPALA-8154. Resolution: Fixed Fix Version/s: Impala 3.2.0 > Disable auth_to_local by default > > > Key: IMPALA-8154 > URL: https://issues.apache.org/jira/browse/IMPALA-8154 > Project: IMPALA > Issue Type: Bug > Components: Distributed Exec >Affects Versions: Impala 2.12.0, Impala 3.1.0 >Reporter: Michael Ho >Assignee: Michael Ho >Priority: Major > Fix For: Impala 3.2.0 > > > Before KRPC the local name mapping was done from the principal name entirely, > however when KRPC is enabled Impala starts to use the system auth_to_local > rules, "use_system_auth_to_local" is enabled by default. This can cause > regression in cases where localauth is configured in the krb5.conf. This may > cause issue for connection between Impalad after [this > commit|https://github.com/apache/impala/commit/5c541b960491ba91533712144599fb3b6d99521d] > The fix is to disable use_system_auth_to_local by default. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (IMPALA-8214) Bad plan in load_nested.py
[ https://issues.apache.org/jira/browse/IMPALA-8214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Armstrong resolved IMPALA-8214. --- Resolution: Fixed Fix Version/s: Impala 3.2.0 > Bad plan in load_nested.py > -- > > Key: IMPALA-8214 > URL: https://issues.apache.org/jira/browse/IMPALA-8214 > Project: IMPALA > Issue Type: Bug > Components: Infrastructure >Affects Versions: Impala 3.1.0 >Reporter: Tim Armstrong >Assignee: Tim Armstrong >Priority: Major > Fix For: Impala 3.2.0 > > > The plan for the below SQL, which is executed without stats, has the larger > input on the build side of the join and does a broadcast join, which is very > suboptimal. This causes high memory consumption when loading larger scale > factors, and generally makes the loading process slower than necessary. We > should flip the join and make it a shuffle join. > https://github.com/apache/impala/blob/d481cd4/testdata/bin/load_nested.py#L123 > {code} > tmp_customer_sql = r""" > SELECT > c_custkey, c_name, c_address, c_nationkey, c_phone, c_acctbal, > c_mktsegment, > c_comment, > GROUP_CONCAT( > CONCAT( > CAST(o_orderkey AS STRING), '\003', > CAST(o_orderstatus AS STRING), '\003', > CAST(o_totalprice AS STRING), '\003', > CAST(o_orderdate AS STRING), '\003', > CAST(o_orderpriority AS STRING), '\003', > CAST(o_clerk AS STRING), '\003', > CAST(o_shippriority AS STRING), '\003', > CAST(o_comment AS STRING), '\003', > CAST(lineitems_string AS STRING) > ), '\002' > ) orders_string > FROM {source_db}.customer > LEFT JOIN tmp_orders_string ON c_custkey = o_custkey > WHERE c_custkey % {chunks} = {chunk_idx} > GROUP BY 1, 2, 3, 4, 5, 6, 7, 8""".format(**sql_params) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (IMPALA-8214) Bad plan in load_nested.py
[ https://issues.apache.org/jira/browse/IMPALA-8214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Armstrong resolved IMPALA-8214. --- Resolution: Fixed Fix Version/s: Impala 3.2.0 > Bad plan in load_nested.py > -- > > Key: IMPALA-8214 > URL: https://issues.apache.org/jira/browse/IMPALA-8214 > Project: IMPALA > Issue Type: Bug > Components: Infrastructure >Affects Versions: Impala 3.1.0 >Reporter: Tim Armstrong >Assignee: Tim Armstrong >Priority: Major > Fix For: Impala 3.2.0 > > > The plan for the below SQL, which is executed without stats, has the larger > input on the build side of the join and does a broadcast join, which is very > suboptimal. This causes high memory consumption when loading larger scale > factors, and generally makes the loading process slower than necessary. We > should flip the join and make it a shuffle join. > https://github.com/apache/impala/blob/d481cd4/testdata/bin/load_nested.py#L123 > {code} > tmp_customer_sql = r""" > SELECT > c_custkey, c_name, c_address, c_nationkey, c_phone, c_acctbal, > c_mktsegment, > c_comment, > GROUP_CONCAT( > CONCAT( > CAST(o_orderkey AS STRING), '\003', > CAST(o_orderstatus AS STRING), '\003', > CAST(o_totalprice AS STRING), '\003', > CAST(o_orderdate AS STRING), '\003', > CAST(o_orderpriority AS STRING), '\003', > CAST(o_clerk AS STRING), '\003', > CAST(o_shippriority AS STRING), '\003', > CAST(o_comment AS STRING), '\003', > CAST(lineitems_string AS STRING) > ), '\002' > ) orders_string > FROM {source_db}.customer > LEFT JOIN tmp_orders_string ON c_custkey = o_custkey > WHERE c_custkey % {chunks} = {chunk_idx} > GROUP BY 1, 2, 3, 4, 5, 6, 7, 8""".format(**sql_params) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-7161) Bootstrap's handling of JAVA_HOME needs improvement
[ https://issues.apache.org/jira/browse/IMPALA-7161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16773654#comment-16773654 ] ASF subversion and git services commented on IMPALA-7161: - Commit 6dabf9d9d3f6b4dc9ba287bbc8d712186589597f in impala's branch refs/heads/2.x from Joe McDonnell [ https://gitbox.apache.org/repos/asf?p=impala.git;h=6dabf9d ] IMPALA-7161: Fix impala-config.sh's handling of JAVA_HOME It is common for developers to specify JAVA_HOME in bin/impala-config-local.sh, so wait until after it is sourced to validate JAVA_HOME. Also, try harder to auto-detect the system's JAVA_HOME in case it has not been specified in the environment. Here is a run through of different scenarios: 1. Not set in environment, not set in impala-config-local.sh: Didn't work before, now tries to autodetect by looking for javac on the PATH 2. Set in environment, not set in impala-config-local.sh: No change 3. Not set in environment, set in impala-config-local.sh: Didn't work before, now works 4. Set in environment and set in impala-config-local.sh: This used to be potentially inconsistent (i.e. JAVA comes from the environment's JAVA_HOME, but JAVA_HOME is overwritten by impala-config-local.sh), now it always uses the value from impala-config-local.sh. Change-Id: Idf3521b4f44fdbdc841a90fd00c477c9423a75bb Reviewed-on: http://gerrit.cloudera.org:8080/10702 Reviewed-by: Philip Zeyliger Tested-by: Impala Public Jenkins > Bootstrap's handling of JAVA_HOME needs improvement > --- > > Key: IMPALA-7161 > URL: https://issues.apache.org/jira/browse/IMPALA-7161 > Project: IMPALA > Issue Type: Bug > Components: Infrastructure >Affects Versions: Impala 2.13.0, Impala 3.1.0 >Reporter: Joe McDonnell >Assignee: Joe McDonnell >Priority: Major > Fix For: Impala 3.1.0 > > > bin/bootstrap_system.sh installs the Java SDK and sets JAVA_HOME in the > current shell. It also adds a command to the bin/impala-config-local.sh to > export JAVA_HOME there. This doesn't do the job. > bin/impala-config.sh tests for JAVA_HOME at the very start of the script, > before it has sourced bin/impala-config-local.sh. So, the user doesn't have a > way of developing over the long term without manually setting up JAVA_HOME. > bin/impala-config.sh also doesn't detect the system JAVA_HOME. For Ubuntu > 16.04, this is fairly simple and if a developer has their system JDK set up > appropriately, it would make sense to use it. For example: > > {noformat} > # If javac exists, then the system has a Java SDK (JRE does not have javac). > # Follow the symbolic links and use this to determine the system's JAVA_HOME. > if [ -L /usr/bin/javac ]; then > SYSTEM_JAVA_HOME=$(readlink -f /usr/bin/javac | sed "s:bin/javac::") > fi > export JAVA_HOME="${JAVA_HOME:-${SYSTEM_JAVA_HOME}}"{noformat} > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-8222) Timeout calculation in stress test doesn't make sense
[ https://issues.apache.org/jira/browse/IMPALA-8222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Armstrong resolved IMPALA-8222. --- Resolution: Fixed Fix Version/s: Impala 3.2.0 > Timeout calculation in stress test doesn't make sense > - > > Key: IMPALA-8222 > URL: https://issues.apache.org/jira/browse/IMPALA-8222 > Project: IMPALA > Issue Type: Bug > Components: Infrastructure >Reporter: Tim Armstrong >Assignee: Tim Armstrong >Priority: Major > Fix For: Impala 3.2.0 > > > There is some logic in the stress test that tries to guess what a reasonable > timeout for a query is. There are enough fudge factors that the false > positive rate is fairly low, but it also doesn't provide much useful coverage > unless a query is stuck. But an overall job timeout achieves the same thing. > Some specific issues that the current logic has (and which are tricky to > solve): > * The number of concurrent queries is calculated at query submission time. > E.g. a query that starts before a large batch of other queries is submitted > will be given a short timeout multiplier. > * There is no guarantee that performance degrades linearly. E.g. if runtime > filters arrive late, we can see much larger perf hits. > We should consider removing the timeout enforcement or at least revisit it. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (IMPALA-7119) HBase tests failing with RetriesExhausted and "RuntimeException: couldn't retrieve HBase table"
[ https://issues.apache.org/jira/browse/IMPALA-7119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16773655#comment-16773655 ] ASF subversion and git services commented on IMPALA-7119: - Commit 9fdb93987cf13f346ad56c1b273a1e0fed86fd10 in impala's branch refs/heads/2.x from Joe McDonnell [ https://gitbox.apache.org/repos/asf?p=impala.git;h=9fdb939 ] IMPALA-7119: Restart whole minicluster when HDFS replication stalls After loading data, we wait for HDFS to replicate all of the blocks appropriately. If this takes too long, we restart HDFS. However, HBase can fail if HDFS is restarted and HBase is unable to write its logs. In general, there is no real reason to keep HBase and the other minicluster components running while restarting HDFS. This changes the HDFS health check to restart the whole minicluster and Impala rather than just HDFS. Testing: - Tested with a modified version that always does the restart in the HDFS health check and verified that the tests pass Change-Id: I58ffe301708c78c26ee61aa754a06f46c224c6e2 Reviewed-on: http://gerrit.cloudera.org:8080/10665 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins > HBase tests failing with RetriesExhausted and "RuntimeException: couldn't > retrieve HBase table" > --- > > Key: IMPALA-7119 > URL: https://issues.apache.org/jira/browse/IMPALA-7119 > Project: IMPALA > Issue Type: Bug >Affects Versions: Impala 2.13.0 >Reporter: Tim Armstrong >Assignee: Joe McDonnell >Priority: Major > Labels: broken-build, flaky > Fix For: Impala 3.1.0 > > > 64820211a2d30238093f1c4cd03bc268e3a01638 > {noformat} > > metadata.test_compute_stats.TestHbaseComputeStats.test_hbase_compute_stats_incremental[exec_option: > {'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 5000, > 'disable_codegen': False, 'abort_on_error': 1, 'debug_action': None, > 'exec_single_node_rows_threshold': 0} | table_format: hbase/none] > > metadata.test_compute_stats.TestHbaseComputeStats.test_hbase_compute_stats[exec_option: > {'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 5000, > 'disable_codegen': False, 'abort_on_error': 1, 'debug_action': None, > 'exec_single_node_rows_threshold': 0} | table_format: hbase/none] > query_test.test_mt_dop.TestMtDop.test_mt_dop[mt_dop: 1 | exec_option: > {'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, > 'disable_codegen': False, 'abort_on_error': 1, 'debug_action': None, > 'exec_single_node_rows_threshold': 0} | table_format: hbase/none] > query_test.test_mt_dop.TestMtDop.test_compute_stats[mt_dop: 1 | > exec_option: {'batch_size': 0, 'num_nodes': 0, > 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, > 'abort_on_error': 1, 'debug_action': None, 'exec_single_node_rows_threshold': > 0} | table_format: hbase/none] > > query_test.test_hbase_queries.TestHBaseQueries.test_hbase_scan_node[exec_option: > {'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, > 'disable_codegen': False, 'abort_on_error': 1, 'debug_action': None, > 'exec_single_node_rows_threshold': 0} | table_format: hbase/none] > query_test.test_queries.TestHdfsQueries.test_file_partitions[exec_option: > {'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, > 'disable_codegen': False, 'abort_on_error': 1, 'debug_action': None, > 'exec_single_node_rows_threshold': 0} | table_format: hbase/none] > query_test.test_mt_dop.TestMtDop.test_mt_dop[mt_dop: 0 | exec_option: > {'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, > 'disable_codegen': False, 'abort_on_error': 1, 'debug_action': None, > 'exec_single_node_rows_threshold': 0} | table_format: hbase/none] > query_test.test_observability.TestObservability.test_scan_summary > query_test.test_mt_dop.TestMtDop.test_compute_stats[mt_dop: 0 | > exec_option: {'batch_size': 0, 'num_nodes': 0, > 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, > 'abort_on_error': 1, 'debug_action': None, 'exec_single_node_rows_threshold': > 0} | table_format: hbase/none] > failure.test_failpoints.TestFailpoints.test_failpoints[table_format: > hbase/none | exec_option: {'batch_size': 0, 'num_nodes': 0, > 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, > 'abort_on_error': 1, 'debug_action': None, 'exec_single_node_rows_threshold': > 0} | mt_dop: 4 | location: GETNEXT_SCANNER | action: FAIL | query: select 1 > from alltypessmall order by id limit 100] > failure.test_failpoints.TestFailpoints.test_failpoints[table_format: > hbase/none | exec_option: {'batch_size': 0, 'num_nodes': 0, > 'disable_codegen_rows_threshold': 0,
[jira] [Resolved] (IMPALA-8222) Timeout calculation in stress test doesn't make sense
[ https://issues.apache.org/jira/browse/IMPALA-8222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Armstrong resolved IMPALA-8222. --- Resolution: Fixed Fix Version/s: Impala 3.2.0 > Timeout calculation in stress test doesn't make sense > - > > Key: IMPALA-8222 > URL: https://issues.apache.org/jira/browse/IMPALA-8222 > Project: IMPALA > Issue Type: Bug > Components: Infrastructure >Reporter: Tim Armstrong >Assignee: Tim Armstrong >Priority: Major > Fix For: Impala 3.2.0 > > > There is some logic in the stress test that tries to guess what a reasonable > timeout for a query is. There are enough fudge factors that the false > positive rate is fairly low, but it also doesn't provide much useful coverage > unless a query is stuck. But an overall job timeout achieves the same thing. > Some specific issues that the current logic has (and which are tricky to > solve): > * The number of concurrent queries is calculated at query submission time. > E.g. a query that starts before a large batch of other queries is submitted > will be given a short timeout multiplier. > * There is no guarantee that performance degrades linearly. E.g. if runtime > filters arrive late, we can see much larger perf hits. > We should consider removing the timeout enforcement or at least revisit it. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-6662) Make stress test resilient to hangs due to client crashes
[ https://issues.apache.org/jira/browse/IMPALA-6662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16773657#comment-16773657 ] ASF subversion and git services commented on IMPALA-6662: - Commit 95414528199011716af0c55ac9c11eb69fb442b7 in impala's branch refs/heads/master from Tim Armstrong [ https://gitbox.apache.org/repos/asf?p=impala.git;h=9541452 ] IMPALA-6662: Make stress test resilient to hangs due to client crashes Thanks to Sailesh Mukil for the initial version of this patch. The concurrent_select.py process starts multiple sub processes (called query runners), to run the queries. It also starts 2 threads called the query producer thread and the query consumer thread. The query producer thread adds queries to a query queue and the query consumer thread pulls off the queue and feeds the queries to the query runners. The query runner, once it gets queries, does the following: ... with _submit_query_lock: increment(num_queries_started) run_query()# One runner crashes here. increment(num_queries_finished) ... One of the runners crash inside run_query(), thereby never incrementing num_queries_finished. Another thread that's supposed to check for memory leaks (but actually doesn't), periodically acquires '_submit_query_lock' and waits for the number of running queries to reach 0 before releasing the lock. However, in the above case, the number of running queries will never reach 0 because one of the query runners hasn't incremented 'num_queries_finished' and exited. Therefore, the poll_mem_usage() function will hold the lock indefinitely, causing no new queries to be submitted, nor the stress test to complete running. This patch fixes the problem by changing the global trackers of num_queries_started and num_queries_finished, etc. to a per QueryRunner basis. Anytime we want to find the total number of queries started/finished/cancelled, etc., we aggregate the values from all the runners. We synchronize access by adding a new lock called the _query_runners_lock. In _wait_for_test_to_finish(), we periodically check if a QueryRunner has died, and if it has, we make sure to update the num_queries_finished to num_queries_started, since it may have died before updating the 'finished' value, and we also count the error. Other changes: * Boilerplate code is reduced by storing all metrics in a dictionary keyed by the metric name, instead of stamping out the code for 10+ variables. * Added more comments and debug strings * Reformatted some code. Testing: Ran the stress test with the new patch locally and against a cluster. Change-Id: I525bf13e0f3dd660c0d9f5c2bf6eb292e7ebb8af Reviewed-on: http://gerrit.cloudera.org:8080/12521 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins > Make stress test resilient to hangs due to client crashes > - > > Key: IMPALA-6662 > URL: https://issues.apache.org/jira/browse/IMPALA-6662 > Project: IMPALA > Issue Type: Improvement > Components: Infrastructure >Reporter: Sailesh Mukil >Assignee: Tim Armstrong >Priority: Major > Fix For: Impala 3.2.0 > > > The concurrent_select.py process starts multiple sub processes (called query > runners), to run the queries. It also starts 2 threads called the query > producer thread and the query consumer thread. The query producer thread adds > queries to a query queue and the query consumer thread pulls off the queue > and feeds the queries to the query runners. > The query runner, once it gets queries, does the following: > {code:java} > (pseudo code. Real code here: > https://github.com/apache/impala/blob/d49f629c447ea59ad73ceeb0547fde4d41c651d1/tests/stress/concurrent_select.py#L583-L595) > with _submit_query_lock: > increment(num_queries_started) > run_query()# One runner crashes here. > increment(num_queries_finished) > {code} > One of the runners crash inside run_query(), thereby never incrementing > num_queries_finished. > Another thread that's supposed to check for memory leaks (but actually > doesn't), periodically acquires '_submit_query_lock' and waits for the number > of running queries to reach 0 before releasing the lock: > https://github.com/apache/impala/blob/d49f629c447ea59ad73ceeb0547fde4d41c651d1/tests/stress/concurrent_select.py#L449-L511 > However, in the above case, the number of running queries will never reach 0 > because one of the query runners hasn't incremented 'num_queries_finished' > and exited. Therefore, the poll_mem_usage() function will hold the lock > indefinitely, causing no new queries to be submitted, nor the stress test to > complete running. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe,
[jira] [Commented] (IMPALA-7450) catalogd should use thread names to make jstack more readable
[ https://issues.apache.org/jira/browse/IMPALA-7450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16773658#comment-16773658 ] ASF subversion and git services commented on IMPALA-7450: - Commit 540278d57f9d44917c47ea070169c084cdf6dd61 in impala's branch refs/heads/master from Todd Lipcon [ https://gitbox.apache.org/repos/asf?p=impala.git;h=540278d ] IMPALA-7450. Set thread name during refresh/load operations This adds a small utility class for annotating the current thread's name during potentially long-running operations such as refresh/load. With this change, jstack output now includes useful thread names like: During startup: "main [invalidating metadata - 128/428 dbs complete]" While loading a fresh table: "pool-4-thread-12 [Loading metadata for: foo_db.foo_table] [Loading metadata for all partition(s) of foo_db.foo_table]" Pool refreshing metadata for a particular path: "pool-23-thread-5 [Refreshing file metadata for path: hdfs://nameservice1/path/to/partdir..." Tests: Verified the patch manually by jstacking a catalogd while performing some workload. Also added a simple unit test to verify the thread renaming behavior. Change-Id: Ic7c850d6bb2eedc375ee567c19eb17add335f60c Reviewed-on: http://gerrit.cloudera.org:8080/11228 Reviewed-by: Bharath Vissapragada Tested-by: Impala Public Jenkins > catalogd should use thread names to make jstack more readable > - > > Key: IMPALA-7450 > URL: https://issues.apache.org/jira/browse/IMPALA-7450 > Project: IMPALA > Issue Type: Improvement > Components: Catalog >Reporter: Todd Lipcon >Assignee: Todd Lipcon >Priority: Major > Labels: supportability > > Currently when long refresh or DDL operations are being processed, it's hard > to understand what's going on when looking at a jstack. We should have such > potentially-long-running operations temporarily modify the current thread's > name to indicate what action is being taken so we can debug more easily. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-6662) Make stress test resilient to hangs due to client crashes
[ https://issues.apache.org/jira/browse/IMPALA-6662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Armstrong resolved IMPALA-6662. --- Resolution: Fixed Fix Version/s: Impala 3.2.0 Thanks [~sailesh] for doing most of the work here. > Make stress test resilient to hangs due to client crashes > - > > Key: IMPALA-6662 > URL: https://issues.apache.org/jira/browse/IMPALA-6662 > Project: IMPALA > Issue Type: Improvement > Components: Infrastructure >Reporter: Sailesh Mukil >Assignee: Tim Armstrong >Priority: Major > Fix For: Impala 3.2.0 > > > The concurrent_select.py process starts multiple sub processes (called query > runners), to run the queries. It also starts 2 threads called the query > producer thread and the query consumer thread. The query producer thread adds > queries to a query queue and the query consumer thread pulls off the queue > and feeds the queries to the query runners. > The query runner, once it gets queries, does the following: > {code:java} > (pseudo code. Real code here: > https://github.com/apache/impala/blob/d49f629c447ea59ad73ceeb0547fde4d41c651d1/tests/stress/concurrent_select.py#L583-L595) > with _submit_query_lock: > increment(num_queries_started) > run_query()# One runner crashes here. > increment(num_queries_finished) > {code} > One of the runners crash inside run_query(), thereby never incrementing > num_queries_finished. > Another thread that's supposed to check for memory leaks (but actually > doesn't), periodically acquires '_submit_query_lock' and waits for the number > of running queries to reach 0 before releasing the lock: > https://github.com/apache/impala/blob/d49f629c447ea59ad73ceeb0547fde4d41c651d1/tests/stress/concurrent_select.py#L449-L511 > However, in the above case, the number of running queries will never reach 0 > because one of the query runners hasn't incremented 'num_queries_finished' > and exited. Therefore, the poll_mem_usage() function will hold the lock > indefinitely, causing no new queries to be submitted, nor the stress test to > complete running. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-6662) Make stress test resilient to hangs due to client crashes
[ https://issues.apache.org/jira/browse/IMPALA-6662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Armstrong resolved IMPALA-6662. --- Resolution: Fixed Fix Version/s: Impala 3.2.0 Thanks [~sailesh] for doing most of the work here. > Make stress test resilient to hangs due to client crashes > - > > Key: IMPALA-6662 > URL: https://issues.apache.org/jira/browse/IMPALA-6662 > Project: IMPALA > Issue Type: Improvement > Components: Infrastructure >Reporter: Sailesh Mukil >Assignee: Tim Armstrong >Priority: Major > Fix For: Impala 3.2.0 > > > The concurrent_select.py process starts multiple sub processes (called query > runners), to run the queries. It also starts 2 threads called the query > producer thread and the query consumer thread. The query producer thread adds > queries to a query queue and the query consumer thread pulls off the queue > and feeds the queries to the query runners. > The query runner, once it gets queries, does the following: > {code:java} > (pseudo code. Real code here: > https://github.com/apache/impala/blob/d49f629c447ea59ad73ceeb0547fde4d41c651d1/tests/stress/concurrent_select.py#L583-L595) > with _submit_query_lock: > increment(num_queries_started) > run_query()# One runner crashes here. > increment(num_queries_finished) > {code} > One of the runners crash inside run_query(), thereby never incrementing > num_queries_finished. > Another thread that's supposed to check for memory leaks (but actually > doesn't), periodically acquires '_submit_query_lock' and waits for the number > of running queries to reach 0 before releasing the lock: > https://github.com/apache/impala/blob/d49f629c447ea59ad73ceeb0547fde4d41c651d1/tests/stress/concurrent_select.py#L449-L511 > However, in the above case, the number of running queries will never reach 0 > because one of the query runners hasn't incremented 'num_queries_finished' > and exited. Therefore, the poll_mem_usage() function will hold the lock > indefinitely, causing no new queries to be submitted, nor the stress test to > complete running. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (IMPALA-8236) Adding new TUnit values to runtime profile is broken
[ https://issues.apache.org/jira/browse/IMPALA-8236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16773605#comment-16773605 ] Tim Armstrong commented on IMPALA-8236: --- Yes it would appear so. > Adding new TUnit values to runtime profile is broken > > > Key: IMPALA-8236 > URL: https://issues.apache.org/jira/browse/IMPALA-8236 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 2.12.0, Impala 3.1.0 >Reporter: Tim Armstrong >Priority: Major > Labels: observability > > See IMPALA-8235 for context. > The problem this tracks is that there's no way to add a new value to the > TUnit enum, which is required in TCounter, without breaking existing readers > (at least Java readers), which will fail to decode TCounter values with an > enum value they don't recognise. > The workaround is to only use the following units in anything that is > serialised into a TCounter: > {noformat} > UNIT, > UNIT_PER_SECOND, > CPU_TICKS, > BYTES > BYTES_PER_SECOND, > TIME_NS, > DOUBLE_VALUE > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-8236) Adding new TUnit values to runtime profile is broken
[ https://issues.apache.org/jira/browse/IMPALA-8236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16773600#comment-16773600 ] Lars Volker commented on IMPALA-8236: - Doesn't this also apply to old readers for other structs that require a TUnit, e.g. TSummaryStatsCounter or TTimeSeriesCounter? > Adding new TUnit values to runtime profile is broken > > > Key: IMPALA-8236 > URL: https://issues.apache.org/jira/browse/IMPALA-8236 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 2.12.0, Impala 3.1.0 >Reporter: Tim Armstrong >Priority: Major > Labels: observability > > See IMPALA-8235 for context. > The problem this tracks is that there's no way to add a new value to the > TUnit enum, which is required in TCounter, without breaking existing readers > (at least Java readers), which will fail to decode TCounter values with an > enum value they don't recognise. > The workaround is to only use the following units in anything that is > serialised into a TCounter: > {noformat} > UNIT, > UNIT_PER_SECOND, > CPU_TICKS, > BYTES > BYTES_PER_SECOND, > TIME_NS, > DOUBLE_VALUE > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-8236) Adding new TUnit values to runtime profile is broken
[ https://issues.apache.org/jira/browse/IMPALA-8236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Armstrong updated IMPALA-8236: -- Affects Version/s: Impala 2.12.0 Impala 3.1.0 > Adding new TUnit values to runtime profile is broken > > > Key: IMPALA-8236 > URL: https://issues.apache.org/jira/browse/IMPALA-8236 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 2.12.0, Impala 3.1.0 >Reporter: Tim Armstrong >Priority: Major > Labels: observability > > See IMPALA-8235 for context. > The problem this tracks is that there's no way to add a new value to the > TUnit enum, which is required in TCounter, without breaking existing readers > (at least Java readers), which will fail to decode TCounter values with an > enum value they don't recognise. > The workaround is to only use the following units in anything that is > serialised into a TCounter: > {noformat} > UNIT, > UNIT_PER_SECOND, > CPU_TICKS, > BYTES > BYTES_PER_SECOND, > TIME_NS, > DOUBLE_VALUE > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-8236) Adding new TUnit values to runtime profile is broken
[ https://issues.apache.org/jira/browse/IMPALA-8236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Armstrong updated IMPALA-8236: -- Labels: observability (was: ) > Adding new TUnit values to runtime profile is broken > > > Key: IMPALA-8236 > URL: https://issues.apache.org/jira/browse/IMPALA-8236 > Project: IMPALA > Issue Type: Bug > Components: Backend >Reporter: Tim Armstrong >Priority: Major > Labels: observability > > See IMPALA-8235 for context. > The problem this tracks is that there's no way to add a new value to the > TUnit enum, which is required in TCounter, without breaking existing readers > (at least Java readers), which will fail to decode TCounter values with an > enum value they don't recognise. > The workaround is to only use the following units in anything that is > serialised into a TCounter: > {noformat} > UNIT, > UNIT_PER_SECOND, > CPU_TICKS, > BYTES > BYTES_PER_SECOND, > TIME_NS, > DOUBLE_VALUE > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-8236) Adding new TUnit values to runtime profile is broken
Tim Armstrong created IMPALA-8236: - Summary: Adding new TUnit values to runtime profile is broken Key: IMPALA-8236 URL: https://issues.apache.org/jira/browse/IMPALA-8236 Project: IMPALA Issue Type: Bug Components: Backend Reporter: Tim Armstrong See IMPALA-8235 for context. The problem this tracks is that there's no way to add a new value to the TUnit enum, which is required in TCounter, without breaking existing readers (at least Java readers), which will fail to decode TCounter values with an enum value they don't recognise. The workaround is to only use the following units in anything that is serialised into a TCounter: {noformat} UNIT, UNIT_PER_SECOND, CPU_TICKS, BYTES BYTES_PER_SECOND, TIME_NS, DOUBLE_VALUE {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IMPALA-8236) Adding new TUnit values to runtime profile is broken
Tim Armstrong created IMPALA-8236: - Summary: Adding new TUnit values to runtime profile is broken Key: IMPALA-8236 URL: https://issues.apache.org/jira/browse/IMPALA-8236 Project: IMPALA Issue Type: Bug Components: Backend Reporter: Tim Armstrong See IMPALA-8235 for context. The problem this tracks is that there's no way to add a new value to the TUnit enum, which is required in TCounter, without breaking existing readers (at least Java readers), which will fail to decode TCounter values with an enum value they don't recognise. The workaround is to only use the following units in anything that is serialised into a TCounter: {noformat} UNIT, UNIT_PER_SECOND, CPU_TICKS, BYTES BYTES_PER_SECOND, TIME_NS, DOUBLE_VALUE {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-8229) Resolve CMake errors when open Impala project by CLion on Mac
[ https://issues.apache.org/jira/browse/IMPALA-8229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16773580#comment-16773580 ] Quanlong Huang commented on IMPALA-8229: [~stakiar], I'd like to share my steps. When I say "work" I just mean CLion is able to provide navigation in codes, e.g. jump to definitions, show call hierarchy, jump to the implementation of interfaces, etc. CLion on Mac is unable to build or debug Impala. There're still some red lines (errors) in it but don't affect the reading. My change is big and tedious: just keep resolving CMake errors until CLion is able to update symbols and build the index. It just needs to be done once and then you can check out to your branch. Next time when you open CLion it will reuse the symbols unless you reload the CMake project. I need some time to do this again since I have reverted the dirty changes I made... Will share a workable branch for CLion later. > Resolve CMake errors when open Impala project by CLion on Mac > - > > Key: IMPALA-8229 > URL: https://issues.apache.org/jira/browse/IMPALA-8229 > Project: IMPALA > Issue Type: Improvement >Reporter: Quanlong Huang >Priority: Major > > I'm happy to develop Impala in CLion on Mac. It might encourage more people > to join the community since it can significantly lower the threshold for > Impala development. > My normal workflow is > * Understand relative codes in CLion and make changes. > * Generate patches, then debug on my remote Ubuntu machine. > To make CLion works, I comment out some requirements (e.g. llvm) in > CMakeLists.txt, modify impala-config.sh (e.g. for JNI, versions for Darwin), > then copy the generated sources of thrift and some header files of > native-toolchain from my Ubuntu machine. However, it's not an elegant > solution. We need more efforts for this. > Creating the ticket first to see if anyone needs this too. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-8229) Resolve CMake errors when open Impala project by CLion on Mac
[ https://issues.apache.org/jira/browse/IMPALA-8229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Quanlong Huang updated IMPALA-8229: --- Description: I'm happy to develop Impala in CLion on Mac. It might encourage more people to join the community since it can significantly lower the threshold for Impala development. My normal workflow is * Understand relative codes in CLion and make changes. * Generate patches, then debug on my remote Ubuntu machine. To make CLion works, I comment out some requirements (e.g. llvm) in CMakeLists.txt, modify impala-config.sh (e.g. for JNI, versions for Darwin), then copy the generated sources of thrift and some header files of native-toolchain from my Ubuntu machine. However, it's not an elegant solution. We need more efforts for this. Creating the ticket first to see if anyone needs this too. was: I'm happy to develop Impala in CLion in Mac. It might encourage more people to join the community since it can significantly lower the threshold for Impala development. My normal workflow is * Understand relative codes in CLion and make changes. * Generate patches, then debug on my remote Ubuntu machine. To make CLion works, I comment out some requirements (e.g. llvm) in CMakeLists.txt, modify impala-config.sh (e.g. for JNI, versions for Darwin), then copy the generated sources of thrift and some header files of native-toolchain from my Ubuntu machine. However, it's not an elegant solution. We need more efforts for this. Creating the ticket first to see if anyone needs this too. Summary: Resolve CMake errors when open Impala project by CLion on Mac (was: Resolve CMake errors when open Impala project by CLion in Mac) > Resolve CMake errors when open Impala project by CLion on Mac > - > > Key: IMPALA-8229 > URL: https://issues.apache.org/jira/browse/IMPALA-8229 > Project: IMPALA > Issue Type: Improvement >Reporter: Quanlong Huang >Priority: Major > > I'm happy to develop Impala in CLion on Mac. It might encourage more people > to join the community since it can significantly lower the threshold for > Impala development. > My normal workflow is > * Understand relative codes in CLion and make changes. > * Generate patches, then debug on my remote Ubuntu machine. > To make CLion works, I comment out some requirements (e.g. llvm) in > CMakeLists.txt, modify impala-config.sh (e.g. for JNI, versions for Darwin), > then copy the generated sources of thrift and some header files of > native-toolchain from my Ubuntu machine. However, it's not an elegant > solution. We need more efforts for this. > Creating the ticket first to see if anyone needs this too. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-6812) Kudu scans not returning all rows
[ https://issues.apache.org/jira/browse/IMPALA-6812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16773557#comment-16773557 ] ASF subversion and git services commented on IMPALA-6812: - Commit f9c2a67566facbbbd43a1b615d869f5b4decca50 in impala's branch refs/heads/2.x from Thomas Tauber-Marshall [ https://gitbox.apache.org/repos/asf?p=impala.git;h=f9c2a67 ] IMPALA-6812: Fix flaky Kudu scan tests Many of our Kudu related tests have been flaky with the symptom that scans appear to not return rows that were just inserted. This occurs because our default Kudu scan level of READ_LATEST doesn't make any consistency guarantees. This patch adds a query option 'kudu_read_mode', which overrides the startup flag of the same name, and then set that option to READ_AT_SNAPSHOT for all tests with Kudu inserts and scans, which should give us more consistent test results. Testing: - Passed a full exhaustive run. Does not appear to increase time to run by any significant amount. Change-Id: I70df84f2cbc663107f2ad029565d3c15bdfbd47c Reviewed-on: http://gerrit.cloudera.org:8080/10503 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins Reviewed-on: http://gerrit.cloudera.org:8080/12513 Reviewed-by: Thomas Marshall > Kudu scans not returning all rows > - > > Key: IMPALA-6812 > URL: https://issues.apache.org/jira/browse/IMPALA-6812 > Project: IMPALA > Issue Type: Bug >Affects Versions: Impala 3.0, Impala 2.13.0 >Reporter: Tianyi Wang >Assignee: Thomas Tauber-Marshall >Priority: Critical > Labels: broken-build > Fix For: Impala 3.1.0 > > > In a 2.x exhaustive build, test_column_storage_attributes failed: > {noformat} > Error Message > query_test/test_kudu.py:383: in test_column_storage_attributes assert > cursor.fetchall() == \ E assert [] == [(26, True, 0, 0, 0, 0, ...)] E > Right contains more items, first extra item: (26, True, 0, 0, 0, 0, ...) E > Use -v to get the full diff > Stacktrace > query_test/test_kudu.py:383: in test_column_storage_attributes > assert cursor.fetchall() == \ > E assert [] == [(26, True, 0, 0, 0, 0, ...)] > E Right contains more items, first extra item: (26, True, 0, 0, 0, 0, ...) > E Use -v to get the full diff > {noformat} > The last alter column query in the log is: > {noformat} > alter table test_column_storage_attributes_b9040aa.storage_attrs alter > column decimal_col > set encoding DICT_ENCODING compression NO_COMPRESSION > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-8191) TestBreakpadExhaustive.test_minidump_creation fails to kill cluster
[ https://issues.apache.org/jira/browse/IMPALA-8191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16773558#comment-16773558 ] ASF subversion and git services commented on IMPALA-8191: - Commit c1274fafb04de1b9b7c3a17e209814b8c4346311 in impala's branch refs/heads/master from Lars Volker [ https://gitbox.apache.org/repos/asf?p=impala.git;h=c1274fa ] IMPALA-8191: Wait for additional breakpad processes during test The Breakpad signal handler forks off a process to write a minidump. During the breakpad tests we send signals to the Impala daemons and then wait for all processes to go away. Prior to this change we did this by waiting on the PID returned by process.get_pid(). It is determined by iterating over psutil.get_pid_list() which is an ordered list of PIDs running on the system. We return the first process in the list with a matching command line. In cases where the PID space rolled over, this could have been the forked off breakpad process and we'd wait on that one. During the subsequent check that all processes are indeed gone, we could then pick up the original Impala daemon that had forked off to write the minidump and was still in the process of shutting down. To fix this, we wait for every process twice. Processes are identified by their command and iterating through them twice makes sure we catch both the original daemon and it's breakpad child. This change also contains improvements to the logging of processes in our tests. This should make it easier to identify similar issues in the future. Testing: I ran the breakpad tests in exhaustive mode. I didn't try to exercise it around a PID roll-over, but we shouldn't see the issue in IMPALA-8191 again. Change-Id: Ia4dcc5fecb9b5f38ae1504aae40f099837cf1bca Reviewed-on: http://gerrit.cloudera.org:8080/12501 Reviewed-by: Lars Volker Tested-by: Impala Public Jenkins > TestBreakpadExhaustive.test_minidump_creation fails to kill cluster > --- > > Key: IMPALA-8191 > URL: https://issues.apache.org/jira/browse/IMPALA-8191 > Project: IMPALA > Issue Type: Bug > Components: Infrastructure >Affects Versions: Impala 3.2.0 >Reporter: Andrew Sherman >Assignee: Lars Volker >Priority: Critical > Labels: breakpad, broken-build, flaky-test > Fix For: Impala 3.2.0 > > > h3. Error Message > {quote} > assert not [, > ] + where > [, > ] = > .impalads + > where = > .cluster > {quote} > h3. Stacktrace > {quote} > custom_cluster/test_breakpad.py:183: in test_minidump_creation > self.kill_cluster(SIGSEGV) custom_cluster/test_breakpad.py:81: in > kill_cluster signal is SIGUSR1 or self.assert_all_processes_killed() > custom_cluster/test_breakpad.py:121: in assert_all_processes_killed assert > not self.cluster.impalads E assert not > [, > ] E + where > [, > ] = > .impalads E + > where = > .cluster > {quote} > See [IMPALA-8114] for a similar bug -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-8235) AdmissionControlTimeSinceLastUpdate TIME_MS counter breaks some profile consumers
Tim Armstrong created IMPALA-8235: - Summary: AdmissionControlTimeSinceLastUpdate TIME_MS counter breaks some profile consumers Key: IMPALA-8235 URL: https://issues.apache.org/jira/browse/IMPALA-8235 Project: IMPALA Issue Type: Bug Components: Backend Affects Versions: Impala 3.2.0 Reporter: Tim Armstrong Assignee: Tim Armstrong The issue is: * type is a required field for TCounter with enum type TUnit * Some readers (e.g. Java) that encounter an unknown enum value replace it it with null, then throw an exception because it is a required field This means that the workflow for adding a new TUnit is basically broken, since we can't add a unit in Impala then update readers later. I think we should switch to TIME_NS then reconsider the workflow here. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IMPALA-8235) AdmissionControlTimeSinceLastUpdate TIME_MS counter breaks some profile consumers
Tim Armstrong created IMPALA-8235: - Summary: AdmissionControlTimeSinceLastUpdate TIME_MS counter breaks some profile consumers Key: IMPALA-8235 URL: https://issues.apache.org/jira/browse/IMPALA-8235 Project: IMPALA Issue Type: Bug Components: Backend Affects Versions: Impala 3.2.0 Reporter: Tim Armstrong Assignee: Tim Armstrong The issue is: * type is a required field for TCounter with enum type TUnit * Some readers (e.g. Java) that encounter an unknown enum value replace it it with null, then throw an exception because it is a required field This means that the workflow for adding a new TUnit is basically broken, since we can't add a unit in Impala then update readers later. I think we should switch to TIME_NS then reconsider the workflow here. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-7694) Add CPU resource utilization (user, system, iowait) timelines to profiles
[ https://issues.apache.org/jira/browse/IMPALA-7694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16773563#comment-16773563 ] ASF subversion and git services commented on IMPALA-7694: - Commit 257fa0c68bb4e64880a64844d8d4023c54645230 in impala's branch refs/heads/master from Lars Volker [ https://gitbox.apache.org/repos/asf?p=impala.git;h=257fa0c ] IMPALA-8209: Include fragment instance ID in memz/ breakdown The change for IMPALA-7694 had accidentally removed the fragment instance ID from the memz/ breakdown. This change puts it back and adds a test to make sure it's there. This change also pads query IDs with zeros when printing them in the backend. Change-Id: I73bf06bf95c88186b16fd03243de9bac946c5cc8 Reviewed-on: http://gerrit.cloudera.org:8080/12524 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins > Add CPU resource utilization (user, system, iowait) timelines to profiles > - > > Key: IMPALA-7694 > URL: https://issues.apache.org/jira/browse/IMPALA-7694 > Project: IMPALA > Issue Type: Improvement > Components: Backend >Affects Versions: Impala 3.1.0 >Reporter: Lars Volker >Assignee: Lars Volker >Priority: Major > Labels: observability, supportability > Fix For: Impala 3.2.0 > > > We often struggle to determine why a query was slow, in particular if it was > caused by other tasks on the same machine using resources. To help with this > we should include timelines for system resource utilization to the profiles. > These should eventually include CPU and disk and network I/O. If it is too > expensive to include these in all queries we should add a flag to add these > to a percentage of queries, and a query option to force-enable them. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-8209) Fragment instance ID no longer displayed on /memz
[ https://issues.apache.org/jira/browse/IMPALA-8209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16773562#comment-16773562 ] ASF subversion and git services commented on IMPALA-8209: - Commit 257fa0c68bb4e64880a64844d8d4023c54645230 in impala's branch refs/heads/master from Lars Volker [ https://gitbox.apache.org/repos/asf?p=impala.git;h=257fa0c ] IMPALA-8209: Include fragment instance ID in memz/ breakdown The change for IMPALA-7694 had accidentally removed the fragment instance ID from the memz/ breakdown. This change puts it back and adds a test to make sure it's there. This change also pads query IDs with zeros when printing them in the backend. Change-Id: I73bf06bf95c88186b16fd03243de9bac946c5cc8 Reviewed-on: http://gerrit.cloudera.org:8080/12524 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins > Fragment instance ID no longer displayed on /memz > - > > Key: IMPALA-8209 > URL: https://issues.apache.org/jira/browse/IMPALA-8209 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 3.2.0 >Reporter: Lars Volker >Assignee: Lars Volker >Priority: Major > Labels: observability > Fix For: Impala 3.2.0 > > > The change for IMPALA-7694 dropped the fragment instance ID from the > runtime-state profile. However, this also removed it from the /memz page. We > should add it back there. > https://gerrit.cloudera.org/#/c/12069/8/be/src/runtime/runtime-state.cc@76 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-8191) TestBreakpadExhaustive.test_minidump_creation fails to kill cluster
[ https://issues.apache.org/jira/browse/IMPALA-8191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16773559#comment-16773559 ] ASF subversion and git services commented on IMPALA-8191: - Commit c1274fafb04de1b9b7c3a17e209814b8c4346311 in impala's branch refs/heads/master from Lars Volker [ https://gitbox.apache.org/repos/asf?p=impala.git;h=c1274fa ] IMPALA-8191: Wait for additional breakpad processes during test The Breakpad signal handler forks off a process to write a minidump. During the breakpad tests we send signals to the Impala daemons and then wait for all processes to go away. Prior to this change we did this by waiting on the PID returned by process.get_pid(). It is determined by iterating over psutil.get_pid_list() which is an ordered list of PIDs running on the system. We return the first process in the list with a matching command line. In cases where the PID space rolled over, this could have been the forked off breakpad process and we'd wait on that one. During the subsequent check that all processes are indeed gone, we could then pick up the original Impala daemon that had forked off to write the minidump and was still in the process of shutting down. To fix this, we wait for every process twice. Processes are identified by their command and iterating through them twice makes sure we catch both the original daemon and it's breakpad child. This change also contains improvements to the logging of processes in our tests. This should make it easier to identify similar issues in the future. Testing: I ran the breakpad tests in exhaustive mode. I didn't try to exercise it around a PID roll-over, but we shouldn't see the issue in IMPALA-8191 again. Change-Id: Ia4dcc5fecb9b5f38ae1504aae40f099837cf1bca Reviewed-on: http://gerrit.cloudera.org:8080/12501 Reviewed-by: Lars Volker Tested-by: Impala Public Jenkins > TestBreakpadExhaustive.test_minidump_creation fails to kill cluster > --- > > Key: IMPALA-8191 > URL: https://issues.apache.org/jira/browse/IMPALA-8191 > Project: IMPALA > Issue Type: Bug > Components: Infrastructure >Affects Versions: Impala 3.2.0 >Reporter: Andrew Sherman >Assignee: Lars Volker >Priority: Critical > Labels: breakpad, broken-build, flaky-test > Fix For: Impala 3.2.0 > > > h3. Error Message > {quote} > assert not [, > ] + where > [, > ] = > .impalads + > where = > .cluster > {quote} > h3. Stacktrace > {quote} > custom_cluster/test_breakpad.py:183: in test_minidump_creation > self.kill_cluster(SIGSEGV) custom_cluster/test_breakpad.py:81: in > kill_cluster signal is SIGUSR1 or self.assert_all_processes_killed() > custom_cluster/test_breakpad.py:121: in assert_all_processes_killed assert > not self.cluster.impalads E assert not > [, > ] E + where > [, > ] = > .impalads E + > where = > .cluster > {quote} > See [IMPALA-8114] for a similar bug -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-8214) Bad plan in load_nested.py
[ https://issues.apache.org/jira/browse/IMPALA-8214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16773560#comment-16773560 ] ASF subversion and git services commented on IMPALA-8214: - Commit c659b78198a767b91c293cbaf77f5c8b269fba39 in impala's branch refs/heads/master from Tim Armstrong [ https://gitbox.apache.org/repos/asf?p=impala.git;h=c659b78 ] IMPALA-8214: Fix bad plan in load_nested.py The previous plan had the larger input on the build side of the join and did a broadcast join, which is very suboptimal. This speeds up data loading on my minicluster - 18s vs 31s and has a more significant impact on a real cluster, where queries execute much faster, the memory requirement is significantly reduced and the data loading can potentially be broken up into fewer chunks. I also considered computing stats on the table to let Impala generate the same plan, but this achieves the same goal more efficiently. Testing: Run core tests. Resource estimates in planner tests changed slightly because of the different distribution of data. Change-Id: I55e0ca09590a90ba530efe4e8f8bf587dde3eeeb Reviewed-on: http://gerrit.cloudera.org:8080/12519 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins > Bad plan in load_nested.py > -- > > Key: IMPALA-8214 > URL: https://issues.apache.org/jira/browse/IMPALA-8214 > Project: IMPALA > Issue Type: Bug > Components: Infrastructure >Affects Versions: Impala 3.1.0 >Reporter: Tim Armstrong >Assignee: Tim Armstrong >Priority: Major > > The plan for the below SQL, which is executed without stats, has the larger > input on the build side of the join and does a broadcast join, which is very > suboptimal. This causes high memory consumption when loading larger scale > factors, and generally makes the loading process slower than necessary. We > should flip the join and make it a shuffle join. > https://github.com/apache/impala/blob/d481cd4/testdata/bin/load_nested.py#L123 > {code} > tmp_customer_sql = r""" > SELECT > c_custkey, c_name, c_address, c_nationkey, c_phone, c_acctbal, > c_mktsegment, > c_comment, > GROUP_CONCAT( > CONCAT( > CAST(o_orderkey AS STRING), '\003', > CAST(o_orderstatus AS STRING), '\003', > CAST(o_totalprice AS STRING), '\003', > CAST(o_orderdate AS STRING), '\003', > CAST(o_orderpriority AS STRING), '\003', > CAST(o_clerk AS STRING), '\003', > CAST(o_shippriority AS STRING), '\003', > CAST(o_comment AS STRING), '\003', > CAST(lineitems_string AS STRING) > ), '\002' > ) orders_string > FROM {source_db}.customer > LEFT JOIN tmp_orders_string ON c_custkey = o_custkey > WHERE c_custkey % {chunks} = {chunk_idx} > GROUP BY 1, 2, 3, 4, 5, 6, 7, 8""".format(**sql_params) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-7917) Decouple Sentry from Impala
[ https://issues.apache.org/jira/browse/IMPALA-7917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16773561#comment-16773561 ] ASF subversion and git services commented on IMPALA-7917: - Commit 290e48f5d703202d927da5ae188e1bb5d79e7bfe in impala's branch refs/heads/master from Fredy Wijaya [ https://gitbox.apache.org/repos/asf?p=impala.git;h=290e48f ] IMPALA-7917 (Part 1): Decouple Sentry from Impala The first part of this patch is to provide an initial work to decouple Sentry from Impala by creating a generic authorization provider interface that Sentry implements. The idea is to allow more authorization providers in the future. The patch updates the following: - Renamed Authorizeable to Authorizable to fix typographical error. - Moved any clases that uses Sentry specific code to org.apache.impala.authorization.sentry package and created interfaces when necessary. - Moved all generic authorization related classes to org.apache.impala.authorization package. - Minor clean up on authorization related code. In this patch, switching the authorization provider implementation still requires updating the code in many different places. A follow up patch will make it easy to switch an authorization provider implementation. This patch has no functionality change. Testing: - Ran all FE tests - Ran all E2E authorization tests Change-Id: If1fd1df0b38ddd7cfa41299e95f5827f8a9e9c1f Reviewed-on: http://gerrit.cloudera.org:8080/12020 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins > Decouple Sentry from Impala > --- > > Key: IMPALA-7917 > URL: https://issues.apache.org/jira/browse/IMPALA-7917 > Project: IMPALA > Issue Type: Sub-task > Components: Catalog, Frontend >Reporter: Fredy Wijaya >Assignee: Fredy Wijaya >Priority: Major > > The task will will involve decoupling Sentry from Impala, such as moving > Sentry specific code to a separate package/module and provide interfaces that > allows the flexibility to switch to a different authorization provider. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-8234) TUnit enum (part of profile format) was reordered
Tim Armstrong created IMPALA-8234: - Summary: TUnit enum (part of profile format) was reordered Key: IMPALA-8234 URL: https://issues.apache.org/jira/browse/IMPALA-8234 Project: IMPALA Issue Type: Bug Components: Backend Reporter: Tim Armstrong Assignee: Lars Volker IMPALA-7694 inserted BASIS_POINTS into TUnit, which will change the assigned enum values of subsequent units. I think those are fairly rarely used, but it would be good to fix this before the release to avoid any disruption. Maybe it's a good idea to explicitly number the enum elements to make it clear what the wire format is. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (IMPALA-8209) Fragment instance ID no longer displayed on /memz
[ https://issues.apache.org/jira/browse/IMPALA-8209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Volker resolved IMPALA-8209. - Resolution: Fixed Fix Version/s: Impala 3.2.0 > Fragment instance ID no longer displayed on /memz > - > > Key: IMPALA-8209 > URL: https://issues.apache.org/jira/browse/IMPALA-8209 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 3.2.0 >Reporter: Lars Volker >Assignee: Lars Volker >Priority: Major > Labels: observability > Fix For: Impala 3.2.0 > > > The change for IMPALA-7694 dropped the fragment instance ID from the > runtime-state profile. However, this also removed it from the /memz page. We > should add it back there. > https://gerrit.cloudera.org/#/c/12069/8/be/src/runtime/runtime-state.cc@76 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-8209) Fragment instance ID no longer displayed on /memz
[ https://issues.apache.org/jira/browse/IMPALA-8209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Volker resolved IMPALA-8209. - Resolution: Fixed Fix Version/s: Impala 3.2.0 > Fragment instance ID no longer displayed on /memz > - > > Key: IMPALA-8209 > URL: https://issues.apache.org/jira/browse/IMPALA-8209 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 3.2.0 >Reporter: Lars Volker >Assignee: Lars Volker >Priority: Major > Labels: observability > Fix For: Impala 3.2.0 > > > The change for IMPALA-7694 dropped the fragment instance ID from the > runtime-state profile. However, this also removed it from the /memz page. We > should add it back there. > https://gerrit.cloudera.org/#/c/12069/8/be/src/runtime/runtime-state.cc@76 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IMPALA-8234) TUnit enum (part of profile format) was reordered
Tim Armstrong created IMPALA-8234: - Summary: TUnit enum (part of profile format) was reordered Key: IMPALA-8234 URL: https://issues.apache.org/jira/browse/IMPALA-8234 Project: IMPALA Issue Type: Bug Components: Backend Reporter: Tim Armstrong Assignee: Lars Volker IMPALA-7694 inserted BASIS_POINTS into TUnit, which will change the assigned enum values of subsequent units. I think those are fairly rarely used, but it would be good to fix this before the release to avoid any disruption. Maybe it's a good idea to explicitly number the enum elements to make it clear what the wire format is. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-8112) test_cancel_select with debug action failed with unexpected error
[ https://issues.apache.org/jira/browse/IMPALA-8112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Sherman resolved IMPALA-8112. Resolution: Cannot Reproduce Test code is OK, underlying problem seem to be Kudu problem logged as IMPALA-8190 > test_cancel_select with debug action failed with unexpected error > - > > Key: IMPALA-8112 > URL: https://issues.apache.org/jira/browse/IMPALA-8112 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 3.2.0 >Reporter: Michael Brown >Assignee: Andrew Sherman >Priority: Major > Labels: flaky > > Stacktrace > {noformat} > query_test/test_cancellation.py:241: in test_cancel_select > self.execute_cancel_test(vector) > query_test/test_cancellation.py:213: in execute_cancel_test > assert 'Cancelled' in str(thread.fetch_results_error) > E assert 'Cancelled' in "ImpalaBeeswaxException:\n INNER EXCEPTION: 'beeswaxd.ttypes.BeeswaxException'>\n MESSAGE: Unable to open Kudu table: > Network error: recv error from 0.0.0.0:0: Transport endpoint is not connected > (error 107)\n" > E+ where "ImpalaBeeswaxException:\n INNER EXCEPTION: 'beeswaxd.ttypes.BeeswaxException'>\n MESSAGE: Unable to open Kudu table: > Network error: recv error from 0.0.0.0:0: Transport endpoint is not connected > (error 107)\n" = str(ImpalaBeeswaxException()) > E+where ImpalaBeeswaxException() = 140481071658752)>.fetch_results_error > {noformat} > Standard Error > {noformat} > SET > client_identifier=query_test/test_cancellation.py::TestCancellationParallel::()::test_cancel_select[protocol:beeswax|table_format:kudu/none|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'debug_action; > -- executing against localhost:21000 > use tpch_kudu; > -- 2019-01-18 17:50:03,100 INFO MainThread: Started query > 4e4b3ab4cc7d:11efc3f5 > SET > client_identifier=query_test/test_cancellation.py::TestCancellationParallel::()::test_cancel_select[protocol:beeswax|table_format:kudu/none|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'debug_action; > SET batch_size=0; > SET num_nodes=0; > SET disable_codegen_rows_threshold=0; > SET disable_codegen=False; > SET abort_on_error=1; > SET cpu_limit_s=10; > SET debug_action=0:GETNEXT:WAIT|COORD_CANCEL_QUERY_FINSTANCES_RPC:FAIL; > SET exec_single_node_rows_threshold=0; > SET buffer_pool_limit=0; > -- executing async: localhost:21000 > select l_returnflag from lineitem; > -- 2019-01-18 17:50:03,139 INFO MainThread: Started query > fa4ddb9e62a01240:54c86ad > SET > client_identifier=query_test/test_cancellation.py::TestCancellationParallel::()::test_cancel_select[protocol:beeswax|table_format:kudu/none|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'debug_action; > -- connecting to: localhost:21000 > -- fetching results from: object at 0x6235e90> > -- getting state for operation: > > -- canceling operation: object at 0x6235e90> > -- 2019-01-18 17:50:08,196 INFO Thread-4: Starting new HTTP connection > (1): localhost > -- closing query for operation handle: > > {noformat} > [~asherman] please take a look since it looks like you touched code around > this area last. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-8112) test_cancel_select with debug action failed with unexpected error
[ https://issues.apache.org/jira/browse/IMPALA-8112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Sherman resolved IMPALA-8112. Resolution: Cannot Reproduce Test code is OK, underlying problem seem to be Kudu problem logged as IMPALA-8190 > test_cancel_select with debug action failed with unexpected error > - > > Key: IMPALA-8112 > URL: https://issues.apache.org/jira/browse/IMPALA-8112 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 3.2.0 >Reporter: Michael Brown >Assignee: Andrew Sherman >Priority: Major > Labels: flaky > > Stacktrace > {noformat} > query_test/test_cancellation.py:241: in test_cancel_select > self.execute_cancel_test(vector) > query_test/test_cancellation.py:213: in execute_cancel_test > assert 'Cancelled' in str(thread.fetch_results_error) > E assert 'Cancelled' in "ImpalaBeeswaxException:\n INNER EXCEPTION: 'beeswaxd.ttypes.BeeswaxException'>\n MESSAGE: Unable to open Kudu table: > Network error: recv error from 0.0.0.0:0: Transport endpoint is not connected > (error 107)\n" > E+ where "ImpalaBeeswaxException:\n INNER EXCEPTION: 'beeswaxd.ttypes.BeeswaxException'>\n MESSAGE: Unable to open Kudu table: > Network error: recv error from 0.0.0.0:0: Transport endpoint is not connected > (error 107)\n" = str(ImpalaBeeswaxException()) > E+where ImpalaBeeswaxException() = 140481071658752)>.fetch_results_error > {noformat} > Standard Error > {noformat} > SET > client_identifier=query_test/test_cancellation.py::TestCancellationParallel::()::test_cancel_select[protocol:beeswax|table_format:kudu/none|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'debug_action; > -- executing against localhost:21000 > use tpch_kudu; > -- 2019-01-18 17:50:03,100 INFO MainThread: Started query > 4e4b3ab4cc7d:11efc3f5 > SET > client_identifier=query_test/test_cancellation.py::TestCancellationParallel::()::test_cancel_select[protocol:beeswax|table_format:kudu/none|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'debug_action; > SET batch_size=0; > SET num_nodes=0; > SET disable_codegen_rows_threshold=0; > SET disable_codegen=False; > SET abort_on_error=1; > SET cpu_limit_s=10; > SET debug_action=0:GETNEXT:WAIT|COORD_CANCEL_QUERY_FINSTANCES_RPC:FAIL; > SET exec_single_node_rows_threshold=0; > SET buffer_pool_limit=0; > -- executing async: localhost:21000 > select l_returnflag from lineitem; > -- 2019-01-18 17:50:03,139 INFO MainThread: Started query > fa4ddb9e62a01240:54c86ad > SET > client_identifier=query_test/test_cancellation.py::TestCancellationParallel::()::test_cancel_select[protocol:beeswax|table_format:kudu/none|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'debug_action; > -- connecting to: localhost:21000 > -- fetching results from: object at 0x6235e90> > -- getting state for operation: > > -- canceling operation: object at 0x6235e90> > -- 2019-01-18 17:50:08,196 INFO Thread-4: Starting new HTTP connection > (1): localhost > -- closing query for operation handle: > > {noformat} > [~asherman] please take a look since it looks like you touched code around > this area last. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work started] (IMPALA-8233) Do not re-download Ranger if it is already downloaded
[ https://issues.apache.org/jira/browse/IMPALA-8233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on IMPALA-8233 started by Fredy Wijaya. > Do not re-download Ranger if it is already downloaded > - > > Key: IMPALA-8233 > URL: https://issues.apache.org/jira/browse/IMPALA-8233 > Project: IMPALA > Issue Type: Sub-task > Components: Infrastructure >Reporter: Fredy Wijaya >Assignee: Fredy Wijaya >Priority: Major > > Similar to other packages, we should not re-download Ranger if it's already > downloaded. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-8233) Do not re-download Ranger if it is already downloaded
Fredy Wijaya created IMPALA-8233: Summary: Do not re-download Ranger if it is already downloaded Key: IMPALA-8233 URL: https://issues.apache.org/jira/browse/IMPALA-8233 Project: IMPALA Issue Type: Sub-task Components: Infrastructure Reporter: Fredy Wijaya Assignee: Fredy Wijaya Similar to other packages, we should not re-download Ranger if it's already downloaded. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-8233) Do not re-download Ranger if it is already downloaded
Fredy Wijaya created IMPALA-8233: Summary: Do not re-download Ranger if it is already downloaded Key: IMPALA-8233 URL: https://issues.apache.org/jira/browse/IMPALA-8233 Project: IMPALA Issue Type: Sub-task Components: Infrastructure Reporter: Fredy Wijaya Assignee: Fredy Wijaya Similar to other packages, we should not re-download Ranger if it's already downloaded. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (IMPALA-5050) Add support to read TIMESTAMP_MILLIS and TIMESTAMP_MICROS to the parquet scanner
[ https://issues.apache.org/jira/browse/IMPALA-5050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16773510#comment-16773510 ] Alex Rodoni commented on IMPALA-5050: - Thank you [~csringhofer]! > Add support to read TIMESTAMP_MILLIS and TIMESTAMP_MICROS to the parquet > scanner > > > Key: IMPALA-5050 > URL: https://issues.apache.org/jira/browse/IMPALA-5050 > Project: IMPALA > Issue Type: New Feature > Components: Backend >Affects Versions: Impala 2.9.0 >Reporter: Lars Volker >Assignee: Csaba Ringhofer >Priority: Major > Fix For: Impala 3.2.0 > > > This requires updating {{parquet.thrift}} to a version that includes the > {{TIMESTAMP_MICROS}} logical type. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Assigned] (IMPALA-6900) Invalidate metadata operation is ignored at a coordinator if catalog is empty
[ https://issues.apache.org/jira/browse/IMPALA-6900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] bharath v reassigned IMPALA-6900: - Assignee: (was: Vuk Ercegovac) > Invalidate metadata operation is ignored at a coordinator if catalog is empty > - > > Key: IMPALA-6900 > URL: https://issues.apache.org/jira/browse/IMPALA-6900 > Project: IMPALA > Issue Type: Bug > Components: Catalog >Affects Versions: Impala 3.0, Impala 2.12.0 >Reporter: Dimitris Tsirogiannis >Priority: Blocker > > The following workflow may cause an impalad that issued an invalidate > metadata to falsely consider that the effect of that operation has taken > effect, thus causing subsequent queries to fail due to unresolved references > to tables or databases. > Steps to reproduce: > # Start an impala cluster connecting to an empty HMS (no databases). > # Create a database "db" in HMS outside of Impala (e.g. using Hive). > # Run INVALIDATE METADATA through Impala. > # Run "use db" statement in Impala. > > The while condition in the code snippet below is cause the > WaitForMinCatalogUpdate function to prematurely return even though INVALIDATE > METADATA has not taken effect: > {code:java} > void ImpalaServer::WaitForMinCatalogUpdate(..) { > ... > VLOG_QUERY << "Waiting for minimum catalog object version: " ><< min_req_catalog_object_version << " current version: " ><< min_catalog_object_version; > while (catalog_update_info_.min_catalog_object_version < > min_req_catalog_object_version && catalog_update_info_.catalog_service_id == > catalog_service_id) { >catalog_version_update_cv_.Wait(unique_lock); > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-7594) TestAutomaticCatalogInvalidation.test_local_catalog and test_v1_catalog still hitting timeout
[ https://issues.apache.org/jira/browse/IMPALA-7594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] bharath v resolved IMPALA-7594. --- Resolution: Duplicate I think IMPALA-7870 fixed this. I don't think we ever ran into this after IMPALA-7870 was fixed. > TestAutomaticCatalogInvalidation.test_local_catalog and test_v1_catalog still > hitting timeout > - > > Key: IMPALA-7594 > URL: https://issues.apache.org/jira/browse/IMPALA-7594 > Project: IMPALA > Issue Type: Bug > Components: Infrastructure >Affects Versions: Impala 3.1.0 >Reporter: Tim Armstrong >Assignee: Tianyi Wang >Priority: Critical > Labels: flaky > > Similar to IMPALA-7580, I hit this on a build of one of my patches when > running under ASAN. The branch had the fix for IMPALA-7580 in it (it's based > off 038af345933fde4fbcc9bc524f4ca93bfc08c633): > {noformat} > custom_cluster.test_automatic_invalidation.TestAutomaticCatalogInvalidation.test_local_catalog > (from pytest) > Failing for the past 2 builds (Since Failed#3242 ) > Took 30 sec. > add description > Error Message > assert 1537339209.989275 < 1537339209.65928 + where 1537339209.989275 = > () +where = time.time > Stacktrace > custom_cluster/test_automatic_invalidation.py:70: in test_local_catalog > self._run_test(cursor) > custom_cluster/test_automatic_invalidation.py:58: in _run_test > assert time.time() < timeout > E assert 1537339209.989275 < 1537339209.65928 > E+ where 1537339209.989275 = () > E+where = time.time > Standard Error > -- 2018-09-18 23:39:39,498 INFO MainThread: Starting cluster with > command: > /data/jenkins/workspace/impala-private-parameterized/repos/Impala/bin/start-impala-cluster.py > --cluster_size=3 --num_coordinators=3 > --log_dir=/data/jenkins/workspace/impala-private-parameterized/repos/Impala/logs/custom_cluster_tests > --log_level=1 '--impalad_args="--invalidate_tables_timeout_s=20 > --use_local_catalog" ' > '--state_store_args="--statestore_update_frequency_ms=50 > --statestore_priority_update_frequency_ms=50 > --statestore_heartbeat_frequency_ms=50" ' > '--catalogd_args="--invalidate_tables_timeout_s=20 > --catalog_topic_mode=minimal" ' > 23:39:40 MainThread: Starting State Store logging to > /data/jenkins/workspace/impala-private-parameterized/repos/Impala/logs/custom_cluster_tests/statestored.INFO > 23:39:41 MainThread: Starting Catalog Service logging to > /data/jenkins/workspace/impala-private-parameterized/repos/Impala/logs/custom_cluster_tests/catalogd.INFO > 23:39:42 MainThread: Starting Impala Daemon logging to > /data/jenkins/workspace/impala-private-parameterized/repos/Impala/logs/custom_cluster_tests/impalad.INFO > 23:39:43 MainThread: Starting Impala Daemon logging to > /data/jenkins/workspace/impala-private-parameterized/repos/Impala/logs/custom_cluster_tests/impalad_node1.INFO > 23:39:44 MainThread: Starting Impala Daemon logging to > /data/jenkins/workspace/impala-private-parameterized/repos/Impala/logs/custom_cluster_tests/impalad_node2.INFO > 23:39:47 MainThread: Found 3 impalad/1 statestored/1 catalogd process(es) > 23:39:47 MainThread: Getting num_known_live_backends from > impala-ec2-centos74-m5-4xlarge-ondemand-01b4.vpc.cloudera.com:25000 > 23:39:47 MainThread: Waiting for num_known_live_backends=3. Current value: 2 > 23:39:48 MainThread: Getting num_known_live_backends from > impala-ec2-centos74-m5-4xlarge-ondemand-01b4.vpc.cloudera.com:25000 > 23:39:48 MainThread: num_known_live_backends has reached value: 3 > 23:39:48 MainThread: Getting num_known_live_backends from > impala-ec2-centos74-m5-4xlarge-ondemand-01b4.vpc.cloudera.com:25001 > 23:39:48 MainThread: num_known_live_backends has reached value: 3 > 23:39:48 MainThread: Getting num_known_live_backends from > impala-ec2-centos74-m5-4xlarge-ondemand-01b4.vpc.cloudera.com:25002 > 23:39:48 MainThread: num_known_live_backends has reached value: 3 > 23:39:48 MainThread: Impala Cluster Running with 3 nodes (3 coordinators, 3 > executors). > -- 2018-09-18 23:39:48,471 INFO MainThread: Found 3 impalad/1 > statestored/1 catalogd process(es) > -- 2018-09-18 23:39:48,471 INFO MainThread: Getting metric: > statestore.live-backends from > impala-ec2-centos74-m5-4xlarge-ondemand-01b4.vpc.cloudera.com:25010 > -- 2018-09-18 23:39:48,472 INFO MainThread: Metric > 'statestore.live-backends' has reached desired value: 4 > -- 2018-09-18 23:39:48,472 INFO MainThread: Getting > num_known_live_backends from > impala-ec2-centos74-m5-4xlarge-ondemand-01b4.vpc.cloudera.com:25000 > -- 2018-09-18 23:39:48,473 INFO MainThread: num_known_live_backends has > reached value: 3 > -- 2018-09-18 23:39:48,473 INFO MainThread: Getting >
[jira] [Resolved] (IMPALA-7594) TestAutomaticCatalogInvalidation.test_local_catalog and test_v1_catalog still hitting timeout
[ https://issues.apache.org/jira/browse/IMPALA-7594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] bharath v resolved IMPALA-7594. --- Resolution: Duplicate I think IMPALA-7870 fixed this. I don't think we ever ran into this after IMPALA-7870 was fixed. > TestAutomaticCatalogInvalidation.test_local_catalog and test_v1_catalog still > hitting timeout > - > > Key: IMPALA-7594 > URL: https://issues.apache.org/jira/browse/IMPALA-7594 > Project: IMPALA > Issue Type: Bug > Components: Infrastructure >Affects Versions: Impala 3.1.0 >Reporter: Tim Armstrong >Assignee: Tianyi Wang >Priority: Critical > Labels: flaky > > Similar to IMPALA-7580, I hit this on a build of one of my patches when > running under ASAN. The branch had the fix for IMPALA-7580 in it (it's based > off 038af345933fde4fbcc9bc524f4ca93bfc08c633): > {noformat} > custom_cluster.test_automatic_invalidation.TestAutomaticCatalogInvalidation.test_local_catalog > (from pytest) > Failing for the past 2 builds (Since Failed#3242 ) > Took 30 sec. > add description > Error Message > assert 1537339209.989275 < 1537339209.65928 + where 1537339209.989275 = > () +where = time.time > Stacktrace > custom_cluster/test_automatic_invalidation.py:70: in test_local_catalog > self._run_test(cursor) > custom_cluster/test_automatic_invalidation.py:58: in _run_test > assert time.time() < timeout > E assert 1537339209.989275 < 1537339209.65928 > E+ where 1537339209.989275 = () > E+where = time.time > Standard Error > -- 2018-09-18 23:39:39,498 INFO MainThread: Starting cluster with > command: > /data/jenkins/workspace/impala-private-parameterized/repos/Impala/bin/start-impala-cluster.py > --cluster_size=3 --num_coordinators=3 > --log_dir=/data/jenkins/workspace/impala-private-parameterized/repos/Impala/logs/custom_cluster_tests > --log_level=1 '--impalad_args="--invalidate_tables_timeout_s=20 > --use_local_catalog" ' > '--state_store_args="--statestore_update_frequency_ms=50 > --statestore_priority_update_frequency_ms=50 > --statestore_heartbeat_frequency_ms=50" ' > '--catalogd_args="--invalidate_tables_timeout_s=20 > --catalog_topic_mode=minimal" ' > 23:39:40 MainThread: Starting State Store logging to > /data/jenkins/workspace/impala-private-parameterized/repos/Impala/logs/custom_cluster_tests/statestored.INFO > 23:39:41 MainThread: Starting Catalog Service logging to > /data/jenkins/workspace/impala-private-parameterized/repos/Impala/logs/custom_cluster_tests/catalogd.INFO > 23:39:42 MainThread: Starting Impala Daemon logging to > /data/jenkins/workspace/impala-private-parameterized/repos/Impala/logs/custom_cluster_tests/impalad.INFO > 23:39:43 MainThread: Starting Impala Daemon logging to > /data/jenkins/workspace/impala-private-parameterized/repos/Impala/logs/custom_cluster_tests/impalad_node1.INFO > 23:39:44 MainThread: Starting Impala Daemon logging to > /data/jenkins/workspace/impala-private-parameterized/repos/Impala/logs/custom_cluster_tests/impalad_node2.INFO > 23:39:47 MainThread: Found 3 impalad/1 statestored/1 catalogd process(es) > 23:39:47 MainThread: Getting num_known_live_backends from > impala-ec2-centos74-m5-4xlarge-ondemand-01b4.vpc.cloudera.com:25000 > 23:39:47 MainThread: Waiting for num_known_live_backends=3. Current value: 2 > 23:39:48 MainThread: Getting num_known_live_backends from > impala-ec2-centos74-m5-4xlarge-ondemand-01b4.vpc.cloudera.com:25000 > 23:39:48 MainThread: num_known_live_backends has reached value: 3 > 23:39:48 MainThread: Getting num_known_live_backends from > impala-ec2-centos74-m5-4xlarge-ondemand-01b4.vpc.cloudera.com:25001 > 23:39:48 MainThread: num_known_live_backends has reached value: 3 > 23:39:48 MainThread: Getting num_known_live_backends from > impala-ec2-centos74-m5-4xlarge-ondemand-01b4.vpc.cloudera.com:25002 > 23:39:48 MainThread: num_known_live_backends has reached value: 3 > 23:39:48 MainThread: Impala Cluster Running with 3 nodes (3 coordinators, 3 > executors). > -- 2018-09-18 23:39:48,471 INFO MainThread: Found 3 impalad/1 > statestored/1 catalogd process(es) > -- 2018-09-18 23:39:48,471 INFO MainThread: Getting metric: > statestore.live-backends from > impala-ec2-centos74-m5-4xlarge-ondemand-01b4.vpc.cloudera.com:25010 > -- 2018-09-18 23:39:48,472 INFO MainThread: Metric > 'statestore.live-backends' has reached desired value: 4 > -- 2018-09-18 23:39:48,472 INFO MainThread: Getting > num_known_live_backends from > impala-ec2-centos74-m5-4xlarge-ondemand-01b4.vpc.cloudera.com:25000 > -- 2018-09-18 23:39:48,473 INFO MainThread: num_known_live_backends has > reached value: 3 > -- 2018-09-18 23:39:48,473 INFO MainThread: Getting >
[jira] [Commented] (IMPALA-8178) Tests failing with “Could not allocate memory while trying to increase reservation” on EC filesystem
[ https://issues.apache.org/jira/browse/IMPALA-8178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16773419#comment-16773419 ] Joe McDonnell commented on IMPALA-8178: --- This is related to enabling the remote file handle cache in IMPALA-7265. At the moment, it looks like the JVM is allocating some native memory for each file handle. Testing without the file handle cache doesn't see this issue. Erasure coding seems to use "remote" file handles on the minicluster. For now, it might make sense to disable the file handle cache for erasure coding. This is easy to reproduce. Create an erasure coded minicluster, start up impala, and then run "select count(*) from tpcds_parquet.store_sales". This increases memory consumption by 5+GB total across the three impalads. Using pprof, here are the top allocations: {noformat} Total: 3766.0 MB 3657.3 97.1% 97.1% 3657.4 97.1% JVM_FindSignal 48.0 1.3% 98.4% 84.7 2.2% SUNWprivate_1.1 10.5 0.3% 98.7% 10.5 0.3% inflate 7.0 0.2% 98.9% 7.0 0.2% __gnu_cxx::new_allocator::allocate 6.8 0.2% 99.0% 6.8 0.2% impala::SystemAllocator::AllocateViaMalloc 6.5 0.2% 99.2% 6.5 0.2% __gnu_cxx::new_allocator::allocate (inline) 6.4 0.2% 99.4% 6.4 0.2% Java_org_apache_hadoop_io_erasurecode_rawcoder_NativeRSRawDecoder_initImpl 3.9 0.1% 99.5% 3.9 0.1% Java_java_util_zip_ZipFile_getZipMessage{noformat} > Tests failing with “Could not allocate memory while trying to increase > reservation” on EC filesystem > > > Key: IMPALA-8178 > URL: https://issues.apache.org/jira/browse/IMPALA-8178 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 3.2.0 >Reporter: Andrew Sherman >Assignee: Tim Armstrong >Priority: Blocker > Labels: broken-build > > In tests run against an Erasure Coding filesystem, multiple tests failed with > memory allocation errors. > In total 10 tests failed: > * query_test.test_scanners.TestParquet.test_decimal_encodings > * query_test.test_scanners.TestTpchScanRangeLengths.test_tpch_scan_ranges > * query_test.test_exprs.TestExprs.test_exprs [enable_expr_rewrites: 0] > * query_test.test_exprs.TestExprs.test_exprs [enable_expr_rewrites: 1] > * query_test.test_hbase_queries.TestHBaseQueries.test_hbase_scan_node > * query_test.test_scanners.TestParquet.test_def_levels > * > query_test.test_scanners.TestTextSplitDelimiters.test_text_split_across_buffers_delimiterquery_test.test_hbase_queries.TestHBaseQueries.test_hbase_filters > * query_test.test_hbase_queries.TestHBaseQueries.test_hbase_inline_views > * query_test.test_hbase_queries.TestHBaseQueries.test_hbase_top_n > The first failure looked like this on the client side: > {quote} > F > query_test/test_scanners.py::TestParquet::()::test_decimal_encodings[protocol: > beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0, > 'disable_codegen_rows_threshold': 0, 'disable_codegen': True, > 'abort_on_error': 1, 'debug_action': > '-1:OPEN:SET_DENY_RESERVATION_PROBABILITY@0.5', > 'exec_single_node_rows_threshold': 0} | table_format: parquet/none] > query_test/test_scanners.py:717: in test_decimal_encodings > self.run_test_case('QueryTest/parquet-decimal-formats', vector, > unique_database) > common/impala_test_suite.py:472: in run_test_case > result = self.__execute_query(target_impalad_client, query, user=user) > common/impala_test_suite.py:699: in __execute_query > return impalad_client.execute(query, user=user) > common/impala_connection.py:174: in execute > return self.__beeswax_client.execute(sql_stmt, user=user) > beeswax/impala_beeswax.py:183: in execute > handle = self.__execute_query(query_string.strip(), user=user) > beeswax/impala_beeswax.py:360: in __execute_query > self.wait_for_finished(handle) > beeswax/impala_beeswax.py:381: in wait_for_finished > raise ImpalaBeeswaxException("Query aborted:" + error_log, None) > E ImpalaBeeswaxException: ImpalaBeeswaxException: > EQuery aborted:ExecQueryFInstances rpc > query_id=6e44c3c949a31be2:f973c7ff failed: Failed to get minimum > memory reservation of 8.00 KB on daemon xxx.com:22001 for query > 6e44c3c949a31be2:f973c7ff due to following error: Memory limit > exceeded: Could not allocate memory while trying to increase reservation. > E Query(6e44c3c949a31be2:f973c7ff) could not allocate 8.00 KB > without exceeding limit. > E Error occurred on backend xxx.com:22001 > E Memory left in process limit: 1.19 GB > E Query(6e44c3c949a31be2:f973c7ff): Reservation=0 > ReservationLimit=9.60 GB OtherMemory=0 Total=0 Peak=0 > E Memory is likely oversubscribed. Reducing query concurrency or > configuring admission control may help avoid this
[jira] [Updated] (IMPALA-8222) Timeout calculation in stress test doesn't make sense
[ https://issues.apache.org/jira/browse/IMPALA-8222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Armstrong updated IMPALA-8222: -- Description: There is some logic in the stress test that tries to guess what a reasonable timeout for a query is. There are enough fudge factors that the false positive rate is fairly low, but it also doesn't provide much useful coverage unless a query is stuck. But an overall job timeout achieves the same thing. Some specific issues that the current logic has (and which are tricky to solve): * The number of concurrent queries is calculated at query submission time. E.g. a query that starts before a large batch of other queries is submitted will be given a short timeout multiplier. * There is no guarantee that performance degrades linearly. E.g. if runtime filters arrive late, we can see much larger perf hits. We should consider removing the timeout enforcement or at least revisit it. was: There is some logic in the stress test that tries to guess what a reasonable timeout for a query is. There are enough fudge factors that the false positive rate is fairly low, but it also doesn't provide much useful coverage unless a query is stuck. But an overall job timeout achieves the same thing. Some specific issues that the current logic has (and which are tricky to solve): * The number of concurrent queries is calculated at query submission time. E.g. a query that starts before a large batch of other queries is submitted will be given a short timeout multiplier. * There is no guarantee that performance degrades linearly. E.g. if runtime filters arrive late, we can see much larger perf hits. We should consider removing the timeout enforcement or at least revisiting it. > Timeout calculation in stress test doesn't make sense > - > > Key: IMPALA-8222 > URL: https://issues.apache.org/jira/browse/IMPALA-8222 > Project: IMPALA > Issue Type: Bug > Components: Infrastructure >Reporter: Tim Armstrong >Assignee: Tim Armstrong >Priority: Major > > There is some logic in the stress test that tries to guess what a reasonable > timeout for a query is. There are enough fudge factors that the false > positive rate is fairly low, but it also doesn't provide much useful coverage > unless a query is stuck. But an overall job timeout achieves the same thing. > Some specific issues that the current logic has (and which are tricky to > solve): > * The number of concurrent queries is calculated at query submission time. > E.g. a query that starts before a large batch of other queries is submitted > will be given a short timeout multiplier. > * There is no guarantee that performance degrades linearly. E.g. if runtime > filters arrive late, we can see much larger perf hits. > We should consider removing the timeout enforcement or at least revisit it. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-8222) Timeout calculation in stress test doesn't make sense
[ https://issues.apache.org/jira/browse/IMPALA-8222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16773404#comment-16773404 ] Tim Armstrong commented on IMPALA-8222: --- I think we should actually just remove this - I looked at trying to fix the above issues directly and it just adds even more complexity (e.g. we'd have to track the query concurrency throughout the query's runtime). > Timeout calculation in stress test doesn't make sense > - > > Key: IMPALA-8222 > URL: https://issues.apache.org/jira/browse/IMPALA-8222 > Project: IMPALA > Issue Type: Bug > Components: Infrastructure >Reporter: Tim Armstrong >Assignee: Tim Armstrong >Priority: Major > > There is some logic in the stress test that tries to guess what a reasonable > timeout for a query is. There are enough fudge factors that the false > positive rate is fairly low, but it also doesn't provide much useful coverage > unless a query is stuck. But an overall job timeout achieves the same thing. > Some specific issues that the current logic has (and which are tricky to > solve): > * The number of concurrent queries is calculated at query submission time. > E.g. a query that starts before a large batch of other queries is submitted > will be given a short timeout multiplier. > * There is no guarantee that performance degrades linearly. E.g. if runtime > filters arrive late, we can see much larger perf hits. > We should consider removing the timeout enforcement or at least revisiting it. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-8222) Timeout calculation in stress test doesn't make sense
[ https://issues.apache.org/jira/browse/IMPALA-8222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Armstrong updated IMPALA-8222: -- Description: There is some logic in the stress test that tries to guess what a reasonable timeout for a query is. There are enough fudge factors that the false positive rate is fairly low, but it also doesn't provide much useful coverage unless a query is stuck. But an overall job timeout achieves the same thing. Some specific issues that the current logic has (and which are tricky to solve): * The number of concurrent queries is calculated at query submission time. E.g. a query that starts before a large batch of other queries is submitted will be given a short timeout multiplier. * There is no guarantee that performance degrades linearly. E.g. if runtime filters arrive late, we can see much larger perf hits. We should consider removing the timeout enforcement or at least revisiting it. was: There is some logic in the stress test that tries to guess what a reasonable timeout for a query is. There are enough fudge factors that the false positive rate is fairly low, but it also doesn't provide much useful coverage unless a query is stuck. * The number of concurrent queries is calculated at query submission time. E.g. a query that starts before a large batch of other queries is submitted will be given a short timeout multiplier. * There is no guarantee that performance degrades linearly. E.g. if runtime filters arrive late, we can see much larger perf hits. We should consider removing the timeout enforcement or at least revisiting it. > Timeout calculation in stress test doesn't make sense > - > > Key: IMPALA-8222 > URL: https://issues.apache.org/jira/browse/IMPALA-8222 > Project: IMPALA > Issue Type: Bug > Components: Infrastructure >Reporter: Tim Armstrong >Assignee: Tim Armstrong >Priority: Major > > There is some logic in the stress test that tries to guess what a reasonable > timeout for a query is. There are enough fudge factors that the false > positive rate is fairly low, but it also doesn't provide much useful coverage > unless a query is stuck. But an overall job timeout achieves the same thing. > Some specific issues that the current logic has (and which are tricky to > solve): > * The number of concurrent queries is calculated at query submission time. > E.g. a query that starts before a large batch of other queries is submitted > will be given a short timeout multiplier. > * There is no guarantee that performance degrades linearly. E.g. if runtime > filters arrive late, we can see much larger perf hits. > We should consider removing the timeout enforcement or at least revisiting it. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-3735) Add /fragments webpage
[ https://issues.apache.org/jira/browse/IMPALA-3735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16773327#comment-16773327 ] Tim Armstrong commented on IMPALA-3735: --- Abandoned review: https://gerrit.cloudera.org/#/c/3323/ > Add /fragments webpage > -- > > Key: IMPALA-3735 > URL: https://issues.apache.org/jira/browse/IMPALA-3735 > Project: IMPALA > Issue Type: Task > Components: Backend >Affects Versions: Impala 2.6.0 >Reporter: Henry Robinson >Assignee: Henry Robinson >Priority: Minor > Labels: observability > > It's hard at the moment to find out the following: > # What fragments are running for a particular query > # What fragments are running on a particular Impala backend > This makes diagnosing a particular class of bugs, where a query has hung > because of a single hung fragment, very painful. We should add > {{/fragments}}, and also a per-query fragment list to the query details page. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-3735) Add /fragments webpage
[ https://issues.apache.org/jira/browse/IMPALA-3735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16773325#comment-16773325 ] Tim Armstrong commented on IMPALA-3735: --- It looks like IMPALA-6190 solved #1 but #2 isn't solved. I ran into a situation recently where I had a 20 node cluster with queries being spread across all coordinators and I wanted to figure out the coordinator of queries that were running on each backend. E.g. if a fragment is still running, find its coordinator so I can see if it's making progress. It's possible by grepping logs, but I couldn't figure out how to do it via debug pages aside from looking at all of the potential coordinators. It would also be convenient if the page included a link to the query on the coordinator (I think that's tricky to get the hostname and webpage ports right in general, but if it's right in typical cases that would be useful). > Add /fragments webpage > -- > > Key: IMPALA-3735 > URL: https://issues.apache.org/jira/browse/IMPALA-3735 > Project: IMPALA > Issue Type: Task > Components: Backend >Affects Versions: Impala 2.6.0 >Reporter: Henry Robinson >Assignee: Henry Robinson >Priority: Minor > Labels: observability > > It's hard at the moment to find out the following: > # What fragments are running for a particular query > # What fragments are running on a particular Impala backend > This makes diagnosing a particular class of bugs, where a query has hung > because of a single hung fragment, very painful. We should add > {{/fragments}}, and also a per-query fragment list to the query details page. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Assigned] (IMPALA-8215) Don't set numTrue = 1
[ https://issues.apache.org/jira/browse/IMPALA-8215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Armstrong reassigned IMPALA-8215: - Assignee: wuchang > Don't set numTrue = 1 > - > > Key: IMPALA-8215 > URL: https://issues.apache.org/jira/browse/IMPALA-8215 > Project: IMPALA > Issue Type: Sub-task > Components: Frontend >Reporter: Tim Armstrong >Assignee: wuchang >Priority: Major > Labels: newbie, ramp-up > > See the parent task - there's an obvious bug where we set numTrues = 1 for no > obvious reason. We should change the code and update any tests. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Assigned] (IMPALA-8205) Illegal statistics for numFalse and numTrue
[ https://issues.apache.org/jira/browse/IMPALA-8205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Armstrong reassigned IMPALA-8205: - Assignee: wuchang > Illegal statistics for numFalse and numTrue > --- > > Key: IMPALA-8205 > URL: https://issues.apache.org/jira/browse/IMPALA-8205 > Project: IMPALA > Issue Type: Bug >Reporter: wuchang >Assignee: wuchang >Priority: Major > Labels: impala, numFalse, numTrue, statistics > > When impala compute statistics, it set *numFalse = -1* and *numTrue = 1* when > the statistic is missing; > *-1* for *numFalse* will corrupt some query engine like Presto and there > already exists some PR report and hotfix it : > [presto-11859|https://github.com/prestodb/presto/pull/11859] > *1* for *numTrue* is also unreasonable because we are not sure whether it > indicates the real numTrue statistics or a missing statistics; > Also, previously , the *nullCount* also use -1 to indicate its absence which > also caused problem for Presto. Presto has to add a hotfix for > it([presto-11549|https://github.com/prestodb/presto/pull/11549]) . But it is > a fortunate that impala has fixed this bug; > It is necessary to set to null when these statistics are absent instead of -1 > and 1. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Deleted] (IMPALA-8216) Don't set numTrue = 1
[ https://issues.apache.org/jira/browse/IMPALA-8216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Armstrong deleted IMPALA-8216: -- > Don't set numTrue = 1 > - > > Key: IMPALA-8216 > URL: https://issues.apache.org/jira/browse/IMPALA-8216 > Project: IMPALA > Issue Type: Sub-task >Reporter: Tim Armstrong >Priority: Major > Labels: newbie, ramp-up > > See the parent task - there's an obvious bug where we set numTrues = 1 for no > obvious reason. We should change the code and update any tests. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Deleted] (IMPALA-8216) Don't set numTrue = 1
[ https://issues.apache.org/jira/browse/IMPALA-8216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Armstrong deleted IMPALA-8216: -- > Don't set numTrue = 1 > - > > Key: IMPALA-8216 > URL: https://issues.apache.org/jira/browse/IMPALA-8216 > Project: IMPALA > Issue Type: Sub-task >Reporter: Tim Armstrong >Priority: Major > Labels: newbie, ramp-up > > See the parent task - there's an obvious bug where we set numTrues = 1 for no > obvious reason. We should change the code and update any tests. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (IMPALA-8207) Fix query loading in run-workload.py
[ https://issues.apache.org/jira/browse/IMPALA-8207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Tauber-Marshall resolved IMPALA-8207. Resolution: Fixed Fix Version/s: Impala 3.2.0 > Fix query loading in run-workload.py > > > Key: IMPALA-8207 > URL: https://issues.apache.org/jira/browse/IMPALA-8207 > Project: IMPALA > Issue Type: Bug > Components: Infrastructure >Affects Versions: Impala 3.2.0 >Reporter: Thomas Tauber-Marshall >Assignee: Thomas Tauber-Marshall >Priority: Major > Fix For: Impala 3.2.0 > > > The code that run-workload.py uses to retrieve the queries for particular > workloads has not been kept up to date with changes to the contents of the > testdata/workload/* directories, resulting in it picking up and running > various queries that were not really intended to be part of the workloads. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-8207) Fix query loading in run-workload.py
[ https://issues.apache.org/jira/browse/IMPALA-8207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Tauber-Marshall resolved IMPALA-8207. Resolution: Fixed Fix Version/s: Impala 3.2.0 > Fix query loading in run-workload.py > > > Key: IMPALA-8207 > URL: https://issues.apache.org/jira/browse/IMPALA-8207 > Project: IMPALA > Issue Type: Bug > Components: Infrastructure >Affects Versions: Impala 3.2.0 >Reporter: Thomas Tauber-Marshall >Assignee: Thomas Tauber-Marshall >Priority: Major > Fix For: Impala 3.2.0 > > > The code that run-workload.py uses to retrieve the queries for particular > workloads has not been kept up to date with changes to the contents of the > testdata/workload/* directories, resulting in it picking up and running > various queries that were not really intended to be part of the workloads. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (IMPALA-8230) ERROR: AnalysisException: No matching function with signature: trunc(DOUBLE, SMALLINT).
[ https://issues.apache.org/jira/browse/IMPALA-8230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16773218#comment-16773218 ] Jim Apple commented on IMPALA-8230: --- [~qinzl_1], thank you for the bug report. Can you provide the query that produced it? > ERROR: AnalysisException: No matching function with signature: trunc(DOUBLE, > SMALLINT). > --- > > Key: IMPALA-8230 > URL: https://issues.apache.org/jira/browse/IMPALA-8230 > Project: IMPALA > Issue Type: Bug >Affects Versions: Impala 2.12.0 >Reporter: qinzl_1 >Priority: Major > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-8191) TestBreakpadExhaustive.test_minidump_creation fails to kill cluster
[ https://issues.apache.org/jira/browse/IMPALA-8191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Volker updated IMPALA-8191: Labels: breakpad broken-build flaky-test (was: broken-build flaky-test) > TestBreakpadExhaustive.test_minidump_creation fails to kill cluster > --- > > Key: IMPALA-8191 > URL: https://issues.apache.org/jira/browse/IMPALA-8191 > Project: IMPALA > Issue Type: Bug > Components: Infrastructure >Affects Versions: Impala 3.2.0 >Reporter: Andrew Sherman >Assignee: Lars Volker >Priority: Critical > Labels: breakpad, broken-build, flaky-test > Fix For: Impala 3.2.0 > > > h3. Error Message > {quote} > assert not [, > ] + where > [, > ] = > .impalads + > where = > .cluster > {quote} > h3. Stacktrace > {quote} > custom_cluster/test_breakpad.py:183: in test_minidump_creation > self.kill_cluster(SIGSEGV) custom_cluster/test_breakpad.py:81: in > kill_cluster signal is SIGUSR1 or self.assert_all_processes_killed() > custom_cluster/test_breakpad.py:121: in assert_all_processes_killed assert > not self.cluster.impalads E assert not > [, > ] E + where > [, > ] = > .impalads E + > where = > .cluster > {quote} > See [IMPALA-8114] for a similar bug -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-8191) TestBreakpadExhaustive.test_minidump_creation fails to kill cluster
[ https://issues.apache.org/jira/browse/IMPALA-8191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Volker resolved IMPALA-8191. - Resolution: Fixed Fix Version/s: Impala 3.2.0 > TestBreakpadExhaustive.test_minidump_creation fails to kill cluster > --- > > Key: IMPALA-8191 > URL: https://issues.apache.org/jira/browse/IMPALA-8191 > Project: IMPALA > Issue Type: Bug >Reporter: Andrew Sherman >Assignee: Lars Volker >Priority: Critical > Labels: broken-build, flaky-test > Fix For: Impala 3.2.0 > > > h3. Error Message > {quote} > assert not [, > ] + where > [, > ] = > .impalads + > where = > .cluster > {quote} > h3. Stacktrace > {quote} > custom_cluster/test_breakpad.py:183: in test_minidump_creation > self.kill_cluster(SIGSEGV) custom_cluster/test_breakpad.py:81: in > kill_cluster signal is SIGUSR1 or self.assert_all_processes_killed() > custom_cluster/test_breakpad.py:121: in assert_all_processes_killed assert > not self.cluster.impalads E assert not > [, > ] E + where > [, > ] = > .impalads E + > where = > .cluster > {quote} > See [IMPALA-8114] for a similar bug -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (IMPALA-8191) TestBreakpadExhaustive.test_minidump_creation fails to kill cluster
[ https://issues.apache.org/jira/browse/IMPALA-8191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Volker updated IMPALA-8191: Component/s: Infrastructure > TestBreakpadExhaustive.test_minidump_creation fails to kill cluster > --- > > Key: IMPALA-8191 > URL: https://issues.apache.org/jira/browse/IMPALA-8191 > Project: IMPALA > Issue Type: Bug > Components: Infrastructure >Reporter: Andrew Sherman >Assignee: Lars Volker >Priority: Critical > Labels: broken-build, flaky-test > Fix For: Impala 3.2.0 > > > h3. Error Message > {quote} > assert not [, > ] + where > [, > ] = > .impalads + > where = > .cluster > {quote} > h3. Stacktrace > {quote} > custom_cluster/test_breakpad.py:183: in test_minidump_creation > self.kill_cluster(SIGSEGV) custom_cluster/test_breakpad.py:81: in > kill_cluster signal is SIGUSR1 or self.assert_all_processes_killed() > custom_cluster/test_breakpad.py:121: in assert_all_processes_killed assert > not self.cluster.impalads E assert not > [, > ] E + where > [, > ] = > .impalads E + > where = > .cluster > {quote} > See [IMPALA-8114] for a similar bug -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-8216) Don't set numTrue = 1
[ https://issues.apache.org/jira/browse/IMPALA-8216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16773210#comment-16773210 ] Lars Volker commented on IMPALA-8216: - Is this a duplicate of IMPALA-8215? > Don't set numTrue = 1 > - > > Key: IMPALA-8216 > URL: https://issues.apache.org/jira/browse/IMPALA-8216 > Project: IMPALA > Issue Type: Sub-task > Components: Frontend >Reporter: Tim Armstrong >Priority: Major > Labels: newbie, ramp-up > > See the parent task - there's an obvious bug where we set numTrues = 1 for no > obvious reason. We should change the code and update any tests. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-8191) TestBreakpadExhaustive.test_minidump_creation fails to kill cluster
[ https://issues.apache.org/jira/browse/IMPALA-8191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Volker updated IMPALA-8191: Affects Version/s: Impala 3.2.0 > TestBreakpadExhaustive.test_minidump_creation fails to kill cluster > --- > > Key: IMPALA-8191 > URL: https://issues.apache.org/jira/browse/IMPALA-8191 > Project: IMPALA > Issue Type: Bug > Components: Infrastructure >Affects Versions: Impala 3.2.0 >Reporter: Andrew Sherman >Assignee: Lars Volker >Priority: Critical > Labels: broken-build, flaky-test > Fix For: Impala 3.2.0 > > > h3. Error Message > {quote} > assert not [, > ] + where > [, > ] = > .impalads + > where = > .cluster > {quote} > h3. Stacktrace > {quote} > custom_cluster/test_breakpad.py:183: in test_minidump_creation > self.kill_cluster(SIGSEGV) custom_cluster/test_breakpad.py:81: in > kill_cluster signal is SIGUSR1 or self.assert_all_processes_killed() > custom_cluster/test_breakpad.py:121: in assert_all_processes_killed assert > not self.cluster.impalads E assert not > [, > ] E + where > [, > ] = > .impalads E + > where = > .cluster > {quote} > See [IMPALA-8114] for a similar bug -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-8191) TestBreakpadExhaustive.test_minidump_creation fails to kill cluster
[ https://issues.apache.org/jira/browse/IMPALA-8191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Volker resolved IMPALA-8191. - Resolution: Fixed Fix Version/s: Impala 3.2.0 > TestBreakpadExhaustive.test_minidump_creation fails to kill cluster > --- > > Key: IMPALA-8191 > URL: https://issues.apache.org/jira/browse/IMPALA-8191 > Project: IMPALA > Issue Type: Bug >Reporter: Andrew Sherman >Assignee: Lars Volker >Priority: Critical > Labels: broken-build, flaky-test > Fix For: Impala 3.2.0 > > > h3. Error Message > {quote} > assert not [, > ] + where > [, > ] = > .impalads + > where = > .cluster > {quote} > h3. Stacktrace > {quote} > custom_cluster/test_breakpad.py:183: in test_minidump_creation > self.kill_cluster(SIGSEGV) custom_cluster/test_breakpad.py:81: in > kill_cluster signal is SIGUSR1 or self.assert_all_processes_killed() > custom_cluster/test_breakpad.py:121: in assert_all_processes_killed assert > not self.cluster.impalads E assert not > [, > ] E + where > [, > ] = > .impalads E + > where = > .cluster > {quote} > See [IMPALA-8114] for a similar bug -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-8232) Custom cluster tests should allow setting dfs.client settings for impalads
Sahil Takiar created IMPALA-8232: Summary: Custom cluster tests should allow setting dfs.client settings for impalads Key: IMPALA-8232 URL: https://issues.apache.org/jira/browse/IMPALA-8232 Project: IMPALA Issue Type: Improvement Components: Backend Reporter: Sahil Takiar Assignee: Sahil Takiar Right now, custom cluster tests only allow specifying impalad startup options, however, it would be nice if the tests could specify arbitrary HDFS client configs as well (e.g. {{dfs.client}} options). This would allow us to increase our test integration coverage with different HDFS client setups such as (1) disabling short-circuit reads (thus triggering the code path for a remote read) (requires setting {{dfs.client.read.shortcircuit}} to false), (2) enabling hedged reads (requires setting {{dfs.client.hedged.read.threadpool.size}} to a value greater than 0). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (IMPALA-8232) Custom cluster tests should allow setting dfs.client settings for impalads
[ https://issues.apache.org/jira/browse/IMPALA-8232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sahil Takiar updated IMPALA-8232: - Labels: test (was: ) > Custom cluster tests should allow setting dfs.client settings for impalads > -- > > Key: IMPALA-8232 > URL: https://issues.apache.org/jira/browse/IMPALA-8232 > Project: IMPALA > Issue Type: Improvement > Components: Backend >Reporter: Sahil Takiar >Assignee: Sahil Takiar >Priority: Major > Labels: test > > Right now, custom cluster tests only allow specifying impalad startup > options, however, it would be nice if the tests could specify arbitrary HDFS > client configs as well (e.g. {{dfs.client}} options). This would allow us to > increase our test integration coverage with different HDFS client setups such > as (1) disabling short-circuit reads (thus triggering the code path for a > remote read) (requires setting {{dfs.client.read.shortcircuit}} to false), > (2) enabling hedged reads (requires setting > {{dfs.client.hedged.read.threadpool.size}} to a value greater than 0). -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-8232) Custom cluster tests should allow setting dfs.client settings for impalads
Sahil Takiar created IMPALA-8232: Summary: Custom cluster tests should allow setting dfs.client settings for impalads Key: IMPALA-8232 URL: https://issues.apache.org/jira/browse/IMPALA-8232 Project: IMPALA Issue Type: Improvement Components: Backend Reporter: Sahil Takiar Assignee: Sahil Takiar Right now, custom cluster tests only allow specifying impalad startup options, however, it would be nice if the tests could specify arbitrary HDFS client configs as well (e.g. {{dfs.client}} options). This would allow us to increase our test integration coverage with different HDFS client setups such as (1) disabling short-circuit reads (thus triggering the code path for a remote read) (requires setting {{dfs.client.read.shortcircuit}} to false), (2) enabling hedged reads (requires setting {{dfs.client.hedged.read.threadpool.size}} to a value greater than 0). -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Closed] (IMPALA-2125) Improve perf when reading timestamps from parquet files written by hive
[ https://issues.apache.org/jira/browse/IMPALA-2125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Csaba Ringhofer closed IMPALA-2125. --- Resolution: Duplicate > Improve perf when reading timestamps from parquet files written by hive > --- > > Key: IMPALA-2125 > URL: https://issues.apache.org/jira/browse/IMPALA-2125 > Project: IMPALA > Issue Type: Improvement > Components: Backend >Affects Versions: Impala 2.2 >Reporter: casey >Priority: Minor > > This is for tracking purposes. The improvement is already committed -- > 29de99c9d25c49b73488d2f75bc3644ae9ff9325. > When using the flag -convert_legacy_hive_parquet_utc_timestamps=true, > depending on the query the runtime may be 10x longer (possibly more). The > commit above inlines some function calls which improves the 10x case to 5x. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Closed] (IMPALA-2125) Improve perf when reading timestamps from parquet files written by hive
[ https://issues.apache.org/jira/browse/IMPALA-2125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Csaba Ringhofer closed IMPALA-2125. --- Resolution: Duplicate > Improve perf when reading timestamps from parquet files written by hive > --- > > Key: IMPALA-2125 > URL: https://issues.apache.org/jira/browse/IMPALA-2125 > Project: IMPALA > Issue Type: Improvement > Components: Backend >Affects Versions: Impala 2.2 >Reporter: casey >Priority: Minor > > This is for tracking purposes. The improvement is already committed -- > 29de99c9d25c49b73488d2f75bc3644ae9ff9325. > When using the flag -convert_legacy_hive_parquet_utc_timestamps=true, > depending on the query the runtime may be 10x longer (possibly more). The > commit above inlines some function calls which improves the 10x case to 5x. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-2125) Improve perf when reading timestamps from parquet files written by hive
[ https://issues.apache.org/jira/browse/IMPALA-2125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16773094#comment-16773094 ] Csaba Ringhofer commented on IMPALA-2125: - I close this ticket as the global lock is no longer an issue since IMPALA-3307. There is some ongoing effort to make UTC->local conversions even faster: IMPALA-7085 > Improve perf when reading timestamps from parquet files written by hive > --- > > Key: IMPALA-2125 > URL: https://issues.apache.org/jira/browse/IMPALA-2125 > Project: IMPALA > Issue Type: Improvement > Components: Backend >Affects Versions: Impala 2.2 >Reporter: casey >Priority: Minor > > This is for tracking purposes. The improvement is already committed -- > 29de99c9d25c49b73488d2f75bc3644ae9ff9325. > When using the flag -convert_legacy_hive_parquet_utc_timestamps=true, > depending on the query the runtime may be 10x longer (possibly more). The > commit above inlines some function calls which improves the 10x case to 5x. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-5050) Add support to read TIMESTAMP_MILLIS and TIMESTAMP_MICROS to the parquet scanner
[ https://issues.apache.org/jira/browse/IMPALA-5050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16773066#comment-16773066 ] Csaba Ringhofer commented on IMPALA-5050: - [~arodoni_cloudera] I found a few points that should be extended: https://github.com/apache/impala/blob/b8a8edddcb727a28c2d15bdb3533a32454364ade/docs/topics/impala_parquet.xml#L1120 INT64 + OriginalType TIMESTAMP_MILLIS -> TIMESTAMP INT64 + OriginalType TIMESTAMP_MICROS -> TIMESTAMP INT64 + LogicalType TIMESTAMP -> TIMESTAMP Note that these columns can be still read as BIGINT too, so existing queries will work the same way as they used to. https://github.com/apache/impala/blob/b8a8edddcb727a28c2d15bdb3533a32454364ade/docs/shared/impala_common.xml#L2149 I think that these columns written by Sqoop can be read by Impala after this change, but I didn't verify this. https://github.com/apache/impala/blob/b8a8edddcb727a28c2d15bdb3533a32454364ade/docs/topics/impala_timestamp.xml#L197 It could be mentioned that Hive cannot write INT64 timestamps at the moment, but the implementation is in progress: HIVE-21216 https://github.com/apache/impala/blob/b8a8edddcb727a28c2d15bdb3533a32454364ade/docs/topics/impala_timestamp.xml#L218 It should be mentioned that convert_legacy_hive_parquet_utc_timestamps only affects INT96 timestamps. INT64 timestamp with only OriginalType are assumed to be always UTC normalized, so the UTC->local conversion will be always done. INT64 timestamps with LogicalType specify whether UTC->local conversion is necessary depending in the Parquet metadata. > Add support to read TIMESTAMP_MILLIS and TIMESTAMP_MICROS to the parquet > scanner > > > Key: IMPALA-5050 > URL: https://issues.apache.org/jira/browse/IMPALA-5050 > Project: IMPALA > Issue Type: New Feature > Components: Backend >Affects Versions: Impala 2.9.0 >Reporter: Lars Volker >Assignee: Csaba Ringhofer >Priority: Major > Fix For: Impala 3.2.0 > > > This requires updating {{parquet.thrift}} to a version that includes the > {{TIMESTAMP_MICROS}} logical type. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-8229) Resolve CMake errors when open Impala project by CLion in Mac
[ https://issues.apache.org/jira/browse/IMPALA-8229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16773024#comment-16773024 ] Sahil Takiar commented on IMPALA-8229: -- I use CLion as well, but I run it on an Ubuntu machine. Getting it to work on a Mac would be nice, but I've tried several times and I could never get it working. Could you paste what changes to Impala you make to get CLion to work on Mac? > Resolve CMake errors when open Impala project by CLion in Mac > - > > Key: IMPALA-8229 > URL: https://issues.apache.org/jira/browse/IMPALA-8229 > Project: IMPALA > Issue Type: Improvement >Reporter: Quanlong Huang >Priority: Major > > I'm happy to develop Impala in CLion in Mac. It might encourage more people > to join the community since it can significantly lower the threshold for > Impala development. > My normal workflow is > * Understand relative codes in CLion and make changes. > * Generate patches, then debug on my remote Ubuntu machine. > To make CLion works, I comment out some requirements (e.g. llvm) in > CMakeLists.txt, modify impala-config.sh (e.g. for JNI, versions for Darwin), > then copy the generated sources of thrift and some header files of > native-toolchain from my Ubuntu machine. However, it's not an elegant > solution. We need more efforts for this. > Creating the ticket first to see if anyone needs this too. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-8231) Impala allows ambiguous datetime patterns with to_timestamp
[ https://issues.apache.org/jira/browse/IMPALA-8231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Kaszab updated IMPALA-8231: - Description: Impala allows e.g. having multiple year sections in a datetime pattern. {code:java} select to_timestamp('2018-21-01-01', '-yy-MM-dd'); ++ │·· | to_timestamp('2018-21-01-01', '-yy-mm-dd') | │·· ++ │·· | 2021-01-01 00:00:00| │·· ++ {code} Here even the result is something weird: {code:java} select to_timestamp('21-2018-01-01', 'yy--MM-dd'); ++ │·· | to_timestamp('21-2018-01-01', 'yy--mm-dd') | │·· ++ │·· | 3918-01-01 00:00:00| │·· ++ {code} I think having the mentioned patterns in a from_timestamp() is fine as that wouldn't make any inconsistencies in the result. However, in a to_timestamp() it's ambiguous which section to use for populating e.g. the year part of a timestamp. In that case I think returning an error is reasonable. +This proposal is in line with what Oracle does:+ Oracle forbids the same: {code:java} select to_timestamp('2018-19-11-19', '-YY-MM-DD') from DUAL; ORA-01812: year may only be specified once {code} But Oracle allows the same format for conversions the other way around: {code:java} select to_char( to_timestamp('2018-11-19', '-MM-DD'), '-YY-MM-DD') from DUAL; 2018-18-11-19 {code} Note, that this issue is also true for any other datetime pattern element as there is no duplicate or conflict check during parsing. was: Impala allows e.g. having multiple year sections in a datetime pattern. {code:java} select to_timestamp('2018-21-01-01', '-yy-MM-dd'); ++ │·· | to_timestamp('2018-21-01-01', '-yy-mm-dd') | │·· ++ │·· | 2021-01-01 00:00:00| │·· ++ {code} Here even the result is something weird: {code:java} select to_timestamp('21-2018-01-01', 'yy--MM-dd'); ++ │·· | to_timestamp('21-2018-01-01', 'yy--mm-dd') | │·· ++ │·· | 3918-01-01 00:00:00| │·· ++ {code} I think having the mentioned patterns in a from_timestamp() is fine as that wouldn't make any inconsistencies in the result. However, in a to_timestamp() it's ambiguous which section to use for populating e.g. the year part of a timestamp. In that case I think returning an error is reasonable. Oracle forbids the same: {code:java} select to_timestamp('2018-19-11-19', '-YY-MM-DD') from DUAL; ORA-01812: year may only be specified once {code} Note, that this issue is also true for any other datetime pattern element as there is no duplicate or conflict check during parsing. > Impala allows ambiguous datetime patterns with to_timestamp > --- > > Key: IMPALA-8231 > URL: https://issues.apache.org/jira/browse/IMPALA-8231 >
[jira] [Created] (IMPALA-8231) Impala allows ambiguous datetime patterns with to_timestamp
Gabor Kaszab created IMPALA-8231: Summary: Impala allows ambiguous datetime patterns with to_timestamp Key: IMPALA-8231 URL: https://issues.apache.org/jira/browse/IMPALA-8231 Project: IMPALA Issue Type: Bug Components: Backend Affects Versions: Impala 3.1.0 Reporter: Gabor Kaszab Impala allows e.g. having multiple year sections in a datetime pattern. {code:java} select to_timestamp('2018-21-01-01', '-yy-MM-dd'); ++ │·· | to_timestamp('2018-21-01-01', '-yy-mm-dd') | │·· ++ │·· | 2021-01-01 00:00:00| │·· ++ {code} Here even the result is something weird: {code:java} select to_timestamp('21-2018-01-01', 'yy--MM-dd'); ++ │·· | to_timestamp('21-2018-01-01', 'yy--mm-dd') | │·· ++ │·· | 3918-01-01 00:00:00| │·· ++ {code} I think having the mentioned patterns in a from_timestamp() is fine as that wouldn't make any inconsistencies in the result. However, in a to_timestamp() it's ambiguous which section to use for populating e.g. the year part of a timestamp. In that case I think returning an error is reasonable. Oracle forbids the same: {code:java} select to_timestamp('2018-19-11-19', '-YY-MM-DD') from DUAL; ORA-01812: year may only be specified once {code} Note, that this issue is also true for any other datetime pattern element as there is no duplicate or conflict check during parsing. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IMPALA-8231) Impala allows ambiguous datetime patterns with to_timestamp
Gabor Kaszab created IMPALA-8231: Summary: Impala allows ambiguous datetime patterns with to_timestamp Key: IMPALA-8231 URL: https://issues.apache.org/jira/browse/IMPALA-8231 Project: IMPALA Issue Type: Bug Components: Backend Affects Versions: Impala 3.1.0 Reporter: Gabor Kaszab Impala allows e.g. having multiple year sections in a datetime pattern. {code:java} select to_timestamp('2018-21-01-01', '-yy-MM-dd'); ++ │·· | to_timestamp('2018-21-01-01', '-yy-mm-dd') | │·· ++ │·· | 2021-01-01 00:00:00| │·· ++ {code} Here even the result is something weird: {code:java} select to_timestamp('21-2018-01-01', 'yy--MM-dd'); ++ │·· | to_timestamp('21-2018-01-01', 'yy--mm-dd') | │·· ++ │·· | 3918-01-01 00:00:00| │·· ++ {code} I think having the mentioned patterns in a from_timestamp() is fine as that wouldn't make any inconsistencies in the result. However, in a to_timestamp() it's ambiguous which section to use for populating e.g. the year part of a timestamp. In that case I think returning an error is reasonable. Oracle forbids the same: {code:java} select to_timestamp('2018-19-11-19', '-YY-MM-DD') from DUAL; ORA-01812: year may only be specified once {code} Note, that this issue is also true for any other datetime pattern element as there is no duplicate or conflict check during parsing. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-8230) ERROR: AnalysisException: No matching function with signature: trunc(DOUBLE, SMALLINT).
qinzl_1 created IMPALA-8230: --- Summary: ERROR: AnalysisException: No matching function with signature: trunc(DOUBLE, SMALLINT). Key: IMPALA-8230 URL: https://issues.apache.org/jira/browse/IMPALA-8230 Project: IMPALA Issue Type: Bug Affects Versions: Impala 2.12.0 Reporter: qinzl_1 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-8212) Crash during startup in kudu::security::CanonicalizeKrb5Principal()
[ https://issues.apache.org/jira/browse/IMPALA-8212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16772744#comment-16772744 ] Michael Ho commented on IMPALA-8212: Looking at the stack trace of the crash, it seems that the Kudu code made calls to some Kerberos code which made some modification to {{g_krb5_ctx}} inadvertently. As far as I understand, the assumption is that {{g_krb5_ctx}} is global, shared and it should not be modified after initialization. However, the default initialization code {{krb5_init_context(_krb5_ctx)}} called by {{kudu::security:: InitKrb5Ctx()}} only sets {{g_krb5_ctx->default_realm}} to 0. Upon the first call to {{krb5_parse_name()}}, the Kerberos library will call {{krb5_get_default_realm()}} to get the default relam as the Sasl client we created didn't actually take the Kerberos realm as argument. Apparently, {{krb5_get_default_realm}} may modify {{g_krb5_context}} and it's not thread safe. As shown in the stack trace and the code below, {{context->default_realm}} is most likely {{NULL}}. So, if multiple negotiation threads get into the same code path of calling {{krb5_get_default_realm()}} at the same time, they may end up stepping on each other and corrupting {{g_krb5_ctx}}, leading to the crash as we saw above or some error messages like the following: {noformat} 0216 14:26:07.459600 (+ 296us) negotiation.cc:304] Negotiation complete: Runtime error: Server connection negotiation failed: server connection from X.X.X.X:37070: could not canonicalize krb5 principal: could not parse principal: Configuration file does not specify default realm {noformat} [~tlipcon] kindly pointed out that someone reported similar issue in Kerberos upstream in the past (http://krbdev.mit.edu/rt/Ticket/Display.html?id=2855). {noformat} krb5_error_code KRB5_CALLCONV krb5_get_default_realm(krb5_context context, char **realm_out) { krb5_error_code ret; *realm_out = NULL; if (context == NULL || context->magic != KV5M_CONTEXT) return KV5M_CONTEXT; if (context->default_realm == NULL) { ret = get_default_realm(context, >default_realm); <<<- // non-thread safe call if (ret) return ret; } *realm_out = strdup(context->default_realm); return (*realm_out == NULL) ? ENOMEM : 0; } {noformat} Stack trace showing {noformat} #30 #31 0x048d0a53 in tcmalloc::ThreadCache::ReleaseToCentralCache(tcmalloc::ThreadCache::FreeList*, unsigned long, int) () #32 0x048d0aec in tcmalloc::ThreadCache::ListTooLong(tcmalloc::ThreadCache::FreeList*, unsigned long) () #33 0x04a0b4c0 in tc_free () #34 0x7fb03f051720 in profile_iterator_free () from sysroot/lib64/libkrb5.so.3 #35 0x7fb03f0519a4 in profile_get_value () from sysroot/lib64/libkrb5.so.3 #36 0x7fb03f051a18 in profile_get_string () from sysroot/lib64/libkrb5.so.3 #37 0x7fb03f044dde in profile_default_realm () from sysroot/lib64/libkrb5.so.3 #38 0x7fb03f044509 in krb5_get_default_realm () from sysroot/lib64/libkrb5.so.3 #39 0x7fb03f0245e8 in krb5_parse_name_flags () from sysroot/lib64/libkrb5.so.3 #40 0x01ff7bbf in kudu::security::CanonicalizeKrb5Principal(std::string*) () #41 0x026ee4df in kudu::rpc::ServerNegotiation::AuthenticateBySasl(kudu::faststring*) () #42 0x026ea929 in kudu::rpc::ServerNegotiation::Negotiate() () #43 0x0271035b in kudu::rpc::DoServerNegotiation(kudu::rpc::Connection*, kudu::TriStateFlag, kudu::TriStateFlag, kudu::MonoTime const&) () #44 0x0271070d in kudu::rpc::Negotiation::RunNegotiation(scoped_refptr const&, kudu::TriStateFlag, kudu::TriStateFlag, kudu::MonoTime) () {noformat} > Crash during startup in kudu::security::CanonicalizeKrb5Principal() > --- > > Key: IMPALA-8212 > URL: https://issues.apache.org/jira/browse/IMPALA-8212 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 3.2.0 > Environment: CentOS Linux release 7.4.1708 (Core) > Linux vc0512.halxg.cloudera.com 3.10.0-693.21.1.el7.x86_64 #1 SMP Wed Mar 7 > 19:03:37 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux >Reporter: Tim Armstrong >Assignee: Michael Ho >Priority: Blocker > Labels: crash > Attachments: gdb-core-60055.txt, gdb.txt, hs_err_pid60055.log, > hs_err_pid65365.log, > impalad.vc0512.halxg.cloudera.com.impala.log.INFO.20190218-140034.65365, > impalad.vc0513.halxg.cloudera.com.impala.log.INFO.20190216-142536.60055 > > > I saw this crash twice will working on the stress test. It *seems* to happen > when the stress infrastructure switches the service to a debug build, > restarts the service, then starts running queries. I haven't seen it happen > once the service is up and running for a