[jira] [Resolved] (IMPALA-966) Type errors are attributed to wrong expression with insert
[ https://issues.apache.org/jira/browse/IMPALA-966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alice Fan resolved IMPALA-966. -- Resolution: Fixed Fix Version/s: Impala 3.2.0 > Type errors are attributed to wrong expression with insert > -- > > Key: IMPALA-966 > URL: https://issues.apache.org/jira/browse/IMPALA-966 > Project: IMPALA > Issue Type: Bug > Components: Frontend >Affects Versions: Impala 1.3 >Reporter: Henry Robinson >Assignee: Alice Fan >Priority: Minor > Fix For: Impala 3.2.0 > > > The type error below belongs to the second row to be inserted ({{sqrt()}} > returns {{DOUBLE}}). But the obviously {{FLOAT}} first expression gets blamed > for the error. > {code} > [localhost:21000] > insert overwrite alltypesnopart_insert(float_col) > values(CAST(1.0 AS FLOAT)), (sqrt(-1)); > Query: insert overwrite alltypesnopart_insert(float_col) values(CAST(1.0 AS > FLOAT)), (sqrt(-1)) > ERROR: AnalysisException: Possible loss of precision for target table > 'functional.alltypesnopart_insert'. > Expression 'cast(1.0 as float)' (type: DOUBLE) would need to be cast to FLOAT > for column 'float_col' > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-966) Type errors are attributed to wrong expression with insert
[ https://issues.apache.org/jira/browse/IMPALA-966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alice Fan resolved IMPALA-966. -- Resolution: Fixed Fix Version/s: Impala 3.2.0 > Type errors are attributed to wrong expression with insert > -- > > Key: IMPALA-966 > URL: https://issues.apache.org/jira/browse/IMPALA-966 > Project: IMPALA > Issue Type: Bug > Components: Frontend >Affects Versions: Impala 1.3 >Reporter: Henry Robinson >Assignee: Alice Fan >Priority: Minor > Fix For: Impala 3.2.0 > > > The type error below belongs to the second row to be inserted ({{sqrt()}} > returns {{DOUBLE}}). But the obviously {{FLOAT}} first expression gets blamed > for the error. > {code} > [localhost:21000] > insert overwrite alltypesnopart_insert(float_col) > values(CAST(1.0 AS FLOAT)), (sqrt(-1)); > Query: insert overwrite alltypesnopart_insert(float_col) values(CAST(1.0 AS > FLOAT)), (sqrt(-1)) > ERROR: AnalysisException: Possible loss of precision for target table > 'functional.alltypesnopart_insert'. > Expression 'cast(1.0 as float)' (type: DOUBLE) would need to be cast to FLOAT > for column 'float_col' > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (IMPALA-8473) Refactor lineage publication mechanism to allow for different consumers
[ https://issues.apache.org/jira/browse/IMPALA-8473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16840006#comment-16840006 ] radford nguyen commented on IMPALA-8473: [~LinaAtAustin], any reason you've changed your mind? Impala team and I actually prefer the interface approach. > Refactor lineage publication mechanism to allow for different consumers > --- > > Key: IMPALA-8473 > URL: https://issues.apache.org/jira/browse/IMPALA-8473 > Project: IMPALA > Issue Type: Improvement > Components: Backend, Frontend >Reporter: radford nguyen >Assignee: radford nguyen >Priority: Critical > Attachments: ImpalaPostExecHook-infra.patch > > > Impetus for this change is to allow lineage to be consumed by Atlas via Kafka. > h3. Design Proposal > Move lineage logging from be to fe, where we can make use of the same plugin > approach as {{authorization_provider}} to allow a downstream user to provide > their own lineage consumers as runtime dependencies. > [~mad...@apache.org] has provided a fe patch (attached) with suggested > mechanism for allowing multiple hooks to be registered with the fe. Hooks > would be invoked from the be at appropriate places, e.g. > [https://github.com/apache/impala/blob/c1b0a073938c144e9bf33901bd4df6dcda0f09ec/be/src/service/impala-server.cc#L466]. > The hooks should all be executed asynchronously, so the current thinking is > that this execution should happen in the fe, since the be does not know about > what hooks are registered. IOW, the > {{ImpalaPostExecHookFactory.executeHooks}} method (see patch) should probably > make use of a thread-pool executor service (or something similar) in order to > execute all hooks in parallel and in a non-blocking manner, returning to the > be asap. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-8550) Sentry refresh privileges has race conditions
[ https://issues.apache.org/jira/browse/IMPALA-8550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vihang Karajgaonkar updated IMPALA-8550: Description: Recently, I encountered a race condition in {{SentryProxy}}'s refreshSentryAuthorization loop. The race happens when Sentry server is slow to update its information based on changes in HMS. Consider the following scenario: # Impala session from user A creates a database/table. # AuthorizationManager will updateDatabaseOwnerPrivilege [here|[https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java#L1159]] Note that this add adds the user privilege in Catalog's cache out-of-band (without confirming that Sentry has added this privilege in its database) # Assume that Sentry is slow to update its database of roles/privileges. (Actually depending on the timing of these events, it doesn't really matter but likelihood of the issue increases if Sentry is slow. # The refreshSentryAuthorization loop is triggered based on a configured interval [here|[https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/authorization/sentry/SentryProxy.java#L174]]. Since Sentry has not yet updated its database of the owner information, this loop will remove the privilege from Catalog. Any subsequent SQL which requires privileges will fail until Sentry is synced and refresh loop adds this privilege again the catalog cache. was: Recently, I encountered a race condition in \{{SentryProxy}}'s refreshSentryAuthorization loop. The race happens when Sentry server is slow to update its information based on changes in HMS. Consider the following scenario: # Impala session from user A creates a database/table. # AuthorizationManager will updateDatabaseOwnerPrivilege [here|[https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java#L1159]] Note that this add adds the user privilege in Catalog's cache out-of-band (without confirming that Sentry has added this privilege in its database) # Assume that Sentry is slow to update its database of roles/privileges. (Actually depending on the timing of these events, it doesn't really matter but likely increases if Sentry is slow. # The refreshSentryAuthorization loop is triggered based on a configured interval [here|[https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/authorization/sentry/SentryProxy.java#L174]]. Since Sentry has not yet updated its database of the owner information, this loop will remove the privilege from Catalog. Any subsequent SQL which requires privileges will fail until Sentry is synced and refresh loop adds this privilege again the catalog cache. > Sentry refresh privileges has race conditions > - > > Key: IMPALA-8550 > URL: https://issues.apache.org/jira/browse/IMPALA-8550 > Project: IMPALA > Issue Type: Bug >Reporter: Vihang Karajgaonkar >Priority: Major > > Recently, I encountered a race condition in {{SentryProxy}}'s > refreshSentryAuthorization loop. The race happens when Sentry server is slow > to update its information based on changes in HMS. Consider the following > scenario: > # Impala session from user A creates a database/table. > # AuthorizationManager will updateDatabaseOwnerPrivilege > [here|[https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java#L1159]] > Note that this add adds the user privilege in Catalog's cache out-of-band > (without confirming that Sentry has added this privilege in its database) > # Assume that Sentry is slow to update its database of roles/privileges. > (Actually depending on the timing of these events, it doesn't really matter > but likelihood of the issue increases if Sentry is slow. > # The refreshSentryAuthorization loop is triggered based on a configured > interval > [here|[https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/authorization/sentry/SentryProxy.java#L174]]. > Since Sentry has not yet updated its database of the owner information, this > loop will remove the privilege from Catalog. Any subsequent SQL which > requires privileges will fail until Sentry is synced and refresh loop adds > this privilege again the catalog cache. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-8550) Sentry refresh privileges has race conditions
[ https://issues.apache.org/jira/browse/IMPALA-8550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16839908#comment-16839908 ] Fredy Wijaya commented on IMPALA-8550: -- Yeah, this is a known issue with the Sentry object ownership implementation. > Sentry refresh privileges has race conditions > - > > Key: IMPALA-8550 > URL: https://issues.apache.org/jira/browse/IMPALA-8550 > Project: IMPALA > Issue Type: Bug >Reporter: Vihang Karajgaonkar >Priority: Major > > Recently, I encountered a race condition in \{{SentryProxy}}'s > refreshSentryAuthorization loop. The race happens when Sentry server is slow > to update its information based on changes in HMS. Consider the following > scenario: > # Impala session from user A creates a database/table. > # AuthorizationManager will updateDatabaseOwnerPrivilege > [here|[https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java#L1159]] > Note that this add adds the user privilege in Catalog's cache out-of-band > (without confirming that Sentry has added this privilege in its database) > # Assume that Sentry is slow to update its database of roles/privileges. > (Actually depending on the timing of these events, it doesn't really matter > but likely increases if Sentry is slow. > # The refreshSentryAuthorization loop is triggered based on a configured > interval > [here|[https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/authorization/sentry/SentryProxy.java#L174]]. > Since Sentry has not yet updated its database of the owner information, this > loop will remove the privilege from Catalog. Any subsequent SQL which > requires privileges will fail until Sentry is synced and refresh loop adds > this privilege again the catalog cache. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-8550) Sentry refresh privileges has race conditions
[ https://issues.apache.org/jira/browse/IMPALA-8550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16839901#comment-16839901 ] Vihang Karajgaonkar commented on IMPALA-8550: - The easiest way to reproduce this race is to turn on {{test_owner_privileges}} on HMS-3 environment. > Sentry refresh privileges has race conditions > - > > Key: IMPALA-8550 > URL: https://issues.apache.org/jira/browse/IMPALA-8550 > Project: IMPALA > Issue Type: Bug >Reporter: Vihang Karajgaonkar >Priority: Major > > Recently, I encountered a race condition in \{{SentryProxy}}'s > refreshSentryAuthorization loop. The race happens when Sentry server is slow > to update its information based on changes in HMS. Consider the following > scenario: > # Impala session from user A creates a database/table. > # AuthorizationManager will updateDatabaseOwnerPrivilege > [here|[https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java#L1159]] > Note that this add adds the user privilege in Catalog's cache out-of-band > (without confirming that Sentry has added this privilege in its database) > # Assume that Sentry is slow to update its database of roles/privileges. > (Actually depending on the timing of these events, it doesn't really matter > but likely increases if Sentry is slow. > # The refreshSentryAuthorization loop is triggered based on a configured > interval > [here|[https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/authorization/sentry/SentryProxy.java#L174]]. > Since Sentry has not yet updated its database of the owner information, this > loop will remove the privilege from Catalog. Any subsequent SQL which > requires privileges will fail until Sentry is synced and refresh loop adds > this privilege again the catalog cache. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-8550) Sentry refresh privileges has race conditions
Vihang Karajgaonkar created IMPALA-8550: --- Summary: Sentry refresh privileges has race conditions Key: IMPALA-8550 URL: https://issues.apache.org/jira/browse/IMPALA-8550 Project: IMPALA Issue Type: Bug Reporter: Vihang Karajgaonkar Recently, I encountered a race condition in \{{SentryProxy}}'s refreshSentryAuthorization loop. The race happens when Sentry server is slow to update its information based on changes in HMS. Consider the following scenario: # Impala session from user A creates a database/table. # AuthorizationManager will updateDatabaseOwnerPrivilege [here|[https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java#L1159]] Note that this add adds the user privilege in Catalog's cache out-of-band (without confirming that Sentry has added this privilege in its database) # Assume that Sentry is slow to update its database of roles/privileges. (Actually depending on the timing of these events, it doesn't really matter but likely increases if Sentry is slow. # The refreshSentryAuthorization loop is triggered based on a configured interval [here|[https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/authorization/sentry/SentryProxy.java#L174]]. Since Sentry has not yet updated its database of the owner information, this loop will remove the privilege from Catalog. Any subsequent SQL which requires privileges will fail until Sentry is synced and refresh loop adds this privilege again the catalog cache. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IMPALA-8550) Sentry refresh privileges has race conditions
Vihang Karajgaonkar created IMPALA-8550: --- Summary: Sentry refresh privileges has race conditions Key: IMPALA-8550 URL: https://issues.apache.org/jira/browse/IMPALA-8550 Project: IMPALA Issue Type: Bug Reporter: Vihang Karajgaonkar Recently, I encountered a race condition in \{{SentryProxy}}'s refreshSentryAuthorization loop. The race happens when Sentry server is slow to update its information based on changes in HMS. Consider the following scenario: # Impala session from user A creates a database/table. # AuthorizationManager will updateDatabaseOwnerPrivilege [here|[https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java#L1159]] Note that this add adds the user privilege in Catalog's cache out-of-band (without confirming that Sentry has added this privilege in its database) # Assume that Sentry is slow to update its database of roles/privileges. (Actually depending on the timing of these events, it doesn't really matter but likely increases if Sentry is slow. # The refreshSentryAuthorization loop is triggered based on a configured interval [here|[https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/authorization/sentry/SentryProxy.java#L174]]. Since Sentry has not yet updated its database of the owner information, this loop will remove the privilege from Catalog. Any subsequent SQL which requires privileges will fail until Sentry is synced and refresh loop adds this privilege again the catalog cache. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-7288) Codegen crash in FinalizeModule()
[ https://issues.apache.org/jira/browse/IMPALA-7288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16839842#comment-16839842 ] ASF subversion and git services commented on IMPALA-7288: - Commit aea18dd08f34caf5c659c4b71f7bc4d70d743739 in impala's branch refs/heads/2.x from Bikramjeet Vig [ https://gitbox.apache.org/repos/asf?p=impala.git;h=aea18dd ] IMPALA-7288: Fix Codegen Crash in FinalizeModule() (Addendum) In addition to previous fix for IMPALA-7288, this patch would prevent impala from crashing in case a code-path generates a malformed handcrafted function which it then tries to finalize. Ideally this would never happen since the code paths for generating handcrafted IRs would never generate a malformed function. Change-Id: Id09c6f59f677ba30145fb2081715f1a7d89fe20b Reviewed-on: http://gerrit.cloudera.org:8080/10944 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins > Codegen crash in FinalizeModule() > - > > Key: IMPALA-7288 > URL: https://issues.apache.org/jira/browse/IMPALA-7288 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 2.12.0, Impala 3.1.0 >Reporter: Balazs Jeszenszky >Assignee: Bikramjeet Vig >Priority: Blocker > Fix For: Impala 2.13.0, Impala 3.1.0 > > > The following sequence crashes Impala 2.12 reliably: > {code} > CREATE TABLE test (c1 CHAR(6),c2 CHAR(6)); > select 1 from test t1, test t2 > where t1.c1 = FROM_TIMESTAMP(cast(t2.c2 as string), 'MMdd'); > {code} > hs_err_pid has: > {code} > # > # A fatal error has been detected by the Java Runtime Environment: > # > # SIGSEGV (0xb) at pc=0x03b36ce4, pid=28459, tid=0x7f2c49685700 > # > # JRE version: Java(TM) SE Runtime Environment (8.0_162-b12) (build > 1.8.0_162-b12) > # Java VM: Java HotSpot(TM) 64-Bit Server VM (25.162-b12 mixed mode > linux-amd64 compressed oops) > # Problematic frame: > # C [impalad+0x3736ce4] llvm::Value::getContext() const+0x4 > {code} > Backtrace is: > {code} > #0 0x7f2cb217a5f7 in raise () from /lib64/libc.so.6 > #1 0x7f2cb217bce8 in abort () from /lib64/libc.so.6 > #2 0x7f2cb4de2f35 in os::abort(bool) () from > /usr/java/latest/jre/lib/amd64/server/libjvm.so > #3 0x7f2cb4f86f33 in VMError::report_and_die() () from > /usr/java/latest/jre/lib/amd64/server/libjvm.so > #4 0x7f2cb4de922f in JVM_handle_linux_signal () from > /usr/java/latest/jre/lib/amd64/server/libjvm.so > #5 0x7f2cb4ddf253 in signalHandler(int, siginfo*, void*) () from > /usr/java/latest/jre/lib/amd64/server/libjvm.so > #6 > #7 0x03b36ce4 in llvm::Value::getContext() const () > #8 0x03b36cff in llvm::Value::getValueName() const () > #9 0x03b36de9 in llvm::Value::getName() const () > #10 0x01ba6bb2 in impala::LlvmCodeGen::FinalizeModule (this=0x9b53980) > at > /usr/src/debug/impala-2.12.0-cdh5.15.0/be/src/codegen/llvm-codegen.cc:1076 > #11 0x018f5c0f in impala::FragmentInstanceState::Open (this=0xac0b400) > at > /usr/src/debug/impala-2.12.0-cdh5.15.0/be/src/runtime/fragment-instance-state.cc:255 > #12 0x018f3699 in impala::FragmentInstanceState::Exec (this=0xac0b400) > at > /usr/src/debug/impala-2.12.0-cdh5.15.0/be/src/runtime/fragment-instance-state.cc:80 > #13 0x019028c3 in impala::QueryState::ExecFInstance (this=0x9c6ad00, > fis=0xac0b400) > at > /usr/src/debug/impala-2.12.0-cdh5.15.0/be/src/runtime/query-state.cc:410 > #14 0x0190113c in impala::QueryStateoperator()(void) > const (__closure=0x7f2c49684be8) > at > /usr/src/debug/impala-2.12.0-cdh5.15.0/be/src/runtime/query-state.cc:350 > #15 0x019034dd in > boost::detail::function::void_function_obj_invoker0, > void>::invoke(boost::detail::function::function_buffer &) > (function_obj_ptr=...) > at > /usr/src/debug/impala-2.12.0-cdh5.15.0/toolchain/boost-1.57.0-p3/include/boost/function/function_template.hpp:153 > {code} > Crash is at > https://github.com/cloudera/Impala/blob/cdh5-2.12.0_5.15.0/be/src/codegen/llvm-codegen.cc#L1070-L1079. > The repro steps seem to be quite specific. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-6086) Use of permanent function should require SELECT privilege on DB
[ https://issues.apache.org/jira/browse/IMPALA-6086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16839838#comment-16839838 ] ASF subversion and git services commented on IMPALA-6086: - Commit 2e720ace8b285ae6a3b6b5ebc63dcfd04a763ca1 in impala's branch refs/heads/2.x from Zoram Thanga [ https://gitbox.apache.org/repos/asf?p=impala.git;h=2e720ac ] IMPALA-6086: Use of permanent function should require SELECT privilege on DB To use a permanent UDF should require at least SELECT privilege on the database. Functions that have constant arguments get constant-folded into string literals, losing their privilege requests in the process. This patch saves the privilege requests found during the first phase of query analysis, where all the objects and the privileges required to access them are identified. The requests are added back to the new analyzer created for re-analysis post expression rewrite. Testing: New FE test cases have been added to AuthorizationStmtTest. Manual tests were also done to identify the bug, as well as to test the fix. Ran exhaustive and covering tests. Change-Id: Iee70f15e4c04f7daaed9cac2400ec626e1fb0e57 Reviewed-on: http://gerrit.cloudera.org:8080/10850 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins > Use of permanent function should require SELECT privilege on DB > --- > > Key: IMPALA-6086 > URL: https://issues.apache.org/jira/browse/IMPALA-6086 > Project: IMPALA > Issue Type: Bug > Components: Catalog, Security >Affects Versions: Impala 2.9.0, Impala 3.1.0 >Reporter: Zoram Thanga >Assignee: Zoram Thanga >Priority: Minor > Labels: security > Fix For: Impala 3.1.0 > > > A user that has no privilege on a database should not be able to execute any > permanent functions in that database. This is currently possible, and should > be fixed, so that the user must have SELECT privilege to execute permanent > functions. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-8072) Clean up config files in docker containers
[ https://issues.apache.org/jira/browse/IMPALA-8072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16839844#comment-16839844 ] ASF subversion and git services commented on IMPALA-8072: - Commit d12675af59f2ac74db4fae09d41b720bfd72fe4b in impala's branch refs/heads/master from Tim Armstrong [ https://gitbox.apache.org/repos/asf?p=impala.git;h=d12675a ] IMPALA-8072: addendum: don't require fe rebuild for config Previously config changes wouldn't be picked up by containers until maven copied the files from fe/src/test/resources to fe/target/test-classes. This makes it more convenient - after running ./bin/create-test-configuration.sh new configs are picked up by any newly-run containers. Change-Id: I18f9f90667b1d16cf97d3e3f9fac400980d5b733 Reviewed-on: http://gerrit.cloudera.org:8080/13288 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins > Clean up config files in docker containers > -- > > Key: IMPALA-8072 > URL: https://issues.apache.org/jira/browse/IMPALA-8072 > Project: IMPALA > Issue Type: Sub-task > Components: Infrastructure >Affects Versions: Impala 3.2.0 >Reporter: Tim Armstrong >Assignee: Tim Armstrong >Priority: Major > Labels: docker > Fix For: Impala 3.3.0 > > > Currently the docker containers include a bunch of config files copied > indiscriminately from the dev environment. Mostly these aren't valid for a > production container and it's expected that the real config files will be > mounted at /opt/impala/conf. > We should instead include a more reasonable set of default configs (e.g. for > admission control), plus placeholders for other config files that may need to > be overridden with site-specific configs. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-7201) Support DDL in LocalCatalog using existing catalogd
[ https://issues.apache.org/jira/browse/IMPALA-7201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16839837#comment-16839837 ] ASF subversion and git services commented on IMPALA-7201: - Commit 3afde5d99e7bc434358b813457d83cac4a6f086c in impala's branch refs/heads/2.x from Todd Lipcon [ https://gitbox.apache.org/repos/asf?p=impala.git;h=3afde5d ] IMPALA-7201. Support DDL with LocalCatalog enabled This fixes a couple issues with DDL commands when LocalCatalog is enabled: - updateCatalogCache() gets called after any DDL. Instead of throwing an exception, we can just no-op this by returning some fake result. - In order to support 'drop database' we need to properly implement the various function-related calls such that they don't throw exceptions. This changes them to be stubbed out as having no functions. - Fixes for 'alter view' and 'drop view' so that the underlying target table gets loaded by the catalogd before attempting the operation. Without this, in the LocalCatalog case, the catalogd would only have an IncompleteTable and these operations would fail with "unexpected table type" errors. With this patch I was able to run 'run-tests.py -k views' and 3/4 passed. The one that failed depends on HBase tables, not yet implemented. Change-Id: Ic39c97a5f5ad145e03b96d1a470dc2dfa6ec71a5 Reviewed-on: http://gerrit.cloudera.org:8080/10806 Reviewed-by: Todd Lipcon Tested-by: Todd Lipcon > Support DDL in LocalCatalog using existing catalogd > --- > > Key: IMPALA-7201 > URL: https://issues.apache.org/jira/browse/IMPALA-7201 > Project: IMPALA > Issue Type: Sub-task >Reporter: Todd Lipcon >Assignee: Todd Lipcon >Priority: Major > Fix For: Impala 3.1.0 > > > Need some changes to ensure that create table, create view, drop view, etc, > can work. The initial implementation will still RPC out to catalogd, which > will perform the mutations. At some point we may want to move this work to > the impalad itself, but for now keeping the code with as little change as > possible is preferred. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-7228) Add tpcds-unmodified to single-node-perf-run
[ https://issues.apache.org/jira/browse/IMPALA-7228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16839839#comment-16839839 ] ASF subversion and git services commented on IMPALA-7228: - Commit 43e54501cece5e4d4a2f8c483465dd81d8b6a115 in impala's branch refs/heads/2.x from njanarthanan [ https://gitbox.apache.org/repos/asf?p=impala.git;h=43e5450 ] IMPALA-7228: Add tpcds-unmodified to single-node-perf-run Description: tpcds-unmodified workload was added as a part of IMPALA-6819. This change allows tpcds-unmodified workload to be available for the single node perf run. Testing: Ran single node perf run using the following parameters and the test run was successful --iterations 2 --scale 2 --table_formats "parquet/none" \ --num_impalads 1 --workload "tpcds-unmodified" \ --load --query_names "TPCDS-Q17.*" --start_minicluster Change-Id: I511661c586cd55e3240ccbea9c499b9c3fc98440 Reviewed-on: http://gerrit.cloudera.org:8080/10931 Reviewed-by: Impala Public Jenkins Reviewed-by: Jim Apple Tested-by: Impala Public Jenkins > Add tpcds-unmodified to single-node-perf-run > > > Key: IMPALA-7228 > URL: https://issues.apache.org/jira/browse/IMPALA-7228 > Project: IMPALA > Issue Type: Task > Components: Perf Investigation >Affects Versions: Impala 3.1.0 >Reporter: Jim Apple >Assignee: nithya >Priority: Minor > Labels: newbie > > IMPALA-6819 added the tpcds-unmodified workload. This doesn't work with > single-node-perf-run yet: > {noformat} > Traceback (most recent call last): > File "./bin/single_node_perf_run.py", line 334, in > main() > File "./bin/single_node_perf_run.py", line 324, in main > perf_ab_test(options, args) > File "./bin/single_node_perf_run.py", line 231, in perf_ab_test > datasets = set([WORKLOAD_TO_DATASET[workload] for workload in workloads]) > KeyError: 'tpcds-unmodified' > {noformat} > cc: [~njanarthanan] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-7140) Build out support for HDFS tables and views in LocalCatalog
[ https://issues.apache.org/jira/browse/IMPALA-7140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16839836#comment-16839836 ] ASF subversion and git services commented on IMPALA-7140: - Commit cb4755421b3437808037feca5c29d95f446aab93 in impala's branch refs/heads/2.x from Todd Lipcon [ https://gitbox.apache.org/repos/asf?p=impala.git;h=cb47554 ] IMPALA-7140 (part 8): support views in LocalCatalog This adds basic support for loading views in LocalCatalog. Tested with a small unit test and also verified from the shell that I can select from a view. Change-Id: Ib3516b9ceff6dce12ded68d93afde09728627e08 Reviewed-on: http://gerrit.cloudera.org:8080/10805 Tested-by: Impala Public Jenkins Reviewed-by: Todd Lipcon > Build out support for HDFS tables and views in LocalCatalog > --- > > Key: IMPALA-7140 > URL: https://issues.apache.org/jira/browse/IMPALA-7140 > Project: IMPALA > Issue Type: Sub-task > Components: Catalog, Frontend >Reporter: Todd Lipcon >Assignee: Todd Lipcon >Priority: Major > Fix For: Impala 3.1.0 > > > This subtask tracks the work to build out basic read-only support for HDFS > tables and views in the LocalCatalog implementation: > - loading table schemas > - loading partitions > - loading file information from HDFS > This work will be broken up into a number of patches to keep each piece > reviewable. Once this subtask is complete we should be able to plan most > simple read-only queries. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-6819) Add new performance test workloads
[ https://issues.apache.org/jira/browse/IMPALA-6819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16839840#comment-16839840 ] ASF subversion and git services commented on IMPALA-6819: - Commit 43e54501cece5e4d4a2f8c483465dd81d8b6a115 in impala's branch refs/heads/2.x from njanarthanan [ https://gitbox.apache.org/repos/asf?p=impala.git;h=43e5450 ] IMPALA-7228: Add tpcds-unmodified to single-node-perf-run Description: tpcds-unmodified workload was added as a part of IMPALA-6819. This change allows tpcds-unmodified workload to be available for the single node perf run. Testing: Ran single node perf run using the following parameters and the test run was successful --iterations 2 --scale 2 --table_formats "parquet/none" \ --num_impalads 1 --workload "tpcds-unmodified" \ --load --query_names "TPCDS-Q17.*" --start_minicluster Change-Id: I511661c586cd55e3240ccbea9c499b9c3fc98440 Reviewed-on: http://gerrit.cloudera.org:8080/10931 Reviewed-by: Impala Public Jenkins Reviewed-by: Jim Apple Tested-by: Impala Public Jenkins > Add new performance test workloads > --- > > Key: IMPALA-6819 > URL: https://issues.apache.org/jira/browse/IMPALA-6819 > Project: IMPALA > Issue Type: Task > Components: Infrastructure >Reporter: nithya >Assignee: nithya >Priority: Major > > Add additional workloads to impala-asf rep > Workloads that will be added > {code:java} > [targeted-perf] > [tpcds-unmodified] > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-6810) query_test::test_runtime_filters.py::test_row_filters fails when run against an external cluster
[ https://issues.apache.org/jira/browse/IMPALA-6810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16839841#comment-16839841 ] ASF subversion and git services commented on IMPALA-6810: - Commit d0a6239be30b64f1193394236de0564bebe9f696 in impala's branch refs/heads/2.x from Michael Brown [ https://gitbox.apache.org/repos/asf?p=impala.git;h=d0a6239 ] IMPALA-6810: runtime_row_filters.test: omit pool name in pattern Some downstream tests run this with a fair-scheduler.xml set that, while not changing admission control behavior, does change the name of the pool. Omit the pool name to permit that downstream test to succeed. Testing: - local with change in minicluster - downstream in environment as well Change-Id: I3fe6beb169dc6bfefabde9dc7a4632c1a5e63fa7 Reviewed-on: http://gerrit.cloudera.org:8080/10942 Reviewed-by: Michael Brown Tested-by: Impala Public Jenkins > query_test::test_runtime_filters.py::test_row_filters fails when run against > an external cluster > > > Key: IMPALA-6810 > URL: https://issues.apache.org/jira/browse/IMPALA-6810 > Project: IMPALA > Issue Type: Bug > Components: Infrastructure >Affects Versions: Impala 2.12.0 >Reporter: David Knupp >Assignee: Michael Brown >Priority: Critical > Labels: admission-control, resource-management > Fix For: Impala 3.1.0 > > > Presumably this test has been passing when run against the local > mini-cluster. When run against an external cluster, however, the test fails > with an AssertionError because the exception string is different than > expected. > The expected string is: > _ImpalaBeeswaxException: INNER EXCEPTION: 'beeswaxd.ttypes.BeeswaxException'> MESSAGE: Rejected query from pool > {color:red}*default-pool*{color}: minimum memory reservation is greater than > memory available to the query for buffer reservations. Increase the > buffer_pool_limit to 290.00 MB. See the query profile for more information > about the per-node memory requirements._ > The actual string is: > _ImpalaBeeswaxException: INNER EXCEPTION: 'beeswaxd.ttypes.BeeswaxException'> MESSAGE: Rejected query from pool > {color:red}*root.jenkins*{color}: minimum memory reservation is greater than > memory available to the query for buffer reservations. Increase the > buffer_pool_limit to 290.00 MB. See the query profile for more information > about the per-node memory requirements._ > {noformat} > Stacktrace > query_test/test_runtime_filters.py:168: in test_row_filters > test_file_vars={'$RUNTIME_FILTER_WAIT_TIME_MS' : str(WAIT_TIME_MS)}) > common/impala_test_suite.py:401: in run_test_case > self.__verify_exceptions(test_section['CATCH'], str(e), use_db) > common/impala_test_suite.py:279: in __verify_exceptions > (expected_str, actual_str) > E AssertionError: Unexpected exception string. Expected: > ImpalaBeeswaxException: INNER EXCEPTION: 'beeswaxd.ttypes.BeeswaxException'> MESSAGE: Rejected query from pool > default-pool: minimum memory reservation is greater than memory available to > the query for buffer reservations. Increase the buffer_pool_limit to 290.00 > MB. See the query profile for more information about the per-node memory > requirements. > E Not found in actual: ImpalaBeeswaxException: INNER EXCEPTION: 'beeswaxd.ttypes.BeeswaxException'> MESSAGE: Rejected query from pool > root.jenkins: minimum memory reservation is greater than memory available to > the query for buffer reservations. Increase the buffer_pool_limit to 290.00 > MB. See the query profile for more information about the per-node memory > requirements. > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-7734) Catalog memz page shows useless memory breakdown
[ https://issues.apache.org/jira/browse/IMPALA-7734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bikramjeet Vig updated IMPALA-7734: --- Description: If you look at catalogd memz, e.g. at localhost:25020/memz, it has a bogus breakdown. The catalogd does not use the MemTracker infrastructure since the vast majority of it's memory consumption is in the JVM. {noformat} Breakdown : Total=0 Peak=0 Untracked Memory: Total=0 {noformat} Reported by [~alanj_impala_5a78] Update: It is the same for statestored as well and apart from "breakdown" section, the "Memory consumption / limit" part is also redundant. Should remove both parts from the memz pages of both daemons was: If you look at catalogd memz, e.g. at localhost:25020/memz, it has a bogus breakdown. The catalogd does not use the MemTracker infrastructure since the vast majority of it's memory consumption is in the JVM. {noformat} Breakdown : Total=0 Peak=0 Untracked Memory: Total=0 {noformat} Reported by [~alanj_impala_5a78] > Catalog memz page shows useless memory breakdown > > > Key: IMPALA-7734 > URL: https://issues.apache.org/jira/browse/IMPALA-7734 > Project: IMPALA > Issue Type: Bug >Affects Versions: Impala 3.0, Impala 2.12.0, Impala 3.1.0 >Reporter: Tim Armstrong >Priority: Minor > Labels: newbie, observability, ramp-up, supportability > > If you look at catalogd memz, e.g. at localhost:25020/memz, it has a bogus > breakdown. The catalogd does not use the MemTracker infrastructure since the > vast majority of it's memory consumption is in the JVM. > {noformat} > Breakdown > : Total=0 Peak=0 > Untracked Memory: Total=0 > {noformat} > Reported by [~alanj_impala_5a78] > Update: It is the same for statestored as well and apart from "breakdown" > section, the "Memory consumption / limit" part is also redundant. Should > remove both parts from the memz pages of both daemons > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-7734) Catalog and Statestore memz page shows useless memory breakdown
[ https://issues.apache.org/jira/browse/IMPALA-7734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bikramjeet Vig updated IMPALA-7734: --- Summary: Catalog and Statestore memz page shows useless memory breakdown (was: Catalog memz page shows useless memory breakdown) > Catalog and Statestore memz page shows useless memory breakdown > --- > > Key: IMPALA-7734 > URL: https://issues.apache.org/jira/browse/IMPALA-7734 > Project: IMPALA > Issue Type: Bug >Affects Versions: Impala 3.0, Impala 2.12.0, Impala 3.1.0 >Reporter: Tim Armstrong >Priority: Minor > Labels: newbie, observability, ramp-up, supportability > > If you look at catalogd memz, e.g. at localhost:25020/memz, it has a bogus > breakdown. The catalogd does not use the MemTracker infrastructure since the > vast majority of it's memory consumption is in the JVM. > {noformat} > Breakdown > : Total=0 Peak=0 > Untracked Memory: Total=0 > {noformat} > Reported by [~alanj_impala_5a78] > Update: It is the same for statestored as well and apart from "breakdown" > section, the "Memory consumption / limit" part is also redundant. Should > remove both parts from the memz pages of both daemons > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-8400) Implement Ranger audit event handler
[ https://issues.apache.org/jira/browse/IMPALA-8400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fredy Wijaya updated IMPALA-8400: - Summary: Implement Ranger audit event handler (was: Ranger audit log should be done atomically) > Implement Ranger audit event handler > > > Key: IMPALA-8400 > URL: https://issues.apache.org/jira/browse/IMPALA-8400 > Project: IMPALA > Issue Type: Sub-task > Components: Catalog, Frontend >Reporter: Fredy Wijaya >Assignee: Fredy Wijaya >Priority: Critical > > The current implementation logs the audit log per request. We should consider > doing the audit log atomically. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-8371) Unified backend tests need to return appropriate return code
[ https://issues.apache.org/jira/browse/IMPALA-8371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joe McDonnell resolved IMPALA-8371. --- Resolution: Fixed Fix Version/s: Impala 3.3.0 > Unified backend tests need to return appropriate return code > > > Key: IMPALA-8371 > URL: https://issues.apache.org/jira/browse/IMPALA-8371 > Project: IMPALA > Issue Type: Bug > Components: Infrastructure >Affects Versions: Impala 3.3.0 >Reporter: Joe McDonnell >Assignee: Joe McDonnell >Priority: Blocker > Labels: broken-build > Fix For: Impala 3.3.0 > > > The scripts generated by bin/gen-backend-test-script.sh need to return the > return code from the call to the unified backend executable. The JUnitXML > contains a failure, which Jenkins and other tools can process, but the return > code must match up for scripts to be able to loop the test, etc. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-2658) Extend the NDV function to accept a precision
[ https://issues.apache.org/jira/browse/IMPALA-2658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16839721#comment-16839721 ] Peter Ebert commented on IMPALA-2658: - I took a 2nd look and I don't think it make sense to go lower than a precision of 6 (2^6=64), that's only 64 bytes of memory for the register. I'm not confident that going above 16 was tested much in the research, but I think it's reasonable to allow users to try to go higher (I don't recall much precision improvement above 16, but it may vary depending on dataset and hash quality). > Extend the NDV function to accept a precision > - > > Key: IMPALA-2658 > URL: https://issues.apache.org/jira/browse/IMPALA-2658 > Project: IMPALA > Issue Type: Improvement > Components: Backend >Affects Versions: Impala 2.2.4 >Reporter: Peter Ebert >Priority: Minor > Labels: ramp-up > Attachments: Comparison of HLL Memory usage, Query Duration and > Accuracy.jpg > > > Hyperloglog algorithm used by NDV defaults to a precision of 10. Being able > to set this precision would have two benefits: > # Lower precision sizes can speed up the performance, as a precision of 9 has > 1/2 the number of registers as 10 (exponential) and may be just as accurate > depending on expected cardinality. > # Higher precision can help with very large cardinalities (100 million to > billion range) and will typically provide more accurate data. Those who are > presenting estimates to end users will likely be willing to trade some > performance cost for more accuracy, while still out performing the naive > approach by a large margin. > Propose adding the overloaded function NDV(expression, int precision) > with accepted range between 18 and 4 inclusive. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-7734) Catalog memz page shows useless memory breakdown
[ https://issues.apache.org/jira/browse/IMPALA-7734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16839697#comment-16839697 ] Bikramjeet Vig commented on IMPALA-7734: [~ngangam], will do. Thanks > Catalog memz page shows useless memory breakdown > > > Key: IMPALA-7734 > URL: https://issues.apache.org/jira/browse/IMPALA-7734 > Project: IMPALA > Issue Type: Bug >Affects Versions: Impala 3.0, Impala 2.12.0, Impala 3.1.0 >Reporter: Tim Armstrong >Priority: Minor > Labels: newbie, observability, ramp-up, supportability > > If you look at catalogd memz, e.g. at localhost:25020/memz, it has a bogus > breakdown. The catalogd does not use the MemTracker infrastructure since the > vast majority of it's memory consumption is in the JVM. > {noformat} > Breakdown > : Total=0 Peak=0 > Untracked Memory: Total=0 > {noformat} > Reported by [~alanj_impala_5a78] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-7734) Catalog memz page shows useless memory breakdown
[ https://issues.apache.org/jira/browse/IMPALA-7734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16839686#comment-16839686 ] Naveen Gangam commented on IMPALA-7734: --- [~bikramjeet.vig] Sorry, this was meant to be a rampup jira for me in Impala. But I have been re-assigned back to hive. If there are any takers, could you please re-assign it? Thanks > Catalog memz page shows useless memory breakdown > > > Key: IMPALA-7734 > URL: https://issues.apache.org/jira/browse/IMPALA-7734 > Project: IMPALA > Issue Type: Bug >Affects Versions: Impala 3.0, Impala 2.12.0, Impala 3.1.0 >Reporter: Tim Armstrong >Priority: Minor > Labels: newbie, observability, ramp-up, supportability > > If you look at catalogd memz, e.g. at localhost:25020/memz, it has a bogus > breakdown. The catalogd does not use the MemTracker infrastructure since the > vast majority of it's memory consumption is in the JVM. > {noformat} > Breakdown > : Total=0 Peak=0 > Untracked Memory: Total=0 > {noformat} > Reported by [~alanj_impala_5a78] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-2658) Extend the NDV function to accept a precision
[ https://issues.apache.org/jira/browse/IMPALA-2658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16839687#comment-16839687 ] Bikramjeet Vig commented on IMPALA-2658: Thanks [~PeterEbert] > Extend the NDV function to accept a precision > - > > Key: IMPALA-2658 > URL: https://issues.apache.org/jira/browse/IMPALA-2658 > Project: IMPALA > Issue Type: Improvement > Components: Backend >Affects Versions: Impala 2.2.4 >Reporter: Peter Ebert >Priority: Minor > Labels: ramp-up > Attachments: Comparison of HLL Memory usage, Query Duration and > Accuracy.jpg > > > Hyperloglog algorithm used by NDV defaults to a precision of 10. Being able > to set this precision would have two benefits: > # Lower precision sizes can speed up the performance, as a precision of 9 has > 1/2 the number of registers as 10 (exponential) and may be just as accurate > depending on expected cardinality. > # Higher precision can help with very large cardinalities (100 million to > billion range) and will typically provide more accurate data. Those who are > presenting estimates to end users will likely be willing to trade some > performance cost for more accuracy, while still out performing the naive > approach by a large margin. > Propose adding the overloaded function NDV(expression, int precision) > with accepted range between 18 and 4 inclusive. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-8528) Refactor authorization code from AnalysisContext to AuthorizationChecker
[ https://issues.apache.org/jira/browse/IMPALA-8528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16839685#comment-16839685 ] ASF subversion and git services commented on IMPALA-8528: - Commit 5a23bacdba9f199948b6a971aebca30586c360a5 in impala's branch refs/heads/master from Fredy Wijaya [ https://gitbox.apache.org/repos/asf?p=impala.git;h=5a23bac ] IMPALA-8528: Refactor authorization check in AnalysisContext This patch moves the authorization check logic from AnalysisContext into BaseAuthorizationChecker to consolidate the logic into a single place. This patch also converts AuthorizationChecker into an interface The existing implementation code of AuthorizationChecker is now moved to BaseAuthorizationChecker. This patch has no functionality change. Testing: - Ran FE tests - Ran E2E authorization tests Change-Id: I3bc3a11220dae0f49ef3e73d9ff27a90e9d4a71c Reviewed-on: http://gerrit.cloudera.org:8080/13285 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins > Refactor authorization code from AnalysisContext to AuthorizationChecker > > > Key: IMPALA-8528 > URL: https://issues.apache.org/jira/browse/IMPALA-8528 > Project: IMPALA > Issue Type: Sub-task > Components: Frontend >Reporter: Fredy Wijaya >Assignee: Fredy Wijaya >Priority: Critical > Fix For: Impala 3.3.0 > > > Currently the authorization code is scattered in few places, such as > AnalysisContext and AuthorizationChecker. This makes it difficult to add > things such as doing pre and post authorization check for audit logging, etc. > We need to consolidate the authorization code into a single place and perhaps > make AuthorizationChecker as an interface and create a > BaseAuthorizationChecker that contains many useful authorization methods. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-7734) Catalog memz page shows useless memory breakdown
[ https://issues.apache.org/jira/browse/IMPALA-7734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16839670#comment-16839670 ] Bikramjeet Vig commented on IMPALA-7734: [~ngangam] [~ychena] Is anyone of you working on this? > Catalog memz page shows useless memory breakdown > > > Key: IMPALA-7734 > URL: https://issues.apache.org/jira/browse/IMPALA-7734 > Project: IMPALA > Issue Type: Bug >Affects Versions: Impala 3.0, Impala 2.12.0, Impala 3.1.0 >Reporter: Tim Armstrong >Priority: Minor > Labels: newbie, observability, ramp-up, supportability > > If you look at catalogd memz, e.g. at localhost:25020/memz, it has a bogus > breakdown. The catalogd does not use the MemTracker infrastructure since the > vast majority of it's memory consumption is in the JVM. > {noformat} > Breakdown > : Total=0 Peak=0 > Untracked Memory: Total=0 > {noformat} > Reported by [~alanj_impala_5a78] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-8549) Add support for scanning DEFLATE text files
[ https://issues.apache.org/jira/browse/IMPALA-8549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sahil Takiar updated IMPALA-8549: - Labels: ramp-up (was: ) > Add support for scanning DEFLATE text files > --- > > Key: IMPALA-8549 > URL: https://issues.apache.org/jira/browse/IMPALA-8549 > Project: IMPALA > Issue Type: Improvement > Components: Backend >Reporter: Sahil Takiar >Priority: Minor > Labels: ramp-up > > Several Hadoop tools (e.g. Hive, MapReduce, etc.) support reading and writing > text files stored using zlib / deflate (results in files such as > {{00_0.deflate}}). Impala currently does not support reading {{.deflate}} > files and returns errors such as: {{ERROR: Scanner plugin 'DEFLATE' is not > one of the enabled plugins: 'LZO'}}. > Moreover, the default compression codec in Hadoop is zlib / deflate (see > {{o.a.h.io.compress.DefaultCodec}}). So when writing to a text table in Hive, > if users set {{hive.exec.compress.output}} to true, then {{.deflate}} files > will be written by default. > Impala does support zlib / deflate with other file formats though: Avro, > RCFiles, SequenceFiles (see > https://impala.apache.org/docs/build/html/topics/impala_file_formats.html). -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-8549) Add support for scanning DEFLATE text files
[ https://issues.apache.org/jira/browse/IMPALA-8549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16839657#comment-16839657 ] Sahil Takiar commented on IMPALA-8549: -- Thanks Tim. Looking through {{be/src/util/codec.cc}} and {{be/src/util/decompress.cc}} it seems we already have support for creating {{.deflate}} files; and we have test data for {{.deflate}} Avro and Sequence files already. For reference, as part adding support for {{.deflate}} files, the following test changes need to be be made: * {{TestCompressedFormats}} in {{test_compressed_formats.py}} needs to be updated to test text {{.deflate}} files; right now the test says: "# Deflate-compressed (['def']) text files (or at least text files with a compressed extension) have not been tested yet." ** I think getting this test to work requires adding the database {{functional_text_def}} (similar to {{functional_seq_def}}) to the dataload as well > Add support for scanning DEFLATE text files > --- > > Key: IMPALA-8549 > URL: https://issues.apache.org/jira/browse/IMPALA-8549 > Project: IMPALA > Issue Type: Improvement > Components: Backend >Reporter: Sahil Takiar >Priority: Minor > > Several Hadoop tools (e.g. Hive, MapReduce, etc.) support reading and writing > text files stored using zlib / deflate (results in files such as > {{00_0.deflate}}). Impala currently does not support reading {{.deflate}} > files and returns errors such as: {{ERROR: Scanner plugin 'DEFLATE' is not > one of the enabled plugins: 'LZO'}}. > Moreover, the default compression codec in Hadoop is zlib / deflate (see > {{o.a.h.io.compress.DefaultCodec}}). So when writing to a text table in Hive, > if users set {{hive.exec.compress.output}} to true, then {{.deflate}} files > will be written by default. > Impala does support zlib / deflate with other file formats though: Avro, > RCFiles, SequenceFiles (see > https://impala.apache.org/docs/build/html/topics/impala_file_formats.html). -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-8473) Refactor lineage publication mechanism to allow for different consumers
[ https://issues.apache.org/jira/browse/IMPALA-8473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16839623#comment-16839623 ] Na Li commented on IMPALA-8473: --- [~radford-nguyen] please keep your current approach and uses abstract class for "ImpalaPostExecHook ". > Refactor lineage publication mechanism to allow for different consumers > --- > > Key: IMPALA-8473 > URL: https://issues.apache.org/jira/browse/IMPALA-8473 > Project: IMPALA > Issue Type: Improvement > Components: Backend, Frontend >Reporter: radford nguyen >Assignee: radford nguyen >Priority: Critical > Attachments: ImpalaPostExecHook-infra.patch > > > Impetus for this change is to allow lineage to be consumed by Atlas via Kafka. > h3. Design Proposal > Move lineage logging from be to fe, where we can make use of the same plugin > approach as {{authorization_provider}} to allow a downstream user to provide > their own lineage consumers as runtime dependencies. > [~mad...@apache.org] has provided a fe patch (attached) with suggested > mechanism for allowing multiple hooks to be registered with the fe. Hooks > would be invoked from the be at appropriate places, e.g. > [https://github.com/apache/impala/blob/c1b0a073938c144e9bf33901bd4df6dcda0f09ec/be/src/service/impala-server.cc#L466]. > The hooks should all be executed asynchronously, so the current thinking is > that this execution should happen in the fe, since the be does not know about > what hooks are registered. IOW, the > {{ImpalaPostExecHookFactory.executeHooks}} method (see patch) should probably > make use of a thread-pool executor service (or something similar) in order to > execute all hooks in parallel and in a non-blocking manner, returning to the > be asap. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-8376) Add per-directory limits for scratch disk usage
[ https://issues.apache.org/jira/browse/IMPALA-8376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Armstrong updated IMPALA-8376: -- Description: I think we'd want to use a similar syntax to the cache sizes specified for the data cache. > Add per-directory limits for scratch disk usage > --- > > Key: IMPALA-8376 > URL: https://issues.apache.org/jira/browse/IMPALA-8376 > Project: IMPALA > Issue Type: Sub-task > Components: Backend >Reporter: Tim Armstrong >Priority: Major > Labels: resource-management > > I think we'd want to use a similar syntax to the cache sizes specified for > the data cache. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-8414) Warning caused by not skipping header of /proc/net/dev
[ https://issues.apache.org/jira/browse/IMPALA-8414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Volker resolved IMPALA-8414. - Resolution: Fixed Fix Version/s: Impala 3.3.0 > Warning caused by not skipping header of /proc/net/dev > -- > > Key: IMPALA-8414 > URL: https://issues.apache.org/jira/browse/IMPALA-8414 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 3.3.0 >Reporter: Lars Volker >Assignee: Lars Volker >Priority: Major > Fix For: Impala 3.3.0 > > > [This fix|https://gerrit.cloudera.org/#/c/12954/] for IMPALA-8395 does not > skip the first to header lines of /proc/net/dev, causing warnings like this: > {noformat} > W0414 17:58:49.836887 32683 system-state-info.cc:192] Failed to parse > interface name in line: Inter-| Receive >| Transmit > W0414 17:59:49.940279 32683 system-state-info.cc:192] Failed to parse > interface name in line: Inter-| Receive >| Transmit > W0414 18:00:50.077952 32683 system-state-info.cc:192] Failed to parse > interface name in line: Inter-| Receive >| Transmit > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (IMPALA-8414) Warning caused by not skipping header of /proc/net/dev
[ https://issues.apache.org/jira/browse/IMPALA-8414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Volker resolved IMPALA-8414. - Resolution: Fixed Fix Version/s: Impala 3.3.0 > Warning caused by not skipping header of /proc/net/dev > -- > > Key: IMPALA-8414 > URL: https://issues.apache.org/jira/browse/IMPALA-8414 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 3.3.0 >Reporter: Lars Volker >Assignee: Lars Volker >Priority: Major > Fix For: Impala 3.3.0 > > > [This fix|https://gerrit.cloudera.org/#/c/12954/] for IMPALA-8395 does not > skip the first to header lines of /proc/net/dev, causing warnings like this: > {noformat} > W0414 17:58:49.836887 32683 system-state-info.cc:192] Failed to parse > interface name in line: Inter-| Receive >| Transmit > W0414 17:59:49.940279 32683 system-state-info.cc:192] Failed to parse > interface name in line: Inter-| Receive >| Transmit > W0414 18:00:50.077952 32683 system-state-info.cc:192] Failed to parse > interface name in line: Inter-| Receive >| Transmit > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Assigned] (IMPALA-8484) Add support to run queries on disjoint executor groups
[ https://issues.apache.org/jira/browse/IMPALA-8484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Volker reassigned IMPALA-8484: --- Assignee: Lars Volker > Add support to run queries on disjoint executor groups > -- > > Key: IMPALA-8484 > URL: https://issues.apache.org/jira/browse/IMPALA-8484 > Project: IMPALA > Issue Type: New Feature >Affects Versions: Impala 3.3.0 >Reporter: Lars Volker >Assignee: Lars Volker >Priority: Major > Labels: scalability > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-8549) Add support for scanning DEFLATE text files
[ https://issues.apache.org/jira/browse/IMPALA-8549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16839608#comment-16839608 ] Tim Armstrong commented on IMPALA-8549: --- [~stakiar] the code that marks it as unsupported predates my involvement so I'm not sure why it wasn't supported (I stopped short of shaving that particular yak). I believe deflate is a variant of gzip. Looking at the hadoop code the implementations only different in headers: https://github.com/apache/hadoop/tree/a55d6bba71c81c1c4e9d8cd11f55c78f10a548b0/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress. It would make sense to just implement it. > Add support for scanning DEFLATE text files > --- > > Key: IMPALA-8549 > URL: https://issues.apache.org/jira/browse/IMPALA-8549 > Project: IMPALA > Issue Type: Improvement > Components: Backend >Reporter: Sahil Takiar >Priority: Minor > > Several Hadoop tools (e.g. Hive, MapReduce, etc.) support reading and writing > text files stored using zlib / deflate (results in files such as > {{00_0.deflate}}). Impala currently does not support reading {{.deflate}} > files and returns errors such as: {{ERROR: Scanner plugin 'DEFLATE' is not > one of the enabled plugins: 'LZO'}}. > Moreover, the default compression codec in Hadoop is zlib / deflate (see > {{o.a.h.io.compress.DefaultCodec}}). So when writing to a text table in Hive, > if users set {{hive.exec.compress.output}} to true, then {{.deflate}} files > will be written by default. > Impala does support zlib / deflate with other file formats though: Avro, > RCFiles, SequenceFiles (see > https://impala.apache.org/docs/build/html/topics/impala_file_formats.html). -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Work stopped] (IMPALA-7613) Support round(DECIMAL) with non-constant second argument
[ https://issues.apache.org/jira/browse/IMPALA-7613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on IMPALA-7613 stopped by Abhishek Rawat. -- > Support round(DECIMAL) with non-constant second argument > > > Key: IMPALA-7613 > URL: https://issues.apache.org/jira/browse/IMPALA-7613 > Project: IMPALA > Issue Type: Bug > Components: Frontend >Affects Versions: Impala 3.0, Impala 2.12.0 >Reporter: Tim Armstrong >Assignee: Abhishek Rawat >Priority: Major > Labels: decimal, ramp-up > > Sometimes users want to round to a precision that is data-driven (e.g. using > a lookup table). They can't currently do this with decimal. I think we could > support this by just using the input decimal type as the output type when the > second argument is non-constant. > {noformat} > select round(l_tax, l_linenumber) from tpch.lineitem limit 5; > Query: select round(l_tax, l_linenumber) from tpch.lineitem limit 5 > Query submitted at: 2018-09-24 11:03:10 (Coordinator: > http://tarmstrong-box:25000) > ERROR: AnalysisException: round() must be called with a constant second > argument. > {noformat} > Motivated by a user trying to do something like this; > http://community.cloudera.com/t5/Interactive-Short-cycle-SQL/Impala-round-function-does-not-return-expected-result/m-p/80200#M4906 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Work started] (IMPALA-8537) Negative values reported for tmp-file-mgr.scratch-space-bytes-used under heavy spilling load
[ https://issues.apache.org/jira/browse/IMPALA-8537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on IMPALA-8537 started by Abhishek Rawat. -- > Negative values reported for tmp-file-mgr.scratch-space-bytes-used under > heavy spilling load > > > Key: IMPALA-8537 > URL: https://issues.apache.org/jira/browse/IMPALA-8537 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 3.3.0 >Reporter: David Rorke >Assignee: Abhishek Rawat >Priority: Major > Attachments: bad_spill_metrics.json > > > I'm running a workload that does a lot of spilling and noticed the value > reported for tmp-file-mgr.scratch-space-bytes-used is negative on all nodes. > Some details of the workload and cluster configuration: > * Generating a 10 TB TPC-DS partitioned parquet data set (very large sort). > * 30 impalads, each with 48 GB RAM and 14 scratch directories (each on a > separate drive) > * Rough estimate (based on query metrics) of total cumulative aggregate > memory spilled across the cluster since restart is 6.5 TB. > Snapshot of the bad metrics attached. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-8527) Maven hangs on jenkins.impala.io talking to repository.apache.org
[ https://issues.apache.org/jira/browse/IMPALA-8527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Armstrong resolved IMPALA-8527. --- Resolution: Fixed > Maven hangs on jenkins.impala.io talking to repository.apache.org > - > > Key: IMPALA-8527 > URL: https://issues.apache.org/jira/browse/IMPALA-8527 > Project: IMPALA > Issue Type: Bug > Components: Infrastructure >Affects Versions: Impala 3.3.0 >Reporter: Tim Armstrong >Assignee: Tim Armstrong >Priority: Blocker > Labels: broken-build > Fix For: Impala 3.3.0 > > > We're seeing most precommit builds failing because mvn gets stuck talking to > repository.apache.org. See IMPALA-8516. > I'm going to see if we can avoid it by pruning down our Maven repository > dependencies - we should be able to get all the artifacts from other mirrors. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-8527) Maven hangs on jenkins.impala.io talking to repository.apache.org
[ https://issues.apache.org/jira/browse/IMPALA-8527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Armstrong resolved IMPALA-8527. --- Resolution: Fixed > Maven hangs on jenkins.impala.io talking to repository.apache.org > - > > Key: IMPALA-8527 > URL: https://issues.apache.org/jira/browse/IMPALA-8527 > Project: IMPALA > Issue Type: Bug > Components: Infrastructure >Affects Versions: Impala 3.3.0 >Reporter: Tim Armstrong >Assignee: Tim Armstrong >Priority: Blocker > Labels: broken-build > Fix For: Impala 3.3.0 > > > We're seeing most precommit builds failing because mvn gets stuck talking to > repository.apache.org. See IMPALA-8516. > I'm going to see if we can avoid it by pruning down our Maven repository > dependencies - we should be able to get all the artifacts from other mirrors. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work started] (IMPALA-8450) Add support for zstd and lz4 in parquet
[ https://issues.apache.org/jira/browse/IMPALA-8450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on IMPALA-8450 started by Abhishek Rawat. -- > Add support for zstd and lz4 in parquet > --- > > Key: IMPALA-8450 > URL: https://issues.apache.org/jira/browse/IMPALA-8450 > Project: IMPALA > Issue Type: Improvement > Components: Backend >Reporter: Tim Armstrong >Assignee: Abhishek Rawat >Priority: Major > Labels: parquet > > PARQUET-970 added these codecs to the format. We have LZ4 in the toolchain > already and I just added zstd: https://gerrit.cloudera.org/#/c/13079/ > These codec probably offer a better trade-off of density and speed than > snappy or gzip. > https://github.com/apache/arrow/pull/807/files might be a useful crib sheet > for how to add a compressor. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Assigned] (IMPALA-8548) Include Documentation About Ordinal Substitution
[ https://issues.apache.org/jira/browse/IMPALA-8548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Rodoni reassigned IMPALA-8548: --- Assignee: Alex Rodoni > Include Documentation About Ordinal Substitution > - > > Key: IMPALA-8548 > URL: https://issues.apache.org/jira/browse/IMPALA-8548 > Project: IMPALA > Issue Type: Documentation > Components: Docs >Affects Versions: Impala 2.0, Impala 3.0 >Reporter: David Mollitor >Assignee: Alex Rodoni >Priority: Minor > > Update Impala docs to include information on the 'ordinal substitution' > feature. > > [https://github.com/apache/impala/blob/master/docs/shared/impala_common.xml#L1104] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-8549) Add support for scanning DEFLATE text files
[ https://issues.apache.org/jira/browse/IMPALA-8549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16839558#comment-16839558 ] Sahil Takiar commented on IMPALA-8549: -- CC: [~tarmstr...@cloudera.com] I think you might be more familiar with this area than I am, so just wondering if I am missing something here. > Add support for scanning DEFLATE text files > --- > > Key: IMPALA-8549 > URL: https://issues.apache.org/jira/browse/IMPALA-8549 > Project: IMPALA > Issue Type: Improvement > Components: Backend >Reporter: Sahil Takiar >Priority: Minor > > Several Hadoop tools (e.g. Hive, MapReduce, etc.) support reading and writing > text files stored using zlib / deflate (results in files such as > {{00_0.deflate}}). Impala currently does not support reading {{.deflate}} > files and returns errors such as: {{ERROR: Scanner plugin 'DEFLATE' is not > one of the enabled plugins: 'LZO'}}. > Moreover, the default compression codec in Hadoop is zlib / deflate (see > {{o.a.h.io.compress.DefaultCodec}}). So when writing to a text table in Hive, > if users set {{hive.exec.compress.output}} to true, then {{.deflate}} files > will be written by default. > Impala does support zlib / deflate with other file formats though: Avro, > RCFiles, SequenceFiles (see > https://impala.apache.org/docs/build/html/topics/impala_file_formats.html). -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-8549) Add support for scanning DEFLATE text files
[ https://issues.apache.org/jira/browse/IMPALA-8549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sahil Takiar updated IMPALA-8549: - Priority: Minor (was: Major) > Add support for scanning DEFLATE text files > --- > > Key: IMPALA-8549 > URL: https://issues.apache.org/jira/browse/IMPALA-8549 > Project: IMPALA > Issue Type: Improvement > Components: Backend >Reporter: Sahil Takiar >Priority: Minor > > Several Hadoop tools (e.g. Hive, MapReduce, etc.) support reading and writing > text files stored using zlib / deflate (results in files such as > {{00_0.deflate}}). Impala currently does not support reading {{.deflate}} > files and returns errors such as: {{ERROR: Scanner plugin 'DEFLATE' is not > one of the enabled plugins: 'LZO'}}. > Moreover, the default compression codec in Hadoop is zlib / deflate (see > {{o.a.h.io.compress.DefaultCodec}}). So when writing to a text table in Hive, > if users set {{hive.exec.compress.output}} to true, then {{.deflate}} files > will be written by default. > Impala does support zlib / deflate with other file formats though: Avro, > RCFiles, SequenceFiles (see > https://impala.apache.org/docs/build/html/topics/impala_file_formats.html). -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-8528) Refactor authorization code from AnalysisContext to AuthorizationChecker
[ https://issues.apache.org/jira/browse/IMPALA-8528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fredy Wijaya resolved IMPALA-8528. -- Resolution: Fixed Fix Version/s: Impala 3.3.0 > Refactor authorization code from AnalysisContext to AuthorizationChecker > > > Key: IMPALA-8528 > URL: https://issues.apache.org/jira/browse/IMPALA-8528 > Project: IMPALA > Issue Type: Sub-task > Components: Frontend >Reporter: Fredy Wijaya >Assignee: Fredy Wijaya >Priority: Critical > Fix For: Impala 3.3.0 > > > Currently the authorization code is scattered in few places, such as > AnalysisContext and AuthorizationChecker. This makes it difficult to add > things such as doing pre and post authorization check for audit logging, etc. > We need to consolidate the authorization code into a single place and perhaps > make AuthorizationChecker as an interface and create a > BaseAuthorizationChecker that contains many useful authorization methods. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-8528) Refactor authorization code from AnalysisContext to AuthorizationChecker
[ https://issues.apache.org/jira/browse/IMPALA-8528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fredy Wijaya resolved IMPALA-8528. -- Resolution: Fixed Fix Version/s: Impala 3.3.0 > Refactor authorization code from AnalysisContext to AuthorizationChecker > > > Key: IMPALA-8528 > URL: https://issues.apache.org/jira/browse/IMPALA-8528 > Project: IMPALA > Issue Type: Sub-task > Components: Frontend >Reporter: Fredy Wijaya >Assignee: Fredy Wijaya >Priority: Critical > Fix For: Impala 3.3.0 > > > Currently the authorization code is scattered in few places, such as > AnalysisContext and AuthorizationChecker. This makes it difficult to add > things such as doing pre and post authorization check for audit logging, etc. > We need to consolidate the authorization code into a single place and perhaps > make AuthorizationChecker as an interface and create a > BaseAuthorizationChecker that contains many useful authorization methods. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IMPALA-8549) Add support for scanning DEFLATE text files
Sahil Takiar created IMPALA-8549: Summary: Add support for scanning DEFLATE text files Key: IMPALA-8549 URL: https://issues.apache.org/jira/browse/IMPALA-8549 Project: IMPALA Issue Type: Improvement Components: Backend Reporter: Sahil Takiar Several Hadoop tools (e.g. Hive, MapReduce, etc.) support reading and writing text files stored using zlib / deflate (results in files such as {{00_0.deflate}}). Impala currently does not support reading {{.deflate}} files and returns errors such as: {{ERROR: Scanner plugin 'DEFLATE' is not one of the enabled plugins: 'LZO'}}. Moreover, the default compression codec in Hadoop is zlib / deflate (see {{o.a.h.io.compress.DefaultCodec}}). So when writing to a text table in Hive, if users set {{hive.exec.compress.output}} to true, then {{.deflate}} files will be written by default. Impala does support zlib / deflate with other file formats though: Avro, RCFiles, SequenceFiles (see https://impala.apache.org/docs/build/html/topics/impala_file_formats.html). -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-8548) Include Documentation About Ordinal Substitution
David Mollitor created IMPALA-8548: -- Summary: Include Documentation About Ordinal Substitution Key: IMPALA-8548 URL: https://issues.apache.org/jira/browse/IMPALA-8548 Project: IMPALA Issue Type: Documentation Components: Docs Affects Versions: Impala 3.0, Impala 2.0 Reporter: David Mollitor Update Impala docs to include information on the 'ordinal substitution' feature. [https://github.com/apache/impala/blob/master/docs/shared/impala_common.xml#L1104] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-7107) [DOCS] Review docs for storage formats impala cannot insert into
[ https://issues.apache.org/jira/browse/IMPALA-7107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16839526#comment-16839526 ] Sahil Takiar commented on IMPALA-7107: -- [~arodoni_cloudera] I suggest we add it back in for now. You can't actually query {{.deflate}} text files in Impala. > [DOCS] Review docs for storage formats impala cannot insert into > > > Key: IMPALA-7107 > URL: https://issues.apache.org/jira/browse/IMPALA-7107 > Project: IMPALA > Issue Type: Bug > Components: Docs >Affects Versions: Impala 2.12.0 >Reporter: Balazs Jeszenszky >Assignee: Alex Rodoni >Priority: Minor > Fix For: Impala 3.2.0 > > > There are several points to clear up or improve across these pages: > * I'd refer to the Hive documentation on how to set compression codecs > instead of documenting Hive's behaviour for file formats Impala cannot write > * Add 'Ingesting file formats Impala can't write' section to 'How Impala > Works with Hadoop File Formats' page, link that central location from > wherever applicable. Unify the recommendation on data loading (usage of LOAD > DATA or hive or manual copy). > * add a compatibility matrix for compressions and file formats, clear up > compatibility on 'How Impala Works with Hadoop File Formats' (the page is > inconsistent even within itself, e.g. bzip2). > * Remove references to Impala versions <2.0 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-8548) Include Documentation About Ordinal Substitution
[ https://issues.apache.org/jira/browse/IMPALA-8548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Mollitor updated IMPALA-8548: --- Priority: Minor (was: Major) > Include Documentation About Ordinal Substitution > - > > Key: IMPALA-8548 > URL: https://issues.apache.org/jira/browse/IMPALA-8548 > Project: IMPALA > Issue Type: Documentation > Components: Docs >Affects Versions: Impala 2.0, Impala 3.0 >Reporter: David Mollitor >Priority: Minor > > Update Impala docs to include information on the 'ordinal substitution' > feature. > > [https://github.com/apache/impala/blob/master/docs/shared/impala_common.xml#L1104] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-7107) [DOCS] Review docs for storage formats impala cannot insert into
[ https://issues.apache.org/jira/browse/IMPALA-7107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16839489#comment-16839489 ] Balazs Jeszenszky commented on IMPALA-7107: --- That's right. Sorry, I misunderstood your original comment. Disregard. > [DOCS] Review docs for storage formats impala cannot insert into > > > Key: IMPALA-7107 > URL: https://issues.apache.org/jira/browse/IMPALA-7107 > Project: IMPALA > Issue Type: Bug > Components: Docs >Affects Versions: Impala 2.12.0 >Reporter: Balazs Jeszenszky >Assignee: Alex Rodoni >Priority: Minor > Fix For: Impala 3.2.0 > > > There are several points to clear up or improve across these pages: > * I'd refer to the Hive documentation on how to set compression codecs > instead of documenting Hive's behaviour for file formats Impala cannot write > * Add 'Ingesting file formats Impala can't write' section to 'How Impala > Works with Hadoop File Formats' page, link that central location from > wherever applicable. Unify the recommendation on data loading (usage of LOAD > DATA or hive or manual copy). > * add a compatibility matrix for compressions and file formats, clear up > compatibility on 'How Impala Works with Hadoop File Formats' (the page is > inconsistent even within itself, e.g. bzip2). > * Remove references to Impala versions <2.0 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-8490) Impala Doc: the file handle cache now supports S3
[ https://issues.apache.org/jira/browse/IMPALA-8490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16839475#comment-16839475 ] Sahil Takiar commented on IMPALA-8490: -- [~arodoni_cloudera] IMPALA-8428 has been merged now, so this should be unblocked. > Impala Doc: the file handle cache now supports S3 > - > > Key: IMPALA-8490 > URL: https://issues.apache.org/jira/browse/IMPALA-8490 > Project: IMPALA > Issue Type: Sub-task > Components: Docs >Reporter: Sahil Takiar >Assignee: Alex Rodoni >Priority: Major > Labels: future_release_doc, in_33 > > https://impala.apache.org/docs/build/html/topics/impala_scalability.html > state: > {quote} > Because this feature only involves HDFS data files, it does not apply to > non-HDFS tables, such as Kudu or HBase tables, or tables that store their > data on cloud services such as S3 or ADLS. > {quote} > This section should be updated because the file handle cache now supports S3 > files. > We should add a section to the docs similar to what we added when support for > remote HDFS files was added to the file handle cache: > {quote} > In Impala 3.2 and higher, file handle caching also applies to remote HDFS > file handles. This is controlled by the cache_remote_file_handles flag for an > impalad. It is recommended that you use the default value of true as this > caching prevents your NameNode from overloading when your cluster has many > remote HDFS reads. > {quote} > Like {{cache_remote_file_handles}} the flag {{cache_s3_file_handles}} has > been added as an impalad startup option (the flag is enabled by default). > Unlike HDFS though, S3 has no NameNode, the benefit is that it eliminate a > call to {{getFileStatus()}} on the target S3 file. So "prevents your NameNode > from overloading when your cluster has many remote HDFS reads" should be > changed to something like "avoids an unnecessary call to > S3AFileSystem#getFileStatus() which reduces the number of API calls made to > S3." -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-8428) Add support for caching file handles on s3
[ https://issues.apache.org/jira/browse/IMPALA-8428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sahil Takiar resolved IMPALA-8428. -- Resolution: Fixed Fix Version/s: Impala 3.3.0 > Add support for caching file handles on s3 > -- > > Key: IMPALA-8428 > URL: https://issues.apache.org/jira/browse/IMPALA-8428 > Project: IMPALA > Issue Type: Improvement > Components: Backend >Affects Versions: Impala 3.3.0 >Reporter: Joe McDonnell >Assignee: Sahil Takiar >Priority: Critical > Fix For: Impala 3.3.0 > > > The file handle cache is currently disabled for S3, as the S3 connector > needed to implement proper unbuffer support. Now that > https://issues.apache.org/jira/browse/HADOOP-14747 is fixed, Impala should > provide an option to cache S3 file handles. > This is particularly important for data caching, as accessing the data cache > happens after obtaining a file handle. If getting a file handle is slow, the > caching will be less effective. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-8428) Add support for caching file handles on s3
[ https://issues.apache.org/jira/browse/IMPALA-8428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sahil Takiar resolved IMPALA-8428. -- Resolution: Fixed Fix Version/s: Impala 3.3.0 > Add support for caching file handles on s3 > -- > > Key: IMPALA-8428 > URL: https://issues.apache.org/jira/browse/IMPALA-8428 > Project: IMPALA > Issue Type: Improvement > Components: Backend >Affects Versions: Impala 3.3.0 >Reporter: Joe McDonnell >Assignee: Sahil Takiar >Priority: Critical > Fix For: Impala 3.3.0 > > > The file handle cache is currently disabled for S3, as the S3 connector > needed to implement proper unbuffer support. Now that > https://issues.apache.org/jira/browse/HADOOP-14747 is fixed, Impala should > provide an option to cache S3 file handles. > This is particularly important for data caching, as accessing the data cache > happens after obtaining a file handle. If getting a file handle is slow, the > caching will be less effective. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (IMPALA-2658) Extend the NDV function to accept a precision
[ https://issues.apache.org/jira/browse/IMPALA-2658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16839440#comment-16839440 ] Peter Ebert commented on IMPALA-2658: - Based on the research papers [http://algo.inria.fr/flajolet/Publications/FlMa85.pdf] and [https://ai.google/research/pubs/pub40671] It has been some time since I read them but I do not recall seeing precisions outside of that range in the research. Above 18 the register size becomes quite large and smaller than 4 the accuracy was very low. > Extend the NDV function to accept a precision > - > > Key: IMPALA-2658 > URL: https://issues.apache.org/jira/browse/IMPALA-2658 > Project: IMPALA > Issue Type: Improvement > Components: Backend >Affects Versions: Impala 2.2.4 >Reporter: Peter Ebert >Priority: Minor > Labels: ramp-up > Attachments: Comparison of HLL Memory usage, Query Duration and > Accuracy.jpg > > > Hyperloglog algorithm used by NDV defaults to a precision of 10. Being able > to set this precision would have two benefits: > # Lower precision sizes can speed up the performance, as a precision of 9 has > 1/2 the number of registers as 10 (exponential) and may be just as accurate > depending on expected cardinality. > # Higher precision can help with very large cardinalities (100 million to > billion range) and will typically provide more accurate data. Those who are > presenting estimates to end users will likely be willing to trade some > performance cost for more accuracy, while still out performing the naive > approach by a large margin. > Propose adding the overloaded function NDV(expression, int precision) > with accepted range between 18 and 4 inclusive. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-8547) get_json_object fails to get value for numeric key
[ https://issues.apache.org/jira/browse/IMPALA-8547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Zimichev updated IMPALA-8547: Labels: built-in-function (was: ) > get_json_object fails to get value for numeric key > -- > > Key: IMPALA-8547 > URL: https://issues.apache.org/jira/browse/IMPALA-8547 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 3.1.0 >Reporter: Eugene Zimichev >Priority: Minor > Labels: built-in-function > > {code:java} > select get_json_object('{"1": 5}', '$.1'); > {code} > returns error: > > {code:java} > "Expected key at position 2" > {code} > > I guess it's caused by using function FindEndOfIdentifier that expects first > symbol of key to be a letter. > Hive version of get_json_object works fine in this case. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-8547) get_json_object fails to get value for numeric key
[ https://issues.apache.org/jira/browse/IMPALA-8547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Zimichev updated IMPALA-8547: Component/s: Backend > get_json_object fails to get value for numeric key > -- > > Key: IMPALA-8547 > URL: https://issues.apache.org/jira/browse/IMPALA-8547 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 3.1.0 >Reporter: Eugene Zimichev >Priority: Minor > > {code:java} > select get_json_object('{"1": 5}', '$.1'); > {code} > returns error: > > {code:java} > "Expected key at position 2" > {code} > > I guess it's caused by using function FindEndOfIdentifier that expects first > symbol of key to be a letter. > Hive version of get_json_object works fine in this case. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-8547) get_json_object fails to get value for numeric key
Eugene Zimichev created IMPALA-8547: --- Summary: get_json_object fails to get value for numeric key Key: IMPALA-8547 URL: https://issues.apache.org/jira/browse/IMPALA-8547 Project: IMPALA Issue Type: Bug Affects Versions: Impala 3.1.0 Reporter: Eugene Zimichev {code:java} select get_json_object('{"1": 5}', '$.1'); {code} returns error: {code:java} "Expected key at position 2" {code} I guess it's caused by using function FindEndOfIdentifier that expects first symbol of key to be a letter. Hive version of get_json_object works fine in this case. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-8546) Collect logs from docker containers in tests
Tim Armstrong created IMPALA-8546: - Summary: Collect logs from docker containers in tests Key: IMPALA-8546 URL: https://issues.apache.org/jira/browse/IMPALA-8546 Project: IMPALA Issue Type: Sub-task Components: Infrastructure Affects Versions: Impala 3.3.0 Reporter: Tim Armstrong Assignee: Tim Armstrong We should collect the logs from the cluster processes into the logs/ subdirectory for debugging purposes. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (IMPALA-8515) Test impala-shell distribution instead of special dev environment version
[ https://issues.apache.org/jira/browse/IMPALA-8515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Armstrong resolved IMPALA-8515. --- Resolution: Fixed Fix Version/s: Impala 3.3.0 > Test impala-shell distribution instead of special dev environment version > - > > Key: IMPALA-8515 > URL: https://issues.apache.org/jira/browse/IMPALA-8515 > Project: IMPALA > Issue Type: Sub-task > Components: Infrastructure >Reporter: Tim Armstrong >Assignee: Tim Armstrong >Priority: Major > Fix For: Impala 3.3.0 > > > Impala shell tests use bin/impala-shell.sh, which uses impala-python and > various dev-environment specific infrastructure to run impala-shell. We also > build a shell tarball, which is meant to be a self-contained version of the > shell with all dependencies. > In principle it's better to test the build artifacts rather than the > development environment. Therefore for full builds, where we build the > tarball, we should test the contents of the tarball including the bundled > libraries. > For remote cluster tests, we can continue to use the dev environment (since > we don't necessarily build the shell tarball there). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (IMPALA-8515) Test impala-shell distribution instead of special dev environment version
[ https://issues.apache.org/jira/browse/IMPALA-8515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Armstrong resolved IMPALA-8515. --- Resolution: Fixed Fix Version/s: Impala 3.3.0 > Test impala-shell distribution instead of special dev environment version > - > > Key: IMPALA-8515 > URL: https://issues.apache.org/jira/browse/IMPALA-8515 > Project: IMPALA > Issue Type: Sub-task > Components: Infrastructure >Reporter: Tim Armstrong >Assignee: Tim Armstrong >Priority: Major > Fix For: Impala 3.3.0 > > > Impala shell tests use bin/impala-shell.sh, which uses impala-python and > various dev-environment specific infrastructure to run impala-shell. We also > build a shell tarball, which is meant to be a self-contained version of the > shell with all dependencies. > In principle it's better to test the build artifacts rather than the > development environment. Therefore for full builds, where we build the > tarball, we should test the contents of the tarball including the bundled > libraries. > For remote cluster tests, we can continue to use the dev environment (since > we don't necessarily build the shell tarball there). -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-8515) Test impala-shell distribution instead of special dev environment version
[ https://issues.apache.org/jira/browse/IMPALA-8515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16839115#comment-16839115 ] ASF subversion and git services commented on IMPALA-8515: - Commit b55d905322db017a11b5424da9c26c8d43aebb4c in impala's branch refs/heads/master from Tim Armstrong [ https://gitbox.apache.org/repos/asf?p=impala.git;h=b55d905 ] IMPALA-8515: port shell tests to use shell build shell/make_shell_tarball.sh builds a tarball with all the shell dependencies bundled. We should test the contents of that tarball in the shell tests instead of using infra/python/env and the libraries bundled there. This tarball is one of the default targets (e.g. run by buildall.sh) so this should not affect any typical development workflows. Note that this means the shell tests now requires the shell tarball to be built locally, which doesn't necessarily happen for remote cluster tests, so we preserve the old behaviour in that case. Testing: Ran core tests on CentOS 6 and CentOS 7. Change-Id: I581363639b279a9c2ff1fd982bdb140260b24baa Reviewed-on: http://gerrit.cloudera.org:8080/13267 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins > Test impala-shell distribution instead of special dev environment version > - > > Key: IMPALA-8515 > URL: https://issues.apache.org/jira/browse/IMPALA-8515 > Project: IMPALA > Issue Type: Sub-task > Components: Infrastructure >Reporter: Tim Armstrong >Assignee: Tim Armstrong >Priority: Major > > Impala shell tests use bin/impala-shell.sh, which uses impala-python and > various dev-environment specific infrastructure to run impala-shell. We also > build a shell tarball, which is meant to be a self-contained version of the > shell with all dependencies. > In principle it's better to test the build artifacts rather than the > development environment. Therefore for full builds, where we build the > tarball, we should test the contents of the tarball including the bundled > libraries. > For remote cluster tests, we can continue to use the dev environment (since > we don't necessarily build the shell tarball there). -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org