[jira] [Work started] (IMPALA-4356) Automatically codegen expressions with any root Expr node
[ https://issues.apache.org/jira/browse/IMPALA-4356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on IMPALA-4356 started by Tim Armstrong. - > Automatically codegen expressions with any root Expr node > - > > Key: IMPALA-4356 > URL: https://issues.apache.org/jira/browse/IMPALA-4356 > Project: IMPALA > Issue Type: Improvement > Components: Backend >Affects Versions: Impala 2.8.0 >Reporter: Tim Armstrong >Assignee: Tim Armstrong >Priority: Major > Labels: codegen > > Currently Impala only automatically codegens expression subtrees with > ScalarFnCall at the root. This is the expression type used to implement many > but not all expressions (including most builtin operators). One example of an > expression that is not automatically codegened is "case" statements. > The crux of this is to move ScalarFnCall::scalar_fn_wrapper_ into ScalarExpr > (and probably rename it). There are some consequential changes required to > make this work. Instead of each ScalarExpr subclass overriding Get*Val(), I > think Get*Val() should be a non-virtual method of ScalarExpr that either > calls the codegen'd function or calls into Get*ValInterpreted(), which would > be the new virtual function. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-8390) test_cancel_insert and test_cancel_sort broken
Thomas Tauber-Marshall created IMPALA-8390: -- Summary: test_cancel_insert and test_cancel_sort broken Key: IMPALA-8390 URL: https://issues.apache.org/jira/browse/IMPALA-8390 Project: IMPALA Issue Type: Bug Reporter: Thomas Tauber-Marshall Assignee: Thomas Tauber-Marshall The tests test_cancel_insert and test_cancel_sort in test_cancellation.py are both broken due to specifying a test dimension 'action' which was renamed as part of IMPALA-7205 More generally, test_cancellation.py has a large number of test dimensions that blow up into a huge test matrix and we should probably think through what combinations of tests are actually giving us the coverage we want -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-8377) Recent toolchain bump breaks Ubuntu 14.04 builds
[ https://issues.apache.org/jira/browse/IMPALA-8377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Tauber-Marshall resolved IMPALA-8377. Resolution: Fixed Fix Version/s: Impala 3.3.0 > Recent toolchain bump breaks Ubuntu 14.04 builds > > > Key: IMPALA-8377 > URL: https://issues.apache.org/jira/browse/IMPALA-8377 > Project: IMPALA > Issue Type: Bug > Components: Infrastructure >Affects Versions: Impala 3.3.0 >Reporter: Lars Volker >Assignee: Thomas Tauber-Marshall >Priority: Critical > Labels: broken-build > Fix For: Impala 3.3.0 > > > Commit 25559dd4 in this change broke the build on Ubuntu 14.04: > https://gerrit.cloudera.org/#/c/12824/ > All daemons and any backend tests immediately segfault during startup with > this stack: > {noformat} > Program terminated with signal SIGSEGV, Segmentation fault. > #0 0x in ?? () > (gdb) where > #0 0x in ?? () > #1 0x7ff0abed9a80 in pthread_once () from > /lib/x86_64-linux-gnu/libpthread.so.0 > #2 0x04a93375 in > llvm::ManagedStaticBase::RegisterManagedStatic(void* (*)(), void (*)(void*)) > const () > #3 0x04a7ac76 in llvm::ManagedStatic<(anonymous > namespace)::CommandLineParser, llvm::object_creator<(anonymous > namespace)::CommandLineParser>, llvm::object_deleter<(anonymous > namespace)::CommandLineParser> >::operator*() [clone .constprop.407] () > #4 0x04a843a6 in llvm::cl::Option::addArgument() () > #5 0x01b26f27 in _GLOBAL__sub_I_SyntaxHighlighting.cpp () > #6 0x04dac9bd in __libc_csu_init () > #7 0x7ff0abb24ed5 in __libc_start_main () from > /lib/x86_64-linux-gnu/libc.so.6 > #8 0x01b59c97 in _start () > {noformat} > Setting {{IMPALA_KUDU_VERSION}} back to {{5211897}} in impala-config.sh make > the daemons start again, as does setting {{KUDU_IS_SUPPORTED=false}}. > However, only the former fixes the be-tests. > One outcome of this might be "Won't Fix" and we deprecate support for Ubuntu > 14.04. If that seems favorable, we should briefly discuss it on dev@. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-8377) Recent toolchain bump breaks Ubuntu 14.04 builds
[ https://issues.apache.org/jira/browse/IMPALA-8377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16810363#comment-16810363 ] ASF subversion and git services commented on IMPALA-8377: - Commit d31ac7b9cc06d2cffe13eb4b53b9070db00f02cb in impala's branch refs/heads/master from Thomas Tauber-Marshall [ https://gitbox.apache.org/repos/asf?p=impala.git;h=d31ac7b ] IMPALA-8377: bump toolchain version to 107-acaeac961d This fixes an issue with the previous toolchain version where the Kudu client was broken and caused all binaries to crash on startup due to an issue with linked libstdc++ It also fixes an issue where fastbinary.so wasn't being properly included with Thrift. Testing: - Built successfully on redhat6/7, ubuntu16/18, sles12, debian8 - Built and ran a full core test run with both USE_CDH_KUDU=true/false Change-Id: I4ac25aa230b9d2559cd4eb6166ab985b18ef7e2a Reviewed-on: http://gerrit.cloudera.org:8080/12928 Reviewed-by: Thomas Marshall Tested-by: Impala Public Jenkins > Recent toolchain bump breaks Ubuntu 14.04 builds > > > Key: IMPALA-8377 > URL: https://issues.apache.org/jira/browse/IMPALA-8377 > Project: IMPALA > Issue Type: Bug > Components: Infrastructure >Affects Versions: Impala 3.3.0 >Reporter: Lars Volker >Assignee: Thomas Tauber-Marshall >Priority: Critical > Labels: broken-build > > Commit 25559dd4 in this change broke the build on Ubuntu 14.04: > https://gerrit.cloudera.org/#/c/12824/ > All daemons and any backend tests immediately segfault during startup with > this stack: > {noformat} > Program terminated with signal SIGSEGV, Segmentation fault. > #0 0x in ?? () > (gdb) where > #0 0x in ?? () > #1 0x7ff0abed9a80 in pthread_once () from > /lib/x86_64-linux-gnu/libpthread.so.0 > #2 0x04a93375 in > llvm::ManagedStaticBase::RegisterManagedStatic(void* (*)(), void (*)(void*)) > const () > #3 0x04a7ac76 in llvm::ManagedStatic<(anonymous > namespace)::CommandLineParser, llvm::object_creator<(anonymous > namespace)::CommandLineParser>, llvm::object_deleter<(anonymous > namespace)::CommandLineParser> >::operator*() [clone .constprop.407] () > #4 0x04a843a6 in llvm::cl::Option::addArgument() () > #5 0x01b26f27 in _GLOBAL__sub_I_SyntaxHighlighting.cpp () > #6 0x04dac9bd in __libc_csu_init () > #7 0x7ff0abb24ed5 in __libc_start_main () from > /lib/x86_64-linux-gnu/libc.so.6 > #8 0x01b59c97 in _start () > {noformat} > Setting {{IMPALA_KUDU_VERSION}} back to {{5211897}} in impala-config.sh make > the daemons start again, as does setting {{KUDU_IS_SUPPORTED=false}}. > However, only the former fixes the be-tests. > One outcome of this might be "Won't Fix" and we deprecate support for Ubuntu > 14.04. If that seems favorable, we should briefly discuss it on dev@. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Work stopped] (IMPALA-5973) Provide query plan in JSON format
[ https://issues.apache.org/jira/browse/IMPALA-5973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on IMPALA-5973 stopped by Tim Armstrong. - > Provide query plan in JSON format > - > > Key: IMPALA-5973 > URL: https://issues.apache.org/jira/browse/IMPALA-5973 > Project: IMPALA > Issue Type: New Feature > Components: Frontend >Affects Versions: Impala 2.10.0 >Reporter: Alexander Behm >Priority: Major > Labels: planner > > Today there is only a text representation of the query plan, but it would be > useful to have a JSON version for portability and machine consumption. > To control whether EXPLAIN should produce a text or JSON output we could > augment the EXPLAIN syntax or we could introduce a query option. It's worth > discussing which one makes more sense. > To avoid maintaining two code paths for explain (TEXT and JSON), I recommend > that internally our code should always generate the JSON plan, and then have > a function that can convert the JSON plan to the conventional textual > representation. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Assigned] (IMPALA-5973) Provide query plan in JSON format
[ https://issues.apache.org/jira/browse/IMPALA-5973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Armstrong reassigned IMPALA-5973: - Assignee: (was: Pranay Singh) > Provide query plan in JSON format > - > > Key: IMPALA-5973 > URL: https://issues.apache.org/jira/browse/IMPALA-5973 > Project: IMPALA > Issue Type: New Feature > Components: Frontend >Affects Versions: Impala 2.10.0 >Reporter: Alexander Behm >Priority: Major > Labels: planner > > Today there is only a text representation of the query plan, but it would be > useful to have a JSON version for portability and machine consumption. > To control whether EXPLAIN should produce a text or JSON output we could > augment the EXPLAIN syntax or we could introduce a query option. It's worth > discussing which one makes more sense. > To avoid maintaining two code paths for explain (TEXT and JSON), I recommend > that internally our code should always generate the JSON plan, and then have > a function that can convert the JSON plan to the conventional textual > representation. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-8389) e2e custom cluster testsuite does not respect cluster_size when impala_log_dir present
radford nguyen created IMPALA-8389: -- Summary: e2e custom cluster testsuite does not respect cluster_size when impala_log_dir present Key: IMPALA-8389 URL: https://issues.apache.org/jira/browse/IMPALA-8389 Project: IMPALA Issue Type: Bug Components: Infrastructure Affects Versions: Impala 3.2.0 Reporter: radford nguyen h3. Brief CustomClusterTestSuite always waits for 3 daemons on startup instead of {{cluster_size}} daemons when {{impala_log_dir}} is specified. h3. Description The {{@CustomClusterTestSuite.withArgs}} decorator allows a user to specify a custom cluster size for the test case being decorated. However, when this option is specified in conjunction with {{impala_log_dir}}, it will fail to wait for the correct number of daemons if any value other than {{DEFAULT_CLUSTER_SIZE}} is used. The root cause is the difference in how the cluster is started with and without {{impala_log_dir}}: [https://github.com/apache/impala/blob/3.2.0/tests/common/custom_cluster_test_suite.py#L147] h3. To Reproduce: * add {{cluster_size=5}} to decorator of test_grant_revoke in tests/authorization/test_ranger.py * $ impala-py.test tests/authorization/test_ranger.py * observe pass * add {{impala_log_dir=whatev}} to decorator of test_grant_revoke * $ impala-py.test tests/authorization/test_ranger.py * observe fail during cluster startup: ** 2019-04-04 14:25:54,140 INFO MainThread: Waiting for num_known_live_backends=3. Current value: 5 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Work started] (IMPALA-8389) e2e custom cluster testsuite does not respect cluster_size when impala_log_dir present
[ https://issues.apache.org/jira/browse/IMPALA-8389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on IMPALA-8389 started by radford nguyen. -- > e2e custom cluster testsuite does not respect cluster_size when > impala_log_dir present > -- > > Key: IMPALA-8389 > URL: https://issues.apache.org/jira/browse/IMPALA-8389 > Project: IMPALA > Issue Type: Bug > Components: Infrastructure >Affects Versions: Impala 3.2.0 >Reporter: radford nguyen >Assignee: radford nguyen >Priority: Minor > Original Estimate: 1h > Remaining Estimate: 1h > > h3. Brief > CustomClusterTestSuite always waits for 3 daemons on startup instead of > {{cluster_size}} daemons when {{impala_log_dir}} is specified. > h3. Description > The {{@CustomClusterTestSuite.withArgs}} decorator allows a user to specify a > custom cluster size for the test case being decorated. However, when this > option is specified in conjunction with {{impala_log_dir}}, it will fail to > wait for the correct number of daemons if any value other than > {{DEFAULT_CLUSTER_SIZE}} is used. > The root cause is the difference in how the cluster is started with and > without {{impala_log_dir}}: > [https://github.com/apache/impala/blob/3.2.0/tests/common/custom_cluster_test_suite.py#L147] > h3. To Reproduce: > * add {{cluster_size=5}} to decorator of test_grant_revoke in > tests/authorization/test_ranger.py > * $ impala-py.test tests/authorization/test_ranger.py > * observe pass > * add {{impala_log_dir=whatev}} to decorator of test_grant_revoke > * $ impala-py.test tests/authorization/test_ranger.py > * observe fail during cluster startup: > ** 2019-04-04 14:25:54,140 INFO MainThread: Waiting for > num_known_live_backends=3. Current value: 5 > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Assigned] (IMPALA-8389) e2e custom cluster testsuite does not respect cluster_size when impala_log_dir present
[ https://issues.apache.org/jira/browse/IMPALA-8389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] radford nguyen reassigned IMPALA-8389: -- Assignee: radford nguyen > e2e custom cluster testsuite does not respect cluster_size when > impala_log_dir present > -- > > Key: IMPALA-8389 > URL: https://issues.apache.org/jira/browse/IMPALA-8389 > Project: IMPALA > Issue Type: Bug > Components: Infrastructure >Affects Versions: Impala 3.2.0 >Reporter: radford nguyen >Assignee: radford nguyen >Priority: Minor > Original Estimate: 1h > Remaining Estimate: 1h > > h3. Brief > CustomClusterTestSuite always waits for 3 daemons on startup instead of > {{cluster_size}} daemons when {{impala_log_dir}} is specified. > h3. Description > The {{@CustomClusterTestSuite.withArgs}} decorator allows a user to specify a > custom cluster size for the test case being decorated. However, when this > option is specified in conjunction with {{impala_log_dir}}, it will fail to > wait for the correct number of daemons if any value other than > {{DEFAULT_CLUSTER_SIZE}} is used. > The root cause is the difference in how the cluster is started with and > without {{impala_log_dir}}: > [https://github.com/apache/impala/blob/3.2.0/tests/common/custom_cluster_test_suite.py#L147] > h3. To Reproduce: > * add {{cluster_size=5}} to decorator of test_grant_revoke in > tests/authorization/test_ranger.py > * $ impala-py.test tests/authorization/test_ranger.py > * observe pass > * add {{impala_log_dir=whatev}} to decorator of test_grant_revoke > * $ impala-py.test tests/authorization/test_ranger.py > * observe fail during cluster startup: > ** 2019-04-04 14:25:54,140 INFO MainThread: Waiting for > num_known_live_backends=3. Current value: 5 > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-8322) S3 tests encounter "timed out waiting for receiver fragment instance"
[ https://issues.apache.org/jira/browse/IMPALA-8322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16810312#comment-16810312 ] Joe McDonnell commented on IMPALA-8322: --- The cancellation test is running, with multiple cancels in progress. It is likely that ThreadResourceMgr::DestroyPool() is being called frequently. FYI: [~kwho] [~lv] [~twm378] > S3 tests encounter "timed out waiting for receiver fragment instance" > - > > Key: IMPALA-8322 > URL: https://issues.apache.org/jira/browse/IMPALA-8322 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 3.3.0 >Reporter: Joe McDonnell >Priority: Blocker > Labels: broken-build > Attachments: fb5b9729-2d7a-4590-ea365b87-d2ead75e.dmp_dumped, > run_tests_swimlane.json.gz > > > This has been seen multiple times when running s3 tests: > {noformat} > query_test/test_join_queries.py:57: in test_basic_joins > self.run_test_case('QueryTest/joins', new_vector) > common/impala_test_suite.py:472: in run_test_case > result = self.__execute_query(target_impalad_client, query, user=user) > common/impala_test_suite.py:699: in __execute_query > return impalad_client.execute(query, user=user) > common/impala_connection.py:174: in execute > return self.__beeswax_client.execute(sql_stmt, user=user) > beeswax/impala_beeswax.py:183: in execute > handle = self.__execute_query(query_string.strip(), user=user) > beeswax/impala_beeswax.py:360: in __execute_query > self.wait_for_finished(handle) > beeswax/impala_beeswax.py:381: in wait_for_finished > raise ImpalaBeeswaxException("Query aborted:" + error_log, None) > E ImpalaBeeswaxException: ImpalaBeeswaxException: > EQuery aborted:Sender 127.0.0.1 timed out waiting for receiver fragment > instance: 6c40d992bb87af2f:0ce96e5d0007, dest node: 4{noformat} > This is related to IMPALA-6818. On a bad run, there are various time outs in > the impalad logs: > {noformat} > I0316 10:47:16.359313 20175 krpc-data-stream-mgr.cc:354] Sender 127.0.0.1 > timed out waiting for receiver fragment instance: > ef4a5dc32a6565bd:a8720b850007, dest node: 5 > I0316 10:47:16.359345 20175 rpcz_store.cc:265] Call > impala.DataStreamService.TransmitData from 127.0.0.1:40030 (request call id > 14881) took 120182ms. Request Metrics: {} > I0316 10:47:16.359380 20175 krpc-data-stream-mgr.cc:354] Sender 127.0.0.1 > timed out waiting for receiver fragment instance: > d148d83e11a4603d:54dc35f70004, dest node: 3 > I0316 10:47:16.359395 20175 rpcz_store.cc:265] Call > impala.DataStreamService.TransmitData from 127.0.0.1:40030 (request call id > 14880) took 123097ms. Request Metrics: {} > ... various messages ... > I0316 10:47:56.364990 20154 kudu-util.h:108] Cancel() RPC failed: Timed out: > CancelQueryFInstances RPC to 127.0.0.1:27000 timed out after 10.000s (SENT) > ... various messages ... > W0316 10:48:15.056421 20150 rpcz_store.cc:251] Call > impala.ControlService.CancelQueryFInstances from 127.0.0.1:40912 (request > call id 202) took 48695ms (client timeout 1). > W0316 10:48:15.056473 20150 rpcz_store.cc:255] Trace: > 0316 10:47:26.361265 (+ 0us) impala-service-pool.cc:165] Inserting onto call > queue > 0316 10:47:26.361285 (+ 20us) impala-service-pool.cc:245] Handling call > 0316 10:48:15.056398 (+48695113us) inbound_call.cc:162] Queueing success > response > Metrics: {} > I0316 10:48:15.057087 20139 connection.cc:584] Got response to call id 202 > after client already timed out or cancelled{noformat} > So far, this has only happened on s3. The system load at the time is not > higher than normal. If anything it is lower than normal. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Closed] (IMPALA-8372) Impala Doc: Consistent uses of hyphens with global flags
[ https://issues.apache.org/jira/browse/IMPALA-8372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Rodoni closed IMPALA-8372. --- Resolution: Fixed Fix Version/s: Impala 3.3.0 > Impala Doc: Consistent uses of hyphens with global flags > > > Key: IMPALA-8372 > URL: https://issues.apache.org/jira/browse/IMPALA-8372 > Project: IMPALA > Issue Type: Bug > Components: Docs >Reporter: Alex Rodoni >Assignee: Alex Rodoni >Priority: Major > Fix For: Impala 3.3.0 > > > Standardize to use 2 non-breaking hyphens for global flags. > https://gerrit.cloudera.org/#/c/12908/ -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-6826) Add support for Ubuntu 18.04
[ https://issues.apache.org/jira/browse/IMPALA-6826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16810241#comment-16810241 ] Laszlo Gaal commented on IMPALA-6826: - IMPALA-8380 tracks the need to upgrade the Postgres JDBC driver, which is required by HMS. This is needed to be able to run and test Impala on Ubuntu 18. > Add support for Ubuntu 18.04 > > > Key: IMPALA-6826 > URL: https://issues.apache.org/jira/browse/IMPALA-6826 > Project: IMPALA > Issue Type: Task > Components: Infrastructure >Affects Versions: Impala 3.0, Impala 2.12.0 > Environment: Ubuntu 18.04 >Reporter: Jim Apple >Assignee: Laszlo Gaal >Priority: Major > > We support Ubuntu 16.04 (and 14.04, in the 2.x line). > > I'm blocked on Ubuntu 18.04 support in > [https://github.com/cloudera/native-toolchain,] but the toolchain is not > technically a pre-requisite, though I believe it's the easiest way to get a > development environment up and running. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-3924) Impala support for Ubuntu 16.04
[ https://issues.apache.org/jira/browse/IMPALA-3924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16810239#comment-16810239 ] Laszlo Gaal commented on IMPALA-3924: - IMPALA-8380 tracks the need to upgrade the Postgres JDBC driver for the minicluster, which is required to be able to run and test Impala on Ubuntu 18. > Impala support for Ubuntu 16.04 > --- > > Key: IMPALA-3924 > URL: https://issues.apache.org/jira/browse/IMPALA-3924 > Project: IMPALA > Issue Type: Improvement > Components: Infrastructure >Affects Versions: Impala 2.8.0 >Reporter: Thomas Tauber-Marshall >Assignee: Thomas Tauber-Marshall >Priority: Critical > Labels: build, toolchain > Fix For: Impala 2.7.0 > > > There are various compatibility issues related to compilation and the > toolchain that are preventing us from building and running Impala on Ubuntu16. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-7957) UNION ALL query returns incorrect results
[ https://issues.apache.org/jira/browse/IMPALA-7957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16810221#comment-16810221 ] Tim Armstrong commented on IMPALA-7957: --- May be connected > UNION ALL query returns incorrect results > - > > Key: IMPALA-7957 > URL: https://issues.apache.org/jira/browse/IMPALA-7957 > Project: IMPALA > Issue Type: Bug > Components: Frontend >Affects Versions: Impala 2.12.0 >Reporter: Luis E Martinez-Poblete >Assignee: Paul Rogers >Priority: Blocker > Labels: correctness > > Synopsis: > = > UNION ALL query returns incorrect results > Problem: > > Customer reported a UNION ALL query returning incorrect results. The UNION > ALL query has 2 legs, but Impala is only returning information from one leg. > Issue can be reproduced in the latest version of Impala. Below is the > reproduction case: > {noformat} > create table mytest_t (c1 timestamp, c2 timestamp, c3 int, c4 int); > insert into mytest_t values (now(), ADDDATE (now(),1), 1,1); > insert into mytest_t values (now(), ADDDATE (now(),1), 2,2); > insert into mytest_t values (now(), ADDDATE (now(),1), 3,3); > SELECT t.c1 > FROM > (SELECT c1, c2 > FROM mytest_t) t > LEFT JOIN > (SELECT c1, c2 > FROM mytest_t > WHERE c2 = c1) t2 ON (t.c2 = t2.c2) > UNION ALL > VALUES (NULL) > {noformat} > The above query produces the following execution plan: > {noformat} > ++ > | Explain String >| > ++ > | Max Per-Host Resource Reservation: Memory=34.02MB Threads=5 >| > | Per-Host Resource Estimates: Memory=2.06GB >| > | WARNING: The following tables are missing relevant table and/or column > statistics. | > | default.mytest_t >| > | >| > | PLAN-ROOT SINK >| > | | >| > | 06:EXCHANGE [UNPARTITIONED] >| > | | >| > | 00:UNION >| > | | constant-operands=1 >| > | | >| > | 04:SELECT >| > | | predicates: default.mytest_t.c1 = default.mytest_t.c2 >| > | | >| > | 03:HASH JOIN [LEFT OUTER JOIN, BROADCAST] >| > | | hash predicates: c2 = c2 >| > | | >| > | |--05:EXCHANGE [BROADCAST] >| > | | | >| > | | 02:SCAN HDFS [default.mytest_t] >| > | | partitions=1/1 files=3 size=192B >| > | | predicates: c2 = c1 >| > | | >| > | 01:SCAN HDFS [default.mytest_t] >| > |partitions=1/1 files=3 size=192B >| > ++ > {noformat} > The issue is in operator 4: > {noformat} > | 04:SELECT | > | | predicates: default.mytest_t.c1 = default.mytest_t.c2 | > {noformat} > It's definitely a bug with predicate placement - that c1 = c2 predicate > shouldn't be evaluated outside the right branch of the LEFT JOIN. > Thanks, > Luis Martinez. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-8388) Misleading error message when rejecting incoming RPCs b/c of memory pressure
[ https://issues.apache.org/jira/browse/IMPALA-8388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16810206#comment-16810206 ] Michael Ho commented on IMPALA-8388: [~lv], the message you added doesn't match the quoted code you posted. I filed IMPALA-8256 to fix that so that the actual queue size and memory consumption is now printed. Is your complaint that we could improve further on IMPALA-8256 ? > Misleading error message when rejecting incoming RPCs b/c of memory pressure > > > Key: IMPALA-8388 > URL: https://issues.apache.org/jira/browse/IMPALA-8388 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 2.12.0, Impala 3.1.0, Impala 3.2.0, Impala 3.3.0 >Reporter: Lars Volker >Priority: Major > Labels: observability, supportability > > When running out of memory we reject incoming RPCs (expected). However, our > error message assumes that the queue is full and prints it as INT_MAX (the > queue size): > {code} > void ImpalaServicePool::RejectTooBusy(kudu::rpc::InboundCall* c) { > string err_msg = > Substitute("$0 request on $1 from $2 dropped due to backpressure. " > "The service queue contains $3 items out of a maximum of $4; > " > "memory consumption is $5.", > c->remote_method().method_name(), > service_->service_name(), > c->remote_address().ToString(), > service_queue_.estimated_queue_length(), > service_queue_.max_size(), // <-- HERE > PrettyPrinter::Print(service_mem_tracker_->consumption(), > TUnit::BYTES)); > {code} > The error currently looks like this: > {noformat} > I0404 11:35:43.276937 54321 impala-service-pool.cc:126] EndDataStream request > on impala.DataStreamService from 1.2.3.4:56789 dropped due to backpressure. > The service queue is full; it has 2147483647 items. Contents of service queue: > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-8388) Misleading error message when rejecting incoming RPCs b/c of memory pressure
[ https://issues.apache.org/jira/browse/IMPALA-8388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Volker updated IMPALA-8388: Description: When running out of memory we reject incoming RPCs (expected). However, our error message assumes that the queue is full and prints it as INT_MAX (the queue size): {code} void ImpalaServicePool::RejectTooBusy(kudu::rpc::InboundCall* c) { string err_msg = Substitute("$0 request on $1 from $2 dropped due to backpressure. " "The service queue contains $3 items out of a maximum of $4; " "memory consumption is $5.", c->remote_method().method_name(), service_->service_name(), c->remote_address().ToString(), service_queue_.estimated_queue_length(), service_queue_.max_size(), // <-- HERE PrettyPrinter::Print(service_mem_tracker_->consumption(), TUnit::BYTES)); {code} The error currently looks like this: {noformat} I0404 11:35:43.276937 54321 impala-service-pool.cc:126] EndDataStream request on impala.DataStreamService from 1.2.3.4:56789 dropped due to backpressure. The service queue is full; it has 2147483647 items. Contents of service queue: {noformat} was: When running out of memory we reject incoming RPCs (expected). However, our error message assumes that the queue is full and prints it as INT_MAX (the queue size): {code} void ImpalaServicePool::RejectTooBusy(kudu::rpc::InboundCall* c) { string err_msg = Substitute("$0 request on $1 from $2 dropped due to backpressure. " "The service queue contains $3 items out of a maximum of $4; " "memory consumption is $5.", c->remote_method().method_name(), service_->service_name(), c->remote_address().ToString(), service_queue_.estimated_queue_length(), service_queue_.max_size(), // <-- HERE PrettyPrinter::Print(service_mem_tracker_->consumption(), TUnit::BYTES)); {code} > Misleading error message when rejecting incoming RPCs b/c of memory pressure > > > Key: IMPALA-8388 > URL: https://issues.apache.org/jira/browse/IMPALA-8388 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 2.12.0, Impala 3.1.0, Impala 3.2.0, Impala 3.3.0 >Reporter: Lars Volker >Priority: Major > Labels: observability, supportability > > When running out of memory we reject incoming RPCs (expected). However, our > error message assumes that the queue is full and prints it as INT_MAX (the > queue size): > {code} > void ImpalaServicePool::RejectTooBusy(kudu::rpc::InboundCall* c) { > string err_msg = > Substitute("$0 request on $1 from $2 dropped due to backpressure. " > "The service queue contains $3 items out of a maximum of $4; > " > "memory consumption is $5.", > c->remote_method().method_name(), > service_->service_name(), > c->remote_address().ToString(), > service_queue_.estimated_queue_length(), > service_queue_.max_size(), // <-- HERE > PrettyPrinter::Print(service_mem_tracker_->consumption(), > TUnit::BYTES)); > {code} > The error currently looks like this: > {noformat} > I0404 11:35:43.276937 54321 impala-service-pool.cc:126] EndDataStream request > on impala.DataStreamService from 1.2.3.4:56789 dropped due to backpressure. > The service queue is full; it has 2147483647 items. Contents of service queue: > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-8388) Misleading error message when rejecting incoming RPCs b/c of memory pressure
Lars Volker created IMPALA-8388: --- Summary: Misleading error message when rejecting incoming RPCs b/c of memory pressure Key: IMPALA-8388 URL: https://issues.apache.org/jira/browse/IMPALA-8388 Project: IMPALA Issue Type: Bug Components: Backend Affects Versions: Impala 3.2.0, Impala 3.1.0, Impala 2.12.0, Impala 3.3.0 Reporter: Lars Volker When running out of memory we reject incoming RPCs (expected). However, our error message assumes that the queue is full and prints it as INT_MAX (the queue size): {code} void ImpalaServicePool::RejectTooBusy(kudu::rpc::InboundCall* c) { string err_msg = Substitute("$0 request on $1 from $2 dropped due to backpressure. " "The service queue contains $3 items out of a maximum of $4; " "memory consumption is $5.", c->remote_method().method_name(), service_->service_name(), c->remote_address().ToString(), service_queue_.estimated_queue_length(), service_queue_.max_size(), // <-- HERE PrettyPrinter::Print(service_mem_tracker_->consumption(), TUnit::BYTES)); {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-8387) Impala Doc: Network I/O throughput to Query Profile output
Alex Rodoni created IMPALA-8387: --- Summary: Impala Doc: Network I/O throughput to Query Profile output Key: IMPALA-8387 URL: https://issues.apache.org/jira/browse/IMPALA-8387 Project: IMPALA Issue Type: Sub-task Components: Docs Affects Versions: Impala 3.3.0 Reporter: Alex Rodoni Assignee: Alex Rodoni -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-7204) Add support for GROUP BY ROLLUP
[ https://issues.apache.org/jira/browse/IMPALA-7204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16810112#comment-16810112 ] Ruslan Dautkhanov commented on IMPALA-7204: --- cc [~grahn] - can you please let us know if you guys are planning to add support for this feature? Our Account team said you might be the right person to ask ) Thanks for any ideas! > Add support for GROUP BY ROLLUP > --- > > Key: IMPALA-7204 > URL: https://issues.apache.org/jira/browse/IMPALA-7204 > Project: IMPALA > Issue Type: New Feature > Components: Backend >Affects Versions: Impala 3.0, Impala 2.12.0 >Reporter: Ruslan Dautkhanov >Priority: Major > Labels: GROUP_BY, sql > > Now suppose that we'd like to analyze our sales data, to study the amount of > sales that is occurring for different products, in different states and > regions. Using the ROLLUP feature of SQL 2003, we could issue the query: > {code:sql} > select region, state, product, sum(sales) total_sales > from sales_history > group by rollup (region, state, product) > {code} > Semantically, the above query is equivalent to > > {code:sql} > select region, state, product, sum(sales) total_sales > from sales_history > group by region, state, product > union > select region, state, null, sum(sales) total_sales > from sales_history > group by region, state > union > select region, null, null, sum(sales) total_sales > from sales_history > group by region > union > select null, null, null, sum(sales) total_sales > from sales_history > > {code} > The query might produce results that looked something like: > {noformat} > REGION STATE PRODUCT TOTAL_SALES > -- - --- --- > null null null 6200 > EAST MA BOATS 100 > EAST MA CARS 1500 > EAST MA null 1600 > EAST NY BOATS 150 > EAST NY CARS 1000 > EAST NY null 1150 > EAST null null 2750 > WEST CA BOATS 750 > WEST CA CARS 500 > WEST CA null 1250 > WEST AZ BOATS 2000 > WEST AZ CARS 200 > WEST AZ null 2200 > WEST null null 3450 > {noformat} > We have a lot of production queries that work around this missing Impala > functionality by having three UNION ALLs. Physical execution plan shows > Impala actually reads full fact table three times. So it could be a three > times improvement (or more, depending on number of columns that are being > rolled up). > I can't find another SQL on Hadoop engine that doesn't support this feature. > *Checked Spark, Hive, PIG, Flink and some other engines - they all do > support this basic SQL feature*. > Would be great to have a matching feature in Impala too. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-8359) Coverage measurement is not working for Impala daemons
[ https://issues.apache.org/jira/browse/IMPALA-8359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16810091#comment-16810091 ] ASF subversion and git services commented on IMPALA-8359: - Commit a0a20cdf9adcb899e4bb04e3f9278077dc2b52c0 in impala's branch refs/heads/master from Zoltan Borok-Nagy [ https://gitbox.apache.org/repos/asf?p=impala.git;h=a0a20cd ] IMPALA-8359: Fix coverage data generation for impalads impala::InitCommonRuntime() sets a signal handler for SIGTERM. It calls _exit(0) which causes normal program termination without cleaning up, i.e. no destructors are called etc. Gcov writes the coverage data in this cleanup phase, so calling _exit() prevents flushing coverage data. Now the '-codecoverage' flag also defines a macro named CODE_COVERAGE_ENABLED. If this macro is defined we explicitly call __gcov_flush() before calling _exit(). I tested manually. Change-Id: I9be1e1e73b6cfc3557077f763aee4dbfcc7a2d27 Reviewed-on: http://gerrit.cloudera.org:8080/12858 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins > Coverage measurement is not working for Impala daemons > -- > > Key: IMPALA-8359 > URL: https://issues.apache.org/jira/browse/IMPALA-8359 > Project: IMPALA > Issue Type: Bug >Reporter: Zoltán Borók-Nagy >Assignee: Zoltán Borók-Nagy >Priority: Major > > Currently code coverage measurement only works for backend tests. > Impala daemons don't write .gcda files when they terminate because they set a > signal handler for SIGTERM. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-7251) Fix QueryMaintenance calls in Aggregators
[ https://issues.apache.org/jira/browse/IMPALA-7251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16810087#comment-16810087 ] ASF subversion and git services commented on IMPALA-7251: - Commit fdd6db524c9c97f0baebfde0119fce19d62eaec3 in impala's branch refs/heads/2.x from Thomas Tauber-Marshall [ https://gitbox.apache.org/repos/asf?p=impala.git;h=fdd6db5 ] IMPALA-7251: Fix QueryMaintenance calls in Aggregators A recent change, IMPALA-110 (part 2), refactored PartitionedAggregationNode into several classes, including a new type 'Aggregator'. During this refactor, code that makes local allocations while evaluating exprs was moved from the ExecNode (now AggregationNode/StreamingAggregationNode) into the Aggregators, but code related to cleaning these allocations up (ie QueryMaintenance()) was not, resulting in some queries using an excessive amount of memory. This patch removes all calls to QueryMaintenance() from the exec nodes and moves them into the Aggregators. Testing: - Added new test cases with a mem limit that fails if the expr allocations aren't released in a timely manner. - Passed a full exhaustive run. Change-Id: I4dac2bb0a15cdd7315ee15608bae409c125c82f5 Reviewed-on: http://gerrit.cloudera.org:8080/10871 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins > Fix QueryMaintenance calls in Aggregators > - > > Key: IMPALA-7251 > URL: https://issues.apache.org/jira/browse/IMPALA-7251 > Project: IMPALA > Issue Type: Bug >Affects Versions: Impala 3.1.0 >Reporter: Thomas Tauber-Marshall >Assignee: Thomas Tauber-Marshall >Priority: Blocker > Fix For: Impala 3.1.0 > > > A recent change, IMPALA-110 (part 2), refactored PartitionedAggregationNode > into several classes, including GroupingAggregator. During this refactor, > code that makes local allocations while evaluating exprs was moved from the > ExecNode (now AggregationNode/StreamingAggregationNode) into the new type > Aggregator, but code related to cleaning these allocations up (ie > QueryMaintenance()) was not, resulting in some queries using an excessive > amount of memory. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-7006) Rebase KRPC onto Kudu upstream repository
[ https://issues.apache.org/jira/browse/IMPALA-7006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16810074#comment-16810074 ] ASF subversion and git services commented on IMPALA-7006: - Commit 0c2d3c74d662d32eaf5c56cdeca067285ab1d300 in impala's branch refs/heads/2.x from Lars Volker [ https://gitbox.apache.org/repos/asf?p=impala.git;h=0c2d3c7 ] IMPALA-7006: Remove KRPC folders Change-Id: Ic677484c27ed18b105da0a6b0901df4eb9f248e6 Reviewed-on: http://gerrit.cloudera.org:8080/10756 Reviewed-by: Lars Volker Tested-by: Lars Volker > Rebase KRPC onto Kudu upstream repository > - > > Key: IMPALA-7006 > URL: https://issues.apache.org/jira/browse/IMPALA-7006 > Project: IMPALA > Issue Type: Improvement > Components: Backend >Affects Versions: Impala 2.13.0, Impala 3.1.0 >Reporter: Lars Volker >Assignee: Lars Volker >Priority: Major > Labels: krpc > Fix For: Impala 3.1.0 > > > We should consider rebasing our KRPC code on top of the latest Kudu upstream > version. This will keep the two projects more in sync and will allow us to > make use of recent improvements, e.g. around thread stack collection, without > having to pick individual changes. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-5129) Use Kudu's Kinit code to avoid expensive fork
[ https://issues.apache.org/jira/browse/IMPALA-5129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16810082#comment-16810082 ] ASF subversion and git services commented on IMPALA-5129: - Commit b97e0cd555a53057a82dc9c0ad9e0cfe58f3ec66 in impala's branch refs/heads/2.x from Sailesh Mukil [ https://gitbox.apache.org/repos/asf?p=impala.git;h=b97e0cd ] IMPALA-5129: Use Kudu's Kinit code to avoid expensive fork NOTE: This commit is part of a set of changes for IMPALA-7006. It contains pieces of a previous commit that need to be cherry picked again after rebasing the code in be/src/kudu/{util,security,rpc}. The original commit message is below: Impala currently kinits by forking off a child process. This has proved to be expensive in many cases since the subprocess tries to reserve as much memory as Impala is currently using which can be quite a lot. This patch adds a flag called 'use_kudu_kinit' that defaults to true. When it's true, it uses the Kudu security library's kinit code that programatically uses the krb5 library to kinit. When it's false, we run our current path which kicks off the kinit-thread and forks off a kinit process periodically to reacquire tickets based on FLAGS_kerberos_reinit_interval. Converted existing tests in thrift-server-test to run with and without kerberos. We now run this BE test with kerberos by using Kudu's MiniKdc utility. This introduces a new dependency on some kerberos binaries that are checked through FindKerberosPrograms.cmake. Note that this is only a test dependency and not a dependency for the impalad binaries and friends. Compilation will still succeed if the kerberos binaries for the MiniKdc are not found, however, the thrift-server-test will fail. We run with and without the 'use_kudu_kinit' flag. TODO: Since the setting up and tearing down of our security code isn't idempotent, we can run only any one test in a process with Kerberos now (IMPALA-6085). Updated bin/bootstrap_system.sh to install new sasl-gssapi modules and the kerberos binaries required for the MiniKdc. Also fixed a bug that didn't transfer the environment into 'sudo' in bin/bootstrap_system.sh. Testing: Verified with thrift-server-test and also manually on a live kerberized cluster. Change-Id: Ie3c6e933c454e7adca69ef03e7d5c0c84b656895 Reviewed-on: http://gerrit.cloudera.org:8080/7938 Reviewed-by: Sailesh Mukil Tested-by: Impala Public Jenkins Reviewed-on: http://gerrit.cloudera.org:8080/10763 Reviewed-by: Lars Volker Tested-by: Lars Volker > Use Kudu's Kinit code to avoid expensive fork > - > > Key: IMPALA-5129 > URL: https://issues.apache.org/jira/browse/IMPALA-5129 > Project: IMPALA > Issue Type: Improvement > Components: Security >Reporter: Sailesh Mukil >Assignee: Sailesh Mukil >Priority: Major > Labels: security > Fix For: Impala 2.11.0 > > > Impala does a kinit by doing a RunShell() command which basically forks the > entire process (potentially expensive) and execs the 'kinit' command. > KuduRPC avoids the fork by calling into libkrb programatically. Since we > eventually will be pulling in KuduRPC to Impala, we can get rid of the fork > and call into the appropriate KuduRPC code. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-4669) Add Kudu's RPC, util and security libraries
[ https://issues.apache.org/jira/browse/IMPALA-4669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16810076#comment-16810076 ] ASF subversion and git services commented on IMPALA-4669: - Commit 5dbd48f226f1061567da1c381ee2491dab3ceaf4 in impala's branch refs/heads/2.x from Lars Volker [ https://gitbox.apache.org/repos/asf?p=impala.git;h=5dbd48f ] IMPALA-4669: [KUTIL] Add kudu_util library to the build. NOTE: This commit is part of a set of changes for IMPALA-7006. It contains pieces of a previous commit that need to be cherry picked again after rebasing the code in be/src/kudu/{util,security,rpc}. The original commit message is below: A few miscellaneous changes to allow kudu_util to compile with Impala. Add kudu_version.cc to substitute for the version.cc file that is automatically built during the full Kudu build. Set LZ4_DISABLE_DEPRECATE_WARNINGS to allow Kudu's compressor utility to use deprecated names for LZ4 methods. Add NO_NVM_SUPPORT flag to Kudu build (plan to upstream this later) to disable building with nvm support, removing a library dependency. Also remove imported FindOpenSSL.cmake in favour of the standard one provided by cmake itself. Finally, a few changes to allow compilation on RHEL5: * Only use sched_getcpu() if supported * Only include magic.h if available * Workaround for kernels that don't have SOCK_NONBLOCK * Workaround for kernels that don't have O_CLOEXEC (ignore the flag) * Provide non-working implementation of fallocate() * Disable inclusion of linux/fiemap.h - although this exists on RHEL5, it does not compile due to other #includes in env_posix.cc. We disable the path this is used for, since Impala does not call that code. * Use Kudu's implementation of pipe(2), preadv(2) and pwritev(2) where it doesn't exist. In most cases these changes simply force kutil to revert to a different implementation that was already written for OSX support - this patch generalises the logic to provide the implementation whenever the required function doesn't exist. This patch compiles on RHEL5.5 and 6.0, SLES11 and 12, Ubuntu 12.04 and 14.04 and Debian 7.0 and 8.0. Change-Id: I451f02d3e4669e8a548b92fb1445cb2b322659a2 Reviewed-on: http://gerrit.cloudera.org:8080/5715 Tested-by: Impala Public Jenkins Reviewed-by: Henry Robinson Reviewed-on: http://gerrit.cloudera.org:8080/10758 Reviewed-by: Michael Ho Tested-by: Lars Volker > Add Kudu's RPC, util and security libraries > --- > > Key: IMPALA-4669 > URL: https://issues.apache.org/jira/browse/IMPALA-4669 > Project: IMPALA > Issue Type: Sub-task > Components: Distributed Exec >Affects Versions: Impala 2.8.0 >Reporter: Henry Robinson >Assignee: Henry Robinson >Priority: Major > Fix For: Impala 2.10.0 > > > To enable KRPC in Impala, we need to link against Kudu's {{rpc}}, > {{security}} and {{util}} libraries. The easiest way for now is to pull them > into trunk. > Doing this also requires upgrading our {{gutil}} version. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-7006) Rebase KRPC onto Kudu upstream repository
[ https://issues.apache.org/jira/browse/IMPALA-7006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16810078#comment-16810078 ] ASF subversion and git services commented on IMPALA-7006: - Commit 23a3ef7452ade42a426502e0fd3719f3836d6730 in impala's branch refs/heads/2.x from Lars Volker [ https://gitbox.apache.org/repos/asf?p=impala.git;h=23a3ef7 ] IMPALA-7006: [KSECURITY] Update security library integration This commit is part of a set of changes for IMPALA-7006. It started based on an original change (f51c4435), which integrated Kudu's security folder into our build. This change removes several compile time checks that are now either done in Kudu's own cmake files or that can be removed due to Impala deprecating support for older OS versions in the 3.x line. The removed checks are: HAVE_KRB5_GET_INIT_CREDS_OPT_SET_OUT_CCACHE: We now check for this in Kudu's code. HAVE_KRB5_IS_CONFIG_PRINCIPAL, HAVE_KRB5_GET_INIT_CREDS_OPT_SET_FAST_CCACHE_NAME: These checks are not needed anymore. All OS versions supported by Impala now have sufficiently recent versions of Kerberos. Change-Id: Ifab51d887f5e771ad62eeddc14b9c47f42c3130d Reviewed-on: http://gerrit.cloudera.org:8080/10759 Reviewed-by: Lars Volker Tested-by: Lars Volker > Rebase KRPC onto Kudu upstream repository > - > > Key: IMPALA-7006 > URL: https://issues.apache.org/jira/browse/IMPALA-7006 > Project: IMPALA > Issue Type: Improvement > Components: Backend >Affects Versions: Impala 2.13.0, Impala 3.1.0 >Reporter: Lars Volker >Assignee: Lars Volker >Priority: Major > Labels: krpc > Fix For: Impala 3.1.0 > > > We should consider rebasing our KRPC code on top of the latest Kudu upstream > version. This will keep the two projects more in sync and will allow us to > make use of recent improvements, e.g. around thread stack collection, without > having to pick individual changes. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-7006) Rebase KRPC onto Kudu upstream repository
[ https://issues.apache.org/jira/browse/IMPALA-7006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16810085#comment-16810085 ] ASF subversion and git services commented on IMPALA-7006: - Commit 315bc66bbac8715302d455d2d746981cebf74aec in impala's branch refs/heads/2.x from Joe McDonnell [ https://gitbox.apache.org/repos/asf?p=impala.git;h=315bc66 ] KUDU-2305: Limit sidecars to INT_MAX and fortify socket code NOTE: This commit is part of a set of changes for IMPALA-7006. It contains pieces of a previous commit that need to be cherry picked again after rebasing the code in be/src/kudu/{util,security,rpc}. The original commit message is below: Inspection of the code revealed some other local variables that could overflow with large messages. This patch takes two approaches to eliminate the issues. First, it limits the total size of the messages by limiting the total size of the sidecars to INT_MAX. The total size of the protobuf and header components of the message should be considerably smaller, so limiting the sidecars to INT_MAX eliminates messages that are larger than UINT_MAX. This also means that the sidecar offsets, which are unsigned 32-bit integers, are also safe. Given that FLAGS_rpc_max_message_size is limited to INT_MAX at startup, the receiver would reject any message this large anyway. This also helps with the networking codepath, as any given sidecar will have a size less than INT_MAX, so every Slice that interacts with Writev() is shorter than INT_MAX. Second, even with sidecars limited to INT_MAX, the headers and protobuf parts of the messages mean that certain messages could still exceed INT_MAX. This patch changes some of the sockets codepath to tolerate iovec's that reference more than INT_MAX bytes total. Specifically, it changes Writev()'s nwritten bytes to an int64_t for both TlsSocket and Socket. TlsSocket works because it is sending each Slice individually. The first change limited any given Slice to INT_MAX, so each individual Write() should not be impacted. For Socket, Writev() uses sendmsg(). It should do partial network sends to handle this case. Any Write() call specifies its size with a 32-bit integer, and that will not be impacted by this patch. Testing: - Modified TestRpcSidecarLimits() to verify that sidecars are limited to INT_MAX bytes. - Added a test mode to TestRpcSidecarLimits() where it overrides rpc_max_message_size and sends the maximal message. This verifies that the client send codepath can handle the maximal message. Reviewed-on: http://gerrit.cloudera.org:8080/9601 Reviewed-by: Todd Lipcon Tested-by: Todd Lipcon Changes from Kudu version: - Updated declaration of FLAGS_rpc_max_message_size in rpc-mgr.cc and added a warning not to set it larger than INT_MAX. Change-Id: Id23e518995f2bf2f6bf6b49d5f413f3eaa4e79d1 Reviewed-on: http://gerrit.cloudera.org:8080/9748 Reviewed-by: Michael Ho Tested-by: Impala Public Jenkins Reviewed-on: http://gerrit.cloudera.org:8080/10765 Reviewed-by: Lars Volker Tested-by: Lars Volker > Rebase KRPC onto Kudu upstream repository > - > > Key: IMPALA-7006 > URL: https://issues.apache.org/jira/browse/IMPALA-7006 > Project: IMPALA > Issue Type: Improvement > Components: Backend >Affects Versions: Impala 2.13.0, Impala 3.1.0 >Reporter: Lars Volker >Assignee: Lars Volker >Priority: Major > Labels: krpc > Fix For: Impala 3.1.0 > > > We should consider rebasing our KRPC code on top of the latest Kudu upstream > version. This will keep the two projects more in sync and will allow us to > make use of recent improvements, e.g. around thread stack collection, without > having to pick individual changes. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-7006) Rebase KRPC onto Kudu upstream repository
[ https://issues.apache.org/jira/browse/IMPALA-7006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16810086#comment-16810086 ] ASF subversion and git services commented on IMPALA-7006: - Commit b65dbf8e40d7c8f77db05846d84497824d6bbd26 in impala's branch refs/heads/2.x from Lars Volker [ https://gitbox.apache.org/repos/asf?p=impala.git;h=b65dbf8 ] IMPALA-7006: Pick parts of recent Kudu gutil changes - Include some ASAN macros from gutil (Kudu commit c8724c61) - Pick parts of KUDU-2427 (Kudu commit b7cf3b2e) - Rename constants (Kudu commit e719b5ef) These changes will be subsumed by a proper rebase of GUTIL. Change-Id: Id2dc8c70425e3ac030427ebeb1ec18a44d14d5cb Reviewed-on: http://gerrit.cloudera.org:8080/10769 Tested-by: Impala Public Jenkins Reviewed-by: Lars Volker > Rebase KRPC onto Kudu upstream repository > - > > Key: IMPALA-7006 > URL: https://issues.apache.org/jira/browse/IMPALA-7006 > Project: IMPALA > Issue Type: Improvement > Components: Backend >Affects Versions: Impala 2.13.0, Impala 3.1.0 >Reporter: Lars Volker >Assignee: Lars Volker >Priority: Major > Labels: krpc > Fix For: Impala 3.1.0 > > > We should consider rebasing our KRPC code on top of the latest Kudu upstream > version. This will keep the two projects more in sync and will allow us to > make use of recent improvements, e.g. around thread stack collection, without > having to pick individual changes. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-4669) Add Kudu's RPC, util and security libraries
[ https://issues.apache.org/jira/browse/IMPALA-4669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16810080#comment-16810080 ] ASF subversion and git services commented on IMPALA-4669: - Commit d10b34354c0d1616ed2faf78a6659e9be4aacd66 in impala's branch refs/heads/2.x from Lars Volker [ https://gitbox.apache.org/repos/asf?p=impala.git;h=d10b343 ] IMPALA-4669: [KRPC] Add kudu_rpc library to build NOTE: This commit is part of a set of changes for IMPALA-7006. It contains pieces of a previous commit that need to be cherry picked again after rebasing the code in be/src/kudu/{util,security,rpc}. The original commit message is below: Import FindKRPC.cmake from Apache Kudu. Add some files to protoc-gen-krpc link to allow it to find symbols now defined within Impala (without linking all of Impala's libraries). Change-Id: I5693288db90f2e9673b8c88ca4378c3790cba957 Reviewed-on: http://gerrit.cloudera.org:8080/5719 Reviewed-by: Henry Robinson Tested-by: Impala Public Jenkins Reviewed-on: http://gerrit.cloudera.org:8080/10760 Reviewed-by: Lars Volker Tested-by: Lars Volker > Add Kudu's RPC, util and security libraries > --- > > Key: IMPALA-4669 > URL: https://issues.apache.org/jira/browse/IMPALA-4669 > Project: IMPALA > Issue Type: Sub-task > Components: Distributed Exec >Affects Versions: Impala 2.8.0 >Reporter: Henry Robinson >Assignee: Henry Robinson >Priority: Major > Fix For: Impala 2.10.0 > > > To enable KRPC in Impala, we need to link against Kudu's {{rpc}}, > {{security}} and {{util}} libraries. The easiest way for now is to pull them > into trunk. > Doing this also requires upgrading our {{gutil}} version. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-7288) Codegen crash in FinalizeModule()
[ https://issues.apache.org/jira/browse/IMPALA-7288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16810089#comment-16810089 ] ASF subversion and git services commented on IMPALA-7288: - Commit a26392c6d11bd4b6367936c7a1188292027e2d2d in impala's branch refs/heads/2.x from Bikramjeet Vig [ https://gitbox.apache.org/repos/asf?p=impala.git;h=a26392c ] IMPALA-7288: Fix Codegen Crash in FinalizeModule() Currently codegen crashed during FinalizeModule() where it tries to clean up half-baked handcrafted functions. This happens only for the cases where the code generating the handcrafted IR calls eraseFromParent() on failure which also deletes the memory held by the function pointer and therefore causes a crash during clean up in FinalizeModule(). Testing: Added regression tests that verify that failure code paths in the previously offending methods don't crash Impala. Change-Id: I2f0b527909a9fb3090996bb7510e4d58350c21b0 Reviewed-on: http://gerrit.cloudera.org:8080/10933 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins > Codegen crash in FinalizeModule() > - > > Key: IMPALA-7288 > URL: https://issues.apache.org/jira/browse/IMPALA-7288 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 2.12.0, Impala 3.1.0 >Reporter: Balazs Jeszenszky >Assignee: Bikramjeet Vig >Priority: Blocker > Fix For: Impala 3.1.0 > > > The following sequence crashes Impala 2.12 reliably: > {code} > CREATE TABLE test (c1 CHAR(6),c2 CHAR(6)); > select 1 from test t1, test t2 > where t1.c1 = FROM_TIMESTAMP(cast(t2.c2 as string), 'MMdd'); > {code} > hs_err_pid has: > {code} > # > # A fatal error has been detected by the Java Runtime Environment: > # > # SIGSEGV (0xb) at pc=0x03b36ce4, pid=28459, tid=0x7f2c49685700 > # > # JRE version: Java(TM) SE Runtime Environment (8.0_162-b12) (build > 1.8.0_162-b12) > # Java VM: Java HotSpot(TM) 64-Bit Server VM (25.162-b12 mixed mode > linux-amd64 compressed oops) > # Problematic frame: > # C [impalad+0x3736ce4] llvm::Value::getContext() const+0x4 > {code} > Backtrace is: > {code} > #0 0x7f2cb217a5f7 in raise () from /lib64/libc.so.6 > #1 0x7f2cb217bce8 in abort () from /lib64/libc.so.6 > #2 0x7f2cb4de2f35 in os::abort(bool) () from > /usr/java/latest/jre/lib/amd64/server/libjvm.so > #3 0x7f2cb4f86f33 in VMError::report_and_die() () from > /usr/java/latest/jre/lib/amd64/server/libjvm.so > #4 0x7f2cb4de922f in JVM_handle_linux_signal () from > /usr/java/latest/jre/lib/amd64/server/libjvm.so > #5 0x7f2cb4ddf253 in signalHandler(int, siginfo*, void*) () from > /usr/java/latest/jre/lib/amd64/server/libjvm.so > #6 > #7 0x03b36ce4 in llvm::Value::getContext() const () > #8 0x03b36cff in llvm::Value::getValueName() const () > #9 0x03b36de9 in llvm::Value::getName() const () > #10 0x01ba6bb2 in impala::LlvmCodeGen::FinalizeModule (this=0x9b53980) > at > /usr/src/debug/impala-2.12.0-cdh5.15.0/be/src/codegen/llvm-codegen.cc:1076 > #11 0x018f5c0f in impala::FragmentInstanceState::Open (this=0xac0b400) > at > /usr/src/debug/impala-2.12.0-cdh5.15.0/be/src/runtime/fragment-instance-state.cc:255 > #12 0x018f3699 in impala::FragmentInstanceState::Exec (this=0xac0b400) > at > /usr/src/debug/impala-2.12.0-cdh5.15.0/be/src/runtime/fragment-instance-state.cc:80 > #13 0x019028c3 in impala::QueryState::ExecFInstance (this=0x9c6ad00, > fis=0xac0b400) > at > /usr/src/debug/impala-2.12.0-cdh5.15.0/be/src/runtime/query-state.cc:410 > #14 0x0190113c in impala::QueryStateoperator()(void) > const (__closure=0x7f2c49684be8) > at > /usr/src/debug/impala-2.12.0-cdh5.15.0/be/src/runtime/query-state.cc:350 > #15 0x019034dd in > boost::detail::function::void_function_obj_invoker0, > void>::invoke(boost::detail::function::function_buffer &) > (function_obj_ptr=...) > at > /usr/src/debug/impala-2.12.0-cdh5.15.0/toolchain/boost-1.57.0-p3/include/boost/function/function_template.hpp:153 > {code} > Crash is at > https://github.com/cloudera/Impala/blob/cdh5-2.12.0_5.15.0/be/src/codegen/llvm-codegen.cc#L1070-L1079. > The repro steps seem to be quite specific. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-6826) Add support for Ubuntu 18.04
[ https://issues.apache.org/jira/browse/IMPALA-6826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16810090#comment-16810090 ] ASF subversion and git services commented on IMPALA-6826: - Commit 5771c45a21450e49fd2d969a8218bbc62f5530b0 in impala's branch refs/heads/master from Laszlo Gaal [ https://gitbox.apache.org/repos/asf?p=impala.git;h=5771c45 ] IMPALA-6826: Extend bootstrap_system.sh to Ubuntu 18.04 Tweak bin/bootstrap_system.sh to automate the preparation of an Impala development environment on Ubuntu 18.04. The following changes were required: - extend the OS recognition logic to Ubuntu 18.04 - add 'ant' to the list of installed packages - request OpenJDK 8 as the default Java environment (Ubuntu 18.04 defaults to OpenJDK 11) These changes enable bootstrap_system.sh to set up an Impala development environment where Impala can be successfully built. Note that the patch does not attempt to pass the tests yet; this change prepares only the environment. Bugs specific to Ubuntu 18 will be fixed by follow-up commits. Tested in the following environments: - in a Docker container, using "docker/test-with-docker.py --base-image:ubuntu:18.04" - on an AWS EC2 m5.4xlarge instance Change-Id: Iad790f72ea6b62258aed2225eb7bdf79590c350f Reviewed-on: http://gerrit.cloudera.org:8080/12893 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins > Add support for Ubuntu 18.04 > > > Key: IMPALA-6826 > URL: https://issues.apache.org/jira/browse/IMPALA-6826 > Project: IMPALA > Issue Type: Task > Components: Infrastructure >Affects Versions: Impala 3.0, Impala 2.12.0 > Environment: Ubuntu 18.04 >Reporter: Jim Apple >Assignee: Laszlo Gaal >Priority: Major > > We support Ubuntu 16.04 (and 14.04, in the 2.x line). > > I'm blocked on Ubuntu 18.04 support in > [https://github.com/cloudera/native-toolchain,] but the toolchain is not > technically a pre-requisite, though I believe it's the easiest way to get a > development environment up and running. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-7006) Rebase KRPC onto Kudu upstream repository
[ https://issues.apache.org/jira/browse/IMPALA-7006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16810075#comment-16810075 ] ASF subversion and git services commented on IMPALA-7006: - Commit dfb9e16960f858e1dccd209e7b1f7e4be60bc6d4 in impala's branch refs/heads/2.x from Lars Volker [ https://gitbox.apache.org/repos/asf?p=impala.git;h=dfb9e16 ] IMPALA-7006: Add KRPC folders from kudu@334ecafd cp -a ~/checkout/kudu/src/kudu/{rpc,util,security} be/src/kudu/ Change-Id: I232db2b4ccf5df9aca87b21dea31bfb2735d1ab7 Reviewed-on: http://gerrit.cloudera.org:8080/10757 Reviewed-by: Lars Volker Tested-by: Lars Volker > Rebase KRPC onto Kudu upstream repository > - > > Key: IMPALA-7006 > URL: https://issues.apache.org/jira/browse/IMPALA-7006 > Project: IMPALA > Issue Type: Improvement > Components: Backend >Affects Versions: Impala 2.13.0, Impala 3.1.0 >Reporter: Lars Volker >Assignee: Lars Volker >Priority: Major > Labels: krpc > Fix For: Impala 3.1.0 > > > We should consider rebasing our KRPC code on top of the latest Kudu upstream > version. This will keep the two projects more in sync and will allow us to > make use of recent improvements, e.g. around thread stack collection, without > having to pick individual changes. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-110) Add support for multiple distinct operators in the same query block
[ https://issues.apache.org/jira/browse/IMPALA-110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16810088#comment-16810088 ] ASF subversion and git services commented on IMPALA-110: Commit fdd6db524c9c97f0baebfde0119fce19d62eaec3 in impala's branch refs/heads/2.x from Thomas Tauber-Marshall [ https://gitbox.apache.org/repos/asf?p=impala.git;h=fdd6db5 ] IMPALA-7251: Fix QueryMaintenance calls in Aggregators A recent change, IMPALA-110 (part 2), refactored PartitionedAggregationNode into several classes, including a new type 'Aggregator'. During this refactor, code that makes local allocations while evaluating exprs was moved from the ExecNode (now AggregationNode/StreamingAggregationNode) into the Aggregators, but code related to cleaning these allocations up (ie QueryMaintenance()) was not, resulting in some queries using an excessive amount of memory. This patch removes all calls to QueryMaintenance() from the exec nodes and moves them into the Aggregators. Testing: - Added new test cases with a mem limit that fails if the expr allocations aren't released in a timely manner. - Passed a full exhaustive run. Change-Id: I4dac2bb0a15cdd7315ee15608bae409c125c82f5 Reviewed-on: http://gerrit.cloudera.org:8080/10871 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins > Add support for multiple distinct operators in the same query block > --- > > Key: IMPALA-110 > URL: https://issues.apache.org/jira/browse/IMPALA-110 > Project: IMPALA > Issue Type: New Feature > Components: Backend, Frontend >Affects Versions: Impala 0.5, Impala 1.4, Impala 2.0, Impala 2.2, Impala > 2.3.0 >Reporter: Greg Rahn >Assignee: Thomas Tauber-Marshall >Priority: Major > Labels: sql-language > Fix For: Impala 3.1.0 > > > Impala only allows a single (DISTINCT columns) expression in each query. > {color:red}Note: > If you do not need precise accuracy, you can produce an estimate of the > distinct values for a column by specifying NDV(column); a query can contain > multiple instances of NDV(column). To make Impala automatically rewrite > COUNT(DISTINCT) expressions to NDV(), enable the APPX_COUNT_DISTINCT query > option. > {color} > {code} > [impala:21000] > select count(distinct i_class_id) from item; > Query: select count(distinct i_class_id) from item > Query finished, fetching results ... > 16 > Returned 1 row(s) in 1.51s > {code} > {code} > [impala:21000] > select count(distinct i_class_id), count(distinct > i_brand_id) from item; > Query: select count(distinct i_class_id), count(distinct i_brand_id) from item > ERROR: com.cloudera.impala.common.AnalysisException: Analysis exception (in > select count(distinct i_class_id), count(distinct i_brand_id) from item) > at > com.cloudera.impala.analysis.AnalysisContext.analyze(AnalysisContext.java:133) > at > com.cloudera.impala.service.Frontend.createExecRequest(Frontend.java:221) > at > com.cloudera.impala.service.JniFrontend.createExecRequest(JniFrontend.java:89) > Caused by: com.cloudera.impala.common.AnalysisException: all DISTINCT > aggregate functions need to have the same set of parameters as COUNT(DISTINCT > i_class_id); deviating function: COUNT(DISTINCT i_brand_id) > at > com.cloudera.impala.analysis.AggregateInfo.createDistinctAggInfo(AggregateInfo.java:196) > at > com.cloudera.impala.analysis.AggregateInfo.create(AggregateInfo.java:143) > at > com.cloudera.impala.analysis.SelectStmt.createAggInfo(SelectStmt.java:466) > at > com.cloudera.impala.analysis.SelectStmt.analyzeAggregation(SelectStmt.java:347) > at com.cloudera.impala.analysis.SelectStmt.analyze(SelectStmt.java:155) > at > com.cloudera.impala.analysis.AnalysisContext.analyze(AnalysisContext.java:130) > ... 2 more > {code} > Hive supports this: > {code} > $ hive -e "select count(distinct i_class_id), count(distinct i_brand_id) from > item;" > Logging initialized using configuration in > file:/etc/hive/conf.dist/hive-log4j.properties > Hive history file=/tmp/grahn/hive_job_log_grahn_201303052234_1625576708.txt > Total MapReduce jobs = 1 > Launching Job 1 out of 1 > Number of reduce tasks determined at compile time: 1 > In order to change the average load for a reducer (in bytes): > set hive.exec.reducers.bytes.per.reducer= > In order to limit the maximum number of reducers: > set hive.exec.reducers.max= > In order to set a constant number of reducers: > set mapred.reduce.tasks= > Starting Job = job_201302081514_0073, Tracking URL = > http://impala:50030/jobdetails.jsp?jobid=job_201302081514_0073 > Kill Command = /usr/lib/hadoop/bin/hadoop job > -Dmapred.job.tracker=m0525.mtv.cloudera.com:8021 -kill job_201302081514_0073 > Hadoop job
[jira] [Commented] (IMPALA-7006) Rebase KRPC onto Kudu upstream repository
[ https://issues.apache.org/jira/browse/IMPALA-7006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16810083#comment-16810083 ] ASF subversion and git services commented on IMPALA-7006: - Commit b97e0cd555a53057a82dc9c0ad9e0cfe58f3ec66 in impala's branch refs/heads/2.x from Sailesh Mukil [ https://gitbox.apache.org/repos/asf?p=impala.git;h=b97e0cd ] IMPALA-5129: Use Kudu's Kinit code to avoid expensive fork NOTE: This commit is part of a set of changes for IMPALA-7006. It contains pieces of a previous commit that need to be cherry picked again after rebasing the code in be/src/kudu/{util,security,rpc}. The original commit message is below: Impala currently kinits by forking off a child process. This has proved to be expensive in many cases since the subprocess tries to reserve as much memory as Impala is currently using which can be quite a lot. This patch adds a flag called 'use_kudu_kinit' that defaults to true. When it's true, it uses the Kudu security library's kinit code that programatically uses the krb5 library to kinit. When it's false, we run our current path which kicks off the kinit-thread and forks off a kinit process periodically to reacquire tickets based on FLAGS_kerberos_reinit_interval. Converted existing tests in thrift-server-test to run with and without kerberos. We now run this BE test with kerberos by using Kudu's MiniKdc utility. This introduces a new dependency on some kerberos binaries that are checked through FindKerberosPrograms.cmake. Note that this is only a test dependency and not a dependency for the impalad binaries and friends. Compilation will still succeed if the kerberos binaries for the MiniKdc are not found, however, the thrift-server-test will fail. We run with and without the 'use_kudu_kinit' flag. TODO: Since the setting up and tearing down of our security code isn't idempotent, we can run only any one test in a process with Kerberos now (IMPALA-6085). Updated bin/bootstrap_system.sh to install new sasl-gssapi modules and the kerberos binaries required for the MiniKdc. Also fixed a bug that didn't transfer the environment into 'sudo' in bin/bootstrap_system.sh. Testing: Verified with thrift-server-test and also manually on a live kerberized cluster. Change-Id: Ie3c6e933c454e7adca69ef03e7d5c0c84b656895 Reviewed-on: http://gerrit.cloudera.org:8080/7938 Reviewed-by: Sailesh Mukil Tested-by: Impala Public Jenkins Reviewed-on: http://gerrit.cloudera.org:8080/10763 Reviewed-by: Lars Volker Tested-by: Lars Volker > Rebase KRPC onto Kudu upstream repository > - > > Key: IMPALA-7006 > URL: https://issues.apache.org/jira/browse/IMPALA-7006 > Project: IMPALA > Issue Type: Improvement > Components: Backend >Affects Versions: Impala 2.13.0, Impala 3.1.0 >Reporter: Lars Volker >Assignee: Lars Volker >Priority: Major > Labels: krpc > Fix For: Impala 3.1.0 > > > We should consider rebasing our KRPC code on top of the latest Kudu upstream > version. This will keep the two projects more in sync and will allow us to > make use of recent improvements, e.g. around thread stack collection, without > having to pick individual changes. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-7006) Rebase KRPC onto Kudu upstream repository
[ https://issues.apache.org/jira/browse/IMPALA-7006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16810077#comment-16810077 ] ASF subversion and git services commented on IMPALA-7006: - Commit 5dbd48f226f1061567da1c381ee2491dab3ceaf4 in impala's branch refs/heads/2.x from Lars Volker [ https://gitbox.apache.org/repos/asf?p=impala.git;h=5dbd48f ] IMPALA-4669: [KUTIL] Add kudu_util library to the build. NOTE: This commit is part of a set of changes for IMPALA-7006. It contains pieces of a previous commit that need to be cherry picked again after rebasing the code in be/src/kudu/{util,security,rpc}. The original commit message is below: A few miscellaneous changes to allow kudu_util to compile with Impala. Add kudu_version.cc to substitute for the version.cc file that is automatically built during the full Kudu build. Set LZ4_DISABLE_DEPRECATE_WARNINGS to allow Kudu's compressor utility to use deprecated names for LZ4 methods. Add NO_NVM_SUPPORT flag to Kudu build (plan to upstream this later) to disable building with nvm support, removing a library dependency. Also remove imported FindOpenSSL.cmake in favour of the standard one provided by cmake itself. Finally, a few changes to allow compilation on RHEL5: * Only use sched_getcpu() if supported * Only include magic.h if available * Workaround for kernels that don't have SOCK_NONBLOCK * Workaround for kernels that don't have O_CLOEXEC (ignore the flag) * Provide non-working implementation of fallocate() * Disable inclusion of linux/fiemap.h - although this exists on RHEL5, it does not compile due to other #includes in env_posix.cc. We disable the path this is used for, since Impala does not call that code. * Use Kudu's implementation of pipe(2), preadv(2) and pwritev(2) where it doesn't exist. In most cases these changes simply force kutil to revert to a different implementation that was already written for OSX support - this patch generalises the logic to provide the implementation whenever the required function doesn't exist. This patch compiles on RHEL5.5 and 6.0, SLES11 and 12, Ubuntu 12.04 and 14.04 and Debian 7.0 and 8.0. Change-Id: I451f02d3e4669e8a548b92fb1445cb2b322659a2 Reviewed-on: http://gerrit.cloudera.org:8080/5715 Tested-by: Impala Public Jenkins Reviewed-by: Henry Robinson Reviewed-on: http://gerrit.cloudera.org:8080/10758 Reviewed-by: Michael Ho Tested-by: Lars Volker > Rebase KRPC onto Kudu upstream repository > - > > Key: IMPALA-7006 > URL: https://issues.apache.org/jira/browse/IMPALA-7006 > Project: IMPALA > Issue Type: Improvement > Components: Backend >Affects Versions: Impala 2.13.0, Impala 3.1.0 >Reporter: Lars Volker >Assignee: Lars Volker >Priority: Major > Labels: krpc > Fix For: Impala 3.1.0 > > > We should consider rebasing our KRPC code on top of the latest Kudu upstream > version. This will keep the two projects more in sync and will allow us to > make use of recent improvements, e.g. around thread stack collection, without > having to pick individual changes. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-7006) Rebase KRPC onto Kudu upstream repository
[ https://issues.apache.org/jira/browse/IMPALA-7006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16810081#comment-16810081 ] ASF subversion and git services commented on IMPALA-7006: - Commit d10b34354c0d1616ed2faf78a6659e9be4aacd66 in impala's branch refs/heads/2.x from Lars Volker [ https://gitbox.apache.org/repos/asf?p=impala.git;h=d10b343 ] IMPALA-4669: [KRPC] Add kudu_rpc library to build NOTE: This commit is part of a set of changes for IMPALA-7006. It contains pieces of a previous commit that need to be cherry picked again after rebasing the code in be/src/kudu/{util,security,rpc}. The original commit message is below: Import FindKRPC.cmake from Apache Kudu. Add some files to protoc-gen-krpc link to allow it to find symbols now defined within Impala (without linking all of Impala's libraries). Change-Id: I5693288db90f2e9673b8c88ca4378c3790cba957 Reviewed-on: http://gerrit.cloudera.org:8080/5719 Reviewed-by: Henry Robinson Tested-by: Impala Public Jenkins Reviewed-on: http://gerrit.cloudera.org:8080/10760 Reviewed-by: Lars Volker Tested-by: Lars Volker > Rebase KRPC onto Kudu upstream repository > - > > Key: IMPALA-7006 > URL: https://issues.apache.org/jira/browse/IMPALA-7006 > Project: IMPALA > Issue Type: Improvement > Components: Backend >Affects Versions: Impala 2.13.0, Impala 3.1.0 >Reporter: Lars Volker >Assignee: Lars Volker >Priority: Major > Labels: krpc > Fix For: Impala 3.1.0 > > > We should consider rebasing our KRPC code on top of the latest Kudu upstream > version. This will keep the two projects more in sync and will allow us to > make use of recent improvements, e.g. around thread stack collection, without > having to pick individual changes. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-6085) Make the setup and teardown of the security code idempotent
[ https://issues.apache.org/jira/browse/IMPALA-6085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16810084#comment-16810084 ] ASF subversion and git services commented on IMPALA-6085: - Commit b97e0cd555a53057a82dc9c0ad9e0cfe58f3ec66 in impala's branch refs/heads/2.x from Sailesh Mukil [ https://gitbox.apache.org/repos/asf?p=impala.git;h=b97e0cd ] IMPALA-5129: Use Kudu's Kinit code to avoid expensive fork NOTE: This commit is part of a set of changes for IMPALA-7006. It contains pieces of a previous commit that need to be cherry picked again after rebasing the code in be/src/kudu/{util,security,rpc}. The original commit message is below: Impala currently kinits by forking off a child process. This has proved to be expensive in many cases since the subprocess tries to reserve as much memory as Impala is currently using which can be quite a lot. This patch adds a flag called 'use_kudu_kinit' that defaults to true. When it's true, it uses the Kudu security library's kinit code that programatically uses the krb5 library to kinit. When it's false, we run our current path which kicks off the kinit-thread and forks off a kinit process periodically to reacquire tickets based on FLAGS_kerberos_reinit_interval. Converted existing tests in thrift-server-test to run with and without kerberos. We now run this BE test with kerberos by using Kudu's MiniKdc utility. This introduces a new dependency on some kerberos binaries that are checked through FindKerberosPrograms.cmake. Note that this is only a test dependency and not a dependency for the impalad binaries and friends. Compilation will still succeed if the kerberos binaries for the MiniKdc are not found, however, the thrift-server-test will fail. We run with and without the 'use_kudu_kinit' flag. TODO: Since the setting up and tearing down of our security code isn't idempotent, we can run only any one test in a process with Kerberos now (IMPALA-6085). Updated bin/bootstrap_system.sh to install new sasl-gssapi modules and the kerberos binaries required for the MiniKdc. Also fixed a bug that didn't transfer the environment into 'sudo' in bin/bootstrap_system.sh. Testing: Verified with thrift-server-test and also manually on a live kerberized cluster. Change-Id: Ie3c6e933c454e7adca69ef03e7d5c0c84b656895 Reviewed-on: http://gerrit.cloudera.org:8080/7938 Reviewed-by: Sailesh Mukil Tested-by: Impala Public Jenkins Reviewed-on: http://gerrit.cloudera.org:8080/10763 Reviewed-by: Lars Volker Tested-by: Lars Volker > Make the setup and teardown of the security code idempotent > --- > > Key: IMPALA-6085 > URL: https://issues.apache.org/jira/browse/IMPALA-6085 > Project: IMPALA > Issue Type: Improvement > Components: Security >Affects Versions: Impala 2.10.0 >Reporter: Sailesh Mukil >Priority: Major > Labels: infrastructure, security, test > > Our security code assumes that it will only be called once in the lifetime of > a process. This is true, however, for tests, we would like to set it up and > tear it down multiple times to issue it different configurations and test it > within the same backend test process. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Work started] (IMPALA-8385) Refactor Sentry admin check
[ https://issues.apache.org/jira/browse/IMPALA-8385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on IMPALA-8385 started by Fredy Wijaya. > Refactor Sentry admin check > --- > > Key: IMPALA-8385 > URL: https://issues.apache.org/jira/browse/IMPALA-8385 > Project: IMPALA > Issue Type: Sub-task > Components: Catalog, Frontend >Reporter: Fredy Wijaya >Assignee: Fredy Wijaya >Priority: Major > > Currently Sentry admin check is hardcoded, for example: > https://github.com/apache/impala/blob/5670f96b828d57f9e36510bb9af02bcc31de775c/be/src/service/client-request-state.cc#L366 > This check needs to be moved out to SentryAuthorizationManager instead. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Assigned] (IMPALA-8385) Refactor Sentry admin check
[ https://issues.apache.org/jira/browse/IMPALA-8385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fredy Wijaya reassigned IMPALA-8385: Assignee: Fredy Wijaya > Refactor Sentry admin check > --- > > Key: IMPALA-8385 > URL: https://issues.apache.org/jira/browse/IMPALA-8385 > Project: IMPALA > Issue Type: Sub-task > Components: Catalog, Frontend >Reporter: Fredy Wijaya >Assignee: Fredy Wijaya >Priority: Major > > Currently Sentry admin check is hardcoded, for example: > https://github.com/apache/impala/blob/5670f96b828d57f9e36510bb9af02bcc31de775c/be/src/service/client-request-state.cc#L366 > This check needs to be moved out to SentryAuthorizationManager instead. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Work started] (IMPALA-8227) Support WITH GRANT OPTION with Ranger authorization provider
[ https://issues.apache.org/jira/browse/IMPALA-8227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on IMPALA-8227 started by Austin Nobis. > Support WITH GRANT OPTION with Ranger authorization provider > > > Key: IMPALA-8227 > URL: https://issues.apache.org/jira/browse/IMPALA-8227 > Project: IMPALA > Issue Type: Sub-task > Components: Catalog, Frontend >Reporter: Fredy Wijaya >Assignee: Austin Nobis >Priority: Major > > This ticket should investigate whether it's feasible to support WITH GRANT > OPTION (giving a grant/revoke privilege for non-admins) with Ranger. If it's > not feasible, Impala should throw an error when Impala is enabled with Ranger. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Comment Edited] (IMPALA-8386) Incorrect predicate in a left outer join query
[ https://issues.apache.org/jira/browse/IMPALA-8386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16809987#comment-16809987 ] Csaba Ringhofer edited comment on IMPALA-8386 at 4/4/19 4:09 PM: - I could slightly simplify the query, the aggregates + group by are actually not needed, a single field with two aliases is enough + the sub query for (select a_id from a) can be also replaced with "a": {code} select count(1) from ( select t2.a_id,t2.amount1,t2.amount2 from a left outer join ( select c.a_id, amount as amount1, amount as amount2 from b join c on b.b_id = c.b_id) t2 on a.a_id = t2.a_id ) t; -- returns 1 {code} was (Author: csringhofer): I could slightly simplify the query, the aggregates + group by are actually not needed, a single field with two aliases is enough: {code} select count(1) from ( select t2.a_id,t2.amount1,t2.amount2 from( select a_id from a) t1 left outer join ( select c.a_id, amount as amount1, amount as amount2 from b join c on b.b_id = c.b_id) t2 on t1.a_id = t2.a_id ) t; -- returns 1 {code} > Incorrect predicate in a left outer join query > -- > > Key: IMPALA-8386 > URL: https://issues.apache.org/jira/browse/IMPALA-8386 > Project: IMPALA > Issue Type: Bug > Components: Frontend >Reporter: Quanlong Huang >Assignee: Quanlong Huang >Priority: Critical > Labels: correctness > > skyyws reported a bug [in the mailing > list|https://lists.apache.org/thread.html/0bdbbaa6bb35b552f050ae30587b7d75b78a72ec60007a8bc0a4a8a9@%3Cdev.impala.apache.org%3E] > on the following data set: > {code:java} > table A > +--+ > | a_id | > +--+ > | 1 | > | 2 | > +--+ > table B > +--++ > | b_id | amount | > +--++ > | 1 | 10 | > | 1 | 20 | > | 2 | NULL | > +--++ > table C > +--+--+ > | a_id | b_id | > +--+--+ > | 1 | 1 | > | 2 | 2 | > +--+--+{code} > The following query returns a wrong result "1": > {code:java} > select count(1) from ( > select t2.a_id,t2.amount1,t2.amount2 > from( select a_id from a) t1 > left outer join ( > select c.a_id,sum(amount) as amount1,sum(amount) as amount2 > from b join c on b.b_id = c.b_id group by c.a_id) t2 > on t1.a_id = t2.a_id > ) t; > {code} > Removing "t2.amount2" can get the right result "2": > {code:java} > select count(1) from ( > select t2.a_id,t2.amount1 > from( select a_id from a) t1 > left outer join ( > select c.a_id,sum(amount) as amount1,sum(amount) as amount2 > from b join c on b.b_id = c.b_id group by c.a_id) t2 > on t1.a_id = t2.a_id > ) t; > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-8386) Incorrect predicate in a left outer join query
[ https://issues.apache.org/jira/browse/IMPALA-8386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Csaba Ringhofer updated IMPALA-8386: Labels: correctness (was: ) > Incorrect predicate in a left outer join query > -- > > Key: IMPALA-8386 > URL: https://issues.apache.org/jira/browse/IMPALA-8386 > Project: IMPALA > Issue Type: Bug > Components: Frontend >Reporter: Quanlong Huang >Assignee: Quanlong Huang >Priority: Critical > Labels: correctness > > skyyws reported a bug [in the mailing > list|https://lists.apache.org/thread.html/0bdbbaa6bb35b552f050ae30587b7d75b78a72ec60007a8bc0a4a8a9@%3Cdev.impala.apache.org%3E] > on the following data set: > {code:java} > table A > +--+ > | a_id | > +--+ > | 1 | > | 2 | > +--+ > table B > +--++ > | b_id | amount | > +--++ > | 1 | 10 | > | 1 | 20 | > | 2 | NULL | > +--++ > table C > +--+--+ > | a_id | b_id | > +--+--+ > | 1 | 1 | > | 2 | 2 | > +--+--+{code} > The following query returns a wrong result "1": > {code:java} > select count(1) from ( > select t2.a_id,t2.amount1,t2.amount2 > from( select a_id from a) t1 > left outer join ( > select c.a_id,sum(amount) as amount1,sum(amount) as amount2 > from b join c on b.b_id = c.b_id group by c.a_id) t2 > on t1.a_id = t2.a_id > ) t; > {code} > Removing "t2.amount2" can get the right result "2": > {code:java} > select count(1) from ( > select t2.a_id,t2.amount1 > from( select a_id from a) t1 > left outer join ( > select c.a_id,sum(amount) as amount1,sum(amount) as amount2 > from b join c on b.b_id = c.b_id group by c.a_id) t2 > on t1.a_id = t2.a_id > ) t; > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-8386) Incorrect predicate in a left outer join query
[ https://issues.apache.org/jira/browse/IMPALA-8386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Csaba Ringhofer updated IMPALA-8386: Component/s: Frontend > Incorrect predicate in a left outer join query > -- > > Key: IMPALA-8386 > URL: https://issues.apache.org/jira/browse/IMPALA-8386 > Project: IMPALA > Issue Type: Bug > Components: Frontend >Reporter: Quanlong Huang >Assignee: Quanlong Huang >Priority: Critical > > skyyws reported a bug [in the mailing > list|https://lists.apache.org/thread.html/0bdbbaa6bb35b552f050ae30587b7d75b78a72ec60007a8bc0a4a8a9@%3Cdev.impala.apache.org%3E] > on the following data set: > {code:java} > table A > +--+ > | a_id | > +--+ > | 1 | > | 2 | > +--+ > table B > +--++ > | b_id | amount | > +--++ > | 1 | 10 | > | 1 | 20 | > | 2 | NULL | > +--++ > table C > +--+--+ > | a_id | b_id | > +--+--+ > | 1 | 1 | > | 2 | 2 | > +--+--+{code} > The following query returns a wrong result "1": > {code:java} > select count(1) from ( > select t2.a_id,t2.amount1,t2.amount2 > from( select a_id from a) t1 > left outer join ( > select c.a_id,sum(amount) as amount1,sum(amount) as amount2 > from b join c on b.b_id = c.b_id group by c.a_id) t2 > on t1.a_id = t2.a_id > ) t; > {code} > Removing "t2.amount2" can get the right result "2": > {code:java} > select count(1) from ( > select t2.a_id,t2.amount1 > from( select a_id from a) t1 > left outer join ( > select c.a_id,sum(amount) as amount1,sum(amount) as amount2 > from b join c on b.b_id = c.b_id group by c.a_id) t2 > on t1.a_id = t2.a_id > ) t; > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Comment Edited] (IMPALA-8386) Incorrect predicate in a left outer join query
[ https://issues.apache.org/jira/browse/IMPALA-8386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16809987#comment-16809987 ] Csaba Ringhofer edited comment on IMPALA-8386 at 4/4/19 3:38 PM: - I could slightly simplify the query, the aggregates + group by are actually not needed, a single field with two aliases is enough: {code} select count(1) from ( select t2.a_id,t2.amount1,t2.amount2 from( select a_id from a) t1 left outer join ( select c.a_id, amount as amount1, amount as amount2 from b join c on b.b_id = c.b_id) t2 on t1.a_id = t2.a_id ) t; -- returns 1 {code} was (Author: csringhofer): I could slightly simplify the query, the aggregates + group by are actually not needed, a single field with two aliases is enough: select count(1) from ( select t2.a_id,t2.amount1,t2.amount2 from( select a_id from a) t1 left outer join ( select c.a_id, amount as amount1, amount as amount2 from b join c on b.b_id = c.b_id) t2 on t1.a_id = t2.a_id ) t; -- returns 1 > Incorrect predicate in a left outer join query > -- > > Key: IMPALA-8386 > URL: https://issues.apache.org/jira/browse/IMPALA-8386 > Project: IMPALA > Issue Type: Bug >Reporter: Quanlong Huang >Assignee: Quanlong Huang >Priority: Critical > > skyyws reported a bug [in the mailing > list|https://lists.apache.org/thread.html/0bdbbaa6bb35b552f050ae30587b7d75b78a72ec60007a8bc0a4a8a9@%3Cdev.impala.apache.org%3E] > on the following data set: > {code:java} > table A > +--+ > | a_id | > +--+ > | 1 | > | 2 | > +--+ > table B > +--++ > | b_id | amount | > +--++ > | 1 | 10 | > | 1 | 20 | > | 2 | NULL | > +--++ > table C > +--+--+ > | a_id | b_id | > +--+--+ > | 1 | 1 | > | 2 | 2 | > +--+--+{code} > The following query returns a wrong result "1": > {code:java} > select count(1) from ( > select t2.a_id,t2.amount1,t2.amount2 > from( select a_id from a) t1 > left outer join ( > select c.a_id,sum(amount) as amount1,sum(amount) as amount2 > from b join c on b.b_id = c.b_id group by c.a_id) t2 > on t1.a_id = t2.a_id > ) t; > {code} > Removing "t2.amount2" can get the right result "2": > {code:java} > select count(1) from ( > select t2.a_id,t2.amount1 > from( select a_id from a) t1 > left outer join ( > select c.a_id,sum(amount) as amount1,sum(amount) as amount2 > from b join c on b.b_id = c.b_id group by c.a_id) t2 > on t1.a_id = t2.a_id > ) t; > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Assigned] (IMPALA-8386) Incorrect predicate in a left outer join query
[ https://issues.apache.org/jira/browse/IMPALA-8386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Quanlong Huang reassigned IMPALA-8386: -- Assignee: Quanlong Huang > Incorrect predicate in a left outer join query > -- > > Key: IMPALA-8386 > URL: https://issues.apache.org/jira/browse/IMPALA-8386 > Project: IMPALA > Issue Type: Bug >Reporter: Quanlong Huang >Assignee: Quanlong Huang >Priority: Critical > > skyyws reported a bug [in the mailing > list|https://lists.apache.org/thread.html/0bdbbaa6bb35b552f050ae30587b7d75b78a72ec60007a8bc0a4a8a9@%3Cdev.impala.apache.org%3E] > on the following data set: > {code:java} > table A > +--+ > | a_id | > +--+ > | 1 | > | 2 | > +--+ > table B > +--++ > | b_id | amount | > +--++ > | 1 | 10 | > | 1 | 20 | > | 2 | NULL | > +--++ > table C > +--+--+ > | a_id | b_id | > +--+--+ > | 1 | 1 | > | 2 | 2 | > +--+--+{code} > The following query returns a wrong result "1": > {code:java} > select count(1) from ( > select t2.a_id,t2.amount1,t2.amount2 > from( select a_id from a) t1 > left outer join ( > select c.a_id,sum(amount) as amount1,sum(amount) as amount2 > from b join c on b.b_id = c.b_id group by c.a_id) t2 > on t1.a_id = t2.a_id > ) t; > {code} > Removing "t2.amount2" can get the right result "2": > {code:java} > select count(1) from ( > select t2.a_id,t2.amount1 > from( select a_id from a) t1 > left outer join ( > select c.a_id,sum(amount) as amount1,sum(amount) as amount2 > from b join c on b.b_id = c.b_id group by c.a_id) t2 > on t1.a_id = t2.a_id > ) t; > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-8386) Incorrect predicate in a left outer join query
Quanlong Huang created IMPALA-8386: -- Summary: Incorrect predicate in a left outer join query Key: IMPALA-8386 URL: https://issues.apache.org/jira/browse/IMPALA-8386 Project: IMPALA Issue Type: Bug Reporter: Quanlong Huang skyyws reported a bug [in the mailing list|https://lists.apache.org/thread.html/0bdbbaa6bb35b552f050ae30587b7d75b78a72ec60007a8bc0a4a8a9@%3Cdev.impala.apache.org%3E] on the following data set: {code:java} table A +--+ | a_id | +--+ | 1 | | 2 | +--+ table B +--++ | b_id | amount | +--++ | 1 | 10 | | 1 | 20 | | 2 | NULL | +--++ table C +--+--+ | a_id | b_id | +--+--+ | 1 | 1 | | 2 | 2 | +--+--+{code} The following query returns a wrong result "1": {code:java} select count(1) from ( select t2.a_id,t2.amount1,t2.amount2 from( select a_id from a) t1 left outer join ( select c.a_id,sum(amount) as amount1,sum(amount) as amount2 from b join c on b.b_id = c.b_id group by c.a_id) t2 on t1.a_id = t2.a_id ) t; {code} Removing "t2.amount2" can get the right result "2": {code:java} select count(1) from ( select t2.a_id,t2.amount1 from( select a_id from a) t1 left outer join ( select c.a_id,sum(amount) as amount1,sum(amount) as amount2 from b join c on b.b_id = c.b_id group by c.a_id) t2 on t1.a_id = t2.a_id ) t; {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Assigned] (IMPALA-8309) Use a more human-readable flag to switch to a different authorization provider
[ https://issues.apache.org/jira/browse/IMPALA-8309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Austin Nobis reassigned IMPALA-8309: Assignee: radford nguyen (was: Austin Nobis) > Use a more human-readable flag to switch to a different authorization provider > -- > > Key: IMPALA-8309 > URL: https://issues.apache.org/jira/browse/IMPALA-8309 > Project: IMPALA > Issue Type: Sub-task > Components: Backend >Reporter: Fredy Wijaya >Assignee: radford nguyen >Priority: Minor > > We currently use authorization_factory_class flag to switch to a different > authorization provider, which is useful for any third party to provide an > implementation of authorization provider. Since, Sentry and Ranger are > officially supported by Impala, we should have a flag, i.e. > authorization_provider=[sentry|ranger] to easily switch between officially > supported authorization providers. > At the time of this writing, the existing {{authorization_factory_class}} > flag is being retained but its default value removed. If present, it will > take precedence over the {{authorization_provider}} flag being added. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Work started] (IMPALA-8309) Use a more human-readable flag to switch to a different authorization provider
[ https://issues.apache.org/jira/browse/IMPALA-8309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on IMPALA-8309 started by Austin Nobis. > Use a more human-readable flag to switch to a different authorization provider > -- > > Key: IMPALA-8309 > URL: https://issues.apache.org/jira/browse/IMPALA-8309 > Project: IMPALA > Issue Type: Sub-task > Components: Backend >Reporter: Fredy Wijaya >Assignee: Austin Nobis >Priority: Minor > > We currently use authorization_factory_class flag to switch to a different > authorization provider, which is useful for any third party to provide an > implementation of authorization provider. Since, Sentry and Ranger are > officially supported by Impala, we should have a flag, i.e. > authorization_provider=[sentry|ranger] to easily switch between officially > supported authorization providers. > At the time of this writing, the existing {{authorization_factory_class}} > flag is being retained but its default value removed. If present, it will > take precedence over the {{authorization_provider}} flag being added. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-8381) Remove branch from ParquetPlainEncoder::Decode()
[ https://issues.apache.org/jira/browse/IMPALA-8381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16809886#comment-16809886 ] Daniel Becker commented on IMPALA-8381: --- Some measurements: Running the following query in the database tpch_parquet: {code:java} set num_nodes=1; select max(l_orderkey) from lineitem;{code} we found the following results averaging the MaterializeTupleTime(*) values over 100 runs with and without the "if": Without "if": 14.3464ms With "if": 16.42624ms This is a 14% improvement in MaterializeTupleTime in this query. The total query time was 0.11s, the ~2ms gain is a little less than 2%. > Remove branch from ParquetPlainEncoder::Decode() > > > Key: IMPALA-8381 > URL: https://issues.apache.org/jira/browse/IMPALA-8381 > Project: IMPALA > Issue Type: Improvement > Components: Backend >Reporter: Csaba Ringhofer >Priority: Minor > Labels: newbie, parquet, performance, ramp-up > > Removing the "if" at > https://github.com/apache/impala/blob/5670f96b828d57f9e36510bb9af02bcc31de775c/be/src/exec/parquet/parquet-common.h#L203 > can lead to 1.5x speed up in plain decoding (type=int32, stride=16). For > primitive types, the same check can be done for a whole batch, so the speedup > can be gained for large batches without loosing safety. The only Parquet type > where this check is needed per element is BYTE_ARRAY (typically used for > STRING columns), which already has a template specialization for > ParquetPlainEncoder::Decode(). -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Assigned] (IMPALA-8381) Remove branch from ParquetPlainEncoder::Decode()
[ https://issues.apache.org/jira/browse/IMPALA-8381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Becker reassigned IMPALA-8381: - Assignee: Daniel Becker > Remove branch from ParquetPlainEncoder::Decode() > > > Key: IMPALA-8381 > URL: https://issues.apache.org/jira/browse/IMPALA-8381 > Project: IMPALA > Issue Type: Improvement > Components: Backend >Reporter: Csaba Ringhofer >Assignee: Daniel Becker >Priority: Minor > Labels: newbie, parquet, performance, ramp-up > > Removing the "if" at > https://github.com/apache/impala/blob/5670f96b828d57f9e36510bb9af02bcc31de775c/be/src/exec/parquet/parquet-common.h#L203 > can lead to 1.5x speed up in plain decoding (type=int32, stride=16). For > primitive types, the same check can be done for a whole batch, so the speedup > can be gained for large batches without loosing safety. The only Parquet type > where this check is needed per element is BYTE_ARRAY (typically used for > STRING columns), which already has a template specialization for > ParquetPlainEncoder::Decode(). -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Comment Edited] (IMPALA-8381) Remove branch from ParquetPlainEncoder::Decode()
[ https://issues.apache.org/jira/browse/IMPALA-8381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16809886#comment-16809886 ] Daniel Becker edited comment on IMPALA-8381 at 4/4/19 2:01 PM: --- Some measurements: Running the following query in the database tpch_parquet: {code:java} set num_nodes=1; select max(l_orderkey) from lineitem;{code} we found the following results averaging the MaterializeTupleTime values over 100 runs with and without the "if": Without "if": 14.3464ms With "if": 16.42624ms This is a 14% improvement in MaterializeTupleTime in this query. The total query time was 0.11s, the ~2ms gain is a little less than 2%. was (Author: daniel.becker): Some measurements: Running the following query in the database tpch_parquet: {code:java} set num_nodes=1; select max(l_orderkey) from lineitem;{code} we found the following results averaging the MaterializeTupleTime(*) values over 100 runs with and without the "if": Without "if": 14.3464ms With "if": 16.42624ms This is a 14% improvement in MaterializeTupleTime in this query. The total query time was 0.11s, the ~2ms gain is a little less than 2%. > Remove branch from ParquetPlainEncoder::Decode() > > > Key: IMPALA-8381 > URL: https://issues.apache.org/jira/browse/IMPALA-8381 > Project: IMPALA > Issue Type: Improvement > Components: Backend >Reporter: Csaba Ringhofer >Priority: Minor > Labels: newbie, parquet, performance, ramp-up > > Removing the "if" at > https://github.com/apache/impala/blob/5670f96b828d57f9e36510bb9af02bcc31de775c/be/src/exec/parquet/parquet-common.h#L203 > can lead to 1.5x speed up in plain decoding (type=int32, stride=16). For > primitive types, the same check can be done for a whole batch, so the speedup > can be gained for large batches without loosing safety. The only Parquet type > where this check is needed per element is BYTE_ARRAY (typically used for > STRING columns), which already has a template specialization for > ParquetPlainEncoder::Decode(). -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org