[jira] [Work started] (IMPALA-4356) Automatically codegen expressions with any root Expr node

2019-04-04 Thread Tim Armstrong (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-4356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-4356 started by Tim Armstrong.
-
> Automatically codegen expressions with any root Expr node
> -
>
> Key: IMPALA-4356
> URL: https://issues.apache.org/jira/browse/IMPALA-4356
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Affects Versions: Impala 2.8.0
>Reporter: Tim Armstrong
>Assignee: Tim Armstrong
>Priority: Major
>  Labels: codegen
>
> Currently Impala only automatically codegens expression subtrees with 
> ScalarFnCall at the root. This is the expression type used to implement many 
> but not all expressions (including most builtin operators). One example of an 
> expression that is not automatically codegened is  "case" statements.
> The crux of this is to move ScalarFnCall::scalar_fn_wrapper_ into ScalarExpr 
> (and probably rename it). There are some consequential changes required to 
> make this work. Instead of each ScalarExpr subclass overriding Get*Val(), I 
> think Get*Val() should be a non-virtual method of ScalarExpr that either 
> calls the codegen'd function or calls into Get*ValInterpreted(), which would 
> be the new virtual function.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-8390) test_cancel_insert and test_cancel_sort broken

2019-04-04 Thread Thomas Tauber-Marshall (JIRA)
Thomas Tauber-Marshall created IMPALA-8390:
--

 Summary: test_cancel_insert and test_cancel_sort broken
 Key: IMPALA-8390
 URL: https://issues.apache.org/jira/browse/IMPALA-8390
 Project: IMPALA
  Issue Type: Bug
Reporter: Thomas Tauber-Marshall
Assignee: Thomas Tauber-Marshall


The tests test_cancel_insert and test_cancel_sort in test_cancellation.py are 
both broken due to specifying a test dimension 'action' which was renamed as 
part of IMPALA-7205

More generally, test_cancellation.py has a large number of test dimensions that 
blow up into a huge test matrix and we should probably think through what 
combinations of tests are actually giving us the coverage we want



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-8377) Recent toolchain bump breaks Ubuntu 14.04 builds

2019-04-04 Thread Thomas Tauber-Marshall (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Tauber-Marshall resolved IMPALA-8377.

   Resolution: Fixed
Fix Version/s: Impala 3.3.0

> Recent toolchain bump breaks Ubuntu 14.04 builds
> 
>
> Key: IMPALA-8377
> URL: https://issues.apache.org/jira/browse/IMPALA-8377
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 3.3.0
>Reporter: Lars Volker
>Assignee: Thomas Tauber-Marshall
>Priority: Critical
>  Labels: broken-build
> Fix For: Impala 3.3.0
>
>
> Commit 25559dd4 in this change broke the build on Ubuntu 14.04: 
> https://gerrit.cloudera.org/#/c/12824/
> All daemons and any backend tests immediately segfault during startup with 
> this stack:
> {noformat}
> Program terminated with signal SIGSEGV, Segmentation fault.
> #0  0x in ?? ()
> (gdb) where
> #0  0x in ?? ()
> #1  0x7ff0abed9a80 in pthread_once () from 
> /lib/x86_64-linux-gnu/libpthread.so.0
> #2  0x04a93375 in 
> llvm::ManagedStaticBase::RegisterManagedStatic(void* (*)(), void (*)(void*)) 
> const ()
> #3  0x04a7ac76 in llvm::ManagedStatic<(anonymous 
> namespace)::CommandLineParser, llvm::object_creator<(anonymous 
> namespace)::CommandLineParser>, llvm::object_deleter<(anonymous 
> namespace)::CommandLineParser> >::operator*() [clone .constprop.407] ()
> #4  0x04a843a6 in llvm::cl::Option::addArgument() ()
> #5  0x01b26f27 in _GLOBAL__sub_I_SyntaxHighlighting.cpp ()
> #6  0x04dac9bd in __libc_csu_init ()
> #7  0x7ff0abb24ed5 in __libc_start_main () from 
> /lib/x86_64-linux-gnu/libc.so.6
> #8  0x01b59c97 in _start ()
> {noformat}
> Setting {{IMPALA_KUDU_VERSION}} back to {{5211897}} in impala-config.sh make 
> the daemons start again, as does setting {{KUDU_IS_SUPPORTED=false}}. 
> However, only the former fixes the be-tests.
> One outcome of this might be "Won't Fix" and we deprecate support for Ubuntu 
> 14.04. If that seems favorable, we should briefly discuss it on dev@.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-8377) Recent toolchain bump breaks Ubuntu 14.04 builds

2019-04-04 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-8377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16810363#comment-16810363
 ] 

ASF subversion and git services commented on IMPALA-8377:
-

Commit d31ac7b9cc06d2cffe13eb4b53b9070db00f02cb in impala's branch 
refs/heads/master from Thomas Tauber-Marshall
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=d31ac7b ]

IMPALA-8377: bump toolchain version to 107-acaeac961d

This fixes an issue with the previous toolchain version where the Kudu
client was broken and caused all binaries to crash on startup due to
an issue with linked libstdc++

It also fixes an issue where fastbinary.so wasn't being properly
included with Thrift.

Testing:
- Built successfully on redhat6/7, ubuntu16/18, sles12, debian8
- Built and ran a full core test run with both USE_CDH_KUDU=true/false

Change-Id: I4ac25aa230b9d2559cd4eb6166ab985b18ef7e2a
Reviewed-on: http://gerrit.cloudera.org:8080/12928
Reviewed-by: Thomas Marshall 
Tested-by: Impala Public Jenkins 


> Recent toolchain bump breaks Ubuntu 14.04 builds
> 
>
> Key: IMPALA-8377
> URL: https://issues.apache.org/jira/browse/IMPALA-8377
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 3.3.0
>Reporter: Lars Volker
>Assignee: Thomas Tauber-Marshall
>Priority: Critical
>  Labels: broken-build
>
> Commit 25559dd4 in this change broke the build on Ubuntu 14.04: 
> https://gerrit.cloudera.org/#/c/12824/
> All daemons and any backend tests immediately segfault during startup with 
> this stack:
> {noformat}
> Program terminated with signal SIGSEGV, Segmentation fault.
> #0  0x in ?? ()
> (gdb) where
> #0  0x in ?? ()
> #1  0x7ff0abed9a80 in pthread_once () from 
> /lib/x86_64-linux-gnu/libpthread.so.0
> #2  0x04a93375 in 
> llvm::ManagedStaticBase::RegisterManagedStatic(void* (*)(), void (*)(void*)) 
> const ()
> #3  0x04a7ac76 in llvm::ManagedStatic<(anonymous 
> namespace)::CommandLineParser, llvm::object_creator<(anonymous 
> namespace)::CommandLineParser>, llvm::object_deleter<(anonymous 
> namespace)::CommandLineParser> >::operator*() [clone .constprop.407] ()
> #4  0x04a843a6 in llvm::cl::Option::addArgument() ()
> #5  0x01b26f27 in _GLOBAL__sub_I_SyntaxHighlighting.cpp ()
> #6  0x04dac9bd in __libc_csu_init ()
> #7  0x7ff0abb24ed5 in __libc_start_main () from 
> /lib/x86_64-linux-gnu/libc.so.6
> #8  0x01b59c97 in _start ()
> {noformat}
> Setting {{IMPALA_KUDU_VERSION}} back to {{5211897}} in impala-config.sh make 
> the daemons start again, as does setting {{KUDU_IS_SUPPORTED=false}}. 
> However, only the former fixes the be-tests.
> One outcome of this might be "Won't Fix" and we deprecate support for Ubuntu 
> 14.04. If that seems favorable, we should briefly discuss it on dev@.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Work stopped] (IMPALA-5973) Provide query plan in JSON format

2019-04-04 Thread Tim Armstrong (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-5973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-5973 stopped by Tim Armstrong.
-
> Provide query plan in JSON format
> -
>
> Key: IMPALA-5973
> URL: https://issues.apache.org/jira/browse/IMPALA-5973
> Project: IMPALA
>  Issue Type: New Feature
>  Components: Frontend
>Affects Versions: Impala 2.10.0
>Reporter: Alexander Behm
>Priority: Major
>  Labels: planner
>
> Today there is only a text representation of the query plan, but it would be 
> useful to have a JSON version for portability and machine consumption.
> To control whether EXPLAIN should produce a text or JSON output we could 
> augment the EXPLAIN syntax or we could introduce a query option. It's worth 
> discussing which one makes more sense.
> To avoid maintaining two code paths for explain (TEXT and JSON), I recommend 
> that internally our code should always generate the JSON plan, and then have 
> a function that can convert the JSON plan to the conventional textual 
> representation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-5973) Provide query plan in JSON format

2019-04-04 Thread Tim Armstrong (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-5973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong reassigned IMPALA-5973:
-

Assignee: (was: Pranay Singh)

> Provide query plan in JSON format
> -
>
> Key: IMPALA-5973
> URL: https://issues.apache.org/jira/browse/IMPALA-5973
> Project: IMPALA
>  Issue Type: New Feature
>  Components: Frontend
>Affects Versions: Impala 2.10.0
>Reporter: Alexander Behm
>Priority: Major
>  Labels: planner
>
> Today there is only a text representation of the query plan, but it would be 
> useful to have a JSON version for portability and machine consumption.
> To control whether EXPLAIN should produce a text or JSON output we could 
> augment the EXPLAIN syntax or we could introduce a query option. It's worth 
> discussing which one makes more sense.
> To avoid maintaining two code paths for explain (TEXT and JSON), I recommend 
> that internally our code should always generate the JSON plan, and then have 
> a function that can convert the JSON plan to the conventional textual 
> representation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-8389) e2e custom cluster testsuite does not respect cluster_size when impala_log_dir present

2019-04-04 Thread radford nguyen (JIRA)
radford nguyen created IMPALA-8389:
--

 Summary: e2e custom cluster testsuite does not respect 
cluster_size when impala_log_dir present
 Key: IMPALA-8389
 URL: https://issues.apache.org/jira/browse/IMPALA-8389
 Project: IMPALA
  Issue Type: Bug
  Components: Infrastructure
Affects Versions: Impala 3.2.0
Reporter: radford nguyen


h3. Brief

CustomClusterTestSuite always waits for 3 daemons on startup instead of 
{{cluster_size}} daemons when {{impala_log_dir}} is specified.
h3. Description

The {{@CustomClusterTestSuite.withArgs}} decorator allows a user to specify a 
custom cluster size for the test case being decorated.  However, when this 
option is specified in conjunction with {{impala_log_dir}}, it will fail to 
wait for the correct number of daemons if any value other than 
{{DEFAULT_CLUSTER_SIZE}} is used.

The root cause is the difference in how the cluster is started with and without 
{{impala_log_dir}}: 
[https://github.com/apache/impala/blob/3.2.0/tests/common/custom_cluster_test_suite.py#L147]
h3. To Reproduce:
 * add {{cluster_size=5}} to decorator of test_grant_revoke in 
tests/authorization/test_ranger.py
 * $ impala-py.test tests/authorization/test_ranger.py
 * observe pass
 * add {{impala_log_dir=whatev}} to decorator of test_grant_revoke
 * $ impala-py.test tests/authorization/test_ranger.py
 * observe fail during cluster startup:
 ** 2019-04-04 14:25:54,140 INFO     MainThread: Waiting for 
num_known_live_backends=3. Current value: 5

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Work started] (IMPALA-8389) e2e custom cluster testsuite does not respect cluster_size when impala_log_dir present

2019-04-04 Thread radford nguyen (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-8389 started by radford nguyen.
--
> e2e custom cluster testsuite does not respect cluster_size when 
> impala_log_dir present
> --
>
> Key: IMPALA-8389
> URL: https://issues.apache.org/jira/browse/IMPALA-8389
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 3.2.0
>Reporter: radford nguyen
>Assignee: radford nguyen
>Priority: Minor
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> h3. Brief
> CustomClusterTestSuite always waits for 3 daemons on startup instead of 
> {{cluster_size}} daemons when {{impala_log_dir}} is specified.
> h3. Description
> The {{@CustomClusterTestSuite.withArgs}} decorator allows a user to specify a 
> custom cluster size for the test case being decorated.  However, when this 
> option is specified in conjunction with {{impala_log_dir}}, it will fail to 
> wait for the correct number of daemons if any value other than 
> {{DEFAULT_CLUSTER_SIZE}} is used.
> The root cause is the difference in how the cluster is started with and 
> without {{impala_log_dir}}: 
> [https://github.com/apache/impala/blob/3.2.0/tests/common/custom_cluster_test_suite.py#L147]
> h3. To Reproduce:
>  * add {{cluster_size=5}} to decorator of test_grant_revoke in 
> tests/authorization/test_ranger.py
>  * $ impala-py.test tests/authorization/test_ranger.py
>  * observe pass
>  * add {{impala_log_dir=whatev}} to decorator of test_grant_revoke
>  * $ impala-py.test tests/authorization/test_ranger.py
>  * observe fail during cluster startup:
>  ** 2019-04-04 14:25:54,140 INFO     MainThread: Waiting for 
> num_known_live_backends=3. Current value: 5
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-8389) e2e custom cluster testsuite does not respect cluster_size when impala_log_dir present

2019-04-04 Thread radford nguyen (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

radford nguyen reassigned IMPALA-8389:
--

Assignee: radford nguyen

> e2e custom cluster testsuite does not respect cluster_size when 
> impala_log_dir present
> --
>
> Key: IMPALA-8389
> URL: https://issues.apache.org/jira/browse/IMPALA-8389
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 3.2.0
>Reporter: radford nguyen
>Assignee: radford nguyen
>Priority: Minor
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> h3. Brief
> CustomClusterTestSuite always waits for 3 daemons on startup instead of 
> {{cluster_size}} daemons when {{impala_log_dir}} is specified.
> h3. Description
> The {{@CustomClusterTestSuite.withArgs}} decorator allows a user to specify a 
> custom cluster size for the test case being decorated.  However, when this 
> option is specified in conjunction with {{impala_log_dir}}, it will fail to 
> wait for the correct number of daemons if any value other than 
> {{DEFAULT_CLUSTER_SIZE}} is used.
> The root cause is the difference in how the cluster is started with and 
> without {{impala_log_dir}}: 
> [https://github.com/apache/impala/blob/3.2.0/tests/common/custom_cluster_test_suite.py#L147]
> h3. To Reproduce:
>  * add {{cluster_size=5}} to decorator of test_grant_revoke in 
> tests/authorization/test_ranger.py
>  * $ impala-py.test tests/authorization/test_ranger.py
>  * observe pass
>  * add {{impala_log_dir=whatev}} to decorator of test_grant_revoke
>  * $ impala-py.test tests/authorization/test_ranger.py
>  * observe fail during cluster startup:
>  ** 2019-04-04 14:25:54,140 INFO     MainThread: Waiting for 
> num_known_live_backends=3. Current value: 5
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-8322) S3 tests encounter "timed out waiting for receiver fragment instance"

2019-04-04 Thread Joe McDonnell (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-8322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16810312#comment-16810312
 ] 

Joe McDonnell commented on IMPALA-8322:
---

The cancellation test is running, with multiple cancels in progress. It is 
likely that ThreadResourceMgr::DestroyPool() is being called frequently.

FYI: [~kwho] [~lv] [~twm378]

 

> S3 tests encounter "timed out waiting for receiver fragment instance"
> -
>
> Key: IMPALA-8322
> URL: https://issues.apache.org/jira/browse/IMPALA-8322
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.3.0
>Reporter: Joe McDonnell
>Priority: Blocker
>  Labels: broken-build
> Attachments: fb5b9729-2d7a-4590-ea365b87-d2ead75e.dmp_dumped, 
> run_tests_swimlane.json.gz
>
>
> This has been seen multiple times when running s3 tests:
> {noformat}
> query_test/test_join_queries.py:57: in test_basic_joins
> self.run_test_case('QueryTest/joins', new_vector)
> common/impala_test_suite.py:472: in run_test_case
> result = self.__execute_query(target_impalad_client, query, user=user)
> common/impala_test_suite.py:699: in __execute_query
> return impalad_client.execute(query, user=user)
> common/impala_connection.py:174: in execute
> return self.__beeswax_client.execute(sql_stmt, user=user)
> beeswax/impala_beeswax.py:183: in execute
> handle = self.__execute_query(query_string.strip(), user=user)
> beeswax/impala_beeswax.py:360: in __execute_query
> self.wait_for_finished(handle)
> beeswax/impala_beeswax.py:381: in wait_for_finished
> raise ImpalaBeeswaxException("Query aborted:" + error_log, None)
> E   ImpalaBeeswaxException: ImpalaBeeswaxException:
> EQuery aborted:Sender 127.0.0.1 timed out waiting for receiver fragment 
> instance: 6c40d992bb87af2f:0ce96e5d0007, dest node: 4{noformat}
> This is related to IMPALA-6818. On a bad run, there are various time outs in 
> the impalad logs:
> {noformat}
> I0316 10:47:16.359313 20175 krpc-data-stream-mgr.cc:354] Sender 127.0.0.1 
> timed out waiting for receiver fragment instance: 
> ef4a5dc32a6565bd:a8720b850007, dest node: 5
> I0316 10:47:16.359345 20175 rpcz_store.cc:265] Call 
> impala.DataStreamService.TransmitData from 127.0.0.1:40030 (request call id 
> 14881) took 120182ms. Request Metrics: {}
> I0316 10:47:16.359380 20175 krpc-data-stream-mgr.cc:354] Sender 127.0.0.1 
> timed out waiting for receiver fragment instance: 
> d148d83e11a4603d:54dc35f70004, dest node: 3
> I0316 10:47:16.359395 20175 rpcz_store.cc:265] Call 
> impala.DataStreamService.TransmitData from 127.0.0.1:40030 (request call id 
> 14880) took 123097ms. Request Metrics: {}
> ... various messages ...
> I0316 10:47:56.364990 20154 kudu-util.h:108] Cancel() RPC failed: Timed out: 
> CancelQueryFInstances RPC to 127.0.0.1:27000 timed out after 10.000s (SENT)
> ... various messages ...
> W0316 10:48:15.056421 20150 rpcz_store.cc:251] Call 
> impala.ControlService.CancelQueryFInstances from 127.0.0.1:40912 (request 
> call id 202) took 48695ms (client timeout 1).
> W0316 10:48:15.056473 20150 rpcz_store.cc:255] Trace:
> 0316 10:47:26.361265 (+ 0us) impala-service-pool.cc:165] Inserting onto call 
> queue
> 0316 10:47:26.361285 (+ 20us) impala-service-pool.cc:245] Handling call
> 0316 10:48:15.056398 (+48695113us) inbound_call.cc:162] Queueing success 
> response
> Metrics: {}
> I0316 10:48:15.057087 20139 connection.cc:584] Got response to call id 202 
> after client already timed out or cancelled{noformat}
> So far, this has only happened on s3. The system load at the time is not 
> higher than normal. If anything it is lower than normal. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Closed] (IMPALA-8372) Impala Doc: Consistent uses of hyphens with global flags

2019-04-04 Thread Alex Rodoni (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Rodoni closed IMPALA-8372.
---
   Resolution: Fixed
Fix Version/s: Impala 3.3.0

> Impala Doc: Consistent uses of hyphens with global flags
> 
>
> Key: IMPALA-8372
> URL: https://issues.apache.org/jira/browse/IMPALA-8372
> Project: IMPALA
>  Issue Type: Bug
>  Components: Docs
>Reporter: Alex Rodoni
>Assignee: Alex Rodoni
>Priority: Major
> Fix For: Impala 3.3.0
>
>
> Standardize to use 2 non-breaking hyphens for global flags. 
> https://gerrit.cloudera.org/#/c/12908/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-6826) Add support for Ubuntu 18.04

2019-04-04 Thread Laszlo Gaal (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-6826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16810241#comment-16810241
 ] 

Laszlo Gaal commented on IMPALA-6826:
-

IMPALA-8380 tracks the need to upgrade the Postgres JDBC driver, which is 
required by HMS. This is needed to be able to run and test Impala on Ubuntu 18.

> Add support for Ubuntu 18.04
> 
>
> Key: IMPALA-6826
> URL: https://issues.apache.org/jira/browse/IMPALA-6826
> Project: IMPALA
>  Issue Type: Task
>  Components: Infrastructure
>Affects Versions: Impala 3.0, Impala 2.12.0
> Environment: Ubuntu 18.04
>Reporter: Jim Apple
>Assignee: Laszlo Gaal
>Priority: Major
>
> We support Ubuntu 16.04 (and 14.04, in the 2.x line).
>  
> I'm blocked on Ubuntu 18.04 support in 
> [https://github.com/cloudera/native-toolchain,] but the toolchain is not 
> technically a pre-requisite, though I believe it's the easiest way to get a 
> development environment up and running.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-3924) Impala support for Ubuntu 16.04

2019-04-04 Thread Laszlo Gaal (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-3924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16810239#comment-16810239
 ] 

Laszlo Gaal commented on IMPALA-3924:
-

IMPALA-8380 tracks the need to upgrade the Postgres JDBC driver for the 
minicluster, which is required to be able to run and test Impala on Ubuntu 18.

> Impala support for Ubuntu 16.04
> ---
>
> Key: IMPALA-3924
> URL: https://issues.apache.org/jira/browse/IMPALA-3924
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Infrastructure
>Affects Versions: Impala 2.8.0
>Reporter: Thomas Tauber-Marshall
>Assignee: Thomas Tauber-Marshall
>Priority: Critical
>  Labels: build, toolchain
> Fix For: Impala 2.7.0
>
>
> There are various compatibility issues related to compilation and the 
> toolchain that are preventing us from building and running Impala on Ubuntu16.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7957) UNION ALL query returns incorrect results

2019-04-04 Thread Tim Armstrong (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16810221#comment-16810221
 ] 

Tim Armstrong commented on IMPALA-7957:
---

May be connected

> UNION ALL query returns incorrect results
> -
>
> Key: IMPALA-7957
> URL: https://issues.apache.org/jira/browse/IMPALA-7957
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 2.12.0
>Reporter: Luis E Martinez-Poblete
>Assignee: Paul Rogers
>Priority: Blocker
>  Labels: correctness
>
> Synopsis:
> =
> UNION ALL query returns incorrect results
> Problem:
> 
> Customer reported a UNION ALL query returning incorrect results. The UNION 
> ALL query has 2 legs, but Impala is only returning information from one leg.
> Issue can be reproduced in the latest version of Impala. Below is the 
> reproduction case:
> {noformat}
> create table mytest_t (c1 timestamp, c2 timestamp, c3 int, c4 int);
> insert into mytest_t values (now(), ADDDATE (now(),1), 1,1);
> insert into mytest_t values (now(), ADDDATE (now(),1), 2,2);
> insert into mytest_t values (now(), ADDDATE (now(),1), 3,3);
> SELECT t.c1
> FROM
>  (SELECT c1, c2
>  FROM mytest_t) t
> LEFT JOIN
>  (SELECT c1, c2
>  FROM mytest_t
>  WHERE c2 = c1) t2 ON (t.c2 = t2.c2)
> UNION ALL
> VALUES (NULL)
> {noformat}
> The above query produces the following execution plan:
> {noformat}
> ++
> | Explain String  
>|
> ++
> | Max Per-Host Resource Reservation: Memory=34.02MB Threads=5 
>|
> | Per-Host Resource Estimates: Memory=2.06GB  
>|
> | WARNING: The following tables are missing relevant table and/or column 
> statistics. |
> | default.mytest_t
>|
> | 
>|
> | PLAN-ROOT SINK  
>|
> | |   
>|
> | 06:EXCHANGE [UNPARTITIONED] 
>|
> | |   
>|
> | 00:UNION
>|
> | |  constant-operands=1  
>|
> | |   
>|
> | 04:SELECT   
>|
> | |  predicates: default.mytest_t.c1 = default.mytest_t.c2
>|
> | |   
>|
> | 03:HASH JOIN [LEFT OUTER JOIN, BROADCAST]   
>|
> | |  hash predicates: c2 = c2 
>|
> | |   
>|
> | |--05:EXCHANGE [BROADCAST]  
>|
> | |  |
>|
> | |  02:SCAN HDFS [default.mytest_t]  
>|
> | | partitions=1/1 files=3 size=192B  
>|
> | | predicates: c2 = c1   
>|
> | |   
>|
> | 01:SCAN HDFS [default.mytest_t] 
>|
> |partitions=1/1 files=3 size=192B 
>|
> ++
> {noformat}
> The issue is in operator 4:
> {noformat}
> | 04:SELECT |
> | | predicates: default.mytest_t.c1 = default.mytest_t.c2 |
> {noformat}
> It's definitely a bug with predicate placement - that c1 = c2 predicate 
> shouldn't be evaluated outside the right branch of the LEFT JOIN.
> Thanks,
> Luis Martinez.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-8388) Misleading error message when rejecting incoming RPCs b/c of memory pressure

2019-04-04 Thread Michael Ho (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-8388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16810206#comment-16810206
 ] 

Michael Ho commented on IMPALA-8388:


[~lv], the message you added doesn't match the quoted code you posted. I filed 
IMPALA-8256 to fix that so that the actual queue size and memory consumption is 
now printed. Is your complaint that we could improve further on IMPALA-8256 ?

> Misleading error message when rejecting incoming RPCs b/c of memory pressure
> 
>
> Key: IMPALA-8388
> URL: https://issues.apache.org/jira/browse/IMPALA-8388
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 2.12.0, Impala 3.1.0, Impala 3.2.0, Impala 3.3.0
>Reporter: Lars Volker
>Priority: Major
>  Labels: observability, supportability
>
> When running out of memory we reject incoming RPCs (expected). However, our 
> error message assumes that the queue is full and prints it as INT_MAX (the 
> queue size):
> {code}
> void ImpalaServicePool::RejectTooBusy(kudu::rpc::InboundCall* c) {
>   string err_msg =
>   Substitute("$0 request on $1 from $2 dropped due to backpressure. "
>  "The service queue contains $3 items out of a maximum of $4; 
> "
>  "memory consumption is $5.",
>  c->remote_method().method_name(),
>  service_->service_name(),
>  c->remote_address().ToString(),
>  service_queue_.estimated_queue_length(),
>  service_queue_.max_size(), // <-- HERE
>  PrettyPrinter::Print(service_mem_tracker_->consumption(), 
> TUnit::BYTES));
> {code}
> The error currently looks like this:
> {noformat}
> I0404 11:35:43.276937 54321 impala-service-pool.cc:126] EndDataStream request 
> on impala.DataStreamService from 1.2.3.4:56789 dropped due to backpressure. 
> The service queue is full; it has 2147483647 items. Contents of service queue:
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-8388) Misleading error message when rejecting incoming RPCs b/c of memory pressure

2019-04-04 Thread Lars Volker (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Volker updated IMPALA-8388:

Description: 
When running out of memory we reject incoming RPCs (expected). However, our 
error message assumes that the queue is full and prints it as INT_MAX (the 
queue size):

{code}
void ImpalaServicePool::RejectTooBusy(kudu::rpc::InboundCall* c) {
  string err_msg =
  Substitute("$0 request on $1 from $2 dropped due to backpressure. "
 "The service queue contains $3 items out of a maximum of $4; "
 "memory consumption is $5.",
 c->remote_method().method_name(),
 service_->service_name(),
 c->remote_address().ToString(),
 service_queue_.estimated_queue_length(),
 service_queue_.max_size(), // <-- HERE
 PrettyPrinter::Print(service_mem_tracker_->consumption(), 
TUnit::BYTES));
{code}

The error currently looks like this:
{noformat}
I0404 11:35:43.276937 54321 impala-service-pool.cc:126] EndDataStream request 
on impala.DataStreamService from 1.2.3.4:56789 dropped due to backpressure. The 
service queue is full; it has 2147483647 items. Contents of service queue:
{noformat}

  was:
When running out of memory we reject incoming RPCs (expected). However, our 
error message assumes that the queue is full and prints it as INT_MAX (the 
queue size):

{code}
void ImpalaServicePool::RejectTooBusy(kudu::rpc::InboundCall* c) {
  string err_msg =
  Substitute("$0 request on $1 from $2 dropped due to backpressure. "
 "The service queue contains $3 items out of a maximum of $4; "
 "memory consumption is $5.",
 c->remote_method().method_name(),
 service_->service_name(),
 c->remote_address().ToString(),
 service_queue_.estimated_queue_length(),
 service_queue_.max_size(), // <-- HERE
 PrettyPrinter::Print(service_mem_tracker_->consumption(), 
TUnit::BYTES));
{code}


> Misleading error message when rejecting incoming RPCs b/c of memory pressure
> 
>
> Key: IMPALA-8388
> URL: https://issues.apache.org/jira/browse/IMPALA-8388
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 2.12.0, Impala 3.1.0, Impala 3.2.0, Impala 3.3.0
>Reporter: Lars Volker
>Priority: Major
>  Labels: observability, supportability
>
> When running out of memory we reject incoming RPCs (expected). However, our 
> error message assumes that the queue is full and prints it as INT_MAX (the 
> queue size):
> {code}
> void ImpalaServicePool::RejectTooBusy(kudu::rpc::InboundCall* c) {
>   string err_msg =
>   Substitute("$0 request on $1 from $2 dropped due to backpressure. "
>  "The service queue contains $3 items out of a maximum of $4; 
> "
>  "memory consumption is $5.",
>  c->remote_method().method_name(),
>  service_->service_name(),
>  c->remote_address().ToString(),
>  service_queue_.estimated_queue_length(),
>  service_queue_.max_size(), // <-- HERE
>  PrettyPrinter::Print(service_mem_tracker_->consumption(), 
> TUnit::BYTES));
> {code}
> The error currently looks like this:
> {noformat}
> I0404 11:35:43.276937 54321 impala-service-pool.cc:126] EndDataStream request 
> on impala.DataStreamService from 1.2.3.4:56789 dropped due to backpressure. 
> The service queue is full; it has 2147483647 items. Contents of service queue:
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-8388) Misleading error message when rejecting incoming RPCs b/c of memory pressure

2019-04-04 Thread Lars Volker (JIRA)
Lars Volker created IMPALA-8388:
---

 Summary: Misleading error message when rejecting incoming RPCs b/c 
of memory pressure
 Key: IMPALA-8388
 URL: https://issues.apache.org/jira/browse/IMPALA-8388
 Project: IMPALA
  Issue Type: Bug
  Components: Backend
Affects Versions: Impala 3.2.0, Impala 3.1.0, Impala 2.12.0, Impala 3.3.0
Reporter: Lars Volker


When running out of memory we reject incoming RPCs (expected). However, our 
error message assumes that the queue is full and prints it as INT_MAX (the 
queue size):

{code}
void ImpalaServicePool::RejectTooBusy(kudu::rpc::InboundCall* c) {
  string err_msg =
  Substitute("$0 request on $1 from $2 dropped due to backpressure. "
 "The service queue contains $3 items out of a maximum of $4; "
 "memory consumption is $5.",
 c->remote_method().method_name(),
 service_->service_name(),
 c->remote_address().ToString(),
 service_queue_.estimated_queue_length(),
 service_queue_.max_size(), // <-- HERE
 PrettyPrinter::Print(service_mem_tracker_->consumption(), 
TUnit::BYTES));
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-8387) Impala Doc: Network I/O throughput to Query Profile output

2019-04-04 Thread Alex Rodoni (JIRA)
Alex Rodoni created IMPALA-8387:
---

 Summary: Impala Doc: Network I/O throughput to Query Profile output
 Key: IMPALA-8387
 URL: https://issues.apache.org/jira/browse/IMPALA-8387
 Project: IMPALA
  Issue Type: Sub-task
  Components: Docs
Affects Versions: Impala 3.3.0
Reporter: Alex Rodoni
Assignee: Alex Rodoni






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7204) Add support for GROUP BY ROLLUP

2019-04-04 Thread Ruslan Dautkhanov (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16810112#comment-16810112
 ] 

Ruslan Dautkhanov commented on IMPALA-7204:
---

cc [~grahn] - can you please let us know if you guys are planning to add 
support for this feature?

Our Account team said you might be the right person to ask )

Thanks for any ideas!


> Add support for GROUP BY ROLLUP
> ---
>
> Key: IMPALA-7204
> URL: https://issues.apache.org/jira/browse/IMPALA-7204
> Project: IMPALA
>  Issue Type: New Feature
>  Components: Backend
>Affects Versions: Impala 3.0, Impala 2.12.0
>Reporter: Ruslan Dautkhanov
>Priority: Major
>  Labels: GROUP_BY, sql
>
> Now suppose that we'd like to analyze our sales data, to study the amount of 
> sales that is occurring for different products, in different states and 
> regions. Using the ROLLUP feature of SQL 2003, we could issue the query:
> {code:sql}
> select region, state, product, sum(sales) total_sales
> from sales_history 
> group by rollup (region, state, product)
> {code}
> Semantically, the above query is equivalent to
>  
> {code:sql}
> select region, state, product, sum(sales) total_sales
> from sales_history 
> group by region, state, product
> union
> select region, state, null, sum(sales) total_sales
> from sales_history 
> group by region, state
> union
> select region, null, null, sum(sales) total_sales
> from sales_history 
> group by region
> union
> select null, null, null, sum(sales) total_sales
> from sales_history
>  
> {code}
> The query might produce results that looked something like:
> {noformat}
> REGION STATE PRODUCT TOTAL_SALES
> -- - --- ---
> null null null 6200
> EAST MA BOATS 100
> EAST MA CARS 1500
> EAST MA null 1600
> EAST NY BOATS 150
> EAST NY CARS 1000
> EAST NY null 1150
> EAST null null 2750
> WEST CA BOATS 750
> WEST CA CARS 500
> WEST CA null 1250
> WEST AZ BOATS 2000
> WEST AZ CARS 200
> WEST AZ null 2200
> WEST null null 3450
> {noformat}
> We have a lot of production queries that work around this missing Impala 
> functionality by having three UNION ALLs. Physical execution plan shows 
> Impala actually reads full fact table three times. So it could be a three 
> times improvement (or more, depending on number of columns that are being 
> rolled up).
> I can't find another SQL on Hadoop engine that doesn't support this feature. 
>  *Checked Spark, Hive, PIG, Flink and some other engines - they all do 
> support this basic SQL feature*.
> Would be great to have a matching feature in Impala too.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-8359) Coverage measurement is not working for Impala daemons

2019-04-04 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-8359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16810091#comment-16810091
 ] 

ASF subversion and git services commented on IMPALA-8359:
-

Commit a0a20cdf9adcb899e4bb04e3f9278077dc2b52c0 in impala's branch 
refs/heads/master from Zoltan Borok-Nagy
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=a0a20cd ]

IMPALA-8359: Fix coverage data generation for impalads

impala::InitCommonRuntime() sets a signal handler for SIGTERM.
It calls _exit(0) which causes normal program termination without
cleaning up, i.e. no destructors are called etc.

Gcov writes the coverage data in this cleanup phase, so calling
_exit() prevents flushing coverage data.

Now the '-codecoverage' flag also defines a macro named
CODE_COVERAGE_ENABLED. If this macro is defined we explicitly
call __gcov_flush() before calling _exit().

I tested manually.

Change-Id: I9be1e1e73b6cfc3557077f763aee4dbfcc7a2d27
Reviewed-on: http://gerrit.cloudera.org:8080/12858
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Coverage measurement is not working for Impala daemons
> --
>
> Key: IMPALA-8359
> URL: https://issues.apache.org/jira/browse/IMPALA-8359
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Zoltán Borók-Nagy
>Assignee: Zoltán Borók-Nagy
>Priority: Major
>
> Currently code coverage measurement only works for backend tests.
> Impala daemons don't write .gcda files when they terminate because they set a 
> signal handler for SIGTERM.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7251) Fix QueryMaintenance calls in Aggregators

2019-04-04 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16810087#comment-16810087
 ] 

ASF subversion and git services commented on IMPALA-7251:
-

Commit fdd6db524c9c97f0baebfde0119fce19d62eaec3 in impala's branch 
refs/heads/2.x from Thomas Tauber-Marshall
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=fdd6db5 ]

IMPALA-7251: Fix QueryMaintenance calls in Aggregators

A recent change, IMPALA-110 (part 2), refactored
PartitionedAggregationNode into several classes, including a new type
'Aggregator'. During this refactor, code that makes local allocations
while evaluating exprs was moved from the ExecNode (now
AggregationNode/StreamingAggregationNode) into the Aggregators, but
code related to cleaning these allocations up (ie QueryMaintenance())
was not, resulting in some queries using an excessive amount of
memory.

This patch removes all calls to QueryMaintenance() from the exec nodes
and moves them into the Aggregators.

Testing:
- Added new test cases with a mem limit that fails if the expr
  allocations aren't released in a timely manner.
- Passed a full exhaustive run.

Change-Id: I4dac2bb0a15cdd7315ee15608bae409c125c82f5
Reviewed-on: http://gerrit.cloudera.org:8080/10871
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Fix QueryMaintenance calls in Aggregators
> -
>
> Key: IMPALA-7251
> URL: https://issues.apache.org/jira/browse/IMPALA-7251
> Project: IMPALA
>  Issue Type: Bug
>Affects Versions: Impala 3.1.0
>Reporter: Thomas Tauber-Marshall
>Assignee: Thomas Tauber-Marshall
>Priority: Blocker
> Fix For: Impala 3.1.0
>
>
> A recent change, IMPALA-110 (part 2), refactored PartitionedAggregationNode 
> into several classes, including GroupingAggregator. During this refactor, 
> code that makes local allocations while evaluating exprs was moved from the 
> ExecNode (now AggregationNode/StreamingAggregationNode) into the new type 
> Aggregator, but code related to cleaning these allocations up (ie 
> QueryMaintenance()) was not, resulting in some queries using an excessive 
> amount of memory.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7006) Rebase KRPC onto Kudu upstream repository

2019-04-04 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16810074#comment-16810074
 ] 

ASF subversion and git services commented on IMPALA-7006:
-

Commit 0c2d3c74d662d32eaf5c56cdeca067285ab1d300 in impala's branch 
refs/heads/2.x from Lars Volker
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=0c2d3c7 ]

IMPALA-7006: Remove KRPC folders

Change-Id: Ic677484c27ed18b105da0a6b0901df4eb9f248e6
Reviewed-on: http://gerrit.cloudera.org:8080/10756
Reviewed-by: Lars Volker 
Tested-by: Lars Volker 


> Rebase KRPC onto Kudu upstream repository
> -
>
> Key: IMPALA-7006
> URL: https://issues.apache.org/jira/browse/IMPALA-7006
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Affects Versions: Impala 2.13.0, Impala 3.1.0
>Reporter: Lars Volker
>Assignee: Lars Volker
>Priority: Major
>  Labels: krpc
> Fix For: Impala 3.1.0
>
>
> We should consider rebasing our KRPC code on top of the latest Kudu upstream 
> version. This will keep the two projects more in sync and will allow us to 
> make use of recent improvements, e.g. around thread stack collection, without 
> having to pick individual changes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-5129) Use Kudu's Kinit code to avoid expensive fork

2019-04-04 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-5129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16810082#comment-16810082
 ] 

ASF subversion and git services commented on IMPALA-5129:
-

Commit b97e0cd555a53057a82dc9c0ad9e0cfe58f3ec66 in impala's branch 
refs/heads/2.x from Sailesh Mukil
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=b97e0cd ]

IMPALA-5129: Use Kudu's Kinit code to avoid expensive fork

NOTE: This commit is part of a set of changes for IMPALA-7006. It
contains pieces of a previous commit that need to be cherry picked
again after rebasing the code in be/src/kudu/{util,security,rpc}.

The original commit message is below:

Impala currently kinits by forking off a child process. This
has proved to be expensive in many cases since the subprocess
tries to reserve as much memory as Impala is currently using
which can be quite a lot.

This patch adds a flag called 'use_kudu_kinit' that defaults to
true. When it's true, it uses the Kudu security library's kinit code
that programatically uses the krb5 library to kinit.
When it's false, we run our current path which kicks off the
kinit-thread and forks off a kinit process periodically to reacquire
tickets based on FLAGS_kerberos_reinit_interval.

Converted existing tests in thrift-server-test to run with and
without kerberos. We now run this BE test with kerberos by using
Kudu's MiniKdc utility. This introduces a new dependency on some
kerberos binaries that are checked through FindKerberosPrograms.cmake.
Note that this is only a test dependency and not a dependency for
the impalad binaries and friends. Compilation will still succeed if
the kerberos binaries for the MiniKdc are not found, however, the
thrift-server-test will fail. We run with and without the
'use_kudu_kinit' flag.

TODO: Since the setting up and tearing down of our security code
isn't idempotent, we can run only any one test in a process with
Kerberos now (IMPALA-6085).

Updated bin/bootstrap_system.sh to install new sasl-gssapi
modules and the kerberos binaries required for the MiniKdc.
Also fixed a bug that didn't transfer the environment into 'sudo'
in bin/bootstrap_system.sh.

Testing: Verified with thrift-server-test and also manually on a
live kerberized cluster.

Change-Id: Ie3c6e933c454e7adca69ef03e7d5c0c84b656895
Reviewed-on: http://gerrit.cloudera.org:8080/7938
Reviewed-by: Sailesh Mukil 
Tested-by: Impala Public Jenkins
Reviewed-on: http://gerrit.cloudera.org:8080/10763
Reviewed-by: Lars Volker 
Tested-by: Lars Volker 


> Use Kudu's Kinit code to avoid expensive fork
> -
>
> Key: IMPALA-5129
> URL: https://issues.apache.org/jira/browse/IMPALA-5129
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Security
>Reporter: Sailesh Mukil
>Assignee: Sailesh Mukil
>Priority: Major
>  Labels: security
> Fix For: Impala 2.11.0
>
>
> Impala does a kinit by doing a RunShell() command which basically forks the 
> entire process (potentially expensive) and execs the 'kinit' command.
> KuduRPC avoids the fork by calling into libkrb programatically. Since we 
> eventually will be pulling in KuduRPC to Impala, we can get rid of the fork 
> and call into the appropriate KuduRPC code.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-4669) Add Kudu's RPC, util and security libraries

2019-04-04 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-4669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16810076#comment-16810076
 ] 

ASF subversion and git services commented on IMPALA-4669:
-

Commit 5dbd48f226f1061567da1c381ee2491dab3ceaf4 in impala's branch 
refs/heads/2.x from Lars Volker
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=5dbd48f ]

IMPALA-4669: [KUTIL] Add kudu_util library to the build.

NOTE: This commit is part of a set of changes for IMPALA-7006. It
contains pieces of a previous commit that need to be cherry picked
again after rebasing the code in be/src/kudu/{util,security,rpc}.

The original commit message is below:

A few miscellaneous changes to allow kudu_util to compile with Impala.

Add kudu_version.cc to substitute for the version.cc file that is
automatically built during the full Kudu build.

Set LZ4_DISABLE_DEPRECATE_WARNINGS to allow Kudu's compressor utility to
use deprecated names for LZ4 methods.

Add NO_NVM_SUPPORT flag to Kudu build (plan to upstream this later) to
disable building with nvm support, removing a library dependency.

Also remove imported FindOpenSSL.cmake in favour of the standard one provided
by cmake itself.

Finally, a few changes to allow compilation on RHEL5:

* Only use sched_getcpu() if supported
* Only include magic.h if available
* Workaround for kernels that don't have SOCK_NONBLOCK
* Workaround for kernels that don't have O_CLOEXEC (ignore the flag)
* Provide non-working implementation of fallocate()
* Disable inclusion of linux/fiemap.h - although this exists on RHEL5,
  it does not compile due to other #includes in env_posix.cc. We disable
  the path this is used for, since Impala does not call that code.
* Use Kudu's implementation of pipe(2), preadv(2) and pwritev(2) where
  it doesn't exist.

In most cases these changes simply force kutil to revert to a different
implementation that was already written for OSX support - this patch
generalises the logic to provide the implementation whenever the
required function doesn't exist.

This patch compiles on RHEL5.5 and 6.0, SLES11 and 12, Ubuntu 12.04 and
14.04 and Debian 7.0 and 8.0.

Change-Id: I451f02d3e4669e8a548b92fb1445cb2b322659a2
Reviewed-on: http://gerrit.cloudera.org:8080/5715
Tested-by: Impala Public Jenkins
Reviewed-by: Henry Robinson 
Reviewed-on: http://gerrit.cloudera.org:8080/10758
Reviewed-by: Michael Ho 
Tested-by: Lars Volker 


> Add Kudu's RPC, util and security libraries
> ---
>
> Key: IMPALA-4669
> URL: https://issues.apache.org/jira/browse/IMPALA-4669
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Distributed Exec
>Affects Versions: Impala 2.8.0
>Reporter: Henry Robinson
>Assignee: Henry Robinson
>Priority: Major
> Fix For: Impala 2.10.0
>
>
> To enable KRPC in Impala, we need to link against Kudu's {{rpc}}, 
> {{security}} and {{util}} libraries. The easiest way for now is to pull them 
> into trunk. 
> Doing this also requires upgrading our {{gutil}} version.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7006) Rebase KRPC onto Kudu upstream repository

2019-04-04 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16810078#comment-16810078
 ] 

ASF subversion and git services commented on IMPALA-7006:
-

Commit 23a3ef7452ade42a426502e0fd3719f3836d6730 in impala's branch 
refs/heads/2.x from Lars Volker
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=23a3ef7 ]

IMPALA-7006: [KSECURITY] Update security library integration

This commit is part of a set of changes for IMPALA-7006. It started
based on an original change (f51c4435), which integrated Kudu's security
folder into our build.

This change removes several compile time checks that are now either done
in Kudu's own cmake files or that can be removed due to Impala
deprecating support for older OS versions in the 3.x line.

The removed checks are:

HAVE_KRB5_GET_INIT_CREDS_OPT_SET_OUT_CCACHE:
We now check for this in Kudu's code.

HAVE_KRB5_IS_CONFIG_PRINCIPAL,
HAVE_KRB5_GET_INIT_CREDS_OPT_SET_FAST_CCACHE_NAME:
These checks are not needed anymore. All OS versions supported by
Impala now have sufficiently recent versions of Kerberos.

Change-Id: Ifab51d887f5e771ad62eeddc14b9c47f42c3130d
Reviewed-on: http://gerrit.cloudera.org:8080/10759
Reviewed-by: Lars Volker 
Tested-by: Lars Volker 


> Rebase KRPC onto Kudu upstream repository
> -
>
> Key: IMPALA-7006
> URL: https://issues.apache.org/jira/browse/IMPALA-7006
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Affects Versions: Impala 2.13.0, Impala 3.1.0
>Reporter: Lars Volker
>Assignee: Lars Volker
>Priority: Major
>  Labels: krpc
> Fix For: Impala 3.1.0
>
>
> We should consider rebasing our KRPC code on top of the latest Kudu upstream 
> version. This will keep the two projects more in sync and will allow us to 
> make use of recent improvements, e.g. around thread stack collection, without 
> having to pick individual changes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7006) Rebase KRPC onto Kudu upstream repository

2019-04-04 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16810085#comment-16810085
 ] 

ASF subversion and git services commented on IMPALA-7006:
-

Commit 315bc66bbac8715302d455d2d746981cebf74aec in impala's branch 
refs/heads/2.x from Joe McDonnell
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=315bc66 ]

KUDU-2305: Limit sidecars to INT_MAX and fortify socket code

NOTE: This commit is part of a set of changes for IMPALA-7006. It
contains pieces of a previous commit that need to be cherry picked
again after rebasing the code in be/src/kudu/{util,security,rpc}.

The original commit message is below:

Inspection of the code revealed some other local variables
that could overflow with large messages. This patch takes
two approaches to eliminate the issues.

First, it limits the total size of the messages by limiting
the total size of the sidecars to INT_MAX. The total size
of the protobuf and header components of the message
should be considerably smaller, so limiting the sidecars
to INT_MAX eliminates messages that are larger than UINT_MAX.
This also means that the sidecar offsets, which are unsigned
32-bit integers, are also safe. Given that
FLAGS_rpc_max_message_size is limited to INT_MAX at startup,
the receiver would reject any message this large anyway.
This also helps with the networking codepath, as any given
sidecar will have a size less than INT_MAX, so every Slice
that interacts with Writev() is shorter than INT_MAX.

Second, even with sidecars limited to INT_MAX, the headers
and protobuf parts of the messages mean that certain messages
could still exceed INT_MAX. This patch changes some of the sockets
codepath to tolerate iovec's that reference more than INT_MAX
bytes total. Specifically, it changes Writev()'s nwritten bytes
to an int64_t for both TlsSocket and Socket. TlsSocket works
because it is sending each Slice individually. The first change
limited any given Slice to INT_MAX, so each individual Write()
should not be impacted. For Socket, Writev() uses sendmsg(). It
should do partial network sends to handle this case. Any Write()
call specifies its size with a 32-bit integer, and that will
not be impacted by this patch.

Testing:
 - Modified TestRpcSidecarLimits() to verify that sidecars are
   limited to INT_MAX bytes.
 - Added a test mode to TestRpcSidecarLimits() where it
   overrides rpc_max_message_size and sends the maximal
   message. This verifies that the client send codepath
   can handle the maximal message.

Reviewed-on: http://gerrit.cloudera.org:8080/9601
Reviewed-by: Todd Lipcon 
Tested-by: Todd Lipcon 

Changes from Kudu version:
 - Updated declaration of FLAGS_rpc_max_message_size
   in rpc-mgr.cc and added a warning not to set it
   larger than INT_MAX.

Change-Id: Id23e518995f2bf2f6bf6b49d5f413f3eaa4e79d1
Reviewed-on: http://gerrit.cloudera.org:8080/9748
Reviewed-by: Michael Ho 
Tested-by: Impala Public Jenkins
Reviewed-on: http://gerrit.cloudera.org:8080/10765
Reviewed-by: Lars Volker 
Tested-by: Lars Volker 


> Rebase KRPC onto Kudu upstream repository
> -
>
> Key: IMPALA-7006
> URL: https://issues.apache.org/jira/browse/IMPALA-7006
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Affects Versions: Impala 2.13.0, Impala 3.1.0
>Reporter: Lars Volker
>Assignee: Lars Volker
>Priority: Major
>  Labels: krpc
> Fix For: Impala 3.1.0
>
>
> We should consider rebasing our KRPC code on top of the latest Kudu upstream 
> version. This will keep the two projects more in sync and will allow us to 
> make use of recent improvements, e.g. around thread stack collection, without 
> having to pick individual changes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7006) Rebase KRPC onto Kudu upstream repository

2019-04-04 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16810086#comment-16810086
 ] 

ASF subversion and git services commented on IMPALA-7006:
-

Commit b65dbf8e40d7c8f77db05846d84497824d6bbd26 in impala's branch 
refs/heads/2.x from Lars Volker
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=b65dbf8 ]

IMPALA-7006: Pick parts of recent Kudu gutil changes

- Include some ASAN macros from gutil (Kudu commit c8724c61)
- Pick parts of KUDU-2427 (Kudu commit b7cf3b2e)
- Rename constants (Kudu commit e719b5ef)

These changes will be subsumed by a proper rebase of GUTIL.

Change-Id: Id2dc8c70425e3ac030427ebeb1ec18a44d14d5cb
Reviewed-on: http://gerrit.cloudera.org:8080/10769
Tested-by: Impala Public Jenkins 
Reviewed-by: Lars Volker 


> Rebase KRPC onto Kudu upstream repository
> -
>
> Key: IMPALA-7006
> URL: https://issues.apache.org/jira/browse/IMPALA-7006
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Affects Versions: Impala 2.13.0, Impala 3.1.0
>Reporter: Lars Volker
>Assignee: Lars Volker
>Priority: Major
>  Labels: krpc
> Fix For: Impala 3.1.0
>
>
> We should consider rebasing our KRPC code on top of the latest Kudu upstream 
> version. This will keep the two projects more in sync and will allow us to 
> make use of recent improvements, e.g. around thread stack collection, without 
> having to pick individual changes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-4669) Add Kudu's RPC, util and security libraries

2019-04-04 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-4669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16810080#comment-16810080
 ] 

ASF subversion and git services commented on IMPALA-4669:
-

Commit d10b34354c0d1616ed2faf78a6659e9be4aacd66 in impala's branch 
refs/heads/2.x from Lars Volker
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=d10b343 ]

IMPALA-4669: [KRPC] Add kudu_rpc library to build

NOTE: This commit is part of a set of changes for IMPALA-7006. It
contains pieces of a previous commit that need to be cherry picked
again after rebasing the code in be/src/kudu/{util,security,rpc}.

The original commit message is below:

Import FindKRPC.cmake from Apache Kudu.

Add some files to protoc-gen-krpc link to allow it to find symbols now
defined within Impala (without linking all of Impala's libraries).

Change-Id: I5693288db90f2e9673b8c88ca4378c3790cba957
Reviewed-on: http://gerrit.cloudera.org:8080/5719
Reviewed-by: Henry Robinson 
Tested-by: Impala Public Jenkins
Reviewed-on: http://gerrit.cloudera.org:8080/10760
Reviewed-by: Lars Volker 
Tested-by: Lars Volker 


> Add Kudu's RPC, util and security libraries
> ---
>
> Key: IMPALA-4669
> URL: https://issues.apache.org/jira/browse/IMPALA-4669
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Distributed Exec
>Affects Versions: Impala 2.8.0
>Reporter: Henry Robinson
>Assignee: Henry Robinson
>Priority: Major
> Fix For: Impala 2.10.0
>
>
> To enable KRPC in Impala, we need to link against Kudu's {{rpc}}, 
> {{security}} and {{util}} libraries. The easiest way for now is to pull them 
> into trunk. 
> Doing this also requires upgrading our {{gutil}} version.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7288) Codegen crash in FinalizeModule()

2019-04-04 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16810089#comment-16810089
 ] 

ASF subversion and git services commented on IMPALA-7288:
-

Commit a26392c6d11bd4b6367936c7a1188292027e2d2d in impala's branch 
refs/heads/2.x from Bikramjeet Vig
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=a26392c ]

IMPALA-7288: Fix Codegen Crash in FinalizeModule()

Currently codegen crashed during FinalizeModule() where it tries to
clean up half-baked handcrafted functions. This happens only for the
cases where the code generating the handcrafted IR calls
eraseFromParent() on failure which also deletes the memory held by the
function pointer and therefore causes a crash during clean up in
FinalizeModule().

Testing:
Added regression tests that verify that failure code paths in
the previously offending methods don't crash Impala.

Change-Id: I2f0b527909a9fb3090996bb7510e4d58350c21b0
Reviewed-on: http://gerrit.cloudera.org:8080/10933
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Codegen crash in FinalizeModule()
> -
>
> Key: IMPALA-7288
> URL: https://issues.apache.org/jira/browse/IMPALA-7288
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 2.12.0, Impala 3.1.0
>Reporter: Balazs Jeszenszky
>Assignee: Bikramjeet Vig
>Priority: Blocker
> Fix For: Impala 3.1.0
>
>
> The following sequence crashes Impala 2.12 reliably:
> {code}
> CREATE TABLE test (c1 CHAR(6),c2 CHAR(6));
> select 1 from test t1, test t2
> where t1.c1 = FROM_TIMESTAMP(cast(t2.c2 as string), 'MMdd');
> {code}
> hs_err_pid has:
> {code}
> #
> # A fatal error has been detected by the Java Runtime Environment:
> #
> #  SIGSEGV (0xb) at pc=0x03b36ce4, pid=28459, tid=0x7f2c49685700
> #
> # JRE version: Java(TM) SE Runtime Environment (8.0_162-b12) (build 
> 1.8.0_162-b12)
> # Java VM: Java HotSpot(TM) 64-Bit Server VM (25.162-b12 mixed mode 
> linux-amd64 compressed oops)
> # Problematic frame:
> # C  [impalad+0x3736ce4]  llvm::Value::getContext() const+0x4
> {code}
> Backtrace is:
> {code}
> #0  0x7f2cb217a5f7 in raise () from /lib64/libc.so.6
> #1  0x7f2cb217bce8 in abort () from /lib64/libc.so.6
> #2  0x7f2cb4de2f35 in os::abort(bool) () from 
> /usr/java/latest/jre/lib/amd64/server/libjvm.so
> #3  0x7f2cb4f86f33 in VMError::report_and_die() () from 
> /usr/java/latest/jre/lib/amd64/server/libjvm.so
> #4  0x7f2cb4de922f in JVM_handle_linux_signal () from 
> /usr/java/latest/jre/lib/amd64/server/libjvm.so
> #5  0x7f2cb4ddf253 in signalHandler(int, siginfo*, void*) () from 
> /usr/java/latest/jre/lib/amd64/server/libjvm.so
> #6  
> #7  0x03b36ce4 in llvm::Value::getContext() const ()
> #8  0x03b36cff in llvm::Value::getValueName() const ()
> #9  0x03b36de9 in llvm::Value::getName() const ()
> #10 0x01ba6bb2 in impala::LlvmCodeGen::FinalizeModule (this=0x9b53980)
> at 
> /usr/src/debug/impala-2.12.0-cdh5.15.0/be/src/codegen/llvm-codegen.cc:1076
> #11 0x018f5c0f in impala::FragmentInstanceState::Open (this=0xac0b400)
> at 
> /usr/src/debug/impala-2.12.0-cdh5.15.0/be/src/runtime/fragment-instance-state.cc:255
> #12 0x018f3699 in impala::FragmentInstanceState::Exec (this=0xac0b400)
> at 
> /usr/src/debug/impala-2.12.0-cdh5.15.0/be/src/runtime/fragment-instance-state.cc:80
> #13 0x019028c3 in impala::QueryState::ExecFInstance (this=0x9c6ad00, 
> fis=0xac0b400)
> at 
> /usr/src/debug/impala-2.12.0-cdh5.15.0/be/src/runtime/query-state.cc:410
> #14 0x0190113c in impala::QueryStateoperator()(void) 
> const (__closure=0x7f2c49684be8)
> at 
> /usr/src/debug/impala-2.12.0-cdh5.15.0/be/src/runtime/query-state.cc:350
> #15 0x019034dd in 
> boost::detail::function::void_function_obj_invoker0,
>  void>::invoke(boost::detail::function::function_buffer &) 
> (function_obj_ptr=...)
> at 
> /usr/src/debug/impala-2.12.0-cdh5.15.0/toolchain/boost-1.57.0-p3/include/boost/function/function_template.hpp:153
> {code}
> Crash is at 
> https://github.com/cloudera/Impala/blob/cdh5-2.12.0_5.15.0/be/src/codegen/llvm-codegen.cc#L1070-L1079.
> The repro steps seem to be quite specific.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-6826) Add support for Ubuntu 18.04

2019-04-04 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-6826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16810090#comment-16810090
 ] 

ASF subversion and git services commented on IMPALA-6826:
-

Commit 5771c45a21450e49fd2d969a8218bbc62f5530b0 in impala's branch 
refs/heads/master from Laszlo Gaal
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=5771c45 ]

IMPALA-6826: Extend bootstrap_system.sh to Ubuntu 18.04

Tweak bin/bootstrap_system.sh to automate the preparation of
an Impala development environment on Ubuntu 18.04.
The following changes were required:
- extend the OS recognition logic to Ubuntu 18.04
- add 'ant' to the list of installed packages
- request OpenJDK 8 as the default Java environment (Ubuntu 18.04
  defaults to OpenJDK 11)

These changes enable bootstrap_system.sh to set up an Impala development
environment where Impala can be successfully built.

Note that the patch does not attempt to pass the tests yet; this change
prepares only the environment. Bugs specific to Ubuntu 18 will be fixed
by follow-up commits.

Tested in the following environments:
- in a Docker container, using
"docker/test-with-docker.py --base-image:ubuntu:18.04"
- on an AWS EC2 m5.4xlarge instance

Change-Id: Iad790f72ea6b62258aed2225eb7bdf79590c350f
Reviewed-on: http://gerrit.cloudera.org:8080/12893
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Add support for Ubuntu 18.04
> 
>
> Key: IMPALA-6826
> URL: https://issues.apache.org/jira/browse/IMPALA-6826
> Project: IMPALA
>  Issue Type: Task
>  Components: Infrastructure
>Affects Versions: Impala 3.0, Impala 2.12.0
> Environment: Ubuntu 18.04
>Reporter: Jim Apple
>Assignee: Laszlo Gaal
>Priority: Major
>
> We support Ubuntu 16.04 (and 14.04, in the 2.x line).
>  
> I'm blocked on Ubuntu 18.04 support in 
> [https://github.com/cloudera/native-toolchain,] but the toolchain is not 
> technically a pre-requisite, though I believe it's the easiest way to get a 
> development environment up and running.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7006) Rebase KRPC onto Kudu upstream repository

2019-04-04 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16810075#comment-16810075
 ] 

ASF subversion and git services commented on IMPALA-7006:
-

Commit dfb9e16960f858e1dccd209e7b1f7e4be60bc6d4 in impala's branch 
refs/heads/2.x from Lars Volker
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=dfb9e16 ]

IMPALA-7006: Add KRPC folders from kudu@334ecafd

cp -a ~/checkout/kudu/src/kudu/{rpc,util,security} be/src/kudu/

Change-Id: I232db2b4ccf5df9aca87b21dea31bfb2735d1ab7
Reviewed-on: http://gerrit.cloudera.org:8080/10757
Reviewed-by: Lars Volker 
Tested-by: Lars Volker 


> Rebase KRPC onto Kudu upstream repository
> -
>
> Key: IMPALA-7006
> URL: https://issues.apache.org/jira/browse/IMPALA-7006
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Affects Versions: Impala 2.13.0, Impala 3.1.0
>Reporter: Lars Volker
>Assignee: Lars Volker
>Priority: Major
>  Labels: krpc
> Fix For: Impala 3.1.0
>
>
> We should consider rebasing our KRPC code on top of the latest Kudu upstream 
> version. This will keep the two projects more in sync and will allow us to 
> make use of recent improvements, e.g. around thread stack collection, without 
> having to pick individual changes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-110) Add support for multiple distinct operators in the same query block

2019-04-04 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16810088#comment-16810088
 ] 

ASF subversion and git services commented on IMPALA-110:


Commit fdd6db524c9c97f0baebfde0119fce19d62eaec3 in impala's branch 
refs/heads/2.x from Thomas Tauber-Marshall
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=fdd6db5 ]

IMPALA-7251: Fix QueryMaintenance calls in Aggregators

A recent change, IMPALA-110 (part 2), refactored
PartitionedAggregationNode into several classes, including a new type
'Aggregator'. During this refactor, code that makes local allocations
while evaluating exprs was moved from the ExecNode (now
AggregationNode/StreamingAggregationNode) into the Aggregators, but
code related to cleaning these allocations up (ie QueryMaintenance())
was not, resulting in some queries using an excessive amount of
memory.

This patch removes all calls to QueryMaintenance() from the exec nodes
and moves them into the Aggregators.

Testing:
- Added new test cases with a mem limit that fails if the expr
  allocations aren't released in a timely manner.
- Passed a full exhaustive run.

Change-Id: I4dac2bb0a15cdd7315ee15608bae409c125c82f5
Reviewed-on: http://gerrit.cloudera.org:8080/10871
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Add support for multiple distinct operators in the same query block
> ---
>
> Key: IMPALA-110
> URL: https://issues.apache.org/jira/browse/IMPALA-110
> Project: IMPALA
>  Issue Type: New Feature
>  Components: Backend, Frontend
>Affects Versions: Impala 0.5, Impala 1.4, Impala 2.0, Impala 2.2, Impala 
> 2.3.0
>Reporter: Greg Rahn
>Assignee: Thomas Tauber-Marshall
>Priority: Major
>  Labels: sql-language
> Fix For: Impala 3.1.0
>
>
> Impala only allows a single (DISTINCT columns) expression in each query.
> {color:red}Note:
> If you do not need precise accuracy, you can produce an estimate of the 
> distinct values for a column by specifying NDV(column); a query can contain 
> multiple instances of NDV(column). To make Impala automatically rewrite 
> COUNT(DISTINCT) expressions to NDV(), enable the APPX_COUNT_DISTINCT query 
> option.
> {color}
> {code}
> [impala:21000] > select count(distinct i_class_id) from item;
> Query: select count(distinct i_class_id) from item
> Query finished, fetching results ...
> 16
> Returned 1 row(s) in 1.51s
> {code}
> {code}
> [impala:21000] > select count(distinct i_class_id), count(distinct 
> i_brand_id) from item;
> Query: select count(distinct i_class_id), count(distinct i_brand_id) from item
> ERROR: com.cloudera.impala.common.AnalysisException: Analysis exception (in 
> select count(distinct i_class_id), count(distinct i_brand_id) from item)
>   at 
> com.cloudera.impala.analysis.AnalysisContext.analyze(AnalysisContext.java:133)
>   at 
> com.cloudera.impala.service.Frontend.createExecRequest(Frontend.java:221)
>   at 
> com.cloudera.impala.service.JniFrontend.createExecRequest(JniFrontend.java:89)
> Caused by: com.cloudera.impala.common.AnalysisException: all DISTINCT 
> aggregate functions need to have the same set of parameters as COUNT(DISTINCT 
> i_class_id); deviating function: COUNT(DISTINCT i_brand_id)
>   at 
> com.cloudera.impala.analysis.AggregateInfo.createDistinctAggInfo(AggregateInfo.java:196)
>   at 
> com.cloudera.impala.analysis.AggregateInfo.create(AggregateInfo.java:143)
>   at 
> com.cloudera.impala.analysis.SelectStmt.createAggInfo(SelectStmt.java:466)
>   at 
> com.cloudera.impala.analysis.SelectStmt.analyzeAggregation(SelectStmt.java:347)
>   at com.cloudera.impala.analysis.SelectStmt.analyze(SelectStmt.java:155)
>   at 
> com.cloudera.impala.analysis.AnalysisContext.analyze(AnalysisContext.java:130)
>   ... 2 more
> {code}
> Hive supports this:
> {code}
> $ hive -e "select count(distinct i_class_id), count(distinct i_brand_id) from 
> item;"
> Logging initialized using configuration in 
> file:/etc/hive/conf.dist/hive-log4j.properties
> Hive history file=/tmp/grahn/hive_job_log_grahn_201303052234_1625576708.txt
> Total MapReduce jobs = 1
> Launching Job 1 out of 1
> Number of reduce tasks determined at compile time: 1
> In order to change the average load for a reducer (in bytes):
>   set hive.exec.reducers.bytes.per.reducer=
> In order to limit the maximum number of reducers:
>   set hive.exec.reducers.max=
> In order to set a constant number of reducers:
>   set mapred.reduce.tasks=
> Starting Job = job_201302081514_0073, Tracking URL = 
> http://impala:50030/jobdetails.jsp?jobid=job_201302081514_0073
> Kill Command = /usr/lib/hadoop/bin/hadoop job  
> -Dmapred.job.tracker=m0525.mtv.cloudera.com:8021 -kill job_201302081514_0073
> Hadoop job 

[jira] [Commented] (IMPALA-7006) Rebase KRPC onto Kudu upstream repository

2019-04-04 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16810083#comment-16810083
 ] 

ASF subversion and git services commented on IMPALA-7006:
-

Commit b97e0cd555a53057a82dc9c0ad9e0cfe58f3ec66 in impala's branch 
refs/heads/2.x from Sailesh Mukil
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=b97e0cd ]

IMPALA-5129: Use Kudu's Kinit code to avoid expensive fork

NOTE: This commit is part of a set of changes for IMPALA-7006. It
contains pieces of a previous commit that need to be cherry picked
again after rebasing the code in be/src/kudu/{util,security,rpc}.

The original commit message is below:

Impala currently kinits by forking off a child process. This
has proved to be expensive in many cases since the subprocess
tries to reserve as much memory as Impala is currently using
which can be quite a lot.

This patch adds a flag called 'use_kudu_kinit' that defaults to
true. When it's true, it uses the Kudu security library's kinit code
that programatically uses the krb5 library to kinit.
When it's false, we run our current path which kicks off the
kinit-thread and forks off a kinit process periodically to reacquire
tickets based on FLAGS_kerberos_reinit_interval.

Converted existing tests in thrift-server-test to run with and
without kerberos. We now run this BE test with kerberos by using
Kudu's MiniKdc utility. This introduces a new dependency on some
kerberos binaries that are checked through FindKerberosPrograms.cmake.
Note that this is only a test dependency and not a dependency for
the impalad binaries and friends. Compilation will still succeed if
the kerberos binaries for the MiniKdc are not found, however, the
thrift-server-test will fail. We run with and without the
'use_kudu_kinit' flag.

TODO: Since the setting up and tearing down of our security code
isn't idempotent, we can run only any one test in a process with
Kerberos now (IMPALA-6085).

Updated bin/bootstrap_system.sh to install new sasl-gssapi
modules and the kerberos binaries required for the MiniKdc.
Also fixed a bug that didn't transfer the environment into 'sudo'
in bin/bootstrap_system.sh.

Testing: Verified with thrift-server-test and also manually on a
live kerberized cluster.

Change-Id: Ie3c6e933c454e7adca69ef03e7d5c0c84b656895
Reviewed-on: http://gerrit.cloudera.org:8080/7938
Reviewed-by: Sailesh Mukil 
Tested-by: Impala Public Jenkins
Reviewed-on: http://gerrit.cloudera.org:8080/10763
Reviewed-by: Lars Volker 
Tested-by: Lars Volker 


> Rebase KRPC onto Kudu upstream repository
> -
>
> Key: IMPALA-7006
> URL: https://issues.apache.org/jira/browse/IMPALA-7006
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Affects Versions: Impala 2.13.0, Impala 3.1.0
>Reporter: Lars Volker
>Assignee: Lars Volker
>Priority: Major
>  Labels: krpc
> Fix For: Impala 3.1.0
>
>
> We should consider rebasing our KRPC code on top of the latest Kudu upstream 
> version. This will keep the two projects more in sync and will allow us to 
> make use of recent improvements, e.g. around thread stack collection, without 
> having to pick individual changes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7006) Rebase KRPC onto Kudu upstream repository

2019-04-04 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16810077#comment-16810077
 ] 

ASF subversion and git services commented on IMPALA-7006:
-

Commit 5dbd48f226f1061567da1c381ee2491dab3ceaf4 in impala's branch 
refs/heads/2.x from Lars Volker
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=5dbd48f ]

IMPALA-4669: [KUTIL] Add kudu_util library to the build.

NOTE: This commit is part of a set of changes for IMPALA-7006. It
contains pieces of a previous commit that need to be cherry picked
again after rebasing the code in be/src/kudu/{util,security,rpc}.

The original commit message is below:

A few miscellaneous changes to allow kudu_util to compile with Impala.

Add kudu_version.cc to substitute for the version.cc file that is
automatically built during the full Kudu build.

Set LZ4_DISABLE_DEPRECATE_WARNINGS to allow Kudu's compressor utility to
use deprecated names for LZ4 methods.

Add NO_NVM_SUPPORT flag to Kudu build (plan to upstream this later) to
disable building with nvm support, removing a library dependency.

Also remove imported FindOpenSSL.cmake in favour of the standard one provided
by cmake itself.

Finally, a few changes to allow compilation on RHEL5:

* Only use sched_getcpu() if supported
* Only include magic.h if available
* Workaround for kernels that don't have SOCK_NONBLOCK
* Workaround for kernels that don't have O_CLOEXEC (ignore the flag)
* Provide non-working implementation of fallocate()
* Disable inclusion of linux/fiemap.h - although this exists on RHEL5,
  it does not compile due to other #includes in env_posix.cc. We disable
  the path this is used for, since Impala does not call that code.
* Use Kudu's implementation of pipe(2), preadv(2) and pwritev(2) where
  it doesn't exist.

In most cases these changes simply force kutil to revert to a different
implementation that was already written for OSX support - this patch
generalises the logic to provide the implementation whenever the
required function doesn't exist.

This patch compiles on RHEL5.5 and 6.0, SLES11 and 12, Ubuntu 12.04 and
14.04 and Debian 7.0 and 8.0.

Change-Id: I451f02d3e4669e8a548b92fb1445cb2b322659a2
Reviewed-on: http://gerrit.cloudera.org:8080/5715
Tested-by: Impala Public Jenkins
Reviewed-by: Henry Robinson 
Reviewed-on: http://gerrit.cloudera.org:8080/10758
Reviewed-by: Michael Ho 
Tested-by: Lars Volker 


> Rebase KRPC onto Kudu upstream repository
> -
>
> Key: IMPALA-7006
> URL: https://issues.apache.org/jira/browse/IMPALA-7006
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Affects Versions: Impala 2.13.0, Impala 3.1.0
>Reporter: Lars Volker
>Assignee: Lars Volker
>Priority: Major
>  Labels: krpc
> Fix For: Impala 3.1.0
>
>
> We should consider rebasing our KRPC code on top of the latest Kudu upstream 
> version. This will keep the two projects more in sync and will allow us to 
> make use of recent improvements, e.g. around thread stack collection, without 
> having to pick individual changes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-7006) Rebase KRPC onto Kudu upstream repository

2019-04-04 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-7006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16810081#comment-16810081
 ] 

ASF subversion and git services commented on IMPALA-7006:
-

Commit d10b34354c0d1616ed2faf78a6659e9be4aacd66 in impala's branch 
refs/heads/2.x from Lars Volker
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=d10b343 ]

IMPALA-4669: [KRPC] Add kudu_rpc library to build

NOTE: This commit is part of a set of changes for IMPALA-7006. It
contains pieces of a previous commit that need to be cherry picked
again after rebasing the code in be/src/kudu/{util,security,rpc}.

The original commit message is below:

Import FindKRPC.cmake from Apache Kudu.

Add some files to protoc-gen-krpc link to allow it to find symbols now
defined within Impala (without linking all of Impala's libraries).

Change-Id: I5693288db90f2e9673b8c88ca4378c3790cba957
Reviewed-on: http://gerrit.cloudera.org:8080/5719
Reviewed-by: Henry Robinson 
Tested-by: Impala Public Jenkins
Reviewed-on: http://gerrit.cloudera.org:8080/10760
Reviewed-by: Lars Volker 
Tested-by: Lars Volker 


> Rebase KRPC onto Kudu upstream repository
> -
>
> Key: IMPALA-7006
> URL: https://issues.apache.org/jira/browse/IMPALA-7006
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Affects Versions: Impala 2.13.0, Impala 3.1.0
>Reporter: Lars Volker
>Assignee: Lars Volker
>Priority: Major
>  Labels: krpc
> Fix For: Impala 3.1.0
>
>
> We should consider rebasing our KRPC code on top of the latest Kudu upstream 
> version. This will keep the two projects more in sync and will allow us to 
> make use of recent improvements, e.g. around thread stack collection, without 
> having to pick individual changes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-6085) Make the setup and teardown of the security code idempotent

2019-04-04 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-6085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16810084#comment-16810084
 ] 

ASF subversion and git services commented on IMPALA-6085:
-

Commit b97e0cd555a53057a82dc9c0ad9e0cfe58f3ec66 in impala's branch 
refs/heads/2.x from Sailesh Mukil
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=b97e0cd ]

IMPALA-5129: Use Kudu's Kinit code to avoid expensive fork

NOTE: This commit is part of a set of changes for IMPALA-7006. It
contains pieces of a previous commit that need to be cherry picked
again after rebasing the code in be/src/kudu/{util,security,rpc}.

The original commit message is below:

Impala currently kinits by forking off a child process. This
has proved to be expensive in many cases since the subprocess
tries to reserve as much memory as Impala is currently using
which can be quite a lot.

This patch adds a flag called 'use_kudu_kinit' that defaults to
true. When it's true, it uses the Kudu security library's kinit code
that programatically uses the krb5 library to kinit.
When it's false, we run our current path which kicks off the
kinit-thread and forks off a kinit process periodically to reacquire
tickets based on FLAGS_kerberos_reinit_interval.

Converted existing tests in thrift-server-test to run with and
without kerberos. We now run this BE test with kerberos by using
Kudu's MiniKdc utility. This introduces a new dependency on some
kerberos binaries that are checked through FindKerberosPrograms.cmake.
Note that this is only a test dependency and not a dependency for
the impalad binaries and friends. Compilation will still succeed if
the kerberos binaries for the MiniKdc are not found, however, the
thrift-server-test will fail. We run with and without the
'use_kudu_kinit' flag.

TODO: Since the setting up and tearing down of our security code
isn't idempotent, we can run only any one test in a process with
Kerberos now (IMPALA-6085).

Updated bin/bootstrap_system.sh to install new sasl-gssapi
modules and the kerberos binaries required for the MiniKdc.
Also fixed a bug that didn't transfer the environment into 'sudo'
in bin/bootstrap_system.sh.

Testing: Verified with thrift-server-test and also manually on a
live kerberized cluster.

Change-Id: Ie3c6e933c454e7adca69ef03e7d5c0c84b656895
Reviewed-on: http://gerrit.cloudera.org:8080/7938
Reviewed-by: Sailesh Mukil 
Tested-by: Impala Public Jenkins
Reviewed-on: http://gerrit.cloudera.org:8080/10763
Reviewed-by: Lars Volker 
Tested-by: Lars Volker 


> Make the setup and teardown of the security code idempotent
> ---
>
> Key: IMPALA-6085
> URL: https://issues.apache.org/jira/browse/IMPALA-6085
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Security
>Affects Versions: Impala 2.10.0
>Reporter: Sailesh Mukil
>Priority: Major
>  Labels: infrastructure, security, test
>
> Our security code assumes that it will only be called once in the lifetime of 
> a process. This is true, however, for tests, we would like to set it up and 
> tear it down multiple times to issue it different configurations and test it 
> within the same backend test process.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Work started] (IMPALA-8385) Refactor Sentry admin check

2019-04-04 Thread Fredy Wijaya (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-8385 started by Fredy Wijaya.

> Refactor Sentry admin check
> ---
>
> Key: IMPALA-8385
> URL: https://issues.apache.org/jira/browse/IMPALA-8385
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Catalog, Frontend
>Reporter: Fredy Wijaya
>Assignee: Fredy Wijaya
>Priority: Major
>
> Currently Sentry admin check is hardcoded, for example: 
> https://github.com/apache/impala/blob/5670f96b828d57f9e36510bb9af02bcc31de775c/be/src/service/client-request-state.cc#L366
> This check needs to be moved out to SentryAuthorizationManager instead.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-8385) Refactor Sentry admin check

2019-04-04 Thread Fredy Wijaya (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fredy Wijaya reassigned IMPALA-8385:


Assignee: Fredy Wijaya

> Refactor Sentry admin check
> ---
>
> Key: IMPALA-8385
> URL: https://issues.apache.org/jira/browse/IMPALA-8385
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Catalog, Frontend
>Reporter: Fredy Wijaya
>Assignee: Fredy Wijaya
>Priority: Major
>
> Currently Sentry admin check is hardcoded, for example: 
> https://github.com/apache/impala/blob/5670f96b828d57f9e36510bb9af02bcc31de775c/be/src/service/client-request-state.cc#L366
> This check needs to be moved out to SentryAuthorizationManager instead.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Work started] (IMPALA-8227) Support WITH GRANT OPTION with Ranger authorization provider

2019-04-04 Thread Austin Nobis (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-8227 started by Austin Nobis.

> Support WITH GRANT OPTION with Ranger authorization provider
> 
>
> Key: IMPALA-8227
> URL: https://issues.apache.org/jira/browse/IMPALA-8227
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Catalog, Frontend
>Reporter: Fredy Wijaya
>Assignee: Austin Nobis
>Priority: Major
>
> This ticket should investigate whether it's feasible to support WITH GRANT 
> OPTION (giving a grant/revoke privilege for non-admins) with Ranger. If it's 
> not feasible, Impala should throw an error when Impala is enabled with Ranger.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Comment Edited] (IMPALA-8386) Incorrect predicate in a left outer join query

2019-04-04 Thread Csaba Ringhofer (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-8386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16809987#comment-16809987
 ] 

Csaba Ringhofer edited comment on IMPALA-8386 at 4/4/19 4:09 PM:
-

I could slightly simplify the query, the aggregates + group by are actually not 
needed, a single field with two aliases is enough + the sub query for (select 
a_id from a) can be also replaced with "a":
{code}
select count(1) from (
select t2.a_id,t2.amount1,t2.amount2
from a
left outer join (
select c.a_id, amount as amount1, amount as amount2
from b join c  on b.b_id = c.b_id) t2
on a.a_id = t2.a_id
) t;
-- returns 1
{code}


was (Author: csringhofer):
I could slightly simplify the query, the aggregates + group by are actually not 
needed, a single field with two aliases is enough:
{code}
select count(1) from (
select t2.a_id,t2.amount1,t2.amount2
from( select a_id from a) t1
left outer join (
select c.a_id, amount as amount1, amount as amount2
from b join c  on b.b_id = c.b_id) t2
on t1.a_id = t2.a_id
) t;
-- returns 1
{code}

> Incorrect predicate in a left outer join query
> --
>
> Key: IMPALA-8386
> URL: https://issues.apache.org/jira/browse/IMPALA-8386
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Reporter: Quanlong Huang
>Assignee: Quanlong Huang
>Priority: Critical
>  Labels: correctness
>
> skyyws  reported a bug [in the mailing 
> list|https://lists.apache.org/thread.html/0bdbbaa6bb35b552f050ae30587b7d75b78a72ec60007a8bc0a4a8a9@%3Cdev.impala.apache.org%3E]
>  on the following data set:
> {code:java}
> table A
> +--+
> | a_id |
> +--+
> | 1    |
> | 2    |
> +--+
> table B
> +--++
> | b_id | amount |
> +--++
> | 1    | 10     |
> | 1    | 20     |
> | 2    | NULL   |
> +--++
> table C
> +--+--+
> | a_id | b_id |
> +--+--+
> | 1    | 1    |
> | 2    | 2    |
> +--+--+{code}
> The following query returns a wrong result "1":
> {code:java}
> select count(1) from (
> select t2.a_id,t2.amount1,t2.amount2
> from( select a_id from a) t1
> left outer join (
> select c.a_id,sum(amount) as amount1,sum(amount) as amount2
> from b join c  on b.b_id = c.b_id group by c.a_id) t2
> on t1.a_id = t2.a_id
> ) t;
> {code}
> Removing "t2.amount2" can get the right result "2":
> {code:java}
> select count(1) from (
> select t2.a_id,t2.amount1
> from( select a_id from a) t1
> left outer join (
> select c.a_id,sum(amount) as amount1,sum(amount) as amount2
> from b join c  on b.b_id = c.b_id group by c.a_id) t2
> on t1.a_id = t2.a_id
> ) t;
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-8386) Incorrect predicate in a left outer join query

2019-04-04 Thread Csaba Ringhofer (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Csaba Ringhofer updated IMPALA-8386:

Labels: correctness  (was: )

> Incorrect predicate in a left outer join query
> --
>
> Key: IMPALA-8386
> URL: https://issues.apache.org/jira/browse/IMPALA-8386
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Reporter: Quanlong Huang
>Assignee: Quanlong Huang
>Priority: Critical
>  Labels: correctness
>
> skyyws  reported a bug [in the mailing 
> list|https://lists.apache.org/thread.html/0bdbbaa6bb35b552f050ae30587b7d75b78a72ec60007a8bc0a4a8a9@%3Cdev.impala.apache.org%3E]
>  on the following data set:
> {code:java}
> table A
> +--+
> | a_id |
> +--+
> | 1    |
> | 2    |
> +--+
> table B
> +--++
> | b_id | amount |
> +--++
> | 1    | 10     |
> | 1    | 20     |
> | 2    | NULL   |
> +--++
> table C
> +--+--+
> | a_id | b_id |
> +--+--+
> | 1    | 1    |
> | 2    | 2    |
> +--+--+{code}
> The following query returns a wrong result "1":
> {code:java}
> select count(1) from (
> select t2.a_id,t2.amount1,t2.amount2
> from( select a_id from a) t1
> left outer join (
> select c.a_id,sum(amount) as amount1,sum(amount) as amount2
> from b join c  on b.b_id = c.b_id group by c.a_id) t2
> on t1.a_id = t2.a_id
> ) t;
> {code}
> Removing "t2.amount2" can get the right result "2":
> {code:java}
> select count(1) from (
> select t2.a_id,t2.amount1
> from( select a_id from a) t1
> left outer join (
> select c.a_id,sum(amount) as amount1,sum(amount) as amount2
> from b join c  on b.b_id = c.b_id group by c.a_id) t2
> on t1.a_id = t2.a_id
> ) t;
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-8386) Incorrect predicate in a left outer join query

2019-04-04 Thread Csaba Ringhofer (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Csaba Ringhofer updated IMPALA-8386:

Component/s: Frontend

> Incorrect predicate in a left outer join query
> --
>
> Key: IMPALA-8386
> URL: https://issues.apache.org/jira/browse/IMPALA-8386
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Reporter: Quanlong Huang
>Assignee: Quanlong Huang
>Priority: Critical
>
> skyyws  reported a bug [in the mailing 
> list|https://lists.apache.org/thread.html/0bdbbaa6bb35b552f050ae30587b7d75b78a72ec60007a8bc0a4a8a9@%3Cdev.impala.apache.org%3E]
>  on the following data set:
> {code:java}
> table A
> +--+
> | a_id |
> +--+
> | 1    |
> | 2    |
> +--+
> table B
> +--++
> | b_id | amount |
> +--++
> | 1    | 10     |
> | 1    | 20     |
> | 2    | NULL   |
> +--++
> table C
> +--+--+
> | a_id | b_id |
> +--+--+
> | 1    | 1    |
> | 2    | 2    |
> +--+--+{code}
> The following query returns a wrong result "1":
> {code:java}
> select count(1) from (
> select t2.a_id,t2.amount1,t2.amount2
> from( select a_id from a) t1
> left outer join (
> select c.a_id,sum(amount) as amount1,sum(amount) as amount2
> from b join c  on b.b_id = c.b_id group by c.a_id) t2
> on t1.a_id = t2.a_id
> ) t;
> {code}
> Removing "t2.amount2" can get the right result "2":
> {code:java}
> select count(1) from (
> select t2.a_id,t2.amount1
> from( select a_id from a) t1
> left outer join (
> select c.a_id,sum(amount) as amount1,sum(amount) as amount2
> from b join c  on b.b_id = c.b_id group by c.a_id) t2
> on t1.a_id = t2.a_id
> ) t;
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Comment Edited] (IMPALA-8386) Incorrect predicate in a left outer join query

2019-04-04 Thread Csaba Ringhofer (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-8386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16809987#comment-16809987
 ] 

Csaba Ringhofer edited comment on IMPALA-8386 at 4/4/19 3:38 PM:
-

I could slightly simplify the query, the aggregates + group by are actually not 
needed, a single field with two aliases is enough:
{code}
select count(1) from (
select t2.a_id,t2.amount1,t2.amount2
from( select a_id from a) t1
left outer join (
select c.a_id, amount as amount1, amount as amount2
from b join c  on b.b_id = c.b_id) t2
on t1.a_id = t2.a_id
) t;
-- returns 1
{code}


was (Author: csringhofer):
I could slightly simplify the query, the aggregates + group by are actually not 
needed, a single field with two aliases is enough:

select count(1) from (
select t2.a_id,t2.amount1,t2.amount2
from( select a_id from a) t1
left outer join (
select c.a_id, amount as amount1, amount as amount2
from b join c  on b.b_id = c.b_id) t2
on t1.a_id = t2.a_id
) t;
-- returns 1

> Incorrect predicate in a left outer join query
> --
>
> Key: IMPALA-8386
> URL: https://issues.apache.org/jira/browse/IMPALA-8386
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Quanlong Huang
>Assignee: Quanlong Huang
>Priority: Critical
>
> skyyws  reported a bug [in the mailing 
> list|https://lists.apache.org/thread.html/0bdbbaa6bb35b552f050ae30587b7d75b78a72ec60007a8bc0a4a8a9@%3Cdev.impala.apache.org%3E]
>  on the following data set:
> {code:java}
> table A
> +--+
> | a_id |
> +--+
> | 1    |
> | 2    |
> +--+
> table B
> +--++
> | b_id | amount |
> +--++
> | 1    | 10     |
> | 1    | 20     |
> | 2    | NULL   |
> +--++
> table C
> +--+--+
> | a_id | b_id |
> +--+--+
> | 1    | 1    |
> | 2    | 2    |
> +--+--+{code}
> The following query returns a wrong result "1":
> {code:java}
> select count(1) from (
> select t2.a_id,t2.amount1,t2.amount2
> from( select a_id from a) t1
> left outer join (
> select c.a_id,sum(amount) as amount1,sum(amount) as amount2
> from b join c  on b.b_id = c.b_id group by c.a_id) t2
> on t1.a_id = t2.a_id
> ) t;
> {code}
> Removing "t2.amount2" can get the right result "2":
> {code:java}
> select count(1) from (
> select t2.a_id,t2.amount1
> from( select a_id from a) t1
> left outer join (
> select c.a_id,sum(amount) as amount1,sum(amount) as amount2
> from b join c  on b.b_id = c.b_id group by c.a_id) t2
> on t1.a_id = t2.a_id
> ) t;
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-8386) Incorrect predicate in a left outer join query

2019-04-04 Thread Quanlong Huang (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Quanlong Huang reassigned IMPALA-8386:
--

Assignee: Quanlong Huang

> Incorrect predicate in a left outer join query
> --
>
> Key: IMPALA-8386
> URL: https://issues.apache.org/jira/browse/IMPALA-8386
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Quanlong Huang
>Assignee: Quanlong Huang
>Priority: Critical
>
> skyyws  reported a bug [in the mailing 
> list|https://lists.apache.org/thread.html/0bdbbaa6bb35b552f050ae30587b7d75b78a72ec60007a8bc0a4a8a9@%3Cdev.impala.apache.org%3E]
>  on the following data set:
> {code:java}
> table A
> +--+
> | a_id |
> +--+
> | 1    |
> | 2    |
> +--+
> table B
> +--++
> | b_id | amount |
> +--++
> | 1    | 10     |
> | 1    | 20     |
> | 2    | NULL   |
> +--++
> table C
> +--+--+
> | a_id | b_id |
> +--+--+
> | 1    | 1    |
> | 2    | 2    |
> +--+--+{code}
> The following query returns a wrong result "1":
> {code:java}
> select count(1) from (
> select t2.a_id,t2.amount1,t2.amount2
> from( select a_id from a) t1
> left outer join (
> select c.a_id,sum(amount) as amount1,sum(amount) as amount2
> from b join c  on b.b_id = c.b_id group by c.a_id) t2
> on t1.a_id = t2.a_id
> ) t;
> {code}
> Removing "t2.amount2" can get the right result "2":
> {code:java}
> select count(1) from (
> select t2.a_id,t2.amount1
> from( select a_id from a) t1
> left outer join (
> select c.a_id,sum(amount) as amount1,sum(amount) as amount2
> from b join c  on b.b_id = c.b_id group by c.a_id) t2
> on t1.a_id = t2.a_id
> ) t;
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-8386) Incorrect predicate in a left outer join query

2019-04-04 Thread Quanlong Huang (JIRA)
Quanlong Huang created IMPALA-8386:
--

 Summary: Incorrect predicate in a left outer join query
 Key: IMPALA-8386
 URL: https://issues.apache.org/jira/browse/IMPALA-8386
 Project: IMPALA
  Issue Type: Bug
Reporter: Quanlong Huang


skyyws  reported a bug [in the mailing 
list|https://lists.apache.org/thread.html/0bdbbaa6bb35b552f050ae30587b7d75b78a72ec60007a8bc0a4a8a9@%3Cdev.impala.apache.org%3E]
 on the following data set:
{code:java}
table A
+--+
| a_id |
+--+
| 1    |
| 2    |
+--+
table B
+--++
| b_id | amount |
+--++
| 1    | 10     |
| 1    | 20     |
| 2    | NULL   |
+--++
table C
+--+--+
| a_id | b_id |
+--+--+
| 1    | 1    |
| 2    | 2    |
+--+--+{code}
The following query returns a wrong result "1":
{code:java}
select count(1) from (
select t2.a_id,t2.amount1,t2.amount2
from( select a_id from a) t1
left outer join (
select c.a_id,sum(amount) as amount1,sum(amount) as amount2
from b join c  on b.b_id = c.b_id group by c.a_id) t2
on t1.a_id = t2.a_id
) t;
{code}
Removing "t2.amount2" can get the right result "2":
{code:java}
select count(1) from (
select t2.a_id,t2.amount1
from( select a_id from a) t1
left outer join (
select c.a_id,sum(amount) as amount1,sum(amount) as amount2
from b join c  on b.b_id = c.b_id group by c.a_id) t2
on t1.a_id = t2.a_id
) t;
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-8309) Use a more human-readable flag to switch to a different authorization provider

2019-04-04 Thread Austin Nobis (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Austin Nobis reassigned IMPALA-8309:


Assignee: radford nguyen  (was: Austin Nobis)

> Use a more human-readable flag to switch to a different authorization provider
> --
>
> Key: IMPALA-8309
> URL: https://issues.apache.org/jira/browse/IMPALA-8309
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Backend
>Reporter: Fredy Wijaya
>Assignee: radford nguyen
>Priority: Minor
>
> We currently use authorization_factory_class flag to switch to a different 
> authorization provider, which is useful for any third party to provide an 
> implementation of authorization provider. Since, Sentry and Ranger are 
> officially supported by Impala, we should have a flag, i.e. 
> authorization_provider=[sentry|ranger] to easily switch between officially 
> supported authorization providers.
> At the time of this writing, the existing {{authorization_factory_class}} 
> flag is being retained but its default value removed.  If present, it will 
> take precedence over the {{authorization_provider}} flag being added.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Work started] (IMPALA-8309) Use a more human-readable flag to switch to a different authorization provider

2019-04-04 Thread Austin Nobis (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-8309 started by Austin Nobis.

> Use a more human-readable flag to switch to a different authorization provider
> --
>
> Key: IMPALA-8309
> URL: https://issues.apache.org/jira/browse/IMPALA-8309
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Backend
>Reporter: Fredy Wijaya
>Assignee: Austin Nobis
>Priority: Minor
>
> We currently use authorization_factory_class flag to switch to a different 
> authorization provider, which is useful for any third party to provide an 
> implementation of authorization provider. Since, Sentry and Ranger are 
> officially supported by Impala, we should have a flag, i.e. 
> authorization_provider=[sentry|ranger] to easily switch between officially 
> supported authorization providers.
> At the time of this writing, the existing {{authorization_factory_class}} 
> flag is being retained but its default value removed.  If present, it will 
> take precedence over the {{authorization_provider}} flag being added.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-8381) Remove branch from ParquetPlainEncoder::Decode()

2019-04-04 Thread Daniel Becker (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-8381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16809886#comment-16809886
 ] 

Daniel Becker commented on IMPALA-8381:
---

Some measurements:

Running the following query in the database tpch_parquet:
{code:java}
set num_nodes=1; select max(l_orderkey) from lineitem;{code}
we found the following results averaging the MaterializeTupleTime(*) values 
over 100 runs with and without the "if":

Without "if": 14.3464ms

With "if": 16.42624ms

This is a 14% improvement in MaterializeTupleTime in this query.

The total query time was 0.11s, the ~2ms gain is a little less than 2%.

> Remove branch from ParquetPlainEncoder::Decode()
> 
>
> Key: IMPALA-8381
> URL: https://issues.apache.org/jira/browse/IMPALA-8381
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Reporter: Csaba Ringhofer
>Priority: Minor
>  Labels: newbie, parquet, performance, ramp-up
>
> Removing the "if" at
> https://github.com/apache/impala/blob/5670f96b828d57f9e36510bb9af02bcc31de775c/be/src/exec/parquet/parquet-common.h#L203
> can lead to 1.5x speed up in plain decoding (type=int32, stride=16). For 
> primitive types, the same check can be done for a whole batch, so the speedup 
> can be gained for large batches without loosing safety. The only Parquet type 
> where this check is needed per element is BYTE_ARRAY (typically used for 
> STRING columns), which already has a template specialization for  
> ParquetPlainEncoder::Decode().



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-8381) Remove branch from ParquetPlainEncoder::Decode()

2019-04-04 Thread Daniel Becker (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Becker reassigned IMPALA-8381:
-

Assignee: Daniel Becker

> Remove branch from ParquetPlainEncoder::Decode()
> 
>
> Key: IMPALA-8381
> URL: https://issues.apache.org/jira/browse/IMPALA-8381
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Reporter: Csaba Ringhofer
>Assignee: Daniel Becker
>Priority: Minor
>  Labels: newbie, parquet, performance, ramp-up
>
> Removing the "if" at
> https://github.com/apache/impala/blob/5670f96b828d57f9e36510bb9af02bcc31de775c/be/src/exec/parquet/parquet-common.h#L203
> can lead to 1.5x speed up in plain decoding (type=int32, stride=16). For 
> primitive types, the same check can be done for a whole batch, so the speedup 
> can be gained for large batches without loosing safety. The only Parquet type 
> where this check is needed per element is BYTE_ARRAY (typically used for 
> STRING columns), which already has a template specialization for  
> ParquetPlainEncoder::Decode().



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Comment Edited] (IMPALA-8381) Remove branch from ParquetPlainEncoder::Decode()

2019-04-04 Thread Daniel Becker (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-8381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16809886#comment-16809886
 ] 

Daniel Becker edited comment on IMPALA-8381 at 4/4/19 2:01 PM:
---

Some measurements:

Running the following query in the database tpch_parquet:
{code:java}
set num_nodes=1; select max(l_orderkey) from lineitem;{code}
we found the following results averaging the MaterializeTupleTime values over 
100 runs with and without the "if":

Without "if": 14.3464ms

With "if": 16.42624ms

This is a 14% improvement in MaterializeTupleTime in this query.

The total query time was 0.11s, the ~2ms gain is a little less than 2%.


was (Author: daniel.becker):
Some measurements:

Running the following query in the database tpch_parquet:
{code:java}
set num_nodes=1; select max(l_orderkey) from lineitem;{code}
we found the following results averaging the MaterializeTupleTime(*) values 
over 100 runs with and without the "if":

Without "if": 14.3464ms

With "if": 16.42624ms

This is a 14% improvement in MaterializeTupleTime in this query.

The total query time was 0.11s, the ~2ms gain is a little less than 2%.

> Remove branch from ParquetPlainEncoder::Decode()
> 
>
> Key: IMPALA-8381
> URL: https://issues.apache.org/jira/browse/IMPALA-8381
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Reporter: Csaba Ringhofer
>Priority: Minor
>  Labels: newbie, parquet, performance, ramp-up
>
> Removing the "if" at
> https://github.com/apache/impala/blob/5670f96b828d57f9e36510bb9af02bcc31de775c/be/src/exec/parquet/parquet-common.h#L203
> can lead to 1.5x speed up in plain decoding (type=int32, stride=16). For 
> primitive types, the same check can be done for a whole batch, so the speedup 
> can be gained for large batches without loosing safety. The only Parquet type 
> where this check is needed per element is BYTE_ARRAY (typically used for 
> STRING columns), which already has a template specialization for  
> ParquetPlainEncoder::Decode().



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org