[jira] [Created] (IMPALA-10144) Add a statement of platforms that Impala runs on

2020-09-03 Thread huangtianhua (Jira)
huangtianhua created IMPALA-10144:
-

 Summary: Add a statement of platforms that Impala runs on
 Key: IMPALA-10144
 URL: https://issues.apache.org/jira/browse/IMPALA-10144
 Project: IMPALA
  Issue Type: Sub-task
Reporter: huangtianhua


Now Impala can build and run all tests on arm64 successful, it's good to add a 
statement that Impala can runs on arm64 platform.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IMPALA-10144) Add a statement of platforms that Impala runs on

2020-09-03 Thread huangtianhua (Jira)
huangtianhua created IMPALA-10144:
-

 Summary: Add a statement of platforms that Impala runs on
 Key: IMPALA-10144
 URL: https://issues.apache.org/jira/browse/IMPALA-10144
 Project: IMPALA
  Issue Type: Sub-task
Reporter: huangtianhua


Now Impala can build and run all tests on arm64 successful, it's good to add a 
statement that Impala can runs on arm64 platform.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-10144) Add a statement of platforms that Impala runs on

2020-09-03 Thread huangtianhua (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

huangtianhua reassigned IMPALA-10144:
-

Assignee: huangtianhua

> Add a statement of platforms that Impala runs on
> 
>
> Key: IMPALA-10144
> URL: https://issues.apache.org/jira/browse/IMPALA-10144
> Project: IMPALA
>  Issue Type: Sub-task
>Reporter: huangtianhua
>Assignee: huangtianhua
>Priority: Major
>
> Now Impala can build and run all tests on arm64 successful, it's good to add 
> a statement that Impala can runs on arm64 platform.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-10133) Implement ds_hll_stringify()

2020-09-03 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-10133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17190441#comment-17190441
 ] 

ASF subversion and git services commented on IMPALA-10133:
--

Commit 99e5f5a8859c58641973bc84058eeb15502da96c in impala's branch 
refs/heads/master from Adam Tamas
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=99e5f5a ]

IMPALA-10133:Implement ds_hll_stringify function.

This function receives a string that is a serialized Apache DataSketches
HLL sketch and returns its stringified format.

A stringified format should look like and contains the following data:

select ds_hll_stringify(ds_hll_sketch(float_col)) from
functional_parquet.alltypestiny;
++
| ds_hll_stringify(ds_hll_sketch(float_col)) |
++
| ### HLL sketch summary:|
|   Log Config K   : 12  |
|   Hll Target : HLL_4   |
|   Current Mode   : LIST|
|   LB : 2   |
|   Estimate   : 2   |
|   UB : 2.0001  |
|   OutOfOrder flag: false   |
|   Coupon count   : 2   |
| ### End HLL sketch summary |
||
++

Change-Id: I85dbf20b5114dd75c300eef0accabe90eac240a0
Reviewed-on: http://gerrit.cloudera.org:8080/16382
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Implement ds_hll_stringify()
> 
>
> Key: IMPALA-10133
> URL: https://issues.apache.org/jira/browse/IMPALA-10133
> Project: IMPALA
>  Issue Type: Sub-task
>Reporter: Adam Tamas
>Assignee: Adam Tamas
>Priority: Major
>
> This function receives a string that is a serialized Apache DataSketches
> HLL sketch and returns its stringified format.
> A stringified format should look like and contains the following data:
> {code:java}
> (select ds_hll_stringify(ds_hll_sketch(i)) from t;)
> ++
> |_c0 |
> ++
> | ### HLL SKETCH SUMMARY: 
>   Log Config K   : 12
>   Hll Target : HLL_4
>   Current Mode   : LIST
>   Memory : true
>   LB : 1.0
>   Estimate   : 1.0
>   UB : 1.49929250618
>   OutOfOrder Flag: false
>   Coupon Count   : 1
>  |
> ++
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-10119) TestImpalaShellInteractive.test_history_does_not_duplicate_on_interrupt

2020-09-03 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-10119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17190442#comment-17190442
 ] 

ASF subversion and git services commented on IMPALA-10119:
--

Commit 2359a1be9dc491f6c35fe3415265d4a29d6bc939 in impala's branch 
refs/heads/master from Tamas Mate
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=2359a1b ]

IMPALA-10119: Fix impala-shell history duplication test

The flaky test was
TestImpalaShellInteractive.test_history_does_not_duplicate_on_interrupt

The test failed with timeout error when the interrupt signal arrived
later after the next test query was started. The impala-shell output was
^C instead of the expected query result.

This change adds an additional blocking expect call to wait for the
interrupt signal to arrive before sending in the next query.

Change-Id: I242eb47cc8093c4566de206f46b75b3feab1183c
Reviewed-on: http://gerrit.cloudera.org:8080/16391
Tested-by: Impala Public Jenkins 
Reviewed-by: Tim Armstrong 


> TestImpalaShellInteractive.test_history_does_not_duplicate_on_interrupt
> ---
>
> Key: IMPALA-10119
> URL: https://issues.apache.org/jira/browse/IMPALA-10119
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 4.0
>Reporter: Tim Armstrong
>Assignee: Tamas Mate
>Priority: Critical
>  Labels: flaky
> Fix For: Impala 4.0
>
>
> This test was flaky.
> https://jenkins.impala.io/job/ubuntu-16.04-dockerised-tests/3069/testReport/junit/shell.test_shell_interactive/TestImpalaShellInteractive/test_history_does_not_duplicate_on_interrupt_table_format_and_file_extensiontextfile_txt_protocol__hs2_/
> {noformat}
> shell.test_shell_interactive.TestImpalaShellInteractive.test_history_does_not_duplicate_on_interrupt[table_format_and_file_extension:
>  ('textfile', '.txt') | protocol: hs2] (from pytest)
> Failing for the past 1 build (Since Failed#3069 )
> Took 36 sec.
> Error Message
> shell/test_shell_interactive.py:532: in 
> test_history_does_not_duplicate_on_interrupt child_proc.expect("Fetched 1 
> row\(s\) in [0-9]+\.?[0-9]*s") 
> ../infra/python/env-gcc7.5.0/lib/python2.7/site-packages/pexpect/__init__.py:1451:
>  in expect timeout, searchwindowsize) 
> ../infra/python/env-gcc7.5.0/lib/python2.7/site-packages/pexpect/__init__.py:1466:
>  in expect_list timeout, searchwindowsize) 
> ../infra/python/env-gcc7.5.0/lib/python2.7/site-packages/pexpect/__init__.py:1568:
>  in expect_loop raise TIMEOUT(str(err) + '\n' + str(self)) E   TIMEOUT: 
> Timeout exceeded. EE   version: 
> 3.3 E   command: 
> /home/ubuntu/Impala/shell/build/impala-shell-4.0.0-SNAPSHOT/impala-shell E   
> args: 
> ['/home/ubuntu/Impala/shell/build/impala-shell-4.0.0-SNAPSHOT/impala-shell', 
> '--protocol=hs2', '-ilocalhost:21050'] E   searcher:  object at 0x7f2d51b99a10> E   buffer (last 100 chars): ' default> select 
> 2;\r\n^C\r\n[localhost:21050] default> ' E   before (last 100 chars): ' 
> default> select 2;\r\n^C\r\n[localhost:21050] default> ' E   after:  'pexpect.TIMEOUT'> E   match: None E   match_index: None E   exitstatus: None 
> E   flag_eof: False E   pid: 12993 E   child_fd: 24 E   closed: False E   
> timeout: 30 E   delimiter:  E   logfile: None E   
> logfile_read: None E   logfile_send: None E   maxread: 2000 E   ignorecase: 
> False E   searchwindowsize: None E   delaybeforesend: 0.05 E   
> delayafterclose: 0.1 E   delayafterterminate: 0.1
> Stacktrace
> shell/test_shell_interactive.py:532: in 
> test_history_does_not_duplicate_on_interrupt
> child_proc.expect("Fetched 1 row\(s\) in [0-9]+\.?[0-9]*s")
> ../infra/python/env-gcc7.5.0/lib/python2.7/site-packages/pexpect/__init__.py:1451:
>  in expect
> timeout, searchwindowsize)
> ../infra/python/env-gcc7.5.0/lib/python2.7/site-packages/pexpect/__init__.py:1466:
>  in expect_list
> timeout, searchwindowsize)
> ../infra/python/env-gcc7.5.0/lib/python2.7/site-packages/pexpect/__init__.py:1568:
>  in expect_loop
> raise TIMEOUT(str(err) + '\n' + str(self))
> E   TIMEOUT: Timeout exceeded.
> E   
> E   version: 3.3
> E   command: 
> /home/ubuntu/Impala/shell/build/impala-shell-4.0.0-SNAPSHOT/impala-shell
> E   args: 
> ['/home/ubuntu/Impala/shell/build/impala-shell-4.0.0-SNAPSHOT/impala-shell', 
> '--protocol=hs2', '-ilocalhost:21050']
> E   searcher: 
> E   buffer (last 100 chars): ' default> select 2;\r\n^C\r\n[localhost:21050] 
> default> '
> E   before (last 100 chars): ' default> select 2;\r\n^C\r\n[localhost:21050] 
> default> '
> E   after: 
> E   match: None
> E   match_index: None
> E   exitstatus: None
> E   flag_eof: False
> E   pid: 12993
> E   child_fd: 24
> E   closed: False
> E   timeout: 30
> E   delimiter: 
> E   logfile: None
> E   

[jira] [Resolved] (IMPALA-10010) Allow unathenticated access to some webui endpoints

2020-09-03 Thread Thomas Tauber-Marshall (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Tauber-Marshall resolved IMPALA-10010.
-
Fix Version/s: Impala 4.0
   Resolution: Fixed

> Allow unathenticated access to some webui endpoints
> ---
>
> Key: IMPALA-10010
> URL: https://issues.apache.org/jira/browse/IMPALA-10010
> Project: IMPALA
>  Issue Type: Task
>  Components: Clients
>Reporter: Thomas Tauber-Marshall
>Assignee: Thomas Tauber-Marshall
>Priority: Major
> Fix For: Impala 4.0
>
>
> Currently, when security is turned on for the webui, eg. with 
> --webserver_require_ldap or --webserver_require_spnego, authentication is 
> applied to all webui endpoints.
> However, there are some endpoints that expose low-sensitivity info, eg. 
> /healthz, and which are scraped by other systems that it may be difficult to 
> get credentials to in order to be able to authenticate, eg. a Kubernetes 
> health check or prometheus monitoring. It would be useful to provide a way to 
> allow unauthenticated access to those endpoints.
> One option would be to run another instance of the webserver on another port. 
> This instance could be unsecured and only expose a few low-sensitivity 
> endpoints. This would allow for a configuration where Impala is run in a 
> private network and the main webserver port could be exposed externally, eg. 
> through an nginx gateway, while keeping the port for the second webserver 
> only available to internal systems.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (IMPALA-10010) Allow unathenticated access to some webui endpoints

2020-09-03 Thread Thomas Tauber-Marshall (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Tauber-Marshall resolved IMPALA-10010.
-
Fix Version/s: Impala 4.0
   Resolution: Fixed

> Allow unathenticated access to some webui endpoints
> ---
>
> Key: IMPALA-10010
> URL: https://issues.apache.org/jira/browse/IMPALA-10010
> Project: IMPALA
>  Issue Type: Task
>  Components: Clients
>Reporter: Thomas Tauber-Marshall
>Assignee: Thomas Tauber-Marshall
>Priority: Major
> Fix For: Impala 4.0
>
>
> Currently, when security is turned on for the webui, eg. with 
> --webserver_require_ldap or --webserver_require_spnego, authentication is 
> applied to all webui endpoints.
> However, there are some endpoints that expose low-sensitivity info, eg. 
> /healthz, and which are scraped by other systems that it may be difficult to 
> get credentials to in order to be able to authenticate, eg. a Kubernetes 
> health check or prometheus monitoring. It would be useful to provide a way to 
> allow unauthenticated access to those endpoints.
> One option would be to run another instance of the webserver on another port. 
> This instance could be unsecured and only expose a few low-sensitivity 
> endpoints. This would allow for a configuration where Impala is run in a 
> private network and the main webserver port could be exposed externally, eg. 
> through an nginx gateway, while keeping the port for the second webserver 
> only available to internal systems.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-10143) TestAcid.test_full_acid_original_files

2020-09-03 Thread Tamas Mate (Jira)
Tamas Mate created IMPALA-10143:
---

 Summary: TestAcid.test_full_acid_original_files
 Key: IMPALA-10143
 URL: https://issues.apache.org/jira/browse/IMPALA-10143
 Project: IMPALA
  Issue Type: Bug
  Components: Backend
Affects Versions: Impala 4.0
Reporter: Tamas Mate
Assignee: Zoltán Borók-Nagy
 Attachments: 
https_^^jenkins.impala.io^job^ubuntu-16.04-dockerised-tests^3077^.log

This test seems to be flaky.
 
[https://jenkins.impala.io/job/ubuntu-16.04-dockerised-tests/3077/testReport/junit/query_test.test_acid/TestAcid/test_full_acid_original_files_protocol__beeswax___exec_optionbatch_size___0___num_nodes___0___disable_codegen_rows_threshold___5000___disable_codegen___False___abort_on_error___1___exec_single_node_rows_threshold___0table_format__text_none_/]
{code:java}
 query_test/test_acid.py:153: in test_full_acid_original_files
self.run_test_case('QueryTest/full-acid-original-file', vector, 
unique_database)
common/impala_test_suite.py:693: in run_test_case
self.__verify_results_and_errors(vector, test_section, result, use_db)
common/impala_test_suite.py:529: in __verify_results_and_errors
replace_filenames_with_placeholder)
common/test_result_verifier.py:456: in verify_raw_results
VERIFIER_MAP[verifier](expected, actual)
common/test_result_verifier.py:278: in verify_query_result_is_equal
assert expected_results == actual_results
E   assert Comparing QueryTestResults (expected vs actual):
E 0,0,0 != 0,19,0
E 0,1,1 != 0,20,1
E 0,2,2 != 0,21,2
E 0,3,3 != 0,22,3
E 0,4,4 != 0,23,4
{code}

The test was added in IMPALA-9515.




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IMPALA-10143) TestAcid.test_full_acid_original_files

2020-09-03 Thread Tamas Mate (Jira)
Tamas Mate created IMPALA-10143:
---

 Summary: TestAcid.test_full_acid_original_files
 Key: IMPALA-10143
 URL: https://issues.apache.org/jira/browse/IMPALA-10143
 Project: IMPALA
  Issue Type: Bug
  Components: Backend
Affects Versions: Impala 4.0
Reporter: Tamas Mate
Assignee: Zoltán Borók-Nagy
 Attachments: 
https_^^jenkins.impala.io^job^ubuntu-16.04-dockerised-tests^3077^.log

This test seems to be flaky.
 
[https://jenkins.impala.io/job/ubuntu-16.04-dockerised-tests/3077/testReport/junit/query_test.test_acid/TestAcid/test_full_acid_original_files_protocol__beeswax___exec_optionbatch_size___0___num_nodes___0___disable_codegen_rows_threshold___5000___disable_codegen___False___abort_on_error___1___exec_single_node_rows_threshold___0table_format__text_none_/]
{code:java}
 query_test/test_acid.py:153: in test_full_acid_original_files
self.run_test_case('QueryTest/full-acid-original-file', vector, 
unique_database)
common/impala_test_suite.py:693: in run_test_case
self.__verify_results_and_errors(vector, test_section, result, use_db)
common/impala_test_suite.py:529: in __verify_results_and_errors
replace_filenames_with_placeholder)
common/test_result_verifier.py:456: in verify_raw_results
VERIFIER_MAP[verifier](expected, actual)
common/test_result_verifier.py:278: in verify_query_result_is_equal
assert expected_results == actual_results
E   assert Comparing QueryTestResults (expected vs actual):
E 0,0,0 != 0,19,0
E 0,1,1 != 0,20,1
E 0,2,2 != 0,21,2
E 0,3,3 != 0,22,3
E 0,4,4 != 0,23,4
{code}

The test was added in IMPALA-9515.




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-10119) TestImpalaShellInteractive.test_history_does_not_duplicate_on_interrupt

2020-09-03 Thread Tamas Mate (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tamas Mate resolved IMPALA-10119.
-
Fix Version/s: Impala 4.0
   Resolution: Fixed

> TestImpalaShellInteractive.test_history_does_not_duplicate_on_interrupt
> ---
>
> Key: IMPALA-10119
> URL: https://issues.apache.org/jira/browse/IMPALA-10119
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 4.0
>Reporter: Tim Armstrong
>Assignee: Tamas Mate
>Priority: Critical
>  Labels: flaky
> Fix For: Impala 4.0
>
>
> This test was flaky.
> https://jenkins.impala.io/job/ubuntu-16.04-dockerised-tests/3069/testReport/junit/shell.test_shell_interactive/TestImpalaShellInteractive/test_history_does_not_duplicate_on_interrupt_table_format_and_file_extensiontextfile_txt_protocol__hs2_/
> {noformat}
> shell.test_shell_interactive.TestImpalaShellInteractive.test_history_does_not_duplicate_on_interrupt[table_format_and_file_extension:
>  ('textfile', '.txt') | protocol: hs2] (from pytest)
> Failing for the past 1 build (Since Failed#3069 )
> Took 36 sec.
> Error Message
> shell/test_shell_interactive.py:532: in 
> test_history_does_not_duplicate_on_interrupt child_proc.expect("Fetched 1 
> row\(s\) in [0-9]+\.?[0-9]*s") 
> ../infra/python/env-gcc7.5.0/lib/python2.7/site-packages/pexpect/__init__.py:1451:
>  in expect timeout, searchwindowsize) 
> ../infra/python/env-gcc7.5.0/lib/python2.7/site-packages/pexpect/__init__.py:1466:
>  in expect_list timeout, searchwindowsize) 
> ../infra/python/env-gcc7.5.0/lib/python2.7/site-packages/pexpect/__init__.py:1568:
>  in expect_loop raise TIMEOUT(str(err) + '\n' + str(self)) E   TIMEOUT: 
> Timeout exceeded. EE   version: 
> 3.3 E   command: 
> /home/ubuntu/Impala/shell/build/impala-shell-4.0.0-SNAPSHOT/impala-shell E   
> args: 
> ['/home/ubuntu/Impala/shell/build/impala-shell-4.0.0-SNAPSHOT/impala-shell', 
> '--protocol=hs2', '-ilocalhost:21050'] E   searcher:  object at 0x7f2d51b99a10> E   buffer (last 100 chars): ' default> select 
> 2;\r\n^C\r\n[localhost:21050] default> ' E   before (last 100 chars): ' 
> default> select 2;\r\n^C\r\n[localhost:21050] default> ' E   after:  'pexpect.TIMEOUT'> E   match: None E   match_index: None E   exitstatus: None 
> E   flag_eof: False E   pid: 12993 E   child_fd: 24 E   closed: False E   
> timeout: 30 E   delimiter:  E   logfile: None E   
> logfile_read: None E   logfile_send: None E   maxread: 2000 E   ignorecase: 
> False E   searchwindowsize: None E   delaybeforesend: 0.05 E   
> delayafterclose: 0.1 E   delayafterterminate: 0.1
> Stacktrace
> shell/test_shell_interactive.py:532: in 
> test_history_does_not_duplicate_on_interrupt
> child_proc.expect("Fetched 1 row\(s\) in [0-9]+\.?[0-9]*s")
> ../infra/python/env-gcc7.5.0/lib/python2.7/site-packages/pexpect/__init__.py:1451:
>  in expect
> timeout, searchwindowsize)
> ../infra/python/env-gcc7.5.0/lib/python2.7/site-packages/pexpect/__init__.py:1466:
>  in expect_list
> timeout, searchwindowsize)
> ../infra/python/env-gcc7.5.0/lib/python2.7/site-packages/pexpect/__init__.py:1568:
>  in expect_loop
> raise TIMEOUT(str(err) + '\n' + str(self))
> E   TIMEOUT: Timeout exceeded.
> E   
> E   version: 3.3
> E   command: 
> /home/ubuntu/Impala/shell/build/impala-shell-4.0.0-SNAPSHOT/impala-shell
> E   args: 
> ['/home/ubuntu/Impala/shell/build/impala-shell-4.0.0-SNAPSHOT/impala-shell', 
> '--protocol=hs2', '-ilocalhost:21050']
> E   searcher: 
> E   buffer (last 100 chars): ' default> select 2;\r\n^C\r\n[localhost:21050] 
> default> '
> E   before (last 100 chars): ' default> select 2;\r\n^C\r\n[localhost:21050] 
> default> '
> E   after: 
> E   match: None
> E   match_index: None
> E   exitstatus: None
> E   flag_eof: False
> E   pid: 12993
> E   child_fd: 24
> E   closed: False
> E   timeout: 30
> E   delimiter: 
> E   logfile: None
> E   logfile_read: None
> E   logfile_send: None
> E   maxread: 2000
> E   ignorecase: False
> E   searchwindowsize: None
> E   delaybeforesend: 0.05
> E   delayafterclose: 0.1
> E   delayafterterminate: 0.1
> {noformat}
> The test was added in IMPALA-9398.
> Looks like it's stuck in this line of code waiting for the fetched message:
> {code}
> child_proc.sendintr()
> child_proc.sendline("select 2;")
> child_proc.expect("Fetched 1 row\(s\) in [0-9]+\.?[0-9]*s")
> child_proc.sendline("quit;")
> child_proc.wait()
> {code}
> [~tmate] can you have a look? It wasn't immediately obvious to me what 
> happened, although maybe the interrupt was handled after the select 2 command 
> - I see ^C after select 2 in the output?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (IMPALA-10119) TestImpalaShellInteractive.test_history_does_not_duplicate_on_interrupt

2020-09-03 Thread Tamas Mate (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tamas Mate resolved IMPALA-10119.
-
Fix Version/s: Impala 4.0
   Resolution: Fixed

> TestImpalaShellInteractive.test_history_does_not_duplicate_on_interrupt
> ---
>
> Key: IMPALA-10119
> URL: https://issues.apache.org/jira/browse/IMPALA-10119
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 4.0
>Reporter: Tim Armstrong
>Assignee: Tamas Mate
>Priority: Critical
>  Labels: flaky
> Fix For: Impala 4.0
>
>
> This test was flaky.
> https://jenkins.impala.io/job/ubuntu-16.04-dockerised-tests/3069/testReport/junit/shell.test_shell_interactive/TestImpalaShellInteractive/test_history_does_not_duplicate_on_interrupt_table_format_and_file_extensiontextfile_txt_protocol__hs2_/
> {noformat}
> shell.test_shell_interactive.TestImpalaShellInteractive.test_history_does_not_duplicate_on_interrupt[table_format_and_file_extension:
>  ('textfile', '.txt') | protocol: hs2] (from pytest)
> Failing for the past 1 build (Since Failed#3069 )
> Took 36 sec.
> Error Message
> shell/test_shell_interactive.py:532: in 
> test_history_does_not_duplicate_on_interrupt child_proc.expect("Fetched 1 
> row\(s\) in [0-9]+\.?[0-9]*s") 
> ../infra/python/env-gcc7.5.0/lib/python2.7/site-packages/pexpect/__init__.py:1451:
>  in expect timeout, searchwindowsize) 
> ../infra/python/env-gcc7.5.0/lib/python2.7/site-packages/pexpect/__init__.py:1466:
>  in expect_list timeout, searchwindowsize) 
> ../infra/python/env-gcc7.5.0/lib/python2.7/site-packages/pexpect/__init__.py:1568:
>  in expect_loop raise TIMEOUT(str(err) + '\n' + str(self)) E   TIMEOUT: 
> Timeout exceeded. EE   version: 
> 3.3 E   command: 
> /home/ubuntu/Impala/shell/build/impala-shell-4.0.0-SNAPSHOT/impala-shell E   
> args: 
> ['/home/ubuntu/Impala/shell/build/impala-shell-4.0.0-SNAPSHOT/impala-shell', 
> '--protocol=hs2', '-ilocalhost:21050'] E   searcher:  object at 0x7f2d51b99a10> E   buffer (last 100 chars): ' default> select 
> 2;\r\n^C\r\n[localhost:21050] default> ' E   before (last 100 chars): ' 
> default> select 2;\r\n^C\r\n[localhost:21050] default> ' E   after:  'pexpect.TIMEOUT'> E   match: None E   match_index: None E   exitstatus: None 
> E   flag_eof: False E   pid: 12993 E   child_fd: 24 E   closed: False E   
> timeout: 30 E   delimiter:  E   logfile: None E   
> logfile_read: None E   logfile_send: None E   maxread: 2000 E   ignorecase: 
> False E   searchwindowsize: None E   delaybeforesend: 0.05 E   
> delayafterclose: 0.1 E   delayafterterminate: 0.1
> Stacktrace
> shell/test_shell_interactive.py:532: in 
> test_history_does_not_duplicate_on_interrupt
> child_proc.expect("Fetched 1 row\(s\) in [0-9]+\.?[0-9]*s")
> ../infra/python/env-gcc7.5.0/lib/python2.7/site-packages/pexpect/__init__.py:1451:
>  in expect
> timeout, searchwindowsize)
> ../infra/python/env-gcc7.5.0/lib/python2.7/site-packages/pexpect/__init__.py:1466:
>  in expect_list
> timeout, searchwindowsize)
> ../infra/python/env-gcc7.5.0/lib/python2.7/site-packages/pexpect/__init__.py:1568:
>  in expect_loop
> raise TIMEOUT(str(err) + '\n' + str(self))
> E   TIMEOUT: Timeout exceeded.
> E   
> E   version: 3.3
> E   command: 
> /home/ubuntu/Impala/shell/build/impala-shell-4.0.0-SNAPSHOT/impala-shell
> E   args: 
> ['/home/ubuntu/Impala/shell/build/impala-shell-4.0.0-SNAPSHOT/impala-shell', 
> '--protocol=hs2', '-ilocalhost:21050']
> E   searcher: 
> E   buffer (last 100 chars): ' default> select 2;\r\n^C\r\n[localhost:21050] 
> default> '
> E   before (last 100 chars): ' default> select 2;\r\n^C\r\n[localhost:21050] 
> default> '
> E   after: 
> E   match: None
> E   match_index: None
> E   exitstatus: None
> E   flag_eof: False
> E   pid: 12993
> E   child_fd: 24
> E   closed: False
> E   timeout: 30
> E   delimiter: 
> E   logfile: None
> E   logfile_read: None
> E   logfile_send: None
> E   maxread: 2000
> E   ignorecase: False
> E   searchwindowsize: None
> E   delaybeforesend: 0.05
> E   delayafterclose: 0.1
> E   delayafterterminate: 0.1
> {noformat}
> The test was added in IMPALA-9398.
> Looks like it's stuck in this line of code waiting for the fetched message:
> {code}
> child_proc.sendintr()
> child_proc.sendline("select 2;")
> child_proc.expect("Fetched 1 row\(s\) in [0-9]+\.?[0-9]*s")
> child_proc.sendline("quit;")
> child_proc.wait()
> {code}
> [~tmate] can you have a look? It wasn't immediately obvious to me what 
> happened, although maybe the interrupt was handled after the select 2 command 
> - I see ^C after select 2 in the output?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (IMPALA-10142) Add RPC sender tracing

2020-09-03 Thread Sahil Takiar (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-10142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17190302#comment-17190302
 ] 

Sahil Takiar edited comment on IMPALA-10142 at 9/3/20, 5:03 PM:


Actually this only really be useful if the RPC response includes some trace 
information as well, otherwise it is hard to capture time actually spent on the 
network. Currently, the {{TransmitDataResponsePB}} just includes the 
{{receiver_latency_ns}}. Adding that into the trace would be useful, other 
things such the timestamp when the RPC was received by the receiver, time in 
queue, etc. would be useful as well.

The timestamp of when the RPC was received by the sender would be particularly 
useful in debugging RPCs where the network is slow.


was (Author: stakiar):
Actually this only really be useful if the RPC response includes some trace 
information as well. Currently, the {{TransmitDataResponsePB}} just includes 
the {{receiver_latency_ns}}. Adding that into the trace would be useful, other 
things such the timestamp when the RPC was received by the receiver, time in 
queue, etc. would be useful as well.

> Add RPC sender tracing
> --
>
> Key: IMPALA-10142
> URL: https://issues.apache.org/jira/browse/IMPALA-10142
> Project: IMPALA
>  Issue Type: Improvement
>Reporter: Sahil Takiar
>Priority: Major
>
> We currently have RPC tracing on the receiver side, but not on the the sender 
> side. For slow RPCs, the logs print out the total amount of time spent 
> sending the RPC + the network time. Adding tracing will basically make this 
> more granular. It will help determine where exactly in the stack the time was 
> spent when sending RPCs.
> Combined with the trace logs in the receiver, it should be much easier to 
> determine the timeline of a given slow RPC.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-10142) Add RPC sender tracing

2020-09-03 Thread Sahil Takiar (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-10142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17190302#comment-17190302
 ] 

Sahil Takiar commented on IMPALA-10142:
---

Actually this only really be useful if the RPC response includes some trace 
information as well. Currently, the {{TransmitDataResponsePB}} just includes 
the {{receiver_latency_ns}}. Adding that into the trace would be useful, other 
things such the timestamp when the RPC was received by the receiver, time in 
queue, etc. would be useful as well.

> Add RPC sender tracing
> --
>
> Key: IMPALA-10142
> URL: https://issues.apache.org/jira/browse/IMPALA-10142
> Project: IMPALA
>  Issue Type: Improvement
>Reporter: Sahil Takiar
>Priority: Major
>
> We currently have RPC tracing on the receiver side, but not on the the sender 
> side. For slow RPCs, the logs print out the total amount of time spent 
> sending the RPC + the network time. Adding tracing will basically make this 
> more granular. It will help determine where exactly in the stack the time was 
> spent when sending RPCs.
> Combined with the trace logs in the receiver, it should be much easier to 
> determine the timeline of a given slow RPC.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-10142) Add RPC sender tracing

2020-09-03 Thread Sahil Takiar (Jira)
Sahil Takiar created IMPALA-10142:
-

 Summary: Add RPC sender tracing
 Key: IMPALA-10142
 URL: https://issues.apache.org/jira/browse/IMPALA-10142
 Project: IMPALA
  Issue Type: Improvement
Reporter: Sahil Takiar


We currently have RPC tracing on the receiver side, but not on the the sender 
side. For slow RPCs, the logs print out the total amount of time spent sending 
the RPC + the network time. Adding tracing will basically make this more 
granular. It will help determine where exactly in the stack the time was spent 
when sending RPCs.

Combined with the trace logs in the receiver, it should be much easier to 
determine the timeline of a given slow RPC.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IMPALA-10142) Add RPC sender tracing

2020-09-03 Thread Sahil Takiar (Jira)
Sahil Takiar created IMPALA-10142:
-

 Summary: Add RPC sender tracing
 Key: IMPALA-10142
 URL: https://issues.apache.org/jira/browse/IMPALA-10142
 Project: IMPALA
  Issue Type: Improvement
Reporter: Sahil Takiar


We currently have RPC tracing on the receiver side, but not on the the sender 
side. For slow RPCs, the logs print out the total amount of time spent sending 
the RPC + the network time. Adding tracing will basically make this more 
granular. It will help determine where exactly in the stack the time was spent 
when sending RPCs.

Combined with the trace logs in the receiver, it should be much easier to 
determine the timeline of a given slow RPC.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-10139) Slow RPC logs can be misleading

2020-09-03 Thread Sahil Takiar (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-10139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17190296#comment-17190296
 ] 

Sahil Takiar commented on IMPALA-10139:
---

One other thought is that with result spooling enabled, the back-pressure 
mechanism won't be such a big issue anymore because results will all get 
spooled, regardless of whether clients fetch results slowly or not.

> Slow RPC logs can be misleading
> ---
>
> Key: IMPALA-10139
> URL: https://issues.apache.org/jira/browse/IMPALA-10139
> Project: IMPALA
>  Issue Type: Improvement
>Reporter: Sahil Takiar
>Priority: Major
>
> The slow RPC logs added in IMPALA-9128 are based on the total time taken to 
> successfully complete a RPC. The issue is that there are many reasons why an 
> RPC might take a long time to complete. An RPC is considered complete only 
> when the receiver has processed that RPC. 
> The problem is that due to client-driven back-pressure mechanism, it is 
> entirely possible that the receiver RPC does not process a receiver RPC 
> because {{KrpcDataStreamRecvr::SenderQueue::GetBatch}} just hasn't been 
> called yet (indirectly called by {{ExchangeNode::GetNext}}).
> This can lead to flood of slow RPC logs, even though the RPCs might not 
> actually be slow themselves. What is worse is that the because of the 
> back-pressure mechanism, slowness from the client (e.g. Hue users) will 
> propagate across all nodes involved in the query.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-10139) Slow RPC logs can be misleading

2020-09-03 Thread Sahil Takiar (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-10139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17190295#comment-17190295
 ] 

Sahil Takiar commented on IMPALA-10139:
---

I think there is a similar issue with the TRACE logs. Take the example TRACE 
above, the majority of the time the RPC was just in the deferred state - e.g. 
there was not enough resources to process the RPC. Again, this just means that 
the back-pressure mechanism was kicking in, not necessarily that the network 
was slow.

> Slow RPC logs can be misleading
> ---
>
> Key: IMPALA-10139
> URL: https://issues.apache.org/jira/browse/IMPALA-10139
> Project: IMPALA
>  Issue Type: Improvement
>Reporter: Sahil Takiar
>Priority: Major
>
> The slow RPC logs added in IMPALA-9128 are based on the total time taken to 
> successfully complete a RPC. The issue is that there are many reasons why an 
> RPC might take a long time to complete. An RPC is considered complete only 
> when the receiver has processed that RPC. 
> The problem is that due to client-driven back-pressure mechanism, it is 
> entirely possible that the receiver RPC does not process a receiver RPC 
> because {{KrpcDataStreamRecvr::SenderQueue::GetBatch}} just hasn't been 
> called yet (indirectly called by {{ExchangeNode::GetNext}}).
> This can lead to flood of slow RPC logs, even though the RPCs might not 
> actually be slow themselves. What is worse is that the because of the 
> back-pressure mechanism, slowness from the client (e.g. Hue users) will 
> propagate across all nodes involved in the query.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-10106) Update DataSketches to version 2.1.0

2020-09-03 Thread Adam Tamas (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Tamas resolved IMPALA-10106.
-
Resolution: Done

> Update DataSketches to version 2.1.0
> 
>
> Key: IMPALA-10106
> URL: https://issues.apache.org/jira/browse/IMPALA-10106
> Project: IMPALA
>  Issue Type: Improvement
>Reporter: Adam Tamas
>Assignee: Adam Tamas
>Priority: Minor
>
> Update the external DataSketches files for HLL/KLL to version 2.1.0



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-10106) Update DataSketches to version 2.1.0

2020-09-03 Thread Adam Tamas (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Tamas resolved IMPALA-10106.
-
Resolution: Done

> Update DataSketches to version 2.1.0
> 
>
> Key: IMPALA-10106
> URL: https://issues.apache.org/jira/browse/IMPALA-10106
> Project: IMPALA
>  Issue Type: Improvement
>Reporter: Adam Tamas
>Assignee: Adam Tamas
>Priority: Minor
>
> Update the external DataSketches files for HLL/KLL to version 2.1.0



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (IMPALA-10087) IMPALA-6050 causes alluxio not to be supported

2020-09-03 Thread abeltian (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

abeltian updated IMPALA-10087:
--
Labels: feature  (was: )

>  IMPALA-6050 causes alluxio not to be supported
> ---
>
> Key: IMPALA-10087
> URL: https://issues.apache.org/jira/browse/IMPALA-10087
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 3.3.0, Impala 3.4.0
>Reporter: abeltian
>Assignee: abeltian
>Priority: Major
>  Labels: feature
> Fix For: Impala 4.0
>
> Attachments: image-2020-08-18-20-09-31-714.png
>
>
> IMPALA-6050 causes alluxio not to be supported.
> {code:java}
> if (sampledFiles != null) {
> numPartitionsPerFs_ = 
> sampledFiles.keySet().stream().collect(Collectors.groupingBy(
> SampledPartitionMetadata::getPartitionFsType, 
> Collectors.counting()));
> } else {
>   numPartitionsPerFs_.putAll(partitions_.stream().collect(  
>   Collectors.groupingBy(FeFsPartition::getFsType, 
> Collectors.counting(;
> }{code}
> {code:java}
> public enum FsType {
> ADLS,
> HDFS,
> LOCAL,
> S3,
> OZONE;
> private static final Map SCHEME_TO_FS_MAPPING =
> ImmutableMap.builder()
>.put("abfs", ADLS)
>.put("abfss", ADLS)
>.put("adl", ADLS)
>.put("file", LOCAL)
>.put("hdfs", HDFS)
>.put("s3a", S3) 
>.put("o3fs", OZONE)
>.build();{code}
> {code:java}
> 10:58:47.793 AM INFO cc:288 10:58:47.793 AM INFO cc:288 
> 0543f5853d94c127:7a4e8369] java.lang.NullPointerException: element 
> cannot be mapped to a null key
>  at java.util.Objects.requireNonNull(Objects.java:228) at 
> java.util.stream.Collectors.lambda$groupingBy$45(Collectors.java:907)
>  at java.util.stream.ReduceOps$3ReducingSink.accept(ReduceOps.java:169) at 
> java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1382)
>  at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481) at 
> java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
>  at 
> java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708) at 
> java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
>  at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499) at 
> org.apache.impala.planner.HdfsScanNode.computeScanRangeLocations(HdfsScanNode.java:822)
>  at org.apache.impala.planner.HdfsScanNode.init(HdfsScanNode.java:381)
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-10087) IMPALA-6050 causes alluxio not to be supported

2020-09-03 Thread abeltian (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

abeltian resolved IMPALA-10087.
---
Fix Version/s: Impala 4.0
   Resolution: Fixed

>  IMPALA-6050 causes alluxio not to be supported
> ---
>
> Key: IMPALA-10087
> URL: https://issues.apache.org/jira/browse/IMPALA-10087
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 3.3.0, Impala 3.4.0
>Reporter: abeltian
>Assignee: abeltian
>Priority: Major
> Fix For: Impala 4.0
>
> Attachments: image-2020-08-18-20-09-31-714.png
>
>
> IMPALA-6050 causes alluxio not to be supported.
> {code:java}
> if (sampledFiles != null) {
> numPartitionsPerFs_ = 
> sampledFiles.keySet().stream().collect(Collectors.groupingBy(
> SampledPartitionMetadata::getPartitionFsType, 
> Collectors.counting()));
> } else {
>   numPartitionsPerFs_.putAll(partitions_.stream().collect(  
>   Collectors.groupingBy(FeFsPartition::getFsType, 
> Collectors.counting(;
> }{code}
> {code:java}
> public enum FsType {
> ADLS,
> HDFS,
> LOCAL,
> S3,
> OZONE;
> private static final Map SCHEME_TO_FS_MAPPING =
> ImmutableMap.builder()
>.put("abfs", ADLS)
>.put("abfss", ADLS)
>.put("adl", ADLS)
>.put("file", LOCAL)
>.put("hdfs", HDFS)
>.put("s3a", S3) 
>.put("o3fs", OZONE)
>.build();{code}
> {code:java}
> 10:58:47.793 AM INFO cc:288 10:58:47.793 AM INFO cc:288 
> 0543f5853d94c127:7a4e8369] java.lang.NullPointerException: element 
> cannot be mapped to a null key
>  at java.util.Objects.requireNonNull(Objects.java:228) at 
> java.util.stream.Collectors.lambda$groupingBy$45(Collectors.java:907)
>  at java.util.stream.ReduceOps$3ReducingSink.accept(ReduceOps.java:169) at 
> java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1382)
>  at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481) at 
> java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
>  at 
> java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708) at 
> java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
>  at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499) at 
> org.apache.impala.planner.HdfsScanNode.computeScanRangeLocations(HdfsScanNode.java:822)
>  at org.apache.impala.planner.HdfsScanNode.init(HdfsScanNode.java:381)
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-10087) IMPALA-6050 causes alluxio not to be supported

2020-09-03 Thread abeltian (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-10087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

abeltian resolved IMPALA-10087.
---
Fix Version/s: Impala 4.0
   Resolution: Fixed

>  IMPALA-6050 causes alluxio not to be supported
> ---
>
> Key: IMPALA-10087
> URL: https://issues.apache.org/jira/browse/IMPALA-10087
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 3.3.0, Impala 3.4.0
>Reporter: abeltian
>Assignee: abeltian
>Priority: Major
> Fix For: Impala 4.0
>
> Attachments: image-2020-08-18-20-09-31-714.png
>
>
> IMPALA-6050 causes alluxio not to be supported.
> {code:java}
> if (sampledFiles != null) {
> numPartitionsPerFs_ = 
> sampledFiles.keySet().stream().collect(Collectors.groupingBy(
> SampledPartitionMetadata::getPartitionFsType, 
> Collectors.counting()));
> } else {
>   numPartitionsPerFs_.putAll(partitions_.stream().collect(  
>   Collectors.groupingBy(FeFsPartition::getFsType, 
> Collectors.counting(;
> }{code}
> {code:java}
> public enum FsType {
> ADLS,
> HDFS,
> LOCAL,
> S3,
> OZONE;
> private static final Map SCHEME_TO_FS_MAPPING =
> ImmutableMap.builder()
>.put("abfs", ADLS)
>.put("abfss", ADLS)
>.put("adl", ADLS)
>.put("file", LOCAL)
>.put("hdfs", HDFS)
>.put("s3a", S3) 
>.put("o3fs", OZONE)
>.build();{code}
> {code:java}
> 10:58:47.793 AM INFO cc:288 10:58:47.793 AM INFO cc:288 
> 0543f5853d94c127:7a4e8369] java.lang.NullPointerException: element 
> cannot be mapped to a null key
>  at java.util.Objects.requireNonNull(Objects.java:228) at 
> java.util.stream.Collectors.lambda$groupingBy$45(Collectors.java:907)
>  at java.util.stream.ReduceOps$3ReducingSink.accept(ReduceOps.java:169) at 
> java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1382)
>  at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481) at 
> java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
>  at 
> java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708) at 
> java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
>  at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499) at 
> org.apache.impala.planner.HdfsScanNode.computeScanRangeLocations(HdfsScanNode.java:822)
>  at org.apache.impala.planner.HdfsScanNode.init(HdfsScanNode.java:381)
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)