date:20180529

[jira] [Commented] (IMPALA-2751) quote in WITH block's comment breaks shell

2018-05-29 Thread Fredy Wijaya (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-2751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16494695#comment-16494695
 ] 

Fredy Wijaya commented on IMPALA-2751:
--

I ran the whole shell tests on Python 2.6 and the fix works. I'll submit a new 
patch after the revert.

> quote in WITH block's comment breaks shell
> --
>
> Key: IMPALA-2751
> URL: https://issues.apache.org/jira/browse/IMPALA-2751
> Project: IMPALA
>  Issue Type: Bug
>  Components: Clients
>Affects Versions: Impala 2.2
> Environment: CDH5.4.8
>Reporter: Marcell Szabo
>Assignee: Fredy Wijaya
>Priority: Minor
>  Labels: impala-shell, shell, usability
> Fix For: Impala 2.13.0, Impala 3.1.0
>
>
> Steps to reproduce:
> $ cat > test.sql
> with a as (
> select 'a'
> -- shouldn't matter
> ) 
> select * from a; 
> $ impala-shell -f test.sql 
> /usr/bin/impala-shell: line 32: warning: setlocale: LC_CTYPE: cannot change 
> locale (UTF-8): No such file or directory
> /usr/bin/impala-shell: line 32: warning: setlocale: LC_CTYPE: cannot change 
> locale (UTF-8): No such file or directory
> Starting Impala Shell without Kerberos authentication
> Connected to host:21000
> Server version: impalad version 2.2.0-cdh5 RELEASE (build 
> 1d0b017e2441dd8950924743d839f14b3995e259)
> Traceback (most recent call last):
>   File "/usr/lib/impala-shell/impala_shell.py", line 1006, in 
> execute_queries_non_interactive_mode(options)
>   File "/usr/lib/impala-shell/impala_shell.py", line 922, in 
> execute_queries_non_interactive_mode
> if shell.onecmd(query) is CmdStatus.ERROR:
>   File "/usr/lib64/python2.6/cmd.py", line 219, in onecmd
> return func(arg)
>   File "/usr/lib/impala-shell/impala_shell.py", line 762, in do_with
> tokens = list(lexer)
>   File "/usr/lib64/python2.6/shlex.py", line 269, in next
> token = self.get_token()
>   File "/usr/lib64/python2.6/shlex.py", line 96, in get_token
> raw = self.read_token()
>   File "/usr/lib64/python2.6/shlex.py", line 172, in read_token
> raise ValueError, "No closing quotation"
> ValueError: No closing quotation
> Also, copy-pasting the query interactively, the line never closes.
> Strangely, the issue only seems to occur in presence of the WITH block.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Created] (IMPALA-7094) Fallback to an unoptimized query if ExprRewriter fails

2018-05-29 Thread Tianyi Wang (JIRA)

Tianyi Wang created IMPALA-7094:
---

 Summary: Fallback to an unoptimized query if ExprRewriter fails
 Key: IMPALA-7094
 URL: https://issues.apache.org/jira/browse/IMPALA-7094
 Project: IMPALA
  Issue Type: Improvement
  Components: Frontend
Affects Versions: Impala 3.1.0
Reporter: Tianyi Wang


The ExprRewriter in impala is growing more complex and sometimes fails. 
Currently the user have to disable the rewrite to workaround it until the bug 
is fixed. Impala should fallback to an unrewritten query automatically and 
print a warning to provide a smoother user experience if there are ExprRewriter 
bugs in the future. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-2751) quote in WITH block's comment breaks shell

2018-05-29 Thread Fredy Wijaya (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-2751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16494610#comment-16494610
 ] 

Fredy Wijaya commented on IMPALA-2751:
--

Ah the infamous unicode issue in Python 2.6. I don't have access to the build 
machines. Can you try to add .encode('utf-8') in the sqlparse.format result and 
run the tests again?

I built Python 2.6.9 manually from the source and added .encode('utf-8')
{noformat}
 
>>> query="with y as (values(7)) insert into 
>>> test_kudu_dml_reporting_256dcf63.dml_test (id) select * from y"
>>> formatted_query = sqlparse.format(query.lstrip(), 
>>> strip_comments=True).encode('utf-8')
 
>>> lexer = shlex.shlex(formatted_query, posix=True)

>>> print(list(lexer)) ['with', 'y', 'as', '(', 'values', '(', '7', ')', ')', 
>>> 'insert', 'into', 'test_kudu_dml_reporting_256dcf63', '.', 'dml_test', '(', 
>>> 'id', ')', 'select', '*', 'from', 'y']{noformat}
 

> quote in WITH block's comment breaks shell
> --
>
> Key: IMPALA-2751
> URL: https://issues.apache.org/jira/browse/IMPALA-2751
> Project: IMPALA
>  Issue Type: Bug
>  Components: Clients
>Affects Versions: Impala 2.2
> Environment: CDH5.4.8
>Reporter: Marcell Szabo
>Assignee: Fredy Wijaya
>Priority: Minor
>  Labels: impala-shell, shell, usability
> Fix For: Impala 2.13.0, Impala 3.1.0
>
>
> Steps to reproduce:
> $ cat > test.sql
> with a as (
> select 'a'
> -- shouldn't matter
> ) 
> select * from a; 
> $ impala-shell -f test.sql 
> /usr/bin/impala-shell: line 32: warning: setlocale: LC_CTYPE: cannot change 
> locale (UTF-8): No such file or directory
> /usr/bin/impala-shell: line 32: warning: setlocale: LC_CTYPE: cannot change 
> locale (UTF-8): No such file or directory
> Starting Impala Shell without Kerberos authentication
> Connected to host:21000
> Server version: impalad version 2.2.0-cdh5 RELEASE (build 
> 1d0b017e2441dd8950924743d839f14b3995e259)
> Traceback (most recent call last):
>   File "/usr/lib/impala-shell/impala_shell.py", line 1006, in 
> execute_queries_non_interactive_mode(options)
>   File "/usr/lib/impala-shell/impala_shell.py", line 922, in 
> execute_queries_non_interactive_mode
> if shell.onecmd(query) is CmdStatus.ERROR:
>   File "/usr/lib64/python2.6/cmd.py", line 219, in onecmd
> return func(arg)
>   File "/usr/lib/impala-shell/impala_shell.py", line 762, in do_with
> tokens = list(lexer)
>   File "/usr/lib64/python2.6/shlex.py", line 269, in next
> token = self.get_token()
>   File "/usr/lib64/python2.6/shlex.py", line 96, in get_token
> raw = self.read_token()
>   File "/usr/lib64/python2.6/shlex.py", line 172, in read_token
> raise ValueError, "No closing quotation"
> ValueError: No closing quotation
> Also, copy-pasting the query interactively, the line never closes.
> Strangely, the issue only seems to occur in presence of the WITH block.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-6953) Improve encapsulation within DiskIoMgr

2018-05-29 Thread ASF subversion and git services (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-6953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16494549#comment-16494549
 ] 

ASF subversion and git services commented on IMPALA-6953:
-

Commit 564687265247d957fb8cda26bcf86fcf6a80f87a in impala's branch 
refs/heads/2.x from [~tarmstr...@cloudera.com]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=5646872 ]

IMPALA-6953: part 1: clean up DiskIoMgr

There should be no behavioural changes as a result of
this refactoring.

Make DiskQueue an encapsulated class.

Remove friend classes where possible, either by using public
methods or moving code between classes.

Move method into protected in some cases.

Split GetNextRequestRange() into two methods that
operate on DiskQueue and RequestContext state. The methods
belong to the respective classes.

Reduce transitive #include dependencies to hopefully help
with build time.

Testing:
Ran core tests.

Change-Id: I50b448834b832a0ee0dc5d85541691cd2f308e12
Reviewed-on: http://gerrit.cloudera.org:8080/10538
Reviewed-by: Thomas Marshall 
Tested-by: Thomas Marshall 


> Improve encapsulation within DiskIoMgr
> --
>
> Key: IMPALA-6953
> URL: https://issues.apache.org/jira/browse/IMPALA-6953
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Reporter: Tim Armstrong
>Assignee: Tim Armstrong
>Priority: Major
>
> While DiskIoMgr is still fresh in my mind, I should do some refactoring to 
> improve the encapsulation within io::. Currently lots of classes are friends 
> with each other and some code is not in the most appropriate class.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-7093) Tables briefly appear to not exist after INVALIDATE METADATA or catalog restart

2018-05-29 Thread Todd Lipcon (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-7093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16494542#comment-16494542
 ] 

Todd Lipcon commented on IMPALA-7093:
-

Testing on Impala 2.10 I haven't been able to reproduce this, so appears it 
might be a regression, though I'll continue to attempt it.

> Tables briefly appear to not exist after INVALIDATE METADATA or catalog 
> restart
> ---
>
> Key: IMPALA-7093
> URL: https://issues.apache.org/jira/browse/IMPALA-7093
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 2.12.0, Impala 2.13.0
>Reporter: Todd Lipcon
>Priority: Major
>
> I'm doing some stress testing of Impala 2.13 (recent snapshot build) and hit 
> the following sequence:
> {code}
>  {"query": "SHOW TABLES in consistency_test", "type": "call", "id": 3}
> {"type": "response", "id": 3, "results": [["t1"]]}
> {"query": "INVALIDATE METADATA", "type": "call", "id": 7}
> {"type": "response", "id": 7}
> {"query": "DESCRIBE consistency_test.t1", "type": "call", "id": 9}
> {"type": "response", "id": 9, "error": "AnalysisException: Could not resolve 
> path: 'consistency_test.t1'\n"}
> {code}
> i.e. 'SHOW TABLES' shows that a table exists, but then shortly after an 
> INVALIDATE METADATA, an attempt to describe a table indicates that the table 
> does not exist. This is a single-threaded test case against a single impalad.
> I also saw a similar behavior that issuing queries to an impalad shortly 
> after a catalogd restart could transiently show tables not existing that in 
> fact exist.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Created] (IMPALA-7093) Tables briefly appear to not exist after INVALIDATE METADATA or catalog restart

2018-05-29 Thread Todd Lipcon (JIRA)

Todd Lipcon created IMPALA-7093:
---

 Summary: Tables briefly appear to not exist after INVALIDATE 
METADATA or catalog restart
 Key: IMPALA-7093
 URL: https://issues.apache.org/jira/browse/IMPALA-7093
 Project: IMPALA
  Issue Type: Bug
  Components: Catalog
Affects Versions: Impala 2.12.0, Impala 2.13.0
Reporter: Todd Lipcon


I'm doing some stress testing of Impala 2.13 (recent snapshot build) and hit 
the following sequence:

{code}
 {"query": "SHOW TABLES in consistency_test", "type": "call", "id": 3}
{"type": "response", "id": 3, "results": [["t1"]]}
{"query": "INVALIDATE METADATA", "type": "call", "id": 7}
{"type": "response", "id": 7}
{"query": "DESCRIBE consistency_test.t1", "type": "call", "id": 9}
{"type": "response", "id": 9, "error": "AnalysisException: Could not resolve 
path: 'consistency_test.t1'\n"}
{code}

i.e. 'SHOW TABLES' shows that a table exists, but then shortly after an 
INVALIDATE METADATA, an attempt to describe a table indicates that the table 
does not exist. This is a single-threaded test case against a single impalad.

I also saw a similar behavior that issuing queries to an impalad shortly after 
a catalogd restart could transiently show tables not existing that in fact 
exist.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-7073) Failed test: query_test.test_scanners.TestScannerReservation.test_scanners

2018-05-29 Thread ASF subversion and git services (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-7073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16494532#comment-16494532
 ] 

ASF subversion and git services commented on IMPALA-7073:
-

Commit e9bd917a218b5e1717fced983f70f64850c6e02f in impala's branch 
refs/heads/master from [~tarmstr...@cloudera.com]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=e9bd917 ]

IMPALA-7073: skip TestScannerReservation on non-miniclusters

The test is (sort of) tuned for miniclusters and is very targeted
to testing a specific code path, rather than general functional
correctness, so we don't really need coverage on all filesystems.

Change-Id: I7952f780cff80c08a6cbef898bf7b95c9bba5f6a
Reviewed-on: http://gerrit.cloudera.org:8080/10533
Reviewed-by: Thomas Marshall 
Tested-by: Impala Public Jenkins 


> Failed test: query_test.test_scanners.TestScannerReservation.test_scanners
> --
>
> Key: IMPALA-7073
> URL: https://issues.apache.org/jira/browse/IMPALA-7073
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.0
>Reporter: Dimitris Tsirogiannis
>Assignee: Tim Armstrong
>Priority: Blocker
>  Labels: broken-build, test-failure
>
> Possibly flaky test: 
> {code:java}
> Stacktrace
> query_test/test_scanners.py:1064: in test_scanners
> self.run_test_case('QueryTest/scanner-reservation', vector)
> common/impala_test_suite.py:451: in run_test_case
> verify_runtime_profile(test_section['RUNTIME_PROFILE'], 
> result.runtime_profile)
> common/test_result_verifier.py:590: in verify_runtime_profile
> actual))
> E   AssertionError: Did not find matches for lines in runtime profile:
> E   EXPECTED LINES:
> E   row_regex:.*InitialRangeActualReservation.*Avg: 4.00 MB.*{code}
> {noformat}
> 11:27:13 -- executing against localhost:21000
> 11:27:13 set debug_action="-1:OPEN:SET_DENY_RESERVATION_PROBABILITY@1.0";
> 11:27:13 
> 11:27:13 -- executing against localhost:21000
> 11:27:13 
> 11:27:13 select count(*)
> 11:27:13 from tpch.customer;
> 11:27:13 
> 11:27:13 -- executing against localhost:21000
> 11:27:13 SET DEBUG_ACTION="";
> 11:27:13 
> 11:27:13 -- executing against localhost:21000
> 11:27:13 select min(l_comment)
> 11:27:13 from tpch_parquet.lineitem;
> {noformat}
> {noformat}
> 11:27:13 E  - SpilledPartitions: 0 (0)
> 11:27:13 E   HDFS_SCAN_NODE (id=0):(Total: 2s295ms, non-child: 
> 2s295ms, % non-child: 100.00%)
> 11:27:13 E Hdfs split stats (:<# splits>/ lengths>): -1:8/193.99 MB 
> 11:27:13 E ExecOption: PARQUET Codegen Enabled, Codegen enabled: 
> 8 out of 8
> 11:27:13 E Hdfs Read Thread Concurrency Bucket: 0:80% 1:20% 2:0% 
> 3:0% 4:0% 5:0% 6:0% 
> 11:27:13 E File Formats: PARQUET/NONE:5 PARQUET/SNAPPY:3 
> 11:27:13 E BytesRead(500.000ms): 0, 21.31 MB, 21.60 MB, 47.94 MB, 
> 56.23 MB, 74.46 MB
> 11:27:13 E  - FooterProcessingTime: (Avg: 3.624ms ; Min: 
> 999.979us ; Max: 9.999ms ; Number of samples: 8)
> 11:27:13 E  - InitialRangeActualReservation: (Avg: 21.50 MB 
> (22544384) ; Min: 4.00 MB (4194304) ; Max: 24.00 MB (25165824) ; Number of 
> samples: 8)
> 11:27:13 E  - InitialRangeIdealReservation: (Avg: 128.00 KB 
> (131072) ; Min: 128.00 KB (131072) ; Max: 128.00 KB (131072) ; Number of 
> samples: 8)
> 11:27:13 E  - ParquetRowGroupActualReservation: (Avg: 24.00 MB 
> (25165824) ; Min: 24.00 MB (25165824) ; Max: 24.00 MB (25165824) ; Number of 
> samples: 3)
> 11:27:13 E  - ParquetRowGroupIdealReservation: (Avg: 24.00 MB 
> (25165824) ; Min: 24.00 MB (25165824) ; Max: 24.00 MB (25165824) ; Number of 
> samples: 3)
> 11:27:13 E  - AverageHdfsReadThreadConcurrency: 0.20 
> 11:27:13 E  - AverageScannerThreadConcurrency: 1.00 
> 11:27:13 E  - BytesRead: 74.55 MB (78175787)
> 11:27:13 E  - BytesReadDataNodeCache: 0
> 11:27:13 E  - BytesReadLocal: 0
> 11:27:13 E  - BytesReadRemoteUnexpected: 0
> 11:27:13 E  - BytesReadShortCircuit: 0
> 11:27:13 E  - CachedFileHandlesHitCount: 0 (0)
> 11:27:13 E  - CachedFileHandlesMissCount: 11 (11)
> 11:27:13 E  - CollectionItemsRead: 0 (0)
> 11:27:13 E  - DecompressionTime: 345.992ms
> 11:27:13 E  - MaxCompressedTextFileLength: 0
> 11:27:13 E  - NumColumns: 1 (1)
> 11:27:13 E  - NumDictFilteredRowGroups: 0 (0)
> 11:27:13 E  - NumDisksAccessed: 2 (2)
> 11:27:13 E  - NumRowGroups: 3 (3)
> 11:27:13 E  - NumScannerThreadReservationsDenied: 0 (0)
> 11:27:13 E  - NumScannerThreadsStarted: 1 (1)
> 11:27:13 E

[jira] [Reopened] (IMPALA-2751) quote in WITH block's comment breaks shell

2018-05-29 Thread Thomas Tauber-Marshall (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-2751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Tauber-Marshall reopened IMPALA-2751:


I put out a patch to revert this: https://gerrit.cloudera.org/#/c/10537/

This patch is causing test_kudu_dml_reporting to fail deterministically on the 
build machines. The failure doesn't repro for me locally, so I'm guessing its a 
python version issue, but I can repro it by logging into a build machine:
{noformat}
$ impala-python --version
Python 2.6.6
$ impala-python
>>> query="with y as (values(7)) insert into 
>>> test_kudu_dml_reporting_256dcf63.dml_test (id) select * from y"
>>> lexer=shlex.shlex(query.lstrip(), posix=True)
>>> print(list(lexer)) # The old way of parsing works
['with', 'y', 'as', '(', 'values', '(', '7', ')', ')', 'insert', 'into', 
'test_kudu_dml_reporting_256dcf63', '.', 'dml_test', '(', 'id', ')', 'select', 
'*', 'from', 'y']
>>> lexer=shlex.shlex(sqlparse.format(query.lstrip(), strip_comments=True), 
>>> posix=True)
>>> print(list(lexer)) # The new addition is causing weird parsing errors
['w', '\x00', '\x00', '\x00', 'i', '\x00', '\x00', '\x00', 't', '\x00', '\x00', 
'\x00', 'h', '\x00', '\x00', '\x00', '\x00', '\x00', '\x00', 'y', '\x00', 
'\x00', '\x00', '\x00', '\x00', '\x00', 'a', '\x00', '\x00', '\x00', 's', 
'\x00', '\x00', '\x00', '\x00', '\x00', '\x00', '(', '\x00', '\x00', '\x00', 
'v', '\x00', '\x00', '\x00', 'a', '\x00', '\x00', '\x00', 'l', '\x00', '\x00', 
'\x00', 'u', '\x00', '\x00', '\x00', 'e', '\x00', '\x00', '\x00', 's', '\x00', 
'\x00', '\x00', '(', '\x00', '\x00', '\x00', '7', '\x00', '\x00', '\x00', ')', 
'\x00', '\x00', '\x00', ')', '\x00', '\x00', '\x00', '\x00', '\x00', '\x00', 
'i', '\x00', '\x00', '\x00', 'n', '\x00', '\x00', '\x00', 's', '\x00', '\x00', 
'\x00', 'e', '\x00', '\x00', '\x00', 'r', '\x00', '\x00', '\x00', 't', '\x00', 
'\x00', '\x00', '\x00', '\x00', '\x00', 'i', '\x00', '\x00', '\x00', 'n', 
'\x00', '\x00', '\x00', 't', '\x00', '\x00', '\x00', 'o', '\x00', '\x00', 
'\x00', '\x00', '\x00', '\x00', 't', '\x00', '\x00', '\x00', 'e', '\x00', 
'\x00', '\x00', 's', '\x00', '\x00', '\x00', 't', '\x00', '\x00', '\x00', '_', 
'\x00', '\x00', '\x00', 'k', '\x00', '\x00', '\x00', 'u', '\x00', '\x00', 
'\x00', 'd', '\x00', '\x00', '\x00', 'u', '\x00', '\x00', '\x00', '_', '\x00', 
'\x00', '\x00', 'd', '\x00', '\x00', '\x00', 'm', '\x00', '\x00', '\x00', 'l', 
'\x00', '\x00', '\x00', '_', '\x00', '\x00', '\x00', 'r', '\x00', '\x00', 
'\x00', 'e', '\x00', '\x00', '\x00', 'p', '\x00', '\x00', '\x00', 'o', '\x00', 
'\x00', '\x00', 'r', '\x00', '\x00', '\x00', 't', '\x00', '\x00', '\x00', 'i', 
'\x00', '\x00', '\x00', 'n', '\x00', '\x00', '\x00', 'g', '\x00', '\x00', 
'\x00', '_', '\x00', '\x00', '\x00', '2', '\x00', '\x00', '\x00', '5', '\x00', 
'\x00', '\x00', '6', '\x00', '\x00', '\x00', 'd', '\x00', '\x00', '\x00', 'c', 
'\x00', '\x00', '\x00', 'f', '\x00', '\x00', '\x00', '6', '\x00', '\x00', 
'\x00', '3', '\x00', '\x00', '\x00', '.', '\x00', '\x00', '\x00', 'd', '\x00', 
'\x00', '\x00', 'm', '\x00', '\x00', '\x00', 'l', '\x00', '\x00', '\x00', '_', 
'\x00', '\x00', '\x00', 't', '\x00', '\x00', '\x00', 'e', '\x00', '\x00', 
'\x00', 's', '\x00', '\x00', '\x00', 't', '\x00', '\x00', '\x00', '\x00', 
'\x00', '\x00', '(', '\x00', '\x00', '\x00', 'i', '\x00', '\x00', '\x00', 'd', 
'\x00', '\x00', '\x00', ')', '\x00', '\x00', '\x00', '\x00', '\x00', '\x00', 
's', '\x00', '\x00', '\x00', 'e', '\x00', '\x00', '\x00', 'l', '\x00', '\x00', 
'\x00', 'e', '\x00', '\x00', '\x00', 'c', '\x00', '\x00', '\x00', 't', '\x00', 
'\x00', '\x00', '\x00', '\x00', '\x00', '*', '\x00', '\x00', '\x00', '\x00', 
'\x00', '\x00', 'f', '\x00', '\x00', '\x00', 'r', '\x00', '\x00', '\x00', 'o', 
'\x00', '\x00', '\x00', 'm', '\x00', '\x00', '\x00', '\x00', '\x00', '\x00', 
'y', '\x00', '\x00', '\x00']
{noformat}

> quote in WITH block's comment breaks shell
> --
>
> Key: IMPALA-2751
> URL: https://issues.apache.org/jira/browse/IMPALA-2751
> Project: IMPALA
>  Issue Type: Bug
>  Components: Clients
>Affects Versions: Impala 2.2
> Environment: CDH5.4.8
>Reporter: Marcell Szabo
>Assignee: Fredy Wijaya
>Priority: Minor
>  Labels: impala-shell, shell, usability
> Fix For: Impala 2.13.0, Impala 3.1.0
>
>
> Steps to reproduce:
> $ cat > test.sql
> with a as (
> select 'a'
> -- shouldn't matter
> ) 
> select * from a; 
> $ impala-shell -f test.sql 
> /usr/bin/impala-shell: line 32: warning: setlocale: LC_CTYPE: cannot change 
> locale (UTF-8): No such file or directory
> /usr/bin/impala-shell: line 32: warning: setlocale: LC_CTYPE: cannot change 
> locale (UTF-8): No such file or directory
> Starting Impala Shell without Kerberos authentication
> Connected to host:21000
> Server version: impalad versi

[jira] [Created] (IMPALA-7092) Re-enable EC tests broken by HDFS-13539

2018-05-29 Thread Tianyi Wang (JIRA)

Tianyi Wang created IMPALA-7092:
---

 Summary: Re-enable EC tests broken by HDFS-13539 
 Key: IMPALA-7092
 URL: https://issues.apache.org/jira/browse/IMPALA-7092
 Project: IMPALA
  Issue Type: Sub-task
  Components: Frontend, Infrastructure
Affects Versions: Impala 3.1.0
Reporter: Tianyi Wang
Assignee: Tianyi Wang


With HDFS-13539 and HDFS-13540 fixed, we should be able to re-enable some tests 
and diagnose the causes of the remaining failed tests without much noise.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-6086) Use of permanent function should require SELECT privilege on DB

2018-05-29 Thread Zoram Thanga (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-6086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16494461#comment-16494461
 ] 

Zoram Thanga commented on IMPALA-6086:
--

The issue seems to be that the function call "trim('abcd')" gets 
re-written/constant-folded into a string literal. We don't require any 
privileges for literals, hence the regression(?). I haven't actually tested 
this on an older release, but IMPALA-4574 and IMPALA-4586 touch on related 
issues.

[~tarmstr...@cloudera.com], does this issue look familiar?

> Use of permanent function should require SELECT privilege on DB
> ---
>
> Key: IMPALA-6086
> URL: https://issues.apache.org/jira/browse/IMPALA-6086
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog, Security
>Affects Versions: Impala 2.9.0
>Reporter: Zoram Thanga
>Assignee: Zoram Thanga
>Priority: Minor
>
> A user that has no privilege on a database should not be able to execute any 
> permanent functions in that database. This is currently possible, and should 
> be fixed, so that the user must have SELECT privilege to execute permanent 
> functions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-7070) Failed test: query_test.test_nested_types.TestParquetArrayEncodings.test_thrift_array_of_arrays on S3

2018-05-29 Thread Lars Volker (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-7070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16494413#comment-16494413
 ] 

Lars Volker commented on IMPALA-7070:
-

I tried to figure out where this error came from. The string "Input/output 
error" could come from it being a PathIOException or a subclass thereof. All of 
the subclasses add more logging to the error message, therefore I think it was 
a PathIOException.

The error seems to come from 
[fs/shell/CommandWithDestination.java:526|https://github.com/apache/hadoop/blob/dd7916d3cd5d880d0b257d229f43f10feff04c93/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/CommandWithDestination.java#L526].

{code}
  if (!rename(src.path, target.path)) {
// too bad we don't know why it failed
PathIOException e = new PathIOException(src.toString());
e.setOperation("rename");
e.setTargetPath(target.toString());
throw e;
  }
{code}

This in turn ends up calling {{rename()}} in 
[fs/s3a/S3AFileSystem.java:690|https://github.com/apache/hadoop/blob/dd7916d3cd5d880d0b257d229f43f10feff04c93/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java?utf8=%E2%9C%93#L690],
 which makes a copy (L806) and then deletes the original (L808).

This looks like a generic HDFS / S3 error to me and could be related to 
IMPALA-6910. Since keep seeing this I will reach out to HDFS folks and ask how 
to debug this.

> Failed test: 
> query_test.test_nested_types.TestParquetArrayEncodings.test_thrift_array_of_arrays
>  on S3
> -
>
> Key: IMPALA-7070
> URL: https://issues.apache.org/jira/browse/IMPALA-7070
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.0
>Reporter: Dimitris Tsirogiannis
>Assignee: Lars Volker
>Priority: Blocker
>  Labels: broken-build, test-failure
>
>  
> {code:java}
> Error Message
> query_test/test_nested_types.py:406: in test_thrift_array_of_arrays "col1 
> array>") query_test/test_nested_types.py:579: in 
> _create_test_table check_call(["hadoop", "fs", "-put", local_path, 
> location], shell=False) /usr/lib64/python2.6/subprocess.py:505: in check_call 
> raise CalledProcessError(retcode, cmd) E   CalledProcessError: Command 
> '['hadoop', 'fs', '-put', 
> '/data/jenkins/workspace/impala-asf-2.x-core-s3/repos/Impala/testdata/parquet_nested_types_encodings/bad-thrift.parquet',
>  
> 's3a://impala-cdh5-s3-test/test-warehouse/test_thrift_array_of_arrays_11da5fde.db/ThriftArrayOfArrays']'
>  returned non-zero exit status 1
> Stacktrace
> query_test/test_nested_types.py:406: in test_thrift_array_of_arrays
> "col1 array>")
> query_test/test_nested_types.py:579: in _create_test_table
> check_call(["hadoop", "fs", "-put", local_path, location], shell=False)
> /usr/lib64/python2.6/subprocess.py:505: in check_call
> raise CalledProcessError(retcode, cmd)
> E   CalledProcessError: Command '['hadoop', 'fs', '-put', 
> '/data/jenkins/workspace/impala-asf-2.x-core-s3/repos/Impala/testdata/parquet_nested_types_encodings/bad-thrift.parquet',
>  
> 's3a://impala-cdh5-s3-test/test-warehouse/test_thrift_array_of_arrays_11da5fde.db/ThriftArrayOfArrays']'
>  returned non-zero exit status 1
> Standard Error
> SET sync_ddl=False;
> -- executing against localhost:21000
> DROP DATABASE IF EXISTS `test_thrift_array_of_arrays_11da5fde` CASCADE;
> SET sync_ddl=False;
> -- executing against localhost:21000
> CREATE DATABASE `test_thrift_array_of_arrays_11da5fde`;
> MainThread: Created database "test_thrift_array_of_arrays_11da5fde" for test 
> ID 
> "query_test/test_nested_types.py::TestParquetArrayEncodings::()::test_thrift_array_of_arrays[exec_option:
>  {'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, 
> 'disable_codegen': False, 'abort_on_error': 1, 'debug_action': None, 
> 'exec_single_node_rows_threshold': 0} | table_format: parquet/none]"
> -- executing against localhost:21000
> create table test_thrift_array_of_arrays_11da5fde.ThriftArrayOfArrays (col1 
> array>) stored as parquet location 
> 's3a://impala-cdh5-s3-test/test-warehouse/test_thrift_array_of_arrays_11da5fde.db/ThriftArrayOfArrays';
> 18/05/20 18:31:03 WARN impl.MetricsConfig: Cannot locate configuration: tried 
> hadoop-metrics2-s3a-file-system.properties,hadoop-metrics2.properties
> 18/05/20 18:31:03 INFO impl.MetricsSystemImpl: Scheduled snapshot period at 
> 10 second(s).
> 18/05/20 18:31:03 INFO impl.MetricsSystemImpl: s3a-file-system metrics system 
> started
> 18/05/20 18:31:06 INFO Configuration.deprecation: 
> fs.s3a.server-side-encryption-key is deprecated. Instead, use 
> fs.s3a.server-side-encryption.key
> put: rename 
> `s3a://impala-cdh

[jira] [Commented] (IMPALA-6086) Use of permanent function should require SELECT privilege on DB

2018-05-29 Thread Zoram Thanga (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-6086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16494379#comment-16494379
 ] 

Zoram Thanga commented on IMPALA-6086:
--

Finally had some cycles to dig into this. Looks like expression rewrite may be 
the culprit, as we seem to be losing the privilegeReqs on re-analysis after the 
rewrite. Here's a sample interaction on a Sentry-enabled Impala:

{quote}
[localhost:21000] default> show tables;
Query: show tables
ERROR: AuthorizationException: User 'zoram' does not have privileges to access: 
default.*.*

[localhost:21000] default> select trim('abcd');
Query: select trim('abcd')
Query submitted at: 2018-05-29 15:22:47 (Coordinator: 
http://zoram-desktop:25000)
Query progress can be monitored at: 
http://zoram-desktop:25000/query_plan?query_id=3f48cb729a94afd4:6692d423
+--+
| trim('abcd') |
+--+
| abcd |
+--+
Fetched 1 row(s) in 4.91s
[localhost:21000] default> set ENABLE_EXPR_REWRITES = FALSE;
ENABLE_EXPR_REWRITES set to FALSE
[localhost:21000] default> select trim('abcd');
Query: select trim('abcd')
Query submitted at: 2018-05-29 15:23:07 (Coordinator: 
http://zoram-desktop:25000)
ERROR: AuthorizationException: Cannot modify system database.

[localhost:21000] default> 
{quote}

> Use of permanent function should require SELECT privilege on DB
> ---
>
> Key: IMPALA-6086
> URL: https://issues.apache.org/jira/browse/IMPALA-6086
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog, Security
>Affects Versions: Impala 2.9.0
>Reporter: Zoram Thanga
>Assignee: Zoram Thanga
>Priority: Minor
>
> A user that has no privilege on a database should not be able to execute any 
> permanent functions in that database. This is currently possible, and should 
> be fixed, so that the user must have SELECT privilege to execute permanent 
> functions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-6843) Responses to prioritizedLoad() requests should be returned directly and not via the statestore

2018-05-29 Thread Todd Lipcon (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-6843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated IMPALA-6843:

Summary: Responses to prioritizedLoad() requests should be returned 
directly and not via the statestore  (was: Responses to prioritizedLoad() 
requests are returned directly and not via the statestore)

> Responses to prioritizedLoad() requests should be returned directly and not 
> via the statestore
> --
>
> Key: IMPALA-6843
> URL: https://issues.apache.org/jira/browse/IMPALA-6843
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Catalog
>Affects Versions: Impala 2.11.0
>Reporter: Dimitris Tsirogiannis
>Assignee: Dimitris Tsirogiannis
>Priority: Major
>  Labels: catalog, frontend, latency, perfomance
>
> Currently, when a statement (e.g. SELECT) needs to access some unloaded 
> tables, it issues a prioritizedLoad() request to the catalog. The catalog 
> loads the table metadata but does not respond directly to the coordinator 
> that issued the request. Instead, the metadata for the newly loaded tables 
> are broadcast via the statestore. The problem with this approach is that the 
> latency of the response may vary significantly and may depend on the 
> latencies of other unrelated metadata operations (e.g. REFRESH) that happen 
> to be in the same topic update.
> The response to a prioritizedLoad() request should come directly to the 
> issuing coordinator. Other coordinators will receive the metadata of the 
> newly loaded table via the statestore. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-7089) test_kudu_dml_reporting failing

2018-05-29 Thread ASF subversion and git services (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-7089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16494325#comment-16494325
 ] 

ASF subversion and git services commented on IMPALA-7089:
-

Commit e660149670e7d2d18b74a6eb3bc06cb929887ca1 in impala's branch 
refs/heads/master from [~twmarshall]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=e660149 ]

IMPALA-7089: xfail test_kudu_dml_reporting

test_kudu_dml_reporting has been causing a large number of build
failures. Temporarily disable it while we figure out what's going on.

Also improve output of test_kudu_dml_reporting on failure.

Change-Id: I222e4c86a50f2450201fbad8b937e8fcf4fac31d
Reviewed-on: http://gerrit.cloudera.org:8080/10527
Reviewed-by: Joe McDonnell 
Tested-by: Impala Public Jenkins 


> test_kudu_dml_reporting failing
> ---
>
> Key: IMPALA-7089
> URL: https://issues.apache.org/jira/browse/IMPALA-7089
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Thomas Tauber-Marshall
>Assignee: Thomas Tauber-Marshall
>Priority: Blocker
>  Labels: broken-build
>
> See in numerous builds:
> {noformat}
> 00:07:23 ___ TestImpalaShell.test_kudu_dml_reporting 
> 
> 00:07:23 [gw1] linux2 -- Python 2.6.6 
> /data/jenkins/workspace/impala-asf-master-core/repos/Impala/bin/../infra/python/env/bin/python
> 00:07:23 shell/test_shell_commandline.py:601: in test_kudu_dml_reporting
> 00:07:23 "with y as (values(7)) insert into %s.dml_test (id) select * 
> from y" % db, 1, 0)
> 00:07:23 shell/test_shell_commandline.py:580: in _validate_dml_stmt
> 00:07:23 assert expected_output in results.stderr
> 00:07:23 E   assert 'Modified 1 row(s), 0 row error(s)' in 'Starting Impala 
> Shell without Kerberos authentication\nConnected to localhost:21000\nServer 
> version: impalad version 
> ...tos-6-4-0895.vpc.cloudera.com:25000/query_plan?query_id=d94f04135c4d25f9:ec1089e8\nFetched
>  0 row(s) in 0.12s\n'
> 00:07:23 E+  where 'Starting Impala Shell without Kerberos 
> authentication\nConnected to localhost:21000\nServer version: impalad version 
> ...tos-6-4-0895.vpc.cloudera.com:25000/query_plan?query_id=d94f04135c4d25f9:ec1089e8\nFetched
>  0 row(s) in 0.12s\n' =  0x7193b10>.stderr
> 00:07:23  Captured stderr setup 
> -
> 00:07:23 SET sync_ddl=False;
> 00:07:23 -- executing against localhost:21000
> 00:07:23 DROP DATABASE IF EXISTS `test_kudu_dml_reporting_256dcf63` CASCADE;
> 00:07:23 
> 00:07:23 SET sync_ddl=False;
> 00:07:23 -- executing against localhost:21000
> 00:07:23 CREATE DATABASE `test_kudu_dml_reporting_256dcf63`;
> 00:07:23 
> 00:07:23 MainThread: Created database "test_kudu_dml_reporting_256dcf63" for 
> test ID 
> "shell/test_shell_commandline.py::TestImpalaShell::()::test_kudu_dml_reporting"
> 00:07:23 = 1 failed, 1932 passed, 63 skipped, 45 xfailed, 1 xpassed in 
> 6985.36 seconds ==
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-7088) Parallel data load breaks load-data.py if loading data on a real cluster

2018-05-29 Thread ASF subversion and git services (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-7088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16494309#comment-16494309
 ] 

ASF subversion and git services commented on IMPALA-7088:
-

Commit 573550ca2f781ff5cb781a6c6dcdfcbfc25edf04 in impala's branch 
refs/heads/master from [~joemcdonnell]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=573550c ]

IMPALA-7088: Fix uninitialized variable in cluster dataload

bin/load-data.py uses a unique directory for local Hive
execution to avoid a race condition when executing multiple
Hive commands at once. This unique directory is not needed
when loading on a real cluster. However, the code to remove
the unique directory at the end does not handle this
correctly.

This skips the code to remove the unique directory when
it is uninitialized.

Change-Id: I5581a45460dc341842d77eaa09647e50f35be6c7
Reviewed-on: http://gerrit.cloudera.org:8080/10526
Reviewed-by: Joe McDonnell 
Tested-by: Impala Public Jenkins 


> Parallel data load breaks load-data.py if loading data on a real cluster
> 
>
> Key: IMPALA-7088
> URL: https://issues.apache.org/jira/browse/IMPALA-7088
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 3.0
>Reporter: David Knupp
>Assignee: Joe McDonnell
>Priority: Blocker
>
> {{Impala/bin/load-data.py}} is most commonly used to load test data onto a 
> simulated standalone cluster running on the local host. However, with the 
> correct inputs, it can also be used to load data onto an actual cluster 
> running on remote hosts.
> A recent enhancement in the load-data.py script to parallelize parts of the 
> data loading process -- https://github.com/apache/impala/commit/d481cd48 -- 
> has introduced a regression in the latter use case:
> From {{$IMPALA_HOME/logs/data_loading/data-load-functional-exhaustive.log}}:
> {noformat}
> Created table functional_hbase.widetable_1000_cols
> Took 0.7121 seconds
> 09:48:01 Beginning execution of hive SQL: 
> /home/systest/Impala-auxiliary-tests/tests/cdh_cluster/../../../Impala-cdh-cluster-test-runner/logs/data_loading/sql/functional/load-functional-query-exhaustive-hive-generated-text-none-none.sql
> Traceback (most recent call last):
>   File 
> "/home/systest/Impala-auxiliary-tests/tests/cdh_cluster/../../../Impala-cdh-cluster-test-runner/bin/load-data.py",
>  line 494, in 
> if __name__ == "__main__": main()
>   File 
> "/home/systest/Impala-auxiliary-tests/tests/cdh_cluster/../../../Impala-cdh-cluster-test-runner/bin/load-data.py",
>  line 468, in main
> hive_exec_query_files_parallel(thread_pool, hive_load_text_files)
>   File 
> "/home/systest/Impala-auxiliary-tests/tests/cdh_cluster/../../../Impala-cdh-cluster-test-runner/bin/load-data.py",
>  line 299, in hive_exec_query_files_parallel
> exec_query_files_parallel(thread_pool, query_files, 'hive')
>   File 
> "/home/systest/Impala-auxiliary-tests/tests/cdh_cluster/../../../Impala-cdh-cluster-test-runner/bin/load-data.py",
>  line 290, in exec_query_files_parallel
> for result in thread_pool.imap_unordered(execution_function, query_files):
>   File "/usr/lib/python2.7/multiprocessing/pool.py", line 659, in next
> raise value
> TypeError: coercing to Unicode: need string or buffer, NoneType found
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Resolved] (IMPALA-6712) TestRuntimeRowFilters.test_row_filters fails on a 2.x ASAN build

2018-05-29 Thread Tianyi Wang (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-6712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tianyi Wang resolved IMPALA-6712.
-
Resolution: Cannot Reproduce

Since the current known path forward is still simply increasing the timeout it 
might be unnecessary to do it if it doesn't happen recently. I will close this 
for now. Please reopen if it happens again.

> TestRuntimeRowFilters.test_row_filters fails on a 2.x ASAN build
> 
>
> Key: IMPALA-6712
> URL: https://issues.apache.org/jira/browse/IMPALA-6712
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 2.12.0
>Reporter: Taras Bobrovytsky
>Assignee: Tianyi Wang
>Priority: Critical
>
> It looks like the query profile does not contain what we are looking for:
> {noformat}
> E   AssertionError: Did not find matches for lines in runtime profile:
> E   EXPECTED LINES:
> E   row_regex: .*Rows processed: 16.38K.*{noformat}
>  
> This happened several times. The latest failure was on this commit: 
> 7336839dbb2d609005362fdff174a822462f05fb



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Closed] (IMPALA-6776) Failed to assign hbase regions to servers

2018-05-29 Thread Vuk Ercegovac (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-6776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vuk Ercegovac closed IMPALA-6776.
-
   Resolution: Cannot Reproduce
Fix Version/s: Impala 3.1.0
   Impala 2.13.0

> Failed to assign hbase regions to servers
> -
>
> Key: IMPALA-6776
> URL: https://issues.apache.org/jira/browse/IMPALA-6776
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 3.0
>Reporter: Tianyi Wang
>Assignee: Vuk Ercegovac
>Priority: Blocker
>  Labels: broken-build
> Fix For: Impala 2.13.0, Impala 3.1.0
>
>
> After switching to hadoop 3 components, split-hbase.sh failed in 
> HBaseTestDataRegionAssigment:
> {noformat}
> 20:40:27 Splitting HBase (logging to 
> /data/jenkins/workspace/impala-asf-master-exhaustive-release/repos/Impala/logs/data_loading/create-hbase.log)...
>  
> 20:41:51 FAILED (Took: 1 min 24 sec)
> 20:41:51 
> '/data/jenkins/workspace/impala-asf-master-exhaustive-release/repos/Impala/testdata/bin/split-hbase.sh'
>  failed. Tail of log:
> 20:41:51 18/03/29 20:41:50 INFO datagenerator.HBaseTestDataRegionAssigment: 
> functional_hbase.alltypessmall,3,1522381286429.7b13fefeda7afac230e22150deab5266.
>  3 -> localhost:16202, expecting localhost,16202,1522374373018
> 20:41:51 18/03/29 20:41:50 INFO datagenerator.HBaseTestDataRegionAssigment: 
> functional_hbase.alltypessmall,5,1522381287511.7a243a822c5c4844a2a3d0f67a541961.
>  5 -> localhost:16202, expecting localhost,16202,1522374373018
> 20:41:51 18/03/29 20:41:50 INFO datagenerator.HBaseTestDataRegionAssigment: 
> functional_hbase.alltypessmall,7,1522381288718.80d6e4a799ad114a146dc3cb41e18e93.
>  7 -> localhost:16203, expecting localhost,16203,1522374374705
> 20:41:51 18/03/29 20:41:50 INFO datagenerator.HBaseTestDataRegionAssigment: 
> functional_hbase.alltypessmall,9,1522381288718.d705a2ea635916f4bb510ca60764080a.
>  9 -> localhost:16203, expecting localhost,16203,1522374374705
> 20:41:51 18/03/29 20:41:50 INFO datagenerator.HBaseTestDataRegionAssigment: 
> functional_hbase.alltypessmall,,1522381282868.a99b569f5417ea9e2561eb5566c31be0.
>   -> localhost:16203, expecting localhost,16201,1522374371810
> 20:41:51 18/03/29 20:41:50 INFO datagenerator.HBaseTestDataRegionAssigment: 
> functional_hbase.alltypessmall,1,1522381285023.5fb566ba94e5fbb8aeca39f3da0a6362.
>  1 -> localhost:16201, expecting localhost,16201,1522374371810
> 20:41:51 18/03/29 20:41:50 INFO datagenerator.HBaseTestDataRegionAssigment: 
> functional_hbase.alltypessmall,3,1522381286429.7b13fefeda7afac230e22150deab5266.
>  3 -> localhost:16202, expecting localhost,16202,1522374373018
> 20:41:51 18/03/29 20:41:50 INFO datagenerator.HBaseTestDataRegionAssigment: 
> functional_hbase.alltypessmall,5,1522381287511.7a243a822c5c4844a2a3d0f67a541961.
>  5 -> localhost:16202, expecting localhost,16202,1522374373018
> 20:41:51 18/03/29 20:41:50 INFO datagenerator.HBaseTestDataRegionAssigment: 
> functional_hbase.alltypessmall,7,1522381288718.80d6e4a799ad114a146dc3cb41e18e93.
>  7 -> localhost:16203, expecting localhost,16203,1522374374705
> 20:41:51 18/03/29 20:41:50 INFO datagenerator.HBaseTestDataRegionAssigment: 
> functional_hbase.alltypessmall,9,1522381288718.d705a2ea635916f4bb510ca60764080a.
>  9 -> localhost:16203, expecting localhost,16203,1522374374705
> 20:41:51 18/03/29 20:41:50 INFO datagenerator.HBaseTestDataRegionAssigment: 
> functional_hbase.alltypessmall,,1522381282868.a99b569f5417ea9e2561eb5566c31be0.
>   -> localhost:16203, expecting localhost,16201,1522374371810
> 20:41:51 18/03/29 20:41:50 INFO datagenerator.HBaseTestDataRegionAssigment: 
> functional_hbase.alltypessmall,1,1522381285023.5fb566ba94e5fbb8aeca39f3da0a6362.
>  1 -> localhost:16201, expecting localhost,16201,1522374371810
> 20:41:51 18/03/29 20:41:50 INFO datagenerator.HBaseTestDataRegionAssigment: 
> functional_hbase.alltypessmall,3,1522381286429.7b13fefeda7afac230e22150deab5266.
>  3 -> localhost:16202, expecting localhost,16202,1522374373018
> 20:41:51 18/03/29 20:41:50 INFO datagenerator.HBaseTestDataRegionAssigment: 
> functional_hbase.alltypessmall,5,1522381287511.7a243a822c5c4844a2a3d0f67a541961.
>  5 -> localhost:16202, expecting localhost,16202,1522374373018
> 20:41:51 18/03/29 20:41:50 INFO datagenerator.HBaseTestDataRegionAssigment: 
> functional_hbase.alltypessmall,7,1522381288718.80d6e4a799ad114a146dc3cb41e18e93.
>  7 -> localhost:16203, expecting localhost,16203,1522374374705
> 20:41:51 18/03/29 20:41:50 INFO datagenerator.HBaseTestDataRegionAssigment: 
> functional_hbase.alltypessmall,9,1522381288718.d705a2ea635916f4bb510ca60764080a.
>  9 -> localhost:16203, expecting localhost,16203,1522374374705
> 20:41:51 18/03/29 20:41:51 INFO datagenerator.HBaseTestDataRegionAssigment: 
> function

[jira] [Resolved] (IMPALA-6933) test_kudu.TestCreateExternalTable on S3 failing with "AlreadyExistsException: Database already exists"

2018-05-29 Thread Vuk Ercegovac (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-6933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vuk Ercegovac resolved IMPALA-6933.
---
   Resolution: Fixed
Fix Version/s: Impala 3.1.0
   Impala 2.13.0

> test_kudu.TestCreateExternalTable on S3 failing with "AlreadyExistsException: 
> Database already exists"
> --
>
> Key: IMPALA-6933
> URL: https://issues.apache.org/jira/browse/IMPALA-6933
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 3.0
>Reporter: David Knupp
>Assignee: Vuk Ercegovac
>Priority: Critical
>  Labels: kudu, test-infra
> Fix For: Impala 2.13.0, Impala 3.1.0
>
>
> Error Message
> {noformat}
> test setup failure
> {noformat}
> Stacktrace
> {noformat}
> conftest.py:347: in conn
> with __unique_conn(db_name=db_name, timeout=timeout) as conn:
> /usr/lib64/python2.6/contextlib.py:16: in __enter__
> return self.gen.next()
> conftest.py:380: in __unique_conn
> cur.execute("CREATE DATABASE %s" % db_name)
> ../infra/python/env/lib/python2.6/site-packages/impala/hiveserver2.py:302: in 
> execute
> configuration=configuration)
> ../infra/python/env/lib/python2.6/site-packages/impala/hiveserver2.py:343: in 
> execute_async
> self._execute_async(op)
> ../infra/python/env/lib/python2.6/site-packages/impala/hiveserver2.py:362: in 
> _execute_async
> operation_fn()
> ../infra/python/env/lib/python2.6/site-packages/impala/hiveserver2.py:340: in 
> op
> async=True)
> ../infra/python/env/lib/python2.6/site-packages/impala/hiveserver2.py:1027: 
> in execute
> return self._operation('ExecuteStatement', req)
> ../infra/python/env/lib/python2.6/site-packages/impala/hiveserver2.py:957: in 
> _operation
> resp = self._rpc(kind, request)
> ../infra/python/env/lib/python2.6/site-packages/impala/hiveserver2.py:925: in 
> _rpc
> err_if_rpc_not_ok(response)
> ../infra/python/env/lib/python2.6/site-packages/impala/hiveserver2.py:704: in 
> err_if_rpc_not_ok
> raise HiveServer2Error(resp.status.errorMessage)
> E   HiveServer2Error: ImpalaRuntimeException: Error making 'createDatabase' 
> RPC to Hive Metastore: 
> E   CAUSED BY: AlreadyExistsException: Database f0mraw already exists
> {noformat}
> Tests affected:
> * query_test.test_kudu.TestCreateExternalTable.test_unsupported_binary_col
> * query_test.test_kudu.TestCreateExternalTable.test_drop_external_table
> * query_test.test_kudu.TestCreateExternalTable.test_explicit_name
> * query_test.test_kudu.TestCreateExternalTable.test_explicit_name_preference
> * query_test.test_kudu.TestCreateExternalTable.test_explicit_name_doesnt_exist
> * 
> query_test.test_kudu.TestCreateExternalTable.test_explicit_name_doesnt_exist_but_implicit_does
> * query_test.test_kudu.TestCreateExternalTable.test_table_without_partitioning
> * query_test.test_kudu.TestCreateExternalTable.test_column_name_case
> * query_test.test_kudu.TestCreateExternalTable.test_conflicting_column_name



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-6933) test_kudu.TestCreateExternalTable on S3 failing with "AlreadyExistsException: Database already exists"

2018-05-29 Thread ASF subversion and git services (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-6933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16494182#comment-16494182
 ] 

ASF subversion and git services commented on IMPALA-6933:
-

Commit 4653637b9e2ee573f3ad7a76da8941a0a4870bd8 in impala's branch 
refs/heads/master from [~vercego]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=4653637 ]

IMPALA-6933: Avoids db name collisions for Kudu tests

Kudu tests generate temporary db names in a way so that its
unlikely, yet possible to collide. A recent test failure
indicates such a collision came up. The fix changes the
way that the name is generated so that it includes the
classes name for which the db name is generated. This db name
will make it easier to see which test created it and the name
will not collide with other names generated by other tests.

Testing:
- ran the updated test locally

Change-Id: I7c2f8a35fec90ae0dabe80237d83954668b47f6e
Reviewed-on: http://gerrit.cloudera.org:8080/10513
Reviewed-by: Michael Brown 
Tested-by: Impala Public Jenkins 


> test_kudu.TestCreateExternalTable on S3 failing with "AlreadyExistsException: 
> Database already exists"
> --
>
> Key: IMPALA-6933
> URL: https://issues.apache.org/jira/browse/IMPALA-6933
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 3.0
>Reporter: David Knupp
>Assignee: Vuk Ercegovac
>Priority: Critical
>  Labels: kudu, test-infra
>
> Error Message
> {noformat}
> test setup failure
> {noformat}
> Stacktrace
> {noformat}
> conftest.py:347: in conn
> with __unique_conn(db_name=db_name, timeout=timeout) as conn:
> /usr/lib64/python2.6/contextlib.py:16: in __enter__
> return self.gen.next()
> conftest.py:380: in __unique_conn
> cur.execute("CREATE DATABASE %s" % db_name)
> ../infra/python/env/lib/python2.6/site-packages/impala/hiveserver2.py:302: in 
> execute
> configuration=configuration)
> ../infra/python/env/lib/python2.6/site-packages/impala/hiveserver2.py:343: in 
> execute_async
> self._execute_async(op)
> ../infra/python/env/lib/python2.6/site-packages/impala/hiveserver2.py:362: in 
> _execute_async
> operation_fn()
> ../infra/python/env/lib/python2.6/site-packages/impala/hiveserver2.py:340: in 
> op
> async=True)
> ../infra/python/env/lib/python2.6/site-packages/impala/hiveserver2.py:1027: 
> in execute
> return self._operation('ExecuteStatement', req)
> ../infra/python/env/lib/python2.6/site-packages/impala/hiveserver2.py:957: in 
> _operation
> resp = self._rpc(kind, request)
> ../infra/python/env/lib/python2.6/site-packages/impala/hiveserver2.py:925: in 
> _rpc
> err_if_rpc_not_ok(response)
> ../infra/python/env/lib/python2.6/site-packages/impala/hiveserver2.py:704: in 
> err_if_rpc_not_ok
> raise HiveServer2Error(resp.status.errorMessage)
> E   HiveServer2Error: ImpalaRuntimeException: Error making 'createDatabase' 
> RPC to Hive Metastore: 
> E   CAUSED BY: AlreadyExistsException: Database f0mraw already exists
> {noformat}
> Tests affected:
> * query_test.test_kudu.TestCreateExternalTable.test_unsupported_binary_col
> * query_test.test_kudu.TestCreateExternalTable.test_drop_external_table
> * query_test.test_kudu.TestCreateExternalTable.test_explicit_name
> * query_test.test_kudu.TestCreateExternalTable.test_explicit_name_preference
> * query_test.test_kudu.TestCreateExternalTable.test_explicit_name_doesnt_exist
> * 
> query_test.test_kudu.TestCreateExternalTable.test_explicit_name_doesnt_exist_but_implicit_does
> * query_test.test_kudu.TestCreateExternalTable.test_table_without_partitioning
> * query_test.test_kudu.TestCreateExternalTable.test_column_name_case
> * query_test.test_kudu.TestCreateExternalTable.test_conflicting_column_name



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Work started] (IMPALA-7090) EqualityDisjunctsToInRule should respect the limit on the number of children in an expr

2018-05-29 Thread Tianyi Wang (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-7090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-7090 started by Tianyi Wang.
---
> EqualityDisjunctsToInRule should respect the limit on the number of children 
> in an expr
> ---
>
> Key: IMPALA-7090
> URL: https://issues.apache.org/jira/browse/IMPALA-7090
> Project: IMPALA
>  Issue Type: Bug
>Affects Versions: Impala 3.0, Impala 2.12.0
>Reporter: Tianyi Wang
>Assignee: Tianyi Wang
>Priority: Critical
>
> Currently EqualityDisjunctsToInRule introduced in IMPALA-5280 might create an 
> expr with unlimited number of children and fails a query, which should be 
> avoided. The easy solution is to not apply the rewrite when the number of 
> children is large.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-6990) TestClientSsl.test_tls_v12 failing due to Python SSL error

2018-05-29 Thread Philip Zeyliger (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-6990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16494139#comment-16494139
 ] 

Philip Zeyliger commented on IMPALA-6990:
-

Is this user-visible? Let's say that a user had impala-shell working on RH6 or 
RH7 before. Does it still work? Does it work when using the same 
{{ssl-minimum-version}} and {{ssl-cipher-list}} flags?

I think this test is saying that these flags don't work for the Python shipped 
in RH7. I suspect they didn't work before either: did they somehow work before? 
Surely before the Thrift change, we were using the same RH image?

Once we've figured this out, I think the easier thing to do is to disable the 
test when using a too-old version of Python. We already have a "skip if legacy 
SSL" flag on the test; this is just one more skip if. We still want to run the 
test for Ubuntu16 or whatever. I think we can assume that the Python running 
the test and the python running impala-shell are the same for our purposes.

Is there a weaker test that we'd want to add?

> TestClientSsl.test_tls_v12 failing due to Python SSL error
> --
>
> Key: IMPALA-6990
> URL: https://issues.apache.org/jira/browse/IMPALA-6990
> Project: IMPALA
>  Issue Type: Bug
>Affects Versions: Impala 3.0
>Reporter: Sailesh Mukil
>Assignee: Sailesh Mukil
>Priority: Blocker
>  Labels: broken-build, flaky
>
> We've seen quite a few jobs fail with the following error:
> *_ssl.c:504: EOF occurred in violation of protocol*
> {code:java}
> custom_cluster/test_client_ssl.py:128: in test_tls_v12
> self._validate_positive_cases("%s/server-cert.pem" % self.CERT_DIR)
> custom_cluster/test_client_ssl.py:181: in _validate_positive_cases
> result = run_impala_shell_cmd(shell_options)
> shell/util.py:97: in run_impala_shell_cmd
> result.stderr)
> E   AssertionError: Cmd --ssl -q 'select 1 + 2' was expected to succeed: 
> Starting Impala Shell without Kerberos authentication
> E   SSL is enabled. Impala server certificates will NOT be verified (set 
> --ca_cert to change)
> E   
> /data/jenkins/workspace/impala-cdh6.x-exhaustive-rhel7/Impala-Toolchain/thrift-0.9.3-p4/python/lib64/python2.7/site-packages/thrift/transport/TSSLSocket.py:80:
>  DeprecationWarning: 3th positional argument is deprecated. Use keyward 
> argument insteand.
> E DeprecationWarning)
> E   
> /data/jenkins/workspace/impala-cdh6.x-exhaustive-rhel7/Impala-Toolchain/thrift-0.9.3-p4/python/lib64/python2.7/site-packages/thrift/transport/TSSLSocket.py:80:
>  DeprecationWarning: 4th positional argument is deprecated. Use keyward 
> argument insteand.
> E DeprecationWarning)
> E   
> /data/jenkins/workspace/impala-cdh6.x-exhaustive-rhel7/Impala-Toolchain/thrift-0.9.3-p4/python/lib64/python2.7/site-packages/thrift/transport/TSSLSocket.py:80:
>  DeprecationWarning: 5th positional argument is deprecated. Use keyward 
> argument insteand.
> E DeprecationWarning)
> E   
> /data/jenkins/workspace/impala-cdh6.x-exhaustive-rhel7/Impala-Toolchain/thrift-0.9.3-p4/python/lib64/python2.7/site-packages/thrift/transport/TSSLSocket.py:216:
>  DeprecationWarning: validate is deprecated. Use cert_reqs=ssl.CERT_NONE 
> instead
> E DeprecationWarning)
> E   No handlers could be found for logger "thrift.transport.TSSLSocket"
> E   Error connecting: TTransportException, Could not connect to 
> localhost:21000: [Errno 8] _ssl.c:504: EOF occurred in violation of protocol
> E   Not connected to Impala, could not execute queries.
> {code}
> We need to investigate why this is happening and fix it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-6990) TestClientSsl.test_tls_v12 failing due to Python SSL error

2018-05-29 Thread Sailesh Mukil (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-6990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16494124#comment-16494124
 ] 

Sailesh Mukil commented on IMPALA-6990:
---

Spent some more time looking at this and found that 'requests' wasn't the 
culprit.

When we upgraded to thrift-0.9.3, the TSSLSocket.py logic changed quite a bit. 
Our RHEL7 machines come equipped with Python 2.7.5. Looking at these comments, 
that means that we'll be unable to create a 'SSLContext' but able to explicitly 
specify ciphers:
https://github.com/apache/thrift/blob/master/lib/py/src/transport/TSSLSocket.py#L37-L41

{code:java}
# SSLContext is not available for Python < 2.7.9
_has_ssl_context = sys.hexversion >= 0x020709F0

# ciphers argument is not available for Python < 2.7.0
_has_ciphers = sys.hexversion >= 0x020700F0
{code}

If we cannot create a 'SSLContext', then we cannot use TLSv1.2 and have to use 
TLSv1:
https://github.com/apache/thrift/blob/master/lib/py/src/transport/TSSLSocket.py#L48-L49
{code:java}
# For python >= 2.7.9, use latest TLS that both client and server
# supports.
# SSL 2.0 and 3.0 are disabled via ssl.OP_NO_SSLv2 and ssl.OP_NO_SSLv3.
# For python < 2.7.9, use TLS 1.0 since TLSv1_X nor OP_NO_SSLvX is
# unavailable.
_default_protocol = ssl.PROTOCOL_SSLv23 if _has_ssl_context else \
ssl.PROTOCOL_TLSv1
{code}

Our custom cluster test forces the server to use TLSv1.2 and also forces a 
specific cipher:
https://github.com/apache/impala/blob/master/tests/custom_cluster/test_client_ssl.py#L118-L119

So this combination of configurations causes a failure in RHEL7 because we only 
allow a specific cipher which works with TLSv1.2, but the client cannot use 
TLSv1.2 due to the Python version as mentioned above.

On systems lower than RHEL7, the machines come equipped with Python 2.6.6, 
which does not force the use of specific ciphers, so we get away without a 
failure.

To fix this, we either need to change the Python version on RHEL 7 to be >= 
Python 2.7.9, or reduce the 'test_client_ssl' limitation to run TLSv1.

The second option is the quickest, although not ideal, but it should at least 
unblock our builds while we can upgrade the AMIs for RHEL7.

> TestClientSsl.test_tls_v12 failing due to Python SSL error
> --
>
> Key: IMPALA-6990
> URL: https://issues.apache.org/jira/browse/IMPALA-6990
> Project: IMPALA
>  Issue Type: Bug
>Affects Versions: Impala 3.0
>Reporter: Sailesh Mukil
>Assignee: Sailesh Mukil
>Priority: Blocker
>  Labels: broken-build, flaky
>
> We've seen quite a few jobs fail with the following error:
> *_ssl.c:504: EOF occurred in violation of protocol*
> {code:java}
> custom_cluster/test_client_ssl.py:128: in test_tls_v12
> self._validate_positive_cases("%s/server-cert.pem" % self.CERT_DIR)
> custom_cluster/test_client_ssl.py:181: in _validate_positive_cases
> result = run_impala_shell_cmd(shell_options)
> shell/util.py:97: in run_impala_shell_cmd
> result.stderr)
> E   AssertionError: Cmd --ssl -q 'select 1 + 2' was expected to succeed: 
> Starting Impala Shell without Kerberos authentication
> E   SSL is enabled. Impala server certificates will NOT be verified (set 
> --ca_cert to change)
> E   
> /data/jenkins/workspace/impala-cdh6.x-exhaustive-rhel7/Impala-Toolchain/thrift-0.9.3-p4/python/lib64/python2.7/site-packages/thrift/transport/TSSLSocket.py:80:
>  DeprecationWarning: 3th positional argument is deprecated. Use keyward 
> argument insteand.
> E DeprecationWarning)
> E   
> /data/jenkins/workspace/impala-cdh6.x-exhaustive-rhel7/Impala-Toolchain/thrift-0.9.3-p4/python/lib64/python2.7/site-packages/thrift/transport/TSSLSocket.py:80:
>  DeprecationWarning: 4th positional argument is deprecated. Use keyward 
> argument insteand.
> E DeprecationWarning)
> E   
> /data/jenkins/workspace/impala-cdh6.x-exhaustive-rhel7/Impala-Toolchain/thrift-0.9.3-p4/python/lib64/python2.7/site-packages/thrift/transport/TSSLSocket.py:80:
>  DeprecationWarning: 5th positional argument is deprecated. Use keyward 
> argument insteand.
> E DeprecationWarning)
> E   
> /data/jenkins/workspace/impala-cdh6.x-exhaustive-rhel7/Impala-Toolchain/thrift-0.9.3-p4/python/lib64/python2.7/site-packages/thrift/transport/TSSLSocket.py:216:
>  DeprecationWarning: validate is deprecated. Use cert_reqs=ssl.CERT_NONE 
> instead
> E DeprecationWarning)
> E   No handlers could be found for logger "thrift.transport.TSSLSocket"
> E   Error connecting: TTransportException, Could not connect to 
> localhost:21000: [Errno 8] _ssl.c:504: EOF occurred in violation of protocol
> E   Not connected to Impala, could not execute queries.
> {code}
> We need to investigate why this is happening and fix it.



--
This message w

[jira] [Created] (IMPALA-7091) Occasional errors with failure.test_failpoints.TestFailpoints.test_failpoints in hbase tests

2018-05-29 Thread Philip Zeyliger (JIRA)

Philip Zeyliger created IMPALA-7091:
---

 Summary: Occasional errors with 
failure.test_failpoints.TestFailpoints.test_failpoints in hbase tests
 Key: IMPALA-7091
 URL: https://issues.apache.org/jira/browse/IMPALA-7091
 Project: IMPALA
  Issue Type: Task
  Components: Frontend
Reporter: Philip Zeyliger
Assignee: Philip Zeyliger


When running the following test with "test-with-docker", I sometimes (but not 
always) see it fail.
{code:java}
failure.test_failpoints.TestFailpoints.test_failpoints[table_format: hbase/none 
| exec_option: {'batch_size': 0, 'num_nodes': 0, 
'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 
'abort_on_error': 1, 'debug_action': None, 'exec_single_node_rows_threshold': 
0} | mt_dop: 4 | location: OPEN | action: MEM_LIMIT_EXCEEDED | query: select * 
from alltypessmall union all select * from alltypessmall]{code}
 
The error I see is an NPE, and, correlating some logs, I think it's this:
{code}
26420:I0524 14:30:14.696190 12271 jni-util.cc:230] 
java.lang.NullPointerException
26421-  at 
org.apache.impala.catalog.HBaseTable.getRegionSize(HBaseTable.java:652)
26422-  at 
org.apache.impala.catalog.HBaseTable.getEstimatedRowStatsForRegion(HBaseTable.java:520)
26423-  at 
org.apache.impala.catalog.HBaseTable.getEstimatedRowStats(HBaseTable.java:605)
26424-  at 
org.apache.impala.planner.HBaseScanNode.computeStats(HBaseScanNode.java:203)
26425-  at org.apache.impala.planner.HBaseScanNode.init(HBaseScanNode.java:127)
26426-  at 
org.apache.impala.planner.SingleNodePlanner.createScanNode(SingleNodePlanner.java:1344)
26427-  at 
org.apache.impala.planner.SingleNodePlanner.createTableRefNode(SingleNodePlanner.java:1514)
26428-  at 
org.apache.impala.planner.SingleNodePlanner.createTableRefsPlan(SingleNodePlanner.java:776)
26429-  at 
org.apache.impala.planner.SingleNodePlanner.createSelectPlan(SingleNodePlanner.java:614)
26430-  at 
org.apache.impala.planner.SingleNodePlanner.createQueryPlan(SingleNodePlanner.java:257)
26431-  at 
org.apache.impala.planner.SingleNodePlanner.createUnionPlan(SingleNodePlanner.java:1563)
26432-  at 
org.apache.impala.planner.SingleNodePlanner.createUnionPlan(SingleNodePlanner.java:1630)
26433-  at 
org.apache.impala.planner.SingleNodePlanner.createQueryPlan(SingleNodePlanner.java:275)
26434-  at 
org.apache.impala.planner.SingleNodePlanner.createSingleNodePlan(SingleNodePlanner.java:147)
26435-  at org.apache.impala.planner.Planner.createPlan(Planner.java:101)
26436-  at 
org.apache.impala.planner.Planner.createParallelPlans(Planner.java:230)
26437-  at 
org.apache.impala.service.Frontend.createExecRequest(Frontend.java:938)
26438-  at 
org.apache.impala.service.Frontend.createExecRequest(Frontend.java:1062)
26439-  at 
org.apache.impala.service.JniFrontend.createExecRequest(JniFrontend.java:156)
26440:I0524 14:30:14.796514 12271 status.cc:125] NullPointerException: null
26441-@  0x1891839  impala::Status::Status()
{code}

The test-with-docker stuff starts HBase at run time independently of data load 
in a way that our other tests don't, and I suspect HBase simply hasn't loaded 
the tables.

I have a change forthcoming to address this.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Created] (IMPALA-7090) EqualityDisjunctsToInRule should respect the limit on the number of children in an expr

2018-05-29 Thread Tianyi Wang (JIRA)

Tianyi Wang created IMPALA-7090:
---

 Summary: EqualityDisjunctsToInRule should respect the limit on the 
number of children in an expr
 Key: IMPALA-7090
 URL: https://issues.apache.org/jira/browse/IMPALA-7090
 Project: IMPALA
  Issue Type: Bug
Affects Versions: Impala 2.12.0, Impala 3.0
Reporter: Tianyi Wang
Assignee: Tianyi Wang


Currently EqualityDisjunctsToInRule introduced in IMPALA-5280 might create an 
expr with unlimited number of children and fails a query, which should be 
avoided. The easy solution is to not apply the rewrite when the number of 
children is large.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-7067) sleep(100000) command from test_shell_commandline.py can hang around and cause test_metrics_are_zero to fail

2018-05-29 Thread ASF subversion and git services (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-7067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16494093#comment-16494093
 ] 

ASF subversion and git services commented on IMPALA-7067:
-

Commit 0e7b075923cbecce4db2fd2e4fa3edf63afef06f in impala's branch 
refs/heads/2.x from [~tarmstr...@cloudera.com]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=0e7b075 ]

IMPALA-7067: deflake test_cancellation

Tweak the query so that it still runs for a long time but
can cancel the fragment quicker instead of being
stuck in a long sleep() call.

Change-Id: I0c90d4f5c277f7b0d5561637944b454f7a44c76e
Reviewed-on: http://gerrit.cloudera.org:8080/10499
Reviewed-by: Tim Armstrong 
Tested-by: Tim Armstrong 


> sleep(10) command from test_shell_commandline.py can hang around and 
> cause test_metrics_are_zero to fail
> 
>
> Key: IMPALA-7067
> URL: https://issues.apache.org/jira/browse/IMPALA-7067
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 3.1.0
>Reporter: Tim Armstrong
>Assignee: Tim Armstrong
>Priority: Critical
>  Labels: flaky
> Fix For: Impala 2.13.0, Impala 3.1.0
>
>
> {noformat}
> 03:25:47 [gw6] PASSED 
> shell/test_shell_commandline.py::TestImpalaShell::test_cancellation 
> ...
> 03:27:01 verifiers/test_verify_metrics.py:34: in test_metrics_are_zero
> 03:27:01 verifier.verify_metrics_are_zero()
> 03:27:01 verifiers/metric_verifier.py:47: in verify_metrics_are_zero
> 03:27:01 self.wait_for_metric(metric, 0, timeout)
> 03:27:01 verifiers/metric_verifier.py:62: in wait_for_metric
> 03:27:01 self.impalad_service.wait_for_metric_value(metric_name, 
> expected_value, timeout)
> 03:27:01 common/impala_service.py:135: in wait_for_metric_value
> 03:27:01 json.dumps(self.read_debug_webpage('rpcz?json')))
> 03:27:01 E   AssertionError: Metric value impala-server.mem-pool.total-bytes 
> did not reach value 0 in 60s
> {noformat}
> I used the json dump from memz and the logs to trace it back to the 
> sleep(10) query hanging around



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-7058) RC and Seq fuzz tests cause crash

2018-05-29 Thread ASF subversion and git services (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-7058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16494090#comment-16494090
 ] 

ASF subversion and git services commented on IMPALA-7058:
-

Commit 47606806a478ea003d6487d375bf683682c16298 in impala's branch 
refs/heads/2.x from [~tarmstr...@cloudera.com]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=4760680 ]

IMPALA-7058: disable fuzz test for RC and Seq

There appear to still be some rare crashes. Let's disable the
test until we can sort those out

Change-Id: I10eb184ab2f27ca9b2d286630ceb37b71affcc27
Reviewed-on: http://gerrit.cloudera.org:8080/10485
Reviewed-by: Alex Behm 
Tested-by: Impala Public Jenkins 


> RC and Seq fuzz tests cause crash
> -
>
> Key: IMPALA-7058
> URL: https://issues.apache.org/jira/browse/IMPALA-7058
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.0
>Reporter: Dimitris Tsirogiannis
>Assignee: Tim Armstrong
>Priority: Blocker
>  Labels: broken-build, crash
> Fix For: Impala 2.13.0, Impala 3.1.0
>
>
> The backtrace is here:
> {code:java}
> #7 0x02d89a84 in 
> impala::DelimitedTextParser::ParseFieldLocations (this=0xcf539a0, 
> max_tuples=1, remaining_len=-102, byte_buffer_ptr=0x7fc6b764dad0, 
> row_end_locations=0x7fc6b764dac0, field_locations=0x10034000, 
> num_tuples=0x7fc6b764dacc, num_fields=0x7fc6b764dac8, 
> next_column_start=0x7fc6b764dad8) at 
> /data/jenkins/workspace/impala-asf-master-exhaustive-rhel7/repos/Impala/be/src/exec/delimited-text-parser.cc:205
> #8 0x01fdb641 in impala::HdfsSequenceScanner::ProcessRange 
> (this=0x15515f80, row_batch=0xcf54800) at 
> /data/jenkins/workspace/impala-asf-master-exhaustive-rhel7/repos/Impala/be/src/exec/hdfs-sequence-scanner.cc:352
> #9 0x02d7a20e in impala::BaseSequenceScanner::GetNextInternal 
> (this=0x15515f80, row_batch=0xcf54800) at 
> /data/jenkins/workspace/impala-asf-master-exhaustive-rhel7/repos/Impala/be/src/exec/base-sequence-scanner.cc:181
> #10 0x01fb1ff0 in impala::HdfsScanner::ProcessSplit (this=0x15515f80) 
> at 
> /data/jenkins/workspace/impala-asf-master-exhaustive-rhel7/repos/Impala/be/src/exec/hdfs-scanner.cc:134
> #11 0x01f89258 in impala::HdfsScanNode::ProcessSplit 
> (this=0x2a4a8800, filter_ctxs=..., expr_results_pool=0x7fc6b764e4b0, 
> scan_range=0x13f5f8700, scanner_thread_reservation=0x7fc6b764e428) at 
> /data/jenkins/workspace/impala-asf-master-exhaustive-rhel7/repos/Impala/be/src/exec/hdfs-scan-node.cc:453
> #12 0x01f885f9 in impala::HdfsScanNode::ScannerThread 
> (this=0x2a4a8800, first_thread=false, scanner_thread_reservation=32768) at 
> /data/jenkins/workspace/impala-asf-master-exhaustive-rhel7/repos/Impala/be/src/exec/hdfs-scan-node.cc:360
> #13 0x01f87a6c in impala::HdfsScanNodeoperator()(void) 
> const (__closure=0x7fc6b764ebe8) at 
> /data/jenkins/workspace/impala-asf-master-exhaustive-rhel7/repos/Impala/be/src/exec/hdfs-scan-node.cc:292
> #14 0x01f89ac8 in 
> boost::detail::function::void_function_obj_invoker0,
>  void>::invoke(boost::detail::function::function_buffer &) 
> (function_obj_ptr=...) at 
> /data/jenkins/workspace/impala-asf-master-exhaustive-rhel7/Impala-Toolchain/boost-1.57.0-p3/include/boost/function/function_template.hpp:153
> #15 0x01bf0b28 in boost::function0::operator() 
> (this=0x7fc6b764ebe0) at 
> /data/jenkins/workspace/impala-asf-master-exhaustive-rhel7/Impala-Toolchain/boost-1.57.0-p3/include/boost/function/function_template.hpp:767
> #16 0x01edc57f in impala::Thread::SuperviseThread(std::string const&, 
> std::string const&, boost::function, impala::ThreadDebugInfo const*, 
> impala::Promise*) (name=..., category=..., functor=..., 
> parent_thread_info=0x7fc6b9e53890, thread_started=0x7fc6b9e527c0) at 
> /data/jenkins/workspace/impala-asf-master-exhaustive-rhel7/repos/Impala/be/src/util/thread.cc:356
> #17 0x01ee471b in boost::_bi::list5, 
> boost::_bi::value, boost::_bi::value >, 
> boost::_bi::value, 
> boost::_bi::value*> >::operator() const&, std::string const&, boost::function, impala::ThreadDebugInfo 
> const*, impala::Promise*), boost::_bi::list0>(boost::_bi::type, 
> void (*&)(std::string const&, std::string const&, boost::function, 
> impala::ThreadDebugInfo const*, impala::Promise*), boost::_bi::list0&, 
> int) (this=0x2a370fc0, f=@0x2a370fb8: 0x1edc218 
>  boost::function, impala::ThreadDebugInfo const*, 
> impala::Promise*)>, a=...) at 
> /data/jenkins/workspace/impala-asf-master-exhaustive-rhel7/Impala-Toolchain/boost-1.57.0-p3/include/boost/bind/bind.hpp:525
> #18 0x01ee463f in boost::_bi::bind_t const&, std::string const&, boost::function, impala::ThreadDebugInfo 
> const*, impala::Promise*), 
> boos

[jira] [Commented] (IMPALA-5662) Log all information relevant to admission control decision making

2018-05-29 Thread ASF subversion and git services (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-5662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16494087#comment-16494087
 ] 

ASF subversion and git services commented on IMPALA-5662:
-

Commit 466188b3970595e2e04d7ecf6a5141a7d3012909 in impala's branch 
refs/heads/2.x from [~bikram.sngh91]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=466188b ]

IMPALA-3134: Support different proc mem limits among impalads for
admission control checks

Currently the admission controller assumes that all backends have the
same process mem limit as the impalad it itself is running on. With
this patch the proc mem limit for each impalad is available to the
admission controller and it uses it for making correct admisssion
decisions. It currently works under the assumption that the
per-process memory limit does not change dynamically.

Testing:
Added an e2e test.

IMPALA-5662: Log the queuing reason for a query

The queuing reason is now logged both while queuing for the first
time and while trying to dequeue.

Change-Id: Idb72eee790cc17466bbfa82e30f369a65f2b060e
Reviewed-on: http://gerrit.cloudera.org:8080/10396
Reviewed-by: Bikramjeet Vig 
Tested-by: Impala Public Jenkins 


> Log all information relevant to admission control decision making
> -
>
> Key: IMPALA-5662
> URL: https://issues.apache.org/jira/browse/IMPALA-5662
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Reporter: Balazs Jeszenszky
>Assignee: Bikramjeet Vig
>Priority: Major
>  Labels: admission-control, observability, resource-management, 
> supportability
> Fix For: Impala 3.1.0
>
>
> Currently, when making a decision whether to admit a query or not, the log 
> has the following format:
> {code:java}
> I0705 14:43:04.031771  7388 admission-controller.cc:442] Stats: 
> agg_num_running=1, agg_num_queued=0, agg_mem_reserved=486.74 MB,  
> local_host(local_mem_admitted=0, num_admitted_running=0, num_queued=0, 
> backend_mem_reserved=56.07 MB)
> {code}
> Since it's also possible to queue queries due to one node not being able to 
> reserve the required memory, we should also log the max(backend_mem_reserved) 
> across all nodes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-7055) test_avro_writer failing on upstream Jenkins (Expected exception: "Writing to table format AVRO is not supported")

2018-05-29 Thread ASF subversion and git services (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-7055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16494092#comment-16494092
 ] 

ASF subversion and git services commented on IMPALA-7055:
-

Commit 1ba8581ceeac4f3c8dbf2b56139dec420de6e967 in impala's branch 
refs/heads/2.x from [~tarmstr...@cloudera.com]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=1ba8581 ]

IMPALA-7055: fix race with DML errors

Error statuses could be lost because backend_exec_complete_barrier_
went to 0 before the query was transitioned to an error state.
Reordering the UpdateExecState() and backend_exec_complete_barrier_
calls prevents this race.

Change-Id: Idafd0b342e77a065be7cc28fa8c8a9df445622c2
Reviewed-on: http://gerrit.cloudera.org:8080/10491
Reviewed-by: Tim Armstrong 
Tested-by: Impala Public Jenkins 


> test_avro_writer failing on upstream Jenkins (Expected exception: "Writing to 
> table format AVRO is not supported")
> --
>
> Key: IMPALA-7055
> URL: https://issues.apache.org/jira/browse/IMPALA-7055
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.0
>Reporter: David Knupp
>Assignee: Tim Armstrong
>Priority: Blocker
>  Labels: correctness, flaky
> Fix For: Impala 2.13.0, Impala 3.1.0
>
>
> This failure occurred while verifying https://gerrit.cloudera.org/c/10455/, 
> but it is not related to that patch. The failing build is 
> https://jenkins.impala.io/job/gerrit-verify-dryrun/2511/ 
> (https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/2232/)
> Test appears to be (from 
> [avro-writer.test|https://github.com/apache/impala/blob/master/testdata/workloads/functional-query/queries/QueryTest/avro-writer.test]):
> {noformat}
>  QUERY
> SET ALLOW_UNSUPPORTED_FORMATS=0;
> insert into __avro_write select 1, "b", 2.2;
>  CATCH
> Writing to table format AVRO is not supported. Use query option 
> ALLOW_UNSUPPORTED_FORMATS
> {noformat}
> Error output:
> {noformat}
> 01:50:18 ] FAIL 
> query_test/test_compressed_formats.py::TestTableWriters::()::test_avro_writer[exec_option:
>  {'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 5000, 
> 'disable_codegen': False, 'abort_on_error': 1, 'debug_action': None, 
> 'exec_single_node_rows_threshold': 0} | table_format: text/none]
> 01:50:18 ] === FAILURES 
> ===
> 01:50:18 ]  TestTableWriters.test_avro_writer[exec_option: {'batch_size': 0, 
> 'num_nodes': 0, 'disable_codegen_rows_threshold': 5000, 'disable_codegen': 
> False, 'abort_on_error': 1, 'debug_action': None, 
> 'exec_single_node_rows_threshold': 0} | table_format: text/none] 
> 01:50:18 ] [gw9] linux2 -- Python 2.7.12 
> /home/ubuntu/Impala/bin/../infra/python/env/bin/python
> 01:50:18 ] query_test/test_compressed_formats.py:189: in test_avro_writer
> 01:50:18 ] self.run_test_case('QueryTest/avro-writer', vector)
> 01:50:18 ] common/impala_test_suite.py:420: in run_test_case
> 01:50:18 ] assert False, "Expected exception: %s" % expected_str
> 01:50:18 ] E   AssertionError: Expected exception: Writing to table format 
> AVRO is not supported. Use query option ALLOW_UNSUPPORTED_FORMATS
> 01:50:18 ]  Captured stderr setup 
> -
> 01:50:18 ] -- connecting to: localhost:21000
> 01:50:18 ] - Captured stderr call 
> -
> 01:50:18 ] -- executing against localhost:21000
> 01:50:18 ] use functional;
> 01:50:18 ] 
> 01:50:18 ] SET batch_size=0;
> 01:50:18 ] SET num_nodes=0;
> 01:50:18 ] SET disable_codegen_rows_threshold=5000;
> 01:50:18 ] SET disable_codegen=False;
> 01:50:18 ] SET abort_on_error=1;
> 01:50:18 ] SET exec_single_node_rows_threshold=0;
> 01:50:18 ] -- executing against localhost:21000
> 01:50:18 ] drop table if exists __avro_write;
> 01:50:18 ] 
> 01:50:18 ] -- executing against localhost:21000
> 01:50:18 ] SET COMPRESSION_CODEC=NONE;
> 01:50:18 ] 
> 01:50:18 ] -- executing against localhost:21000
> 01:50:18 ] 
> 01:50:18 ] create table __avro_write (i int, s string, d double)
> 01:50:18 ] stored as AVRO
> 01:50:18 ] TBLPROPERTIES ('avro.schema.literal'='{
> 01:50:18 ]   "name": "my_record",
> 01:50:18 ]   "type": "record",
> 01:50:18 ]   "fields": [
> 01:50:18 ]   {"name":"i", "type":["int", "null"]},
> 01:50:18 ]   {"name":"s", "type":["string", "null"]},
> 01:50:18 ]   {"name":"d", "type":["double", "null"]}]}');
> 01:50:18 ] 
> 01:50:18 ] -- executing against localhost:21000
> 01:50:18 ] SET COMPRESSION_CODEC="";
> 01:50:18 ] 
> 01:50:18 ] --

[jira] [Commented] (IMPALA-7039) Frontend HBase tests cannot tolerate HBase running on a different port

2018-05-29 Thread ASF subversion and git services (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-7039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16494085#comment-16494085
 ] 

ASF subversion and git services commented on IMPALA-7039:
-

Commit b07bb2729df4aa92d68626f88afa7cd09733ec23 in impala's branch 
refs/heads/2.x from [~tarasbob]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=b07bb27 ]

IMPALA-7039: Ignore the port in HBase planner tests

Before this patch, we used to check the HBase port in the HBase planner
tests. This caused a failure when HBase was running on a different port
than expected. We fix the problem in this patch by not checking the
HBase port.

Testing: ran the FE tests and they passed.

Change-Id: I8eb7628061b2ebaf84323b37424925e9a64f70a0
Reviewed-on: http://gerrit.cloudera.org:8080/10459
Reviewed-by: Tim Armstrong 
Tested-by: Impala Public Jenkins 


> Frontend HBase tests cannot tolerate HBase running on a different port
> --
>
> Key: IMPALA-7039
> URL: https://issues.apache.org/jira/browse/IMPALA-7039
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 3.1.0
>Reporter: Joe McDonnell
>Assignee: Taras Bobrovytsky
>Priority: Blocker
>  Labels: broken-build
> Fix For: Impala 3.1.0
>
>
> When HBase doesn't get the same ports as usual, 
> org.apache.impala.planner.PlannerTest.testHbase and 
> org.apache.impala.planner.PlannerTest.testJoins fail with the following 
> errors:
> {noformat}
> section SCANRANGELOCATIONS of query:
> select * from functional_hbase.alltypessmall
> where id < 5
> Actual does not match expected result:
>   HBASE KEYRANGE port=16020 :3
> ^
>   HBASE KEYRANGE port=16022 3:7
>   HBASE KEYRANGE port=16023 7:
> NODE 0:
> Expected:
>   HBASE KEYRANGE port=16201 :3
>   HBASE KEYRANGE port=16202 3:7
>   HBASE KEYRANGE port=16203 7:
> NODE 0:
> section SCANRANGELOCATIONS of query:
> select * from functional_hbase.stringids
> where id = '5'
> and tinyint_col = 5
> Actual does not match expected result:
>   HBASE KEYRANGE port=16022 5:5\0
> ^
> NODE 0:
> Expected:
>   HBASE KEYRANGE port=16202 5:5\0
> NODE 0:
> section SCANRANGELOCATIONS of query:
> select * from functional_hbase.stringids
> where id > '5'
> and tinyint_col = 5
> Actual does not match expected result:
>   HBASE KEYRANGE port=16022 5\0:7
> ^
>   HBASE KEYRANGE port=16023 7:
> NODE 0:
> Expected:
>   HBASE KEYRANGE port=16202 5\0:7
>   HBASE KEYRANGE port=16203 7:
> NODE 0:
> section SCANRANGELOCATIONS of query:
> select * from functional_hbase.stringids
> where id >= '5'
> and tinyint_col = 5
> Actual does not match expected result:
>   HBASE KEYRANGE port=16022 5:7
> ^^^
>   HBASE KEYRANGE port=16023 7:
> NODE 0:
> Expected:
>   HBASE KEYRANGE port=16202 5:7
>   HBASE KEYRANGE port=16203 7:
> NODE 0:
> section SCANRANGELOCATIONS of query:
> select * from functional_hbase.stringids
> where id < '5'
> and tinyint_col = 5
> Actual does not match expected result:
>   HBASE KEYRANGE port=16020 :3
> ^
>   HBASE KEYRANGE port=16022 3:5
> NODE 0:
> Expected:
>   HBASE KEYRANGE port=16201 :3
>   HBASE KEYRANGE port=16202 3:5
> NODE 0:
> section SCANRANGELOCATIONS of query:
> select * from functional_hbase.stringids
> where id > '4' and id < '5'
> and tinyint_col = 5
> Actual does not match expected result:
>   HBASE KEYRANGE port=16022 4\0:5
> ^
> NODE 0:
> Expected:
>   HBASE KEYRANGE port=16202 4\0:5
> NODE 0:
> section SCANRANGELOCATIONS of query:
> select * from functional_hbase.stringids
> where id >= '4' and id < '5'
> and tinyint_col = 5
> Actual does not match expected result:
>   HBASE KEYRANGE port=16022 4:5
> ^^^
> NODE 0:
> Expected:
>   HBASE KEYRANGE port=16202 4:5
> NODE 0:
> section SCANRANGELOCATIONS of query:
> select * from functional_hbase.stringids
> where id > '4' and id <= '5'
> and tinyint_col = 5
> Actual does not match expected result:
>   HBASE KEYRANGE port=16022 4\0:5\0
> ^^^
> NODE 0:
> Expected:
>   HBASE KEYRANGE port=16202 4\0:5\0
> NODE 0:
> section SCANRANGELOCATIONS of query:
> select * from functional_hbase.stringids
> where id >= '4' and id <= '5'
> and tinyint_col = 5
> Actual does not match expected result:
>   HBASE KEYRANGE port=16022 4:5\0
> ^
> NODE 0:
> Expected:
>   HBASE KEYRANGE port=16202 4:5\0
> NODE 0:
> section SCANRANGELOCATIONS of query:
> select * from functional_hbase.stringids
> where string_col = '4' and tinyint_col = 5 and id >= '4' and id <= '5'
> Actual does not match expected re

[jira] [Commented] (IMPALA-5642) [DOCS] Impala restrictions on using Hive UDFs

2018-05-29 Thread ASF subversion and git services (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-5642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16494096#comment-16494096
 ] 

ASF subversion and git services commented on IMPALA-5642:
-

Commit 0b9334a564dd7dd8d4a08e78876aba3fb0852e4d in impala's branch 
refs/heads/master from [~arodoni_cloudera]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=0b9334a ]

IMPALA-5642: [DOCS] An additional restriction for Hive/Java UDFs

Change-Id: I79f5fcbb570fda48f9ac03f6c3760366aa1859d2
Reviewed-on: http://gerrit.cloudera.org:8080/10520
Reviewed-by: Bharath Vissapragada 
Tested-by: Impala Public Jenkins 


> [DOCS] Impala restrictions on using Hive UDFs 
> --
>
> Key: IMPALA-5642
> URL: https://issues.apache.org/jira/browse/IMPALA-5642
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Docs
>Reporter: bharath v
>Assignee: Alex Rodoni
>Priority: Minor
>
> Along with the already stated restrictions on which Hive (Java) UDFs Impala 
> accepts, we need to add the following.
> {noformat}
> Impala requires that the Hive/Java UDFs must extend 
> 'org.apache.hadoop.hive.ql.exec.UDF' class
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-7079) test_scanners.TestParquet.test_multiple_blocks fails in the erasure coding job

2018-05-29 Thread ASF subversion and git services (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-7079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16494094#comment-16494094
 ] 

ASF subversion and git services commented on IMPALA-7079:
-

Commit 56a740c07a6d80921e86fee769033fab5ad1ccf3 in impala's branch 
refs/heads/master from [~tarasbob]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=56a740c ]

IMPALA-7079: Disable the multiple blocks test in erasure coding build

The test is currently failing in erasure coding build, so disable it to
make the build pass.

Change-Id: I00af0914d907b8dcff69f687f71239e76b6ff335
Reviewed-on: http://gerrit.cloudera.org:8080/10521
Reviewed-by: Tianyi Wang 
Tested-by: Impala Public Jenkins 


> test_scanners.TestParquet.test_multiple_blocks fails in the erasure coding job
> --
>
> Key: IMPALA-7079
> URL: https://issues.apache.org/jira/browse/IMPALA-7079
> Project: IMPALA
>  Issue Type: Task
>Affects Versions: Impala 3.1.0
>Reporter: Taras Bobrovytsky
>Assignee: Taras Bobrovytsky
>Priority: Major
>
> Several tests failed in TestParquet.test_multiple_blocks in a nightly erasure 
> coding run.
> {code}
>  TestParquet.test_multiple_blocks[exec_option: {'batch_size': 0, 'num_nodes': 
> 0, 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 
> 'abort_on_error': 1, 'debug_action': None, 'exec_single_node_rows_threshold': 
> 0} | table_format: parquet/none] 
> [gw0] linux2 -- Python 2.6.6 
> /data/jenkins/workspace/impala-asf-master-core-erasure-coding/repos/Impala/bin/../infra/python/env/bin/python
> query_test/test_scanners.py:550: in test_multiple_blocks
> self._multiple_blocks_helper(table_name, 2, ranges_per_node=1)
> query_test/test_scanners.py:598: in _multiple_blocks_helper
> assert len(num_row_groups_list) == 4
> E   assert 2 == 4
> E+  where 2 = len(['200', '200'])
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-5502) "*DBC Connector for Impala" is without context

2018-05-29 Thread ASF subversion and git services (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-5502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16494095#comment-16494095
 ] 

ASF subversion and git services commented on IMPALA-5502:
-

Commit 6ee48b9a11709ecb6eb8554500f6245eb42f1f8b in impala's branch 
refs/heads/master from [~arodoni_cloudera]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=6ee48b9 ]

IMPALA-5502: [DOCS] Removed JDBC and ODBC connectors without a context

Removed the section on Complex types and JDBC/ODBC connectors described
without a context.

Change-Id: I329dc497f9dd9cbf446d96e68c55cfe290b9fc58
Reviewed-on: http://gerrit.cloudera.org:8080/10522
Reviewed-by: Jim Apple 
Tested-by: Impala Public Jenkins 


> "*DBC Connector for Impala" is without context
> --
>
> Key: IMPALA-5502
> URL: https://issues.apache.org/jira/browse/IMPALA-5502
> Project: IMPALA
>  Issue Type: Task
>  Components: Docs
>Affects Versions: Impala 2.9.0
>Reporter: Jim Apple
>Assignee: Alex Rodoni
>Priority: Minor
>
> http://impala.incubator.apache.org/docs/build/html/topics/impala_jdbc.html
> says to use the Hive JDBC driver, but then says "The Impala complex types 
> (STRUCT, ARRAY, or MAP) are available in Impala 2.3 and higher. To use these 
> types with JDBC requires version 2.5.28 or higher of the JDBC Connector for 
> Impala. To use these types with ODBC requires version 2.5.30 or higher of the 
> ODBC Connector for Impala."
> These connectors could be described or explained above, or this line could be 
> removed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-6813) Hedged reads metrics broken when scanning non-HDFS based table

2018-05-29 Thread ASF subversion and git services (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-6813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16494088#comment-16494088
 ] 

ASF subversion and git services commented on IMPALA-6813:
-

Commit a3efde84a5e0ef17357d24c3e69aa3f255eb4865 in impala's branch 
refs/heads/2.x from [~sailesh]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=a3efde8 ]

IMPALA-6813: Hedged reads metrics broken when scanning non-HDFS based table

We realized that the libHDFS API call hdfsGetHedgedReadMetrics() crashes
when the 'fs' argument passed to it is not a HDFS filesystem.

There is an open bug for it on the HDFS side: HDFS-13417
However, it looks like we won't be getting a fix for it in the short term,
so our only option at this point is to skip it.

Testing: Made sure that enabling preads and scanning from S3 doesn't
cause a crash.
Also, added a custom cluster test to exercise the pread code path. We
are unable to verify hedged reads in a minicluster, but we can at least
exercise the code path to make sure that nothing breaks.

Change-Id: I48fe80dfd9a1ed68a8f2b7038e5f42b5a3df3baa
Reviewed-on: http://gerrit.cloudera.org:8080/9966
Reviewed-by: Sailesh Mukil 
Tested-by: Impala Public Jenkins 


> Hedged reads metrics broken when scanning non-HDFS based table
> --
>
> Key: IMPALA-6813
> URL: https://issues.apache.org/jira/browse/IMPALA-6813
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 2.12.0
>Reporter: Mostafa Mokhtar
>Assignee: Sailesh Mukil
>Priority: Blocker
> Fix For: Impala 2.13.0, Impala 3.1.0
>
>
> When preads are enabled ADLS scans can fail updating the Hedged reads metrics
> {code}
> (gdb) bt
> #0  0x003346c32625 in raise () from /lib64/libc.so.6
> #1  0x003346c33e05 in abort () from /lib64/libc.so.6
> #2  0x7f185be140b5 in os::abort(bool) ()
>from /usr/java/jdk1.8.0_121-cloudera/jre/lib/amd64/server/libjvm.so
> #3  0x7f185bfb6443 in VMError::report_and_die() ()
>from /usr/java/jdk1.8.0_121-cloudera/jre/lib/amd64/server/libjvm.so
> #4  0x7f185be195bf in JVM_handle_linux_signal ()
>from /usr/java/jdk1.8.0_121-cloudera/jre/lib/amd64/server/libjvm.so
> #5  0x7f185be0fb03 in signalHandler(int, siginfo*, void*) ()
>from /usr/java/jdk1.8.0_121-cloudera/jre/lib/amd64/server/libjvm.so
> #6  
> #7  0x7f185bbc1a7b in jni_invoke_nonstatic(JNIEnv_*, JavaValue*, 
> _jobject*, JNICallType, _jmethodID*, JNI_ArgumentPusher*, Thread*) ()
>from /usr/java/jdk1.8.0_121-cloudera/jre/lib/amd64/server/libjvm.so
> #8  0x7f185bbc7e81 in jni_CallObjectMethodV ()
>from /usr/java/jdk1.8.0_121-cloudera/jre/lib/amd64/server/libjvm.so
> #9  0x0212e2b7 in invokeMethod ()
> #10 0x02131297 in hdfsGetHedgedReadMetrics ()
> #11 0x011601c0 in impala::io::ScanRange::Close() ()
> #12 0x01158a95 in 
> impala::io::DiskIoMgr::HandleReadFinished(impala::io::DiskIoMgr::DiskQueue*, 
> impala::io::RequestContext*, std::unique_ptr std::default_delete >) ()
> #13 0x01158e1c in 
> impala::io::DiskIoMgr::ReadRange(impala::io::DiskIoMgr::DiskQueue*, 
> impala:---Type  to continue, or q  to quit---
> :io::RequestContext*, impala::io::ScanRange*) ()
> #14 0x01159052 in 
> impala::io::DiskIoMgr::WorkLoop(impala::io::DiskIoMgr::DiskQueue*) ()
> #15 0x00d5fcaf in 
> impala::Thread::SuperviseThread(std::basic_string std::char_traits, std::allocator > const&, 
> std::basic_string, std::allocator > 
> const&, boost::function, impala::ThreadDebugInfo const*, 
> impala::Promise*) ()
> #16 0x00d604aa in boost::detail::thread_data void (*)(std::basic_string, std::allocator 
> > const&, std::basic_string, 
> std::allocator > const&, boost::function, 
> impala::ThreadDebugInfo const*, impala::Promise*), 
> boost::_bi::list5 std::char_traits, std::allocator > >, 
> boost::_bi::value, 
> std::allocator > >, boost::_bi::value >, 
> boost::_bi::value, 
> boost::_bi::value*> > > >::run() ()
> #17 0x012d6dfa in ?? ()
> #18 0x003347007aa1 in start_thread () from /lib64/libpthread.so.0
> #19 0x003346ce893d in clone () from /lib64/libc.so.6
> {code}
> {code}
> CREATE TABLE adls.lineitem (
>   l_orderkey BIGINT,
>   l_partkey BIGINT,
>   l_suppkey BIGINT,
>   l_linenumber BIGINT,
>   l_quantity DOUBLE,
>   l_extendedprice DOUBLE,
>   l_discount DOUBLE,
>   l_tax DOUBLE,
>   l_returnflag STRING,
>   l_linestatus STRING,
>   l_commitdate STRING,
>   l_receiptdate STRING,
>   l_shipinstruct STRING,
>   l_shipmode STRING,
>   l_comment STRING,
>   l_shipdate STRING
> )
> STORED AS PARQUET
> LOCATION 'adl://foo.azuredatalakestore.net/adls-test.db/lineitem'
> {code}
> select * from adls.lineitem limit 10;



--
This message was sent by Atlassian

[jira] [Commented] (IMPALA-7048) Failed test: query_test.test_parquet_page_index.TestHdfsParquetTableIndexWriter.test_write_index_many_columns_tables

2018-05-29 Thread ASF subversion and git services (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-7048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16494089#comment-16494089
 ] 

ASF subversion and git services commented on IMPALA-7048:
-

Commit a48bbfdf4692eb68f06a4cd192a98947bcc04aba in impala's branch 
refs/heads/2.x from [~boroknagyz]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=a48bbfd ]

IMPALA-7048: Failed test: test_write_index_many_columns_tables

The test in the title fails when the local filesystem is used.

Looking at the error message it seems that the determined
Parquet file size is too small when the local filesystem
is used. There is already an annotation for that:
'SkipIfLocal.parquet_file_size'

I added this annotation to the TestHdfsParquetTableIndexWriter
class, therefore these tests won't be executed when the
test-warehouse directory of Impala resides on the local
filesystem.

Change-Id: Idd3be70fb654a49dda44309a8914fe1f2b48a1af
Reviewed-on: http://gerrit.cloudera.org:8080/10476
Reviewed-by: Zoltan Borok-Nagy 
Tested-by: Impala Public Jenkins 


> Failed test: 
> query_test.test_parquet_page_index.TestHdfsParquetTableIndexWriter.test_write_index_many_columns_tables
> 
>
> Key: IMPALA-7048
> URL: https://issues.apache.org/jira/browse/IMPALA-7048
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Reporter: Dimitris Tsirogiannis
>Assignee: Zoltán Borók-Nagy
>Priority: Blocker
>  Labels: broken-build
> Fix For: Impala 2.13.0, Impala 3.1.0
>
>
> The following test fails when the filesystem is LOCAL:
> {code:java}
> query_test.test_parquet_page_index.TestHdfsParquetTableIndexWriter.test_write_index_many_columns_tables[exec_option:
>  \{'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, 
> 'disable_codegen': False, 'abort_on_error': 1, 'debug_action': None, 
> 'exec_single_node_rows_threshold': 0} | table_format: parquet/none] (from 
> pytest) {code}
> Zoltan, assigning to you since this looks suspiciously related to the fix for 
> IMPALA-5842. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-3134) Admission controller should not assume all backends have same proc mem limit

2018-05-29 Thread ASF subversion and git services (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-3134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16494086#comment-16494086
 ] 

ASF subversion and git services commented on IMPALA-3134:
-

Commit 466188b3970595e2e04d7ecf6a5141a7d3012909 in impala's branch 
refs/heads/2.x from [~bikram.sngh91]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=466188b ]

IMPALA-3134: Support different proc mem limits among impalads for
admission control checks

Currently the admission controller assumes that all backends have the
same process mem limit as the impalad it itself is running on. With
this patch the proc mem limit for each impalad is available to the
admission controller and it uses it for making correct admisssion
decisions. It currently works under the assumption that the
per-process memory limit does not change dynamically.

Testing:
Added an e2e test.

IMPALA-5662: Log the queuing reason for a query

The queuing reason is now logged both while queuing for the first
time and while trying to dequeue.

Change-Id: Idb72eee790cc17466bbfa82e30f369a65f2b060e
Reviewed-on: http://gerrit.cloudera.org:8080/10396
Reviewed-by: Bikramjeet Vig 
Tested-by: Impala Public Jenkins 


> Admission controller should not assume all backends have same proc mem limit
> 
>
> Key: IMPALA-3134
> URL: https://issues.apache.org/jira/browse/IMPALA-3134
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Affects Versions: Impala 2.5.0
>Reporter: Matthew Jacobs
>Assignee: Bikramjeet Vig
>Priority: Minor
>  Labels: admission-control, ramp-up, resource-management
> Fix For: Impala 3.1.0
>
>
> The admission policy now checks that all backends have enough available 
> memory resources to execute the request, but it assumes that all backends 
> share the same process mem limit as the impalad it itself is running on. In a 
> heterogeneous environment, admission is wrong.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-4025) add functions PERCENTILE_DISC(), PERCENTILE_CONT(), and MEDIAN()

2018-05-29 Thread ASF subversion and git services (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-4025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16494091#comment-16494091
 ] 

ASF subversion and git services commented on IMPALA-4025:
-

Commit 41d7cd908a05dabe31775dabf188d3b2136c25d2 in impala's branch 
refs/heads/2.x from [~tianyiwang]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=41d7cd9 ]

IMPALA-4025: Part 1: Generalize and cleanup StmtRewriter

This patch generalizes StmtRewriter, allowing it to be subclassed. The
base class would traverse the stmt tree while the subclasses can install
hooks to execute specific rewrite rules at certain places. Existing
rewriting rules are moved into SubqueryRewriter.

Change-Id: I9e7a6108d3d49be12ae032fdb54b5c3c23152a47
Reviewed-on: http://gerrit.cloudera.org:8080/10495
Reviewed-by: Vuk Ercegovac 
Tested-by: Impala Public Jenkins 


> add functions PERCENTILE_DISC(), PERCENTILE_CONT(), and MEDIAN()
> 
>
> Key: IMPALA-4025
> URL: https://issues.apache.org/jira/browse/IMPALA-4025
> Project: IMPALA
>  Issue Type: New Feature
>  Components: Backend, Frontend
>Affects Versions: Impala 2.2.4
>Reporter: Greg Rahn
>Assignee: Tianyi Wang
>Priority: Major
>  Labels: built-in-function, sql-language
>
> Add the following functions as both an aggregate function and window/analytic 
> function:
> * PERCENTILE_CONT
> * PERCENTILE_DISC
> * MEDIAN (impmented as PERCENTILE_CONT(0.5))
> h6. Syntax
> {code}
> PERCENTILE_CONT() WITHIN GROUP (ORDER BY  [ASC|DESC] 
> [NULLS {FIRST | LAST}]) [ OVER ([])]
> PERCENTILE_DISC() WITHIN GROUP (ORDER BY  [ASC|DESC] 
> [NULLS {FIRST | LAST}]) [ OVER ([])]
> MEDIAN(expr) [ OVER () ]
> {code}
> h6. Notes from other systems
> *Greenplum*
> {code}
> PERCENTILE_CONT(_percentage_) WITHIN GROUP (ORDER BY _expression_)
> {code}
> http://gpdb.docs.pivotal.io/4320/admin_guide/query.html
> Greenplum Database provides the MEDIAN aggregate function, which returns the 
> fiftieth percentile of the PERCENTILE_CONT result and special aggregate 
> expressions for inverse distribution functions as follows:
> Currently you can use only these two expressions with the keyword WITHIN 
> GROUP.
> Note: aggregation fuction only
> *Oracle*
> {code}
> PERCENTILE_CONT(expr) WITHIN GROUP (ORDER BY expr [ DESC | ASC ]) [ OVER 
> (query_partition_clause) ]}}
> {code}
> http://docs.oracle.com/database/121/SQLRF/functions141.htm#SQLRF00687
> Note: implemented as both an aggregate and window function
> *Vertica*
> {code}
> PERCENTILE_CONT ( %_number ) WITHIN GROUP (... ORDER BY expression [ ASC | 
> DESC ] ) OVER (... [ window-partition-clause ] )
> {code}
> https://my.vertica.com/docs/7.2.x/HTML/index.htm#Authoring/SQLReferenceManual/Functions/Analytic/PERCENTILE_CONTAnalytic.htm
> Note: window fuction only
> *Teradata*
> {code}
> PERCENTILE_CONT() WITHIN GROUP (ORDER BY  
> [asc | desc] [nulls {first | last}])
> {code}
> Note: aggregation fuction only
> *Netezza*
> {code}
> SELECT fn() WITHIN GROUP (ORDER BY  [asc|desc] [nulls 
> {first | last}]) FROM [GROUP BY ];
> {code}
> https://www.ibm.com/support/knowledgecenter/SSULQD_7.2.1/com.ibm.nz.dbu.doc/c_dbuser_inverse_distribution_funcs_family_syntax.html
> Note: aggregation fuction only
> *Redshift*
> {code}
> PERCENTILE_CONT ( percentile ) WITHIN GROUP (ORDER BY expr) OVER (  [ 
> PARTITION BY expr_list ]  )
> {code}
> https://www.ibm.com/support/knowledgecenter/SSULQD_7.2.1/com.ibm.nz.dbu.doc/c_dbuser_inverse_distribution_funcs_family_syntax.html
> Note: window fuction only



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Assigned] (IMPALA-6900) Invalidate metadata operation is ignored at a coordinator if catalog is empty

2018-05-29 Thread Dimitris Tsirogiannis (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-6900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dimitris Tsirogiannis reassigned IMPALA-6900:
-

Assignee: Vuk Ercegovac  (was: Dimitris Tsirogiannis)

> Invalidate metadata operation is ignored at a coordinator if catalog is empty
> -
>
> Key: IMPALA-6900
> URL: https://issues.apache.org/jira/browse/IMPALA-6900
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 3.0, Impala 2.12.0
>Reporter: Dimitris Tsirogiannis
>Assignee: Vuk Ercegovac
>Priority: Major
>
> The following workflow may cause an impalad that issued an invalidate 
> metadata to falsely consider that the effect of that operation has taken 
> effect, thus causing subsequent queries to fail due to unresolved references 
> to tables or databases. 
> Steps to reproduce:
>  # Start an impala cluster connecting to an empty HMS (no databases).
>  # Create a database "db" in HMS outside of Impala (e.g. using Hive).
>  # Run INVALIDATE METADATA through Impala.
>  # Run "use db" statement in Impala.
>  
> The while condition in the code snippet below is cause the 
> WaitForMinCatalogUpdate function to prematurely return even though INVALIDATE 
> METADATA has not taken effect: 
> {code:java}
> void ImpalaServer::WaitForMinCatalogUpdate(..) {
> ...
> VLOG_QUERY << "Waiting for minimum catalog object version: "
><< min_req_catalog_object_version << " current version: "
><< min_catalog_object_version;
> while (catalog_update_info_.min_catalog_object_version <  
> min_req_catalog_object_version && catalog_update_info_.catalog_service_id ==  
> catalog_service_id) {
>catalog_version_update_cv_.Wait(unique_lock);
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-6776) Failed to assign hbase regions to servers

2018-05-29 Thread Vuk Ercegovac (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-6776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16494066#comment-16494066
 ] 

Vuk Ercegovac commented on IMPALA-6776:
---

[IMPALA-7061|http://issues.cloudera.org/browse/IMPALA-7061] changed how hbase 
is split for front-end planner tests. Now, the front-end planner test does the 
splitting/assigning, which means that the error reported here will effect just 
the planner tests (and not all others). In addition, the splitting/assigning 
was changed. I'm closing this bug-- its already marked as being related to 
IMPALA-7061. Please open a new bug if a similar issue with region assignment 
comes up.

> Failed to assign hbase regions to servers
> -
>
> Key: IMPALA-6776
> URL: https://issues.apache.org/jira/browse/IMPALA-6776
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 3.0
>Reporter: Tianyi Wang
>Assignee: Vuk Ercegovac
>Priority: Blocker
>  Labels: broken-build
>
> After switching to hadoop 3 components, split-hbase.sh failed in 
> HBaseTestDataRegionAssigment:
> {noformat}
> 20:40:27 Splitting HBase (logging to 
> /data/jenkins/workspace/impala-asf-master-exhaustive-release/repos/Impala/logs/data_loading/create-hbase.log)...
>  
> 20:41:51 FAILED (Took: 1 min 24 sec)
> 20:41:51 
> '/data/jenkins/workspace/impala-asf-master-exhaustive-release/repos/Impala/testdata/bin/split-hbase.sh'
>  failed. Tail of log:
> 20:41:51 18/03/29 20:41:50 INFO datagenerator.HBaseTestDataRegionAssigment: 
> functional_hbase.alltypessmall,3,1522381286429.7b13fefeda7afac230e22150deab5266.
>  3 -> localhost:16202, expecting localhost,16202,1522374373018
> 20:41:51 18/03/29 20:41:50 INFO datagenerator.HBaseTestDataRegionAssigment: 
> functional_hbase.alltypessmall,5,1522381287511.7a243a822c5c4844a2a3d0f67a541961.
>  5 -> localhost:16202, expecting localhost,16202,1522374373018
> 20:41:51 18/03/29 20:41:50 INFO datagenerator.HBaseTestDataRegionAssigment: 
> functional_hbase.alltypessmall,7,1522381288718.80d6e4a799ad114a146dc3cb41e18e93.
>  7 -> localhost:16203, expecting localhost,16203,1522374374705
> 20:41:51 18/03/29 20:41:50 INFO datagenerator.HBaseTestDataRegionAssigment: 
> functional_hbase.alltypessmall,9,1522381288718.d705a2ea635916f4bb510ca60764080a.
>  9 -> localhost:16203, expecting localhost,16203,1522374374705
> 20:41:51 18/03/29 20:41:50 INFO datagenerator.HBaseTestDataRegionAssigment: 
> functional_hbase.alltypessmall,,1522381282868.a99b569f5417ea9e2561eb5566c31be0.
>   -> localhost:16203, expecting localhost,16201,1522374371810
> 20:41:51 18/03/29 20:41:50 INFO datagenerator.HBaseTestDataRegionAssigment: 
> functional_hbase.alltypessmall,1,1522381285023.5fb566ba94e5fbb8aeca39f3da0a6362.
>  1 -> localhost:16201, expecting localhost,16201,1522374371810
> 20:41:51 18/03/29 20:41:50 INFO datagenerator.HBaseTestDataRegionAssigment: 
> functional_hbase.alltypessmall,3,1522381286429.7b13fefeda7afac230e22150deab5266.
>  3 -> localhost:16202, expecting localhost,16202,1522374373018
> 20:41:51 18/03/29 20:41:50 INFO datagenerator.HBaseTestDataRegionAssigment: 
> functional_hbase.alltypessmall,5,1522381287511.7a243a822c5c4844a2a3d0f67a541961.
>  5 -> localhost:16202, expecting localhost,16202,1522374373018
> 20:41:51 18/03/29 20:41:50 INFO datagenerator.HBaseTestDataRegionAssigment: 
> functional_hbase.alltypessmall,7,1522381288718.80d6e4a799ad114a146dc3cb41e18e93.
>  7 -> localhost:16203, expecting localhost,16203,1522374374705
> 20:41:51 18/03/29 20:41:50 INFO datagenerator.HBaseTestDataRegionAssigment: 
> functional_hbase.alltypessmall,9,1522381288718.d705a2ea635916f4bb510ca60764080a.
>  9 -> localhost:16203, expecting localhost,16203,1522374374705
> 20:41:51 18/03/29 20:41:50 INFO datagenerator.HBaseTestDataRegionAssigment: 
> functional_hbase.alltypessmall,,1522381282868.a99b569f5417ea9e2561eb5566c31be0.
>   -> localhost:16203, expecting localhost,16201,1522374371810
> 20:41:51 18/03/29 20:41:50 INFO datagenerator.HBaseTestDataRegionAssigment: 
> functional_hbase.alltypessmall,1,1522381285023.5fb566ba94e5fbb8aeca39f3da0a6362.
>  1 -> localhost:16201, expecting localhost,16201,1522374371810
> 20:41:51 18/03/29 20:41:50 INFO datagenerator.HBaseTestDataRegionAssigment: 
> functional_hbase.alltypessmall,3,1522381286429.7b13fefeda7afac230e22150deab5266.
>  3 -> localhost:16202, expecting localhost,16202,1522374373018
> 20:41:51 18/03/29 20:41:50 INFO datagenerator.HBaseTestDataRegionAssigment: 
> functional_hbase.alltypessmall,5,1522381287511.7a243a822c5c4844a2a3d0f67a541961.
>  5 -> localhost:16202, expecting localhost,16202,1522374373018
> 20:41:51 18/03/29 20:41:50 INFO datagenerator.HBaseTestDataRegionAssigment: 
> functional_hbase.alltypessmall,7,1522381288718.80d6e4a799ad114a146dc3cb41e1

[jira] [Updated] (IMPALA-6338) Runtime profile for query with limit may be missing pieces

2018-05-29 Thread Thomas Tauber-Marshall (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-6338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Tauber-Marshall updated IMPALA-6338:
---
Summary: Runtime profile for query with limit may be missing pieces  (was: 
test_profile_fragment_instances failing)

> Runtime profile for query with limit may be missing pieces
> --
>
> Key: IMPALA-6338
> URL: https://issues.apache.org/jira/browse/IMPALA-6338
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 2.11.0, Impala 2.12.0
>Reporter: David Knupp
>Assignee: Thomas Tauber-Marshall
>Priority: Critical
>  Labels: broken-build, flaky
> Attachments: profile-failure.txt, profile-success.txt
>
>
> Stack trace:
> {noformat}
> query_test/test_observability.py:123: in test_profile_fragment_instances
> assert results.runtime_profile.count("HDFS_SCAN_NODE") == 12
> E   assert 11 == 12
> E+  where 11 =  0x68bd2f0>('HDFS_SCAN_NODE')
> E+where  = 'Query 
> (id=ae4cee91aafc5c6c:11b545c6):\n  DEBUG MODE WARNING: Query profile 
> created while running a DEBUG buil...ontextSwitches: 0 (0)\n   - 
> TotalRawHdfsReadTime(*): 5s784ms\n   - TotalReadThroughput: 17.33 
> MB/sec\n'.count
> E+  where 'Query (id=ae4cee91aafc5c6c:11b545c6):\n  DEBUG 
> MODE WARNING: Query profile created while running a DEBUG 
> buil...ontextSwitches: 0 (0)\n   - TotalRawHdfsReadTime(*): 5s784ms\n 
>   - TotalReadThroughput: 17.33 MB/sec\n' = 
>  0x6322e10>.runtime_profile
> {noformat}
> Query:
> {noformat}
> with l as (select * from tpch.lineitem UNION ALL select * from tpch.lineitem)
> select STRAIGHT_JOIN count(*) from (select * from tpch.lineitem a 
> LIMIT 1) a
> join (select * from l LIMIT 200) b on a.l_orderkey = 
> -b.l_orderkey;
> {noformat}
> Summary:
> {noformat}
> Operator #Hosts  Avg Time  Max Time  #Rows  Est. #Rows   Peak Mem 
>  Est. Peak Mem  Detail
> 
> 05:AGGREGATE  1   0.000ns   0.000ns  1   1   28.00 KB 
>   10.00 MB  FINALIZE  
> 04:HASH JOIN  1  15.000ms  15.000ms  0   1  141.06 MB 
>   17.00 MB  INNER JOIN, BROADCAST 
> |--08:EXCHANGE1   4s153ms   4s153ms  2.00M   2.00M  0 
>  0  UNPARTITIONED 
> |  07:EXCHANGE1   3s783ms   3s783ms  2.00M   2.00M  0 
>  0  UNPARTITIONED 
> |  01:UNION   3  17.000ms  28.001ms  3.03M   2.00M  0 
>  0
> |  |--03:SCAN HDFS3   0.000ns   0.000ns  0   6.00M  0 
>  176.00 MB  tpch.lineitem 
> |  02:SCAN HDFS   3   6s133ms   6s948ms  3.03M   6.00M   24.02 MB 
>  176.00 MB  tpch.lineitem 
> 06:EXCHANGE   1   5s655ms   5s655ms  1   1  0 
>  0  UNPARTITIONED 
> 00:SCAN HDFS  3   4s077ms   6s207ms  2   1   16.05 MB 
>  176.00 MB  tpch.lineitem a   
> {noformat}
> Plan:
> {noformat}
> 
> Max Per-Host Resource Reservation: Memory=17.00MB
> Per-Host Resource Estimates: Memory=379.00MB
> F01:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1
> |  Per-Host Resources: mem-estimate=27.00MB mem-reservation=17.00MB
> PLAN-ROOT SINK
> |  mem-estimate=0B mem-reservation=0B
> |
> 05:AGGREGATE [FINALIZE]
> |  output: count(*)
> |  mem-estimate=10.00MB mem-reservation=0B spill-buffer=2.00MB
> |  tuple-ids=7 row-size=8B cardinality=1
> |
> 04:HASH JOIN [INNER JOIN, BROADCAST]
> |  hash predicates: a.l_orderkey = -1 * l_orderkey
> |  fk/pk conjuncts: assumed fk/pk
> |  runtime filters: RF000[bloom] <- -1 * l_orderkey
> |  mem-estimate=17.00MB mem-reservation=17.00MB spill-buffer=1.00MB
> |  tuple-ids=0,4 row-size=16B cardinality=1
> |
> |--08:EXCHANGE [UNPARTITIONED]
> |  |  mem-estimate=0B mem-reservation=0B
> |  |  tuple-ids=4 row-size=8B cardinality=200
> |  |
> |  F05:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1
> |  Per-Host Resources: mem-estimate=0B mem-reservation=0B
> |  07:EXCHANGE [UNPARTITIONED]
> |  |  limit: 200
> |  |  mem-estimate=0B mem-reservation=0B
> |  |  tuple-ids=4 row-size=8B cardinality=200
> |  |
> |  F04:PLAN FRAGMENT [RANDOM] hosts=3 instances=3
> |  Per-Host Resources: mem-estimate=176.00MB mem-reservation=0B
> |  01:UNION
> |  |  pass-through-operands: all
> |  |  limit: 200
> |  |  mem-estimate=0B mem-reservation=0B
> |  |  tuple-ids=4 row-size=8B cardinality=200
> |  |
> |  |--03:SCAN HDFS [tpch.lineitem, RANDOM]
> |  | partitions

[jira] [Resolved] (IMPALA-6928) test_bloom_filters failing on ASAN build: did not find "Runtime Filter Published" in profile

2018-05-29 Thread Thomas Tauber-Marshall (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-6928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Tauber-Marshall resolved IMPALA-6928.

Resolution: Duplicate

dup of IMPALA-6338

> test_bloom_filters failing on ASAN build: did not find "Runtime Filter 
> Published" in profile
> 
>
> Key: IMPALA-6928
> URL: https://issues.apache.org/jira/browse/IMPALA-6928
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 2.12.0
>Reporter: David Knupp
>Assignee: Thomas Tauber-Marshall
>Priority: Blocker
>
> Stacktrace
> {noformat}
> query_test/test_runtime_filters.py:81: in test_bloom_filters
> self.run_test_case('QueryTest/bloom_filters', vector)
> common/impala_test_suite.py:444: in run_test_case
> verify_runtime_profile(test_section['RUNTIME_PROFILE'], 
> result.runtime_profile)
> common/test_result_verifier.py:560: in verify_runtime_profile
> actual))
> E   AssertionError: Did not find matches for lines in runtime profile:
> E   EXPECTED LINES:
> E   row_regex: .*1 of 1 Runtime Filter Published.*
> E   
> E   ACTUAL PROFILE:
> E   Query (id=a64a18654d28e0c3:e6220f6c):
> E DEBUG MODE WARNING: Query profile created while running a DEBUG build 
> of Impala. Use RELEASE builds to measure query performance.
> E Summary:
> E   Session ID: 244e6109f4226b2b:39160855c64ad4a1
> E   Session Type: BEESWAX
> E   Start Time: 2018-04-23 23:31:59.326883000
> E   End Time: 
> E   Query Type: QUERY
> E   Query State: FINISHED
> E   Query Status: OK
> E   Impala Version: impalad version 2.12.0-cdh5.15.0 DEBUG (build 
> 3d60947b813429cd1db59f9a342498982d341de9)
> E   User: jenkins
> E   Connected User: jenkins
> E   Delegated User: 
> E   Network Address: 127.0.0.1:55776
> E   Default Db: functional
> E   Sql Statement: with l as (select * from tpch.lineitem UNION ALL 
> select * from tpch.lineitem)
> E   select STRAIGHT_JOIN count(*) from (select * from tpch.lineitem a LIMIT 
> 1) a
> E   join (select * from l LIMIT 200) b on a.l_orderkey = -b.l_orderkey
> E   Coordinator: ec2-m2-4xlarge-centos-6-4-0f06.vpc.cloudera.com:22000
> E   Query Options (set by configuration): 
> ABORT_ON_ERROR=1,EXEC_SINGLE_NODE_ROWS_THRESHOLD=0,RUNTIME_FILTER_WAIT_TIME_MS=3,RUNTIME_FILTER_MIN_SIZE=65536,DISABLE_CODEGEN_ROWS_THRESHOLD=0
> E   Query Options (set by configuration and planner): 
> ABORT_ON_ERROR=1,EXEC_SINGLE_NODE_ROWS_THRESHOLD=0,RUNTIME_FILTER_WAIT_TIME_MS=3,MT_DOP=0,RUNTIME_FILTER_MIN_SIZE=65536,DISABLE_CODEGEN_ROWS_THRESHOLD=0
> E   Plan: 
> E   
> E   Max Per-Host Resource Reservation: Memory=19.00MB
> E   Per-Host Resource Estimates: Memory=557.00MB
> E   
> E   F01:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1
> E   |  Per-Host Resources: mem-estimate=28.00MB mem-reservation=18.00MB 
> runtime-filters-memory=1.00MB
> E   PLAN-ROOT SINK
> E   |  mem-estimate=0B mem-reservation=0B
> E   |
> E   05:AGGREGATE [FINALIZE]
> E   |  output: count(*)
> E   |  mem-estimate=10.00MB mem-reservation=0B spill-buffer=2.00MB
> E   |  tuple-ids=7 row-size=8B cardinality=1
> E   |
> E   04:HASH JOIN [INNER JOIN, BROADCAST]
> E   |  hash predicates: a.l_orderkey = -1 * l_orderkey
> E   |  fk/pk conjuncts: assumed fk/pk
> E   |  runtime filters: RF000[bloom] <- -1 * l_orderkey
> E   |  mem-estimate=17.00MB mem-reservation=17.00MB spill-buffer=1.00MB
> E   |  tuple-ids=0,4 row-size=16B cardinality=1
> E   |
> E   |--08:EXCHANGE [UNPARTITIONED]
> E   |  |  mem-estimate=0B mem-reservation=0B
> E   |  |  tuple-ids=4 row-size=8B cardinality=200
> E   |  |
> E   |  F05:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1
> E   |  Per-Host Resources: mem-estimate=0B mem-reservation=0B
> E   |  07:EXCHANGE [UNPARTITIONED]
> E   |  |  limit: 200
> E   |  |  mem-estimate=0B mem-reservation=0B
> E   |  |  tuple-ids=4 row-size=8B cardinality=200
> E   |  |
> E   |  F04:PLAN FRAGMENT [RANDOM] hosts=3 instances=3
> E   |  Per-Host Resources: mem-estimate=264.00MB mem-reservation=0B
> E   |  01:UNION
> E   |  |  pass-through-operands: all
> E   |  |  limit: 200
> E   |  |  mem-estimate=0B mem-reservation=0B
> E   |  |  tuple-ids=4 row-size=8B cardinality=200
> E   |  |
> E   |  |--03:SCAN HDFS [tpch.lineitem, RANDOM]
> E   |  | partitions=1/1 files=1 size=718.94MB
> E   |  | stored statistics:
> E   |  |   table: rows=6001215 size=718.94MB
> E   |  |   columns: all
> E   |  | extrapolated-rows=disabled
> E   |  | mem-estimate=264.00MB mem-reservation=0B
> E   |  | tuple-ids=3 row-size=8B cardinality=6001215
> E   |  |
> E   |  02:SCAN HDFS [tpch.lineitem, RANDOM]
> E   | partitions=1/

[jira] [Assigned] (IMPALA-7088) Parallel data load breaks load-data.py if loading data on a real cluster

2018-05-29 Thread Joe McDonnell (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-7088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joe McDonnell reassigned IMPALA-7088:
-

Assignee: Joe McDonnell

> Parallel data load breaks load-data.py if loading data on a real cluster
> 
>
> Key: IMPALA-7088
> URL: https://issues.apache.org/jira/browse/IMPALA-7088
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 3.0
>Reporter: David Knupp
>Assignee: Joe McDonnell
>Priority: Blocker
>
> {{Impala/bin/load-data.py}} is most commonly used to load test data onto a 
> simulated standalone cluster running on the local host. However, with the 
> correct inputs, it can also be used to load data onto an actual cluster 
> running on remote hosts.
> A recent enhancement in the load-data.py script to parallelize parts of the 
> data loading process -- https://github.com/apache/impala/commit/d481cd48 -- 
> has introduced a regression in the latter use case:
> From {{$IMPALA_HOME/logs/data_loading/data-load-functional-exhaustive.log}}:
> {noformat}
> Created table functional_hbase.widetable_1000_cols
> Took 0.7121 seconds
> 09:48:01 Beginning execution of hive SQL: 
> /home/systest/Impala-auxiliary-tests/tests/cdh_cluster/../../../Impala-cdh-cluster-test-runner/logs/data_loading/sql/functional/load-functional-query-exhaustive-hive-generated-text-none-none.sql
> Traceback (most recent call last):
>   File 
> "/home/systest/Impala-auxiliary-tests/tests/cdh_cluster/../../../Impala-cdh-cluster-test-runner/bin/load-data.py",
>  line 494, in 
> if __name__ == "__main__": main()
>   File 
> "/home/systest/Impala-auxiliary-tests/tests/cdh_cluster/../../../Impala-cdh-cluster-test-runner/bin/load-data.py",
>  line 468, in main
> hive_exec_query_files_parallel(thread_pool, hive_load_text_files)
>   File 
> "/home/systest/Impala-auxiliary-tests/tests/cdh_cluster/../../../Impala-cdh-cluster-test-runner/bin/load-data.py",
>  line 299, in hive_exec_query_files_parallel
> exec_query_files_parallel(thread_pool, query_files, 'hive')
>   File 
> "/home/systest/Impala-auxiliary-tests/tests/cdh_cluster/../../../Impala-cdh-cluster-test-runner/bin/load-data.py",
>  line 290, in exec_query_files_parallel
> for result in thread_pool.imap_unordered(execution_function, query_files):
>   File "/usr/lib/python2.7/multiprocessing/pool.py", line 659, in next
> raise value
> TypeError: coercing to Unicode: need string or buffer, NoneType found
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Created] (IMPALA-7089) test_kudu_dml_reporting failing

2018-05-29 Thread Thomas Tauber-Marshall (JIRA)

Thomas Tauber-Marshall created IMPALA-7089:
--

 Summary: test_kudu_dml_reporting failing
 Key: IMPALA-7089
 URL: https://issues.apache.org/jira/browse/IMPALA-7089
 Project: IMPALA
  Issue Type: Bug
Reporter: Thomas Tauber-Marshall
Assignee: Thomas Tauber-Marshall


See in numerous builds:
{noformat}
00:07:23 ___ TestImpalaShell.test_kudu_dml_reporting 

00:07:23 [gw1] linux2 -- Python 2.6.6 
/data/jenkins/workspace/impala-asf-master-core/repos/Impala/bin/../infra/python/env/bin/python
00:07:23 shell/test_shell_commandline.py:601: in test_kudu_dml_reporting
00:07:23 "with y as (values(7)) insert into %s.dml_test (id) select * from 
y" % db, 1, 0)
00:07:23 shell/test_shell_commandline.py:580: in _validate_dml_stmt
00:07:23 assert expected_output in results.stderr
00:07:23 E   assert 'Modified 1 row(s), 0 row error(s)' in 'Starting Impala 
Shell without Kerberos authentication\nConnected to localhost:21000\nServer 
version: impalad version 
...tos-6-4-0895.vpc.cloudera.com:25000/query_plan?query_id=d94f04135c4d25f9:ec1089e8\nFetched
 0 row(s) in 0.12s\n'
00:07:23 E+  where 'Starting Impala Shell without Kerberos 
authentication\nConnected to localhost:21000\nServer version: impalad version 
...tos-6-4-0895.vpc.cloudera.com:25000/query_plan?query_id=d94f04135c4d25f9:ec1089e8\nFetched
 0 row(s) in 0.12s\n' = .stderr
00:07:23  Captured stderr setup 
-
00:07:23 SET sync_ddl=False;
00:07:23 -- executing against localhost:21000
00:07:23 DROP DATABASE IF EXISTS `test_kudu_dml_reporting_256dcf63` CASCADE;
00:07:23 
00:07:23 SET sync_ddl=False;
00:07:23 -- executing against localhost:21000
00:07:23 CREATE DATABASE `test_kudu_dml_reporting_256dcf63`;
00:07:23 
00:07:23 MainThread: Created database "test_kudu_dml_reporting_256dcf63" for 
test ID 
"shell/test_shell_commandline.py::TestImpalaShell::()::test_kudu_dml_reporting"
00:07:23 = 1 failed, 1932 passed, 63 skipped, 45 xfailed, 1 xpassed in 6985.36 
seconds ==
{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-7088) Parallel data load breaks load-data.py if loading data on a real cluster

2018-05-29 Thread Joe McDonnell (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-7088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16493907#comment-16493907
 ] 

Joe McDonnell commented on IMPALA-7088:
---

[~dknupp] I think it is due to the remove of the unique_dir. In minicluster 
operation, unique_dir is set, but in a real cluster it would not be:
{noformat}
unique_dir = None
if options.hive_hs2_hostport.startswith("localhost:"):
unique_dir = tempfile.mkdtemp(prefix="hive-data-load-")
  ...
  shutil.rmtree(unique_dir){noformat}
The shutil.rmtree(unique_dir) should only happen if unique_dir is not None.

> Parallel data load breaks load-data.py if loading data on a real cluster
> 
>
> Key: IMPALA-7088
> URL: https://issues.apache.org/jira/browse/IMPALA-7088
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 3.0
>Reporter: David Knupp
>Priority: Blocker
>
> {{Impala/bin/load-data.py}} is most commonly used to load test data onto a 
> simulated standalone cluster running on the local host. However, with the 
> correct inputs, it can also be used to load data onto an actual cluster 
> running on remote hosts.
> A recent enhancement in the load-data.py script to parallelize parts of the 
> data loading process -- https://github.com/apache/impala/commit/d481cd48 -- 
> has introduced a regression in the latter use case:
> From {{$IMPALA_HOME/logs/data_loading/data-load-functional-exhaustive.log}}:
> {noformat}
> Created table functional_hbase.widetable_1000_cols
> Took 0.7121 seconds
> 09:48:01 Beginning execution of hive SQL: 
> /home/systest/Impala-auxiliary-tests/tests/cdh_cluster/../../../Impala-cdh-cluster-test-runner/logs/data_loading/sql/functional/load-functional-query-exhaustive-hive-generated-text-none-none.sql
> Traceback (most recent call last):
>   File 
> "/home/systest/Impala-auxiliary-tests/tests/cdh_cluster/../../../Impala-cdh-cluster-test-runner/bin/load-data.py",
>  line 494, in 
> if __name__ == "__main__": main()
>   File 
> "/home/systest/Impala-auxiliary-tests/tests/cdh_cluster/../../../Impala-cdh-cluster-test-runner/bin/load-data.py",
>  line 468, in main
> hive_exec_query_files_parallel(thread_pool, hive_load_text_files)
>   File 
> "/home/systest/Impala-auxiliary-tests/tests/cdh_cluster/../../../Impala-cdh-cluster-test-runner/bin/load-data.py",
>  line 299, in hive_exec_query_files_parallel
> exec_query_files_parallel(thread_pool, query_files, 'hive')
>   File 
> "/home/systest/Impala-auxiliary-tests/tests/cdh_cluster/../../../Impala-cdh-cluster-test-runner/bin/load-data.py",
>  line 290, in exec_query_files_parallel
> for result in thread_pool.imap_unordered(execution_function, query_files):
>   File "/usr/lib/python2.7/multiprocessing/pool.py", line 659, in next
> raise value
> TypeError: coercing to Unicode: need string or buffer, NoneType found
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-7088) Parallel data load breaks load-data.py if loading data on a real cluster

2018-05-29 Thread David Knupp (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-7088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Knupp updated IMPALA-7088:

Description: 
{{Impala/bin/load-data.py}} is most commonly used to load test data onto a 
simulated standalone cluster running on the local host. However, with the 
correct inputs, it can also be used to load data onto an actual cluster running 
on remote hosts.

A recent enhancement in the load-data.py script to parallelize parts of the 
data loading process -- https://github.com/apache/impala/commit/d481cd48 -- has 
introduced a regression in the latter use case:

>From {{$IMPALA_HOME/logs/data_loading/data-load-functional-exhaustive.log}}:
{noformat}
Created table functional_hbase.widetable_1000_cols
Took 0.7121 seconds
09:48:01 Beginning execution of hive SQL: 
/home/systest/Impala-auxiliary-tests/tests/cdh_cluster/../../../Impala-cdh-cluster-test-runner/logs/data_loading/sql/functional/load-functional-query-exhaustive-hive-generated-text-none-none.sql
Traceback (most recent call last):
  File 
"/home/systest/Impala-auxiliary-tests/tests/cdh_cluster/../../../Impala-cdh-cluster-test-runner/bin/load-data.py",
 line 494, in 
if __name__ == "__main__": main()
  File 
"/home/systest/Impala-auxiliary-tests/tests/cdh_cluster/../../../Impala-cdh-cluster-test-runner/bin/load-data.py",
 line 468, in main
hive_exec_query_files_parallel(thread_pool, hive_load_text_files)
  File 
"/home/systest/Impala-auxiliary-tests/tests/cdh_cluster/../../../Impala-cdh-cluster-test-runner/bin/load-data.py",
 line 299, in hive_exec_query_files_parallel
exec_query_files_parallel(thread_pool, query_files, 'hive')
  File 
"/home/systest/Impala-auxiliary-tests/tests/cdh_cluster/../../../Impala-cdh-cluster-test-runner/bin/load-data.py",
 line 290, in exec_query_files_parallel
for result in thread_pool.imap_unordered(execution_function, query_files):
  File "/usr/lib/python2.7/multiprocessing/pool.py", line 659, in next
raise value
TypeError: coercing to Unicode: need string or buffer, NoneType found
{noformat}


  was:
{{Impala/bin/load-data.py}} is most commonly used to load test data onto a 
simulated standalone cluster running on the local host. However, with the 
correct inputs, it can also be used to load data onto an actual remote cluster.

A recent enhancement in the load-data.py script to parallelize parts of the 
data loading process -- https://github.com/apache/impala/commit/d481cd48 -- has 
introduced a regression in the latter use case:

>From {{$IMPALA_HOME/logs/data_loading/data-load-functional-exhaustive.log}}:
{noformat}
Created table functional_hbase.widetable_1000_cols
Took 0.7121 seconds
09:48:01 Beginning execution of hive SQL: 
/home/systest/Impala-auxiliary-tests/tests/cdh_cluster/../../../Impala-cdh-cluster-test-runner/logs/data_loading/sql/functional/load-functional-query-exhaustive-hive-generated-text-none-none.sql
Traceback (most recent call last):
  File 
"/home/systest/Impala-auxiliary-tests/tests/cdh_cluster/../../../Impala-cdh-cluster-test-runner/bin/load-data.py",
 line 494, in 
if __name__ == "__main__": main()
  File 
"/home/systest/Impala-auxiliary-tests/tests/cdh_cluster/../../../Impala-cdh-cluster-test-runner/bin/load-data.py",
 line 468, in main
hive_exec_query_files_parallel(thread_pool, hive_load_text_files)
  File 
"/home/systest/Impala-auxiliary-tests/tests/cdh_cluster/../../../Impala-cdh-cluster-test-runner/bin/load-data.py",
 line 299, in hive_exec_query_files_parallel
exec_query_files_parallel(thread_pool, query_files, 'hive')
  File 
"/home/systest/Impala-auxiliary-tests/tests/cdh_cluster/../../../Impala-cdh-cluster-test-runner/bin/load-data.py",
 line 290, in exec_query_files_parallel
for result in thread_pool.imap_unordered(execution_function, query_files):
  File "/usr/lib/python2.7/multiprocessing/pool.py", line 659, in next
raise value
TypeError: coercing to Unicode: need string or buffer, NoneType found
{noformat}



> Parallel data load breaks load-data.py if loading data on a real cluster
> 
>
> Key: IMPALA-7088
> URL: https://issues.apache.org/jira/browse/IMPALA-7088
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 3.0
>Reporter: David Knupp
>Priority: Blocker
>
> {{Impala/bin/load-data.py}} is most commonly used to load test data onto a 
> simulated standalone cluster running on the local host. However, with the 
> correct inputs, it can also be used to load data onto an actual cluster 
> running on remote hosts.
> A recent enhancement in the load-data.py script to parallelize parts of the 
> data loading process -- https://github.com/apache/impala/commit/d481cd48 -- 
> has introduced a regression in the latter use case:
> From {{$IMPALA_HOME/log

[jira] [Commented] (IMPALA-7088) Parallel data load breaks load-data.py if loading data on a real cluster

2018-05-29 Thread David Knupp (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-7088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16493882#comment-16493882
 ] 

David Knupp commented on IMPALA-7088:
-

Cc: [~joemcdonnell], [~njanarthanan], [~mikesbrown]

> Parallel data load breaks load-data.py if loading data on a real cluster
> 
>
> Key: IMPALA-7088
> URL: https://issues.apache.org/jira/browse/IMPALA-7088
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 3.0
>Reporter: David Knupp
>Priority: Blocker
>
> {{Impala/bin/load-data.py}} is most commonly used to load test data onto a 
> simulated standalone cluster running on the local host. However, with the 
> correct inputs, it can also be used to load data onto an actual remote 
> cluster.
> A recent enhancement in the load-data.py script to parallelize parts of the 
> data loading process -- https://github.com/apache/impala/commit/d481cd48 -- 
> has introduced a regression in the latter use case:
> From {{$IMPALA_HOME/logs/data_loading/data-load-functional-exhaustive.log}}:
> {noformat}
> Created table functional_hbase.widetable_1000_cols
> Took 0.7121 seconds
> 09:48:01 Beginning execution of hive SQL: 
> /home/systest/Impala-auxiliary-tests/tests/cdh_cluster/../../../Impala-cdh-cluster-test-runner/logs/data_loading/sql/functional/load-functional-query-exhaustive-hive-generated-text-none-none.sql
> Traceback (most recent call last):
>   File 
> "/home/systest/Impala-auxiliary-tests/tests/cdh_cluster/../../../Impala-cdh-cluster-test-runner/bin/load-data.py",
>  line 494, in 
> if __name__ == "__main__": main()
>   File 
> "/home/systest/Impala-auxiliary-tests/tests/cdh_cluster/../../../Impala-cdh-cluster-test-runner/bin/load-data.py",
>  line 468, in main
> hive_exec_query_files_parallel(thread_pool, hive_load_text_files)
>   File 
> "/home/systest/Impala-auxiliary-tests/tests/cdh_cluster/../../../Impala-cdh-cluster-test-runner/bin/load-data.py",
>  line 299, in hive_exec_query_files_parallel
> exec_query_files_parallel(thread_pool, query_files, 'hive')
>   File 
> "/home/systest/Impala-auxiliary-tests/tests/cdh_cluster/../../../Impala-cdh-cluster-test-runner/bin/load-data.py",
>  line 290, in exec_query_files_parallel
> for result in thread_pool.imap_unordered(execution_function, query_files):
>   File "/usr/lib/python2.7/multiprocessing/pool.py", line 659, in next
> raise value
> TypeError: coercing to Unicode: need string or buffer, NoneType found
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-7088) Parallel data load breaks load-data.py if loading data on a real cluster

2018-05-29 Thread David Knupp (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-7088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Knupp updated IMPALA-7088:

Description: 
{{Impala/bin/load-data.py}} is most commonly used to load test data onto a 
simulated standalone cluster running on the local host. However, with the 
correct inputs, it can also be used to load data onto an actual remote cluster.

A recent enhancement in the load-data.py script to parallelize parts of the 
data loading process -- https://github.com/apache/impala/commit/d481cd48 -- has 
introduced a regression in the latter use case:

>From {{$IMPALA_HOME/logs/data_loading/data-load-functional-exhaustive.log}}:
{noformat}
Created table functional_hbase.widetable_1000_cols
Took 0.7121 seconds
09:48:01 Beginning execution of hive SQL: 
/home/systest/Impala-auxiliary-tests/tests/cdh_cluster/../../../Impala-cdh-cluster-test-runner/logs/data_loading/sql/functional/load-functional-query-exhaustive-hive-generated-text-none-none.sql
Traceback (most recent call last):
  File 
"/home/systest/Impala-auxiliary-tests/tests/cdh_cluster/../../../Impala-cdh-cluster-test-runner/bin/load-data.py",
 line 494, in 
if __name__ == "__main__": main()
  File 
"/home/systest/Impala-auxiliary-tests/tests/cdh_cluster/../../../Impala-cdh-cluster-test-runner/bin/load-data.py",
 line 468, in main
hive_exec_query_files_parallel(thread_pool, hive_load_text_files)
  File 
"/home/systest/Impala-auxiliary-tests/tests/cdh_cluster/../../../Impala-cdh-cluster-test-runner/bin/load-data.py",
 line 299, in hive_exec_query_files_parallel
exec_query_files_parallel(thread_pool, query_files, 'hive')
  File 
"/home/systest/Impala-auxiliary-tests/tests/cdh_cluster/../../../Impala-cdh-cluster-test-runner/bin/load-data.py",
 line 290, in exec_query_files_parallel
for result in thread_pool.imap_unordered(execution_function, query_files):
  File "/usr/lib/python2.7/multiprocessing/pool.py", line 659, in next
raise value
TypeError: coercing to Unicode: need string or buffer, NoneType found
{noformat}


  was:
Impala/bin/load-data.py is most commonly used to load test data onto a 
simulated standalone cluster running on the local host. However, with the 
correct inputs, it can also be used to load data onto an actual remote cluster.

A recent enhancement in the load-data.py script to parallelize parts of the 
data loading process -- https://github.com/apache/impala/commit/d481cd48 -- has 
introduced a regression in the latter use case:

>From *$IMPALA_HOME/logs/data_loading/data-load-functional-exhaustive.log*:
{noformat}
Created table functional_hbase.widetable_1000_cols
Took 0.7121 seconds
09:48:01 Beginning execution of hive SQL: 
/home/systest/Impala-auxiliary-tests/tests/cdh_cluster/../../../Impala-cdh-cluster-test-runner/logs/data_loading/sql/functional/load-functional-query-exhaustive-hive-generated-text-none-none.sql
Traceback (most recent call last):
  File 
"/home/systest/Impala-auxiliary-tests/tests/cdh_cluster/../../../Impala-cdh-cluster-test-runner/bin/load-data.py",
 line 494, in 
if __name__ == "__main__": main()
  File 
"/home/systest/Impala-auxiliary-tests/tests/cdh_cluster/../../../Impala-cdh-cluster-test-runner/bin/load-data.py",
 line 468, in main
hive_exec_query_files_parallel(thread_pool, hive_load_text_files)
  File 
"/home/systest/Impala-auxiliary-tests/tests/cdh_cluster/../../../Impala-cdh-cluster-test-runner/bin/load-data.py",
 line 299, in hive_exec_query_files_parallel
exec_query_files_parallel(thread_pool, query_files, 'hive')
  File 
"/home/systest/Impala-auxiliary-tests/tests/cdh_cluster/../../../Impala-cdh-cluster-test-runner/bin/load-data.py",
 line 290, in exec_query_files_parallel
for result in thread_pool.imap_unordered(execution_function, query_files):
  File "/usr/lib/python2.7/multiprocessing/pool.py", line 659, in next
raise value
TypeError: coercing to Unicode: need string or buffer, NoneType found
{noformat}



> Parallel data load breaks load-data.py if loading data on a real cluster
> 
>
> Key: IMPALA-7088
> URL: https://issues.apache.org/jira/browse/IMPALA-7088
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 3.0
>Reporter: David Knupp
>Priority: Blocker
>
> {{Impala/bin/load-data.py}} is most commonly used to load test data onto a 
> simulated standalone cluster running on the local host. However, with the 
> correct inputs, it can also be used to load data onto an actual remote 
> cluster.
> A recent enhancement in the load-data.py script to parallelize parts of the 
> data loading process -- https://github.com/apache/impala/commit/d481cd48 -- 
> has introduced a regression in the latter use case:
> From {{$IMPALA_HOME/logs/data_loading/data-load-functional-exhau

[jira] [Created] (IMPALA-7088) Parallel data load breaks load-data.py if loading data on a real cluster

2018-05-29 Thread David Knupp (JIRA)

David Knupp created IMPALA-7088:
---

 Summary: Parallel data load breaks load-data.py if loading data on 
a real cluster
 Key: IMPALA-7088
 URL: https://issues.apache.org/jira/browse/IMPALA-7088
 Project: IMPALA
  Issue Type: Bug
  Components: Infrastructure
Affects Versions: Impala 3.0
Reporter: David Knupp


Impala/bin/load-data.py is most commonly used to load test data onto a 
simulated standalone cluster running on the local host. However, with the 
correct inputs, it can also be used to load data onto an actual remote cluster.

A recent enhancement in the load-data.py script to parallelize parts of the 
data loading process -- https://github.com/apache/impala/commit/d481cd48 -- has 
introduced a regression in the latter use case:

>From *$IMPALA_HOME/logs/data_loading/data-load-functional-exhaustive.log*:
{noformat}
Created table functional_hbase.widetable_1000_cols
Took 0.7121 seconds
09:48:01 Beginning execution of hive SQL: 
/home/systest/Impala-auxiliary-tests/tests/cdh_cluster/../../../Impala-cdh-cluster-test-runner/logs/data_loading/sql/functional/load-functional-query-exhaustive-hive-generated-text-none-none.sql
Traceback (most recent call last):
  File 
"/home/systest/Impala-auxiliary-tests/tests/cdh_cluster/../../../Impala-cdh-cluster-test-runner/bin/load-data.py",
 line 494, in 
if __name__ == "__main__": main()
  File 
"/home/systest/Impala-auxiliary-tests/tests/cdh_cluster/../../../Impala-cdh-cluster-test-runner/bin/load-data.py",
 line 468, in main
hive_exec_query_files_parallel(thread_pool, hive_load_text_files)
  File 
"/home/systest/Impala-auxiliary-tests/tests/cdh_cluster/../../../Impala-cdh-cluster-test-runner/bin/load-data.py",
 line 299, in hive_exec_query_files_parallel
exec_query_files_parallel(thread_pool, query_files, 'hive')
  File 
"/home/systest/Impala-auxiliary-tests/tests/cdh_cluster/../../../Impala-cdh-cluster-test-runner/bin/load-data.py",
 line 290, in exec_query_files_parallel
for result in thread_pool.imap_unordered(execution_function, query_files):
  File "/usr/lib/python2.7/multiprocessing/pool.py", line 659, in next
raise value
TypeError: coercing to Unicode: need string or buffer, NoneType found
{noformat}




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-7084) Partition doesn't exist after attempting to ALTER partition location to non-existing path

2018-05-29 Thread Fredy Wijaya (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-7084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16493850#comment-16493850
 ] 

Fredy Wijaya commented on IMPALA-7084:
--

It doesn't seem to be an issue anymore in the latest master (3.x) and 2.x 
branches.
{noformat}
[localhost:21000] default> create table test (a int) partitioned by (b int);
Query: create table test (a int) partitioned by (b int)
Fetched 1 row(s) in 0.02s

>[localhost:21000] default> insert into test partition (b=1) values (1);
Query: insert into test partition (b=1) values (1)
Query submitted at: 2018-05-29 08:35:14 (Coordinator: http://impala-dev:25000)
Query progress can be monitored at: 
http://impala-dev:25000/query_plan?query_id=84adf7dc8062947:7c9176fc
Modified 1 row(s) in 4.10s

[localhost:21000] default> alter table test add partition (b=2) location 
'hdfs://localhost:20500/test-warehouse/test/b=2/';
Query: alter table test add partition (b=2) location 
'hdfs://localhost:20500/test-warehouse/test/b=2/'
++
| summary|
++
| New partition has been added to the table. |
++
Fetched 1 row(s) in 0.03s

[localhost:21000] default> alter table test partition (b=1) set location 
'hdfs://localhost:20500/test-warehouse/test/b=5/';
Query: alter table test partition (b=1) set location 
'hdfs://localhost:20500/test-warehouse/test/b=5/'
++
| summary|
++
| New location has been set for the specified partition. |
++
Fetched 1 row(s) in 0.07s

[localhost:21000] default> show partitions test;
Query: show partitions test
+---+---++--+--+---++---++
| b | #Rows | #Files | Size | Bytes Cached | Cache Replication | Format | 
Incremental stats | Location   |
+---+---++--+--+---++---++
| 1 | -1 | 0 | 0B | NOT CACHED | NOT CACHED | TEXT | false | 
hdfs://localhost:20500/test-warehouse/test/b=5  
|
| 2 | -1 | 0 | 0B | NOT CACHED | NOT CACHED | TEXT | false | 
hdfs://localhost:20500/test-warehouse/test/b=2  
|
| Total | -1 | 0 | 0B | 0B ||  |   |
 |
+---+---++--+--+---++---++
Fetched 3 row(s) in 0.01s{noformat}
 

 

> Partition doesn't exist after attempting to ALTER partition location to 
> non-existing path
> -
>
> Key: IMPALA-7084
> URL: https://issues.apache.org/jira/browse/IMPALA-7084
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 2.12.0
>Reporter: Gabor Kaszab
>Priority: Major
>  Labels: correctness, ramp-up
>
>  
>  
> {code:java}
> create table test (a int) partitioned by (b int);
> insert into test partition (b=1) values (1);
> // create another partition that points to a different location.
> alter table test add partition (b=2) location 
> 'hdfs://localhost:20500/test-warehouse/test/b=2/';
> // setting the first partition to a non existing location. This fails as 
> expected. The error message is not exactly user friendly, though. Could have 
> said "Invalid location or such".
> alter table test partition (b=1) set location 
> 'hdfs://localhost:20500/test-warehouse/test/b=5/';
> Query: alter table test partition (b=1) set location 
> 'hdfs://localhost:20500/test-warehouse/test/b=5/'
> ERROR: TableLoadingException: Failed to load metadata for table: default.test
> CAUSED BY: NullPointerException: null
> // Setting the first partition to an existing location. This surprisingly 
> fails as the partitions doesn't exist.
> alter table test partition (b=1) set location 
> 'hdfs://localhost:20500/test-warehouse/test/b=2/';
> Query: alter table test partition (b=1) set location 
> 'hdfs://localhost:20500/test-warehouse/test/b=1/'
> ERROR: PartitionNotFoundException: Partition not found: 
> TPartitionKeyValue(name:b, value:1)
> // However, show partition displays b=1 partition as well.
> show partitions test;
> Query: show partitions test
> +---+---++--+--+

[jira] [Resolved] (IMPALA-6317) Expose -cmake_only flag to buildall.sh

2018-05-29 Thread David Knupp (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-6317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Knupp resolved IMPALA-6317.
-
Resolution: Fixed

> Expose -cmake_only flag to buildall.sh
> --
>
> Key: IMPALA-6317
> URL: https://issues.apache.org/jira/browse/IMPALA-6317
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Infrastructure
>Affects Versions: Impala 2.11.0
>Reporter: David Knupp
>Assignee: David Knupp
>Priority: Minor
>
> Impala/bin/make_impala.sh has a {{-cmake_only}} command line option:
> {noformat}
> -cmake_only)
>   CMAKE_ONLY=1
> {noformat}
> Passing this flag means that makefiles only will be generated during the 
> build. However, this flag is not provided in buildall.sh (the caller of 
> make_impala.sh) which effectively renders it useless.
> It turns out that if one has no intention of running the Impala cluster 
> locally (e.g., as when trying to build just enough of the toolchain and dev 
> environment to run the data load scripts for loading data onto a remote 
> cluster) then being able to only generate makefiles is a useful thing.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-7087) Impala is unable to read Parquet decimal columns with lower precision/scale than table metadata

2018-05-29 Thread Tim Armstrong (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-7087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong updated IMPALA-7087:
--
Description: 
This is similar to IMPALA-2515, except relates to a different precision/scale 
in the file metadata rather than just a mismatch in the bytes used to store the 
data. In a lot of cases we should be able to convert the decimal type on the 
fly to the higher-precision type.

{noformat}
ERROR: File '/hdfs/path/00_0_x_2' column 'alterd_decimal' has an invalid 
type length. Expecting: 11 len in file: 8
{noformat}

It would be convenient to allow reading parquet files where the precision/scale 
in the file can be converted to the precision/scale in the table metadata 
without loss of precision.

  was:
This is similar to IMPALA-2515, except relates to a different precision/scale 
in the file metadata rather than just a mismatch in the bytes used to store the 
data. In a lot of cases we should be able to convert the decimal type on the 
fly to the higher-precision type.

{noformat}
ERROR: File '/hdfs/path/00_0_x_2' column 'alterd_decimal' has an invalid 
type length. Expecting: 11 len in file: 8
{noformat}



> Impala is unable to read Parquet decimal columns with lower precision/scale 
> than table metadata
> ---
>
> Key: IMPALA-7087
> URL: https://issues.apache.org/jira/browse/IMPALA-7087
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Backend
>Reporter: Tim Armstrong
>Priority: Major
>  Labels: parquet
>
> This is similar to IMPALA-2515, except relates to a different precision/scale 
> in the file metadata rather than just a mismatch in the bytes used to store 
> the data. In a lot of cases we should be able to convert the decimal type on 
> the fly to the higher-precision type.
> {noformat}
> ERROR: File '/hdfs/path/00_0_x_2' column 'alterd_decimal' has an invalid 
> type length. Expecting: 11 len in file: 8
> {noformat}
> It would be convenient to allow reading parquet files where the 
> precision/scale in the file can be converted to the precision/scale in the 
> table metadata without loss of precision.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Created] (IMPALA-7087) Impala is unable to read Parquet decimal columns with lower precision/scale than table metadata

2018-05-29 Thread Tim Armstrong (JIRA)

Tim Armstrong created IMPALA-7087:
-

 Summary: Impala is unable to read Parquet decimal columns with 
lower precision/scale than table metadata
 Key: IMPALA-7087
 URL: https://issues.apache.org/jira/browse/IMPALA-7087
 Project: IMPALA
  Issue Type: Sub-task
  Components: Backend
Reporter: Tim Armstrong


This is similar to IMPALA-2515, except relates to a different precision/scale 
in the file metadata rather than just a mismatch in the bytes used to store the 
data. In a lot of cases we should be able to convert the decimal type on the 
fly to the higher-precision type.

{noformat}
ERROR: File '/hdfs/path/00_0_x_2' column 'alterd_decimal' has an invalid 
type length. Expecting: 11 len in file: 8
{noformat}




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

51 matches

Mail list logo