[jira] [Commented] (IMPALA-2751) quote in WITH block's comment breaks shell
[ https://issues.apache.org/jira/browse/IMPALA-2751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16494695#comment-16494695 ] Fredy Wijaya commented on IMPALA-2751: -- I ran the whole shell tests on Python 2.6 and the fix works. I'll submit a new patch after the revert. > quote in WITH block's comment breaks shell > -- > > Key: IMPALA-2751 > URL: https://issues.apache.org/jira/browse/IMPALA-2751 > Project: IMPALA > Issue Type: Bug > Components: Clients >Affects Versions: Impala 2.2 > Environment: CDH5.4.8 >Reporter: Marcell Szabo >Assignee: Fredy Wijaya >Priority: Minor > Labels: impala-shell, shell, usability > Fix For: Impala 2.13.0, Impala 3.1.0 > > > Steps to reproduce: > $ cat > test.sql > with a as ( > select 'a' > -- shouldn't matter > ) > select * from a; > $ impala-shell -f test.sql > /usr/bin/impala-shell: line 32: warning: setlocale: LC_CTYPE: cannot change > locale (UTF-8): No such file or directory > /usr/bin/impala-shell: line 32: warning: setlocale: LC_CTYPE: cannot change > locale (UTF-8): No such file or directory > Starting Impala Shell without Kerberos authentication > Connected to host:21000 > Server version: impalad version 2.2.0-cdh5 RELEASE (build > 1d0b017e2441dd8950924743d839f14b3995e259) > Traceback (most recent call last): > File "/usr/lib/impala-shell/impala_shell.py", line 1006, in > execute_queries_non_interactive_mode(options) > File "/usr/lib/impala-shell/impala_shell.py", line 922, in > execute_queries_non_interactive_mode > if shell.onecmd(query) is CmdStatus.ERROR: > File "/usr/lib64/python2.6/cmd.py", line 219, in onecmd > return func(arg) > File "/usr/lib/impala-shell/impala_shell.py", line 762, in do_with > tokens = list(lexer) > File "/usr/lib64/python2.6/shlex.py", line 269, in next > token = self.get_token() > File "/usr/lib64/python2.6/shlex.py", line 96, in get_token > raw = self.read_token() > File "/usr/lib64/python2.6/shlex.py", line 172, in read_token > raise ValueError, "No closing quotation" > ValueError: No closing quotation > Also, copy-pasting the query interactively, the line never closes. > Strangely, the issue only seems to occur in presence of the WITH block. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-7094) Fallback to an unoptimized query if ExprRewriter fails
Tianyi Wang created IMPALA-7094: --- Summary: Fallback to an unoptimized query if ExprRewriter fails Key: IMPALA-7094 URL: https://issues.apache.org/jira/browse/IMPALA-7094 Project: IMPALA Issue Type: Improvement Components: Frontend Affects Versions: Impala 3.1.0 Reporter: Tianyi Wang The ExprRewriter in impala is growing more complex and sometimes fails. Currently the user have to disable the rewrite to workaround it until the bug is fixed. Impala should fallback to an unrewritten query automatically and print a warning to provide a smoother user experience if there are ExprRewriter bugs in the future. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-2751) quote in WITH block's comment breaks shell
[ https://issues.apache.org/jira/browse/IMPALA-2751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16494610#comment-16494610 ] Fredy Wijaya commented on IMPALA-2751: -- Ah the infamous unicode issue in Python 2.6. I don't have access to the build machines. Can you try to add .encode('utf-8') in the sqlparse.format result and run the tests again? I built Python 2.6.9 manually from the source and added .encode('utf-8') {noformat} >>> query="with y as (values(7)) insert into >>> test_kudu_dml_reporting_256dcf63.dml_test (id) select * from y" >>> formatted_query = sqlparse.format(query.lstrip(), >>> strip_comments=True).encode('utf-8') >>> lexer = shlex.shlex(formatted_query, posix=True) >>> print(list(lexer)) ['with', 'y', 'as', '(', 'values', '(', '7', ')', ')', >>> 'insert', 'into', 'test_kudu_dml_reporting_256dcf63', '.', 'dml_test', '(', >>> 'id', ')', 'select', '*', 'from', 'y']{noformat} > quote in WITH block's comment breaks shell > -- > > Key: IMPALA-2751 > URL: https://issues.apache.org/jira/browse/IMPALA-2751 > Project: IMPALA > Issue Type: Bug > Components: Clients >Affects Versions: Impala 2.2 > Environment: CDH5.4.8 >Reporter: Marcell Szabo >Assignee: Fredy Wijaya >Priority: Minor > Labels: impala-shell, shell, usability > Fix For: Impala 2.13.0, Impala 3.1.0 > > > Steps to reproduce: > $ cat > test.sql > with a as ( > select 'a' > -- shouldn't matter > ) > select * from a; > $ impala-shell -f test.sql > /usr/bin/impala-shell: line 32: warning: setlocale: LC_CTYPE: cannot change > locale (UTF-8): No such file or directory > /usr/bin/impala-shell: line 32: warning: setlocale: LC_CTYPE: cannot change > locale (UTF-8): No such file or directory > Starting Impala Shell without Kerberos authentication > Connected to host:21000 > Server version: impalad version 2.2.0-cdh5 RELEASE (build > 1d0b017e2441dd8950924743d839f14b3995e259) > Traceback (most recent call last): > File "/usr/lib/impala-shell/impala_shell.py", line 1006, in > execute_queries_non_interactive_mode(options) > File "/usr/lib/impala-shell/impala_shell.py", line 922, in > execute_queries_non_interactive_mode > if shell.onecmd(query) is CmdStatus.ERROR: > File "/usr/lib64/python2.6/cmd.py", line 219, in onecmd > return func(arg) > File "/usr/lib/impala-shell/impala_shell.py", line 762, in do_with > tokens = list(lexer) > File "/usr/lib64/python2.6/shlex.py", line 269, in next > token = self.get_token() > File "/usr/lib64/python2.6/shlex.py", line 96, in get_token > raw = self.read_token() > File "/usr/lib64/python2.6/shlex.py", line 172, in read_token > raise ValueError, "No closing quotation" > ValueError: No closing quotation > Also, copy-pasting the query interactively, the line never closes. > Strangely, the issue only seems to occur in presence of the WITH block. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-6953) Improve encapsulation within DiskIoMgr
[ https://issues.apache.org/jira/browse/IMPALA-6953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16494549#comment-16494549 ] ASF subversion and git services commented on IMPALA-6953: - Commit 564687265247d957fb8cda26bcf86fcf6a80f87a in impala's branch refs/heads/2.x from [~tarmstr...@cloudera.com] [ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=5646872 ] IMPALA-6953: part 1: clean up DiskIoMgr There should be no behavioural changes as a result of this refactoring. Make DiskQueue an encapsulated class. Remove friend classes where possible, either by using public methods or moving code between classes. Move method into protected in some cases. Split GetNextRequestRange() into two methods that operate on DiskQueue and RequestContext state. The methods belong to the respective classes. Reduce transitive #include dependencies to hopefully help with build time. Testing: Ran core tests. Change-Id: I50b448834b832a0ee0dc5d85541691cd2f308e12 Reviewed-on: http://gerrit.cloudera.org:8080/10538 Reviewed-by: Thomas Marshall Tested-by: Thomas Marshall > Improve encapsulation within DiskIoMgr > -- > > Key: IMPALA-6953 > URL: https://issues.apache.org/jira/browse/IMPALA-6953 > Project: IMPALA > Issue Type: Improvement > Components: Backend >Reporter: Tim Armstrong >Assignee: Tim Armstrong >Priority: Major > > While DiskIoMgr is still fresh in my mind, I should do some refactoring to > improve the encapsulation within io::. Currently lots of classes are friends > with each other and some code is not in the most appropriate class. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-7093) Tables briefly appear to not exist after INVALIDATE METADATA or catalog restart
[ https://issues.apache.org/jira/browse/IMPALA-7093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16494542#comment-16494542 ] Todd Lipcon commented on IMPALA-7093: - Testing on Impala 2.10 I haven't been able to reproduce this, so appears it might be a regression, though I'll continue to attempt it. > Tables briefly appear to not exist after INVALIDATE METADATA or catalog > restart > --- > > Key: IMPALA-7093 > URL: https://issues.apache.org/jira/browse/IMPALA-7093 > Project: IMPALA > Issue Type: Bug > Components: Catalog >Affects Versions: Impala 2.12.0, Impala 2.13.0 >Reporter: Todd Lipcon >Priority: Major > > I'm doing some stress testing of Impala 2.13 (recent snapshot build) and hit > the following sequence: > {code} > {"query": "SHOW TABLES in consistency_test", "type": "call", "id": 3} > {"type": "response", "id": 3, "results": [["t1"]]} > {"query": "INVALIDATE METADATA", "type": "call", "id": 7} > {"type": "response", "id": 7} > {"query": "DESCRIBE consistency_test.t1", "type": "call", "id": 9} > {"type": "response", "id": 9, "error": "AnalysisException: Could not resolve > path: 'consistency_test.t1'\n"} > {code} > i.e. 'SHOW TABLES' shows that a table exists, but then shortly after an > INVALIDATE METADATA, an attempt to describe a table indicates that the table > does not exist. This is a single-threaded test case against a single impalad. > I also saw a similar behavior that issuing queries to an impalad shortly > after a catalogd restart could transiently show tables not existing that in > fact exist. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-7093) Tables briefly appear to not exist after INVALIDATE METADATA or catalog restart
Todd Lipcon created IMPALA-7093: --- Summary: Tables briefly appear to not exist after INVALIDATE METADATA or catalog restart Key: IMPALA-7093 URL: https://issues.apache.org/jira/browse/IMPALA-7093 Project: IMPALA Issue Type: Bug Components: Catalog Affects Versions: Impala 2.12.0, Impala 2.13.0 Reporter: Todd Lipcon I'm doing some stress testing of Impala 2.13 (recent snapshot build) and hit the following sequence: {code} {"query": "SHOW TABLES in consistency_test", "type": "call", "id": 3} {"type": "response", "id": 3, "results": [["t1"]]} {"query": "INVALIDATE METADATA", "type": "call", "id": 7} {"type": "response", "id": 7} {"query": "DESCRIBE consistency_test.t1", "type": "call", "id": 9} {"type": "response", "id": 9, "error": "AnalysisException: Could not resolve path: 'consistency_test.t1'\n"} {code} i.e. 'SHOW TABLES' shows that a table exists, but then shortly after an INVALIDATE METADATA, an attempt to describe a table indicates that the table does not exist. This is a single-threaded test case against a single impalad. I also saw a similar behavior that issuing queries to an impalad shortly after a catalogd restart could transiently show tables not existing that in fact exist. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-7073) Failed test: query_test.test_scanners.TestScannerReservation.test_scanners
[ https://issues.apache.org/jira/browse/IMPALA-7073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16494532#comment-16494532 ] ASF subversion and git services commented on IMPALA-7073: - Commit e9bd917a218b5e1717fced983f70f64850c6e02f in impala's branch refs/heads/master from [~tarmstr...@cloudera.com] [ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=e9bd917 ] IMPALA-7073: skip TestScannerReservation on non-miniclusters The test is (sort of) tuned for miniclusters and is very targeted to testing a specific code path, rather than general functional correctness, so we don't really need coverage on all filesystems. Change-Id: I7952f780cff80c08a6cbef898bf7b95c9bba5f6a Reviewed-on: http://gerrit.cloudera.org:8080/10533 Reviewed-by: Thomas Marshall Tested-by: Impala Public Jenkins > Failed test: query_test.test_scanners.TestScannerReservation.test_scanners > -- > > Key: IMPALA-7073 > URL: https://issues.apache.org/jira/browse/IMPALA-7073 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 3.0 >Reporter: Dimitris Tsirogiannis >Assignee: Tim Armstrong >Priority: Blocker > Labels: broken-build, test-failure > > Possibly flaky test: > {code:java} > Stacktrace > query_test/test_scanners.py:1064: in test_scanners > self.run_test_case('QueryTest/scanner-reservation', vector) > common/impala_test_suite.py:451: in run_test_case > verify_runtime_profile(test_section['RUNTIME_PROFILE'], > result.runtime_profile) > common/test_result_verifier.py:590: in verify_runtime_profile > actual)) > E AssertionError: Did not find matches for lines in runtime profile: > E EXPECTED LINES: > E row_regex:.*InitialRangeActualReservation.*Avg: 4.00 MB.*{code} > {noformat} > 11:27:13 -- executing against localhost:21000 > 11:27:13 set debug_action="-1:OPEN:SET_DENY_RESERVATION_PROBABILITY@1.0"; > 11:27:13 > 11:27:13 -- executing against localhost:21000 > 11:27:13 > 11:27:13 select count(*) > 11:27:13 from tpch.customer; > 11:27:13 > 11:27:13 -- executing against localhost:21000 > 11:27:13 SET DEBUG_ACTION=""; > 11:27:13 > 11:27:13 -- executing against localhost:21000 > 11:27:13 select min(l_comment) > 11:27:13 from tpch_parquet.lineitem; > {noformat} > {noformat} > 11:27:13 E - SpilledPartitions: 0 (0) > 11:27:13 E HDFS_SCAN_NODE (id=0):(Total: 2s295ms, non-child: > 2s295ms, % non-child: 100.00%) > 11:27:13 E Hdfs split stats (:<# splits>/ lengths>): -1:8/193.99 MB > 11:27:13 E ExecOption: PARQUET Codegen Enabled, Codegen enabled: > 8 out of 8 > 11:27:13 E Hdfs Read Thread Concurrency Bucket: 0:80% 1:20% 2:0% > 3:0% 4:0% 5:0% 6:0% > 11:27:13 E File Formats: PARQUET/NONE:5 PARQUET/SNAPPY:3 > 11:27:13 E BytesRead(500.000ms): 0, 21.31 MB, 21.60 MB, 47.94 MB, > 56.23 MB, 74.46 MB > 11:27:13 E - FooterProcessingTime: (Avg: 3.624ms ; Min: > 999.979us ; Max: 9.999ms ; Number of samples: 8) > 11:27:13 E - InitialRangeActualReservation: (Avg: 21.50 MB > (22544384) ; Min: 4.00 MB (4194304) ; Max: 24.00 MB (25165824) ; Number of > samples: 8) > 11:27:13 E - InitialRangeIdealReservation: (Avg: 128.00 KB > (131072) ; Min: 128.00 KB (131072) ; Max: 128.00 KB (131072) ; Number of > samples: 8) > 11:27:13 E - ParquetRowGroupActualReservation: (Avg: 24.00 MB > (25165824) ; Min: 24.00 MB (25165824) ; Max: 24.00 MB (25165824) ; Number of > samples: 3) > 11:27:13 E - ParquetRowGroupIdealReservation: (Avg: 24.00 MB > (25165824) ; Min: 24.00 MB (25165824) ; Max: 24.00 MB (25165824) ; Number of > samples: 3) > 11:27:13 E - AverageHdfsReadThreadConcurrency: 0.20 > 11:27:13 E - AverageScannerThreadConcurrency: 1.00 > 11:27:13 E - BytesRead: 74.55 MB (78175787) > 11:27:13 E - BytesReadDataNodeCache: 0 > 11:27:13 E - BytesReadLocal: 0 > 11:27:13 E - BytesReadRemoteUnexpected: 0 > 11:27:13 E - BytesReadShortCircuit: 0 > 11:27:13 E - CachedFileHandlesHitCount: 0 (0) > 11:27:13 E - CachedFileHandlesMissCount: 11 (11) > 11:27:13 E - CollectionItemsRead: 0 (0) > 11:27:13 E - DecompressionTime: 345.992ms > 11:27:13 E - MaxCompressedTextFileLength: 0 > 11:27:13 E - NumColumns: 1 (1) > 11:27:13 E - NumDictFilteredRowGroups: 0 (0) > 11:27:13 E - NumDisksAccessed: 2 (2) > 11:27:13 E - NumRowGroups: 3 (3) > 11:27:13 E - NumScannerThreadReservationsDenied: 0 (0) > 11:27:13 E - NumScannerThreadsStarted: 1 (1) > 11:27:13 E
[jira] [Reopened] (IMPALA-2751) quote in WITH block's comment breaks shell
[ https://issues.apache.org/jira/browse/IMPALA-2751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Tauber-Marshall reopened IMPALA-2751: I put out a patch to revert this: https://gerrit.cloudera.org/#/c/10537/ This patch is causing test_kudu_dml_reporting to fail deterministically on the build machines. The failure doesn't repro for me locally, so I'm guessing its a python version issue, but I can repro it by logging into a build machine: {noformat} $ impala-python --version Python 2.6.6 $ impala-python >>> query="with y as (values(7)) insert into >>> test_kudu_dml_reporting_256dcf63.dml_test (id) select * from y" >>> lexer=shlex.shlex(query.lstrip(), posix=True) >>> print(list(lexer)) # The old way of parsing works ['with', 'y', 'as', '(', 'values', '(', '7', ')', ')', 'insert', 'into', 'test_kudu_dml_reporting_256dcf63', '.', 'dml_test', '(', 'id', ')', 'select', '*', 'from', 'y'] >>> lexer=shlex.shlex(sqlparse.format(query.lstrip(), strip_comments=True), >>> posix=True) >>> print(list(lexer)) # The new addition is causing weird parsing errors ['w', '\x00', '\x00', '\x00', 'i', '\x00', '\x00', '\x00', 't', '\x00', '\x00', '\x00', 'h', '\x00', '\x00', '\x00', '\x00', '\x00', '\x00', 'y', '\x00', '\x00', '\x00', '\x00', '\x00', '\x00', 'a', '\x00', '\x00', '\x00', 's', '\x00', '\x00', '\x00', '\x00', '\x00', '\x00', '(', '\x00', '\x00', '\x00', 'v', '\x00', '\x00', '\x00', 'a', '\x00', '\x00', '\x00', 'l', '\x00', '\x00', '\x00', 'u', '\x00', '\x00', '\x00', 'e', '\x00', '\x00', '\x00', 's', '\x00', '\x00', '\x00', '(', '\x00', '\x00', '\x00', '7', '\x00', '\x00', '\x00', ')', '\x00', '\x00', '\x00', ')', '\x00', '\x00', '\x00', '\x00', '\x00', '\x00', 'i', '\x00', '\x00', '\x00', 'n', '\x00', '\x00', '\x00', 's', '\x00', '\x00', '\x00', 'e', '\x00', '\x00', '\x00', 'r', '\x00', '\x00', '\x00', 't', '\x00', '\x00', '\x00', '\x00', '\x00', '\x00', 'i', '\x00', '\x00', '\x00', 'n', '\x00', '\x00', '\x00', 't', '\x00', '\x00', '\x00', 'o', '\x00', '\x00', '\x00', '\x00', '\x00', '\x00', 't', '\x00', '\x00', '\x00', 'e', '\x00', '\x00', '\x00', 's', '\x00', '\x00', '\x00', 't', '\x00', '\x00', '\x00', '_', '\x00', '\x00', '\x00', 'k', '\x00', '\x00', '\x00', 'u', '\x00', '\x00', '\x00', 'd', '\x00', '\x00', '\x00', 'u', '\x00', '\x00', '\x00', '_', '\x00', '\x00', '\x00', 'd', '\x00', '\x00', '\x00', 'm', '\x00', '\x00', '\x00', 'l', '\x00', '\x00', '\x00', '_', '\x00', '\x00', '\x00', 'r', '\x00', '\x00', '\x00', 'e', '\x00', '\x00', '\x00', 'p', '\x00', '\x00', '\x00', 'o', '\x00', '\x00', '\x00', 'r', '\x00', '\x00', '\x00', 't', '\x00', '\x00', '\x00', 'i', '\x00', '\x00', '\x00', 'n', '\x00', '\x00', '\x00', 'g', '\x00', '\x00', '\x00', '_', '\x00', '\x00', '\x00', '2', '\x00', '\x00', '\x00', '5', '\x00', '\x00', '\x00', '6', '\x00', '\x00', '\x00', 'd', '\x00', '\x00', '\x00', 'c', '\x00', '\x00', '\x00', 'f', '\x00', '\x00', '\x00', '6', '\x00', '\x00', '\x00', '3', '\x00', '\x00', '\x00', '.', '\x00', '\x00', '\x00', 'd', '\x00', '\x00', '\x00', 'm', '\x00', '\x00', '\x00', 'l', '\x00', '\x00', '\x00', '_', '\x00', '\x00', '\x00', 't', '\x00', '\x00', '\x00', 'e', '\x00', '\x00', '\x00', 's', '\x00', '\x00', '\x00', 't', '\x00', '\x00', '\x00', '\x00', '\x00', '\x00', '(', '\x00', '\x00', '\x00', 'i', '\x00', '\x00', '\x00', 'd', '\x00', '\x00', '\x00', ')', '\x00', '\x00', '\x00', '\x00', '\x00', '\x00', 's', '\x00', '\x00', '\x00', 'e', '\x00', '\x00', '\x00', 'l', '\x00', '\x00', '\x00', 'e', '\x00', '\x00', '\x00', 'c', '\x00', '\x00', '\x00', 't', '\x00', '\x00', '\x00', '\x00', '\x00', '\x00', '*', '\x00', '\x00', '\x00', '\x00', '\x00', '\x00', 'f', '\x00', '\x00', '\x00', 'r', '\x00', '\x00', '\x00', 'o', '\x00', '\x00', '\x00', 'm', '\x00', '\x00', '\x00', '\x00', '\x00', '\x00', 'y', '\x00', '\x00', '\x00'] {noformat} > quote in WITH block's comment breaks shell > -- > > Key: IMPALA-2751 > URL: https://issues.apache.org/jira/browse/IMPALA-2751 > Project: IMPALA > Issue Type: Bug > Components: Clients >Affects Versions: Impala 2.2 > Environment: CDH5.4.8 >Reporter: Marcell Szabo >Assignee: Fredy Wijaya >Priority: Minor > Labels: impala-shell, shell, usability > Fix For: Impala 2.13.0, Impala 3.1.0 > > > Steps to reproduce: > $ cat > test.sql > with a as ( > select 'a' > -- shouldn't matter > ) > select * from a; > $ impala-shell -f test.sql > /usr/bin/impala-shell: line 32: warning: setlocale: LC_CTYPE: cannot change > locale (UTF-8): No such file or directory > /usr/bin/impala-shell: line 32: warning: setlocale: LC_CTYPE: cannot change > locale (UTF-8): No such file or directory > Starting Impala Shell without Kerberos authentication > Connected to host:21000 > Server version: impalad versi
[jira] [Created] (IMPALA-7092) Re-enable EC tests broken by HDFS-13539
Tianyi Wang created IMPALA-7092: --- Summary: Re-enable EC tests broken by HDFS-13539 Key: IMPALA-7092 URL: https://issues.apache.org/jira/browse/IMPALA-7092 Project: IMPALA Issue Type: Sub-task Components: Frontend, Infrastructure Affects Versions: Impala 3.1.0 Reporter: Tianyi Wang Assignee: Tianyi Wang With HDFS-13539 and HDFS-13540 fixed, we should be able to re-enable some tests and diagnose the causes of the remaining failed tests without much noise. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-6086) Use of permanent function should require SELECT privilege on DB
[ https://issues.apache.org/jira/browse/IMPALA-6086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16494461#comment-16494461 ] Zoram Thanga commented on IMPALA-6086: -- The issue seems to be that the function call "trim('abcd')" gets re-written/constant-folded into a string literal. We don't require any privileges for literals, hence the regression(?). I haven't actually tested this on an older release, but IMPALA-4574 and IMPALA-4586 touch on related issues. [~tarmstr...@cloudera.com], does this issue look familiar? > Use of permanent function should require SELECT privilege on DB > --- > > Key: IMPALA-6086 > URL: https://issues.apache.org/jira/browse/IMPALA-6086 > Project: IMPALA > Issue Type: Bug > Components: Catalog, Security >Affects Versions: Impala 2.9.0 >Reporter: Zoram Thanga >Assignee: Zoram Thanga >Priority: Minor > > A user that has no privilege on a database should not be able to execute any > permanent functions in that database. This is currently possible, and should > be fixed, so that the user must have SELECT privilege to execute permanent > functions. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-7070) Failed test: query_test.test_nested_types.TestParquetArrayEncodings.test_thrift_array_of_arrays on S3
[ https://issues.apache.org/jira/browse/IMPALA-7070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16494413#comment-16494413 ] Lars Volker commented on IMPALA-7070: - I tried to figure out where this error came from. The string "Input/output error" could come from it being a PathIOException or a subclass thereof. All of the subclasses add more logging to the error message, therefore I think it was a PathIOException. The error seems to come from [fs/shell/CommandWithDestination.java:526|https://github.com/apache/hadoop/blob/dd7916d3cd5d880d0b257d229f43f10feff04c93/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/CommandWithDestination.java#L526]. {code} if (!rename(src.path, target.path)) { // too bad we don't know why it failed PathIOException e = new PathIOException(src.toString()); e.setOperation("rename"); e.setTargetPath(target.toString()); throw e; } {code} This in turn ends up calling {{rename()}} in [fs/s3a/S3AFileSystem.java:690|https://github.com/apache/hadoop/blob/dd7916d3cd5d880d0b257d229f43f10feff04c93/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java?utf8=%E2%9C%93#L690], which makes a copy (L806) and then deletes the original (L808). This looks like a generic HDFS / S3 error to me and could be related to IMPALA-6910. Since keep seeing this I will reach out to HDFS folks and ask how to debug this. > Failed test: > query_test.test_nested_types.TestParquetArrayEncodings.test_thrift_array_of_arrays > on S3 > - > > Key: IMPALA-7070 > URL: https://issues.apache.org/jira/browse/IMPALA-7070 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 3.0 >Reporter: Dimitris Tsirogiannis >Assignee: Lars Volker >Priority: Blocker > Labels: broken-build, test-failure > > > {code:java} > Error Message > query_test/test_nested_types.py:406: in test_thrift_array_of_arrays "col1 > array>") query_test/test_nested_types.py:579: in > _create_test_table check_call(["hadoop", "fs", "-put", local_path, > location], shell=False) /usr/lib64/python2.6/subprocess.py:505: in check_call > raise CalledProcessError(retcode, cmd) E CalledProcessError: Command > '['hadoop', 'fs', '-put', > '/data/jenkins/workspace/impala-asf-2.x-core-s3/repos/Impala/testdata/parquet_nested_types_encodings/bad-thrift.parquet', > > 's3a://impala-cdh5-s3-test/test-warehouse/test_thrift_array_of_arrays_11da5fde.db/ThriftArrayOfArrays']' > returned non-zero exit status 1 > Stacktrace > query_test/test_nested_types.py:406: in test_thrift_array_of_arrays > "col1 array>") > query_test/test_nested_types.py:579: in _create_test_table > check_call(["hadoop", "fs", "-put", local_path, location], shell=False) > /usr/lib64/python2.6/subprocess.py:505: in check_call > raise CalledProcessError(retcode, cmd) > E CalledProcessError: Command '['hadoop', 'fs', '-put', > '/data/jenkins/workspace/impala-asf-2.x-core-s3/repos/Impala/testdata/parquet_nested_types_encodings/bad-thrift.parquet', > > 's3a://impala-cdh5-s3-test/test-warehouse/test_thrift_array_of_arrays_11da5fde.db/ThriftArrayOfArrays']' > returned non-zero exit status 1 > Standard Error > SET sync_ddl=False; > -- executing against localhost:21000 > DROP DATABASE IF EXISTS `test_thrift_array_of_arrays_11da5fde` CASCADE; > SET sync_ddl=False; > -- executing against localhost:21000 > CREATE DATABASE `test_thrift_array_of_arrays_11da5fde`; > MainThread: Created database "test_thrift_array_of_arrays_11da5fde" for test > ID > "query_test/test_nested_types.py::TestParquetArrayEncodings::()::test_thrift_array_of_arrays[exec_option: > {'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, > 'disable_codegen': False, 'abort_on_error': 1, 'debug_action': None, > 'exec_single_node_rows_threshold': 0} | table_format: parquet/none]" > -- executing against localhost:21000 > create table test_thrift_array_of_arrays_11da5fde.ThriftArrayOfArrays (col1 > array>) stored as parquet location > 's3a://impala-cdh5-s3-test/test-warehouse/test_thrift_array_of_arrays_11da5fde.db/ThriftArrayOfArrays'; > 18/05/20 18:31:03 WARN impl.MetricsConfig: Cannot locate configuration: tried > hadoop-metrics2-s3a-file-system.properties,hadoop-metrics2.properties > 18/05/20 18:31:03 INFO impl.MetricsSystemImpl: Scheduled snapshot period at > 10 second(s). > 18/05/20 18:31:03 INFO impl.MetricsSystemImpl: s3a-file-system metrics system > started > 18/05/20 18:31:06 INFO Configuration.deprecation: > fs.s3a.server-side-encryption-key is deprecated. Instead, use > fs.s3a.server-side-encryption.key > put: rename > `s3a://impala-cdh
[jira] [Commented] (IMPALA-6086) Use of permanent function should require SELECT privilege on DB
[ https://issues.apache.org/jira/browse/IMPALA-6086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16494379#comment-16494379 ] Zoram Thanga commented on IMPALA-6086: -- Finally had some cycles to dig into this. Looks like expression rewrite may be the culprit, as we seem to be losing the privilegeReqs on re-analysis after the rewrite. Here's a sample interaction on a Sentry-enabled Impala: {quote} [localhost:21000] default> show tables; Query: show tables ERROR: AuthorizationException: User 'zoram' does not have privileges to access: default.*.* [localhost:21000] default> select trim('abcd'); Query: select trim('abcd') Query submitted at: 2018-05-29 15:22:47 (Coordinator: http://zoram-desktop:25000) Query progress can be monitored at: http://zoram-desktop:25000/query_plan?query_id=3f48cb729a94afd4:6692d423 +--+ | trim('abcd') | +--+ | abcd | +--+ Fetched 1 row(s) in 4.91s [localhost:21000] default> set ENABLE_EXPR_REWRITES = FALSE; ENABLE_EXPR_REWRITES set to FALSE [localhost:21000] default> select trim('abcd'); Query: select trim('abcd') Query submitted at: 2018-05-29 15:23:07 (Coordinator: http://zoram-desktop:25000) ERROR: AuthorizationException: Cannot modify system database. [localhost:21000] default> {quote} > Use of permanent function should require SELECT privilege on DB > --- > > Key: IMPALA-6086 > URL: https://issues.apache.org/jira/browse/IMPALA-6086 > Project: IMPALA > Issue Type: Bug > Components: Catalog, Security >Affects Versions: Impala 2.9.0 >Reporter: Zoram Thanga >Assignee: Zoram Thanga >Priority: Minor > > A user that has no privilege on a database should not be able to execute any > permanent functions in that database. This is currently possible, and should > be fixed, so that the user must have SELECT privilege to execute permanent > functions. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-6843) Responses to prioritizedLoad() requests should be returned directly and not via the statestore
[ https://issues.apache.org/jira/browse/IMPALA-6843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon updated IMPALA-6843: Summary: Responses to prioritizedLoad() requests should be returned directly and not via the statestore (was: Responses to prioritizedLoad() requests are returned directly and not via the statestore) > Responses to prioritizedLoad() requests should be returned directly and not > via the statestore > -- > > Key: IMPALA-6843 > URL: https://issues.apache.org/jira/browse/IMPALA-6843 > Project: IMPALA > Issue Type: Improvement > Components: Catalog >Affects Versions: Impala 2.11.0 >Reporter: Dimitris Tsirogiannis >Assignee: Dimitris Tsirogiannis >Priority: Major > Labels: catalog, frontend, latency, perfomance > > Currently, when a statement (e.g. SELECT) needs to access some unloaded > tables, it issues a prioritizedLoad() request to the catalog. The catalog > loads the table metadata but does not respond directly to the coordinator > that issued the request. Instead, the metadata for the newly loaded tables > are broadcast via the statestore. The problem with this approach is that the > latency of the response may vary significantly and may depend on the > latencies of other unrelated metadata operations (e.g. REFRESH) that happen > to be in the same topic update. > The response to a prioritizedLoad() request should come directly to the > issuing coordinator. Other coordinators will receive the metadata of the > newly loaded table via the statestore. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-7089) test_kudu_dml_reporting failing
[ https://issues.apache.org/jira/browse/IMPALA-7089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16494325#comment-16494325 ] ASF subversion and git services commented on IMPALA-7089: - Commit e660149670e7d2d18b74a6eb3bc06cb929887ca1 in impala's branch refs/heads/master from [~twmarshall] [ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=e660149 ] IMPALA-7089: xfail test_kudu_dml_reporting test_kudu_dml_reporting has been causing a large number of build failures. Temporarily disable it while we figure out what's going on. Also improve output of test_kudu_dml_reporting on failure. Change-Id: I222e4c86a50f2450201fbad8b937e8fcf4fac31d Reviewed-on: http://gerrit.cloudera.org:8080/10527 Reviewed-by: Joe McDonnell Tested-by: Impala Public Jenkins > test_kudu_dml_reporting failing > --- > > Key: IMPALA-7089 > URL: https://issues.apache.org/jira/browse/IMPALA-7089 > Project: IMPALA > Issue Type: Bug >Reporter: Thomas Tauber-Marshall >Assignee: Thomas Tauber-Marshall >Priority: Blocker > Labels: broken-build > > See in numerous builds: > {noformat} > 00:07:23 ___ TestImpalaShell.test_kudu_dml_reporting > > 00:07:23 [gw1] linux2 -- Python 2.6.6 > /data/jenkins/workspace/impala-asf-master-core/repos/Impala/bin/../infra/python/env/bin/python > 00:07:23 shell/test_shell_commandline.py:601: in test_kudu_dml_reporting > 00:07:23 "with y as (values(7)) insert into %s.dml_test (id) select * > from y" % db, 1, 0) > 00:07:23 shell/test_shell_commandline.py:580: in _validate_dml_stmt > 00:07:23 assert expected_output in results.stderr > 00:07:23 E assert 'Modified 1 row(s), 0 row error(s)' in 'Starting Impala > Shell without Kerberos authentication\nConnected to localhost:21000\nServer > version: impalad version > ...tos-6-4-0895.vpc.cloudera.com:25000/query_plan?query_id=d94f04135c4d25f9:ec1089e8\nFetched > 0 row(s) in 0.12s\n' > 00:07:23 E+ where 'Starting Impala Shell without Kerberos > authentication\nConnected to localhost:21000\nServer version: impalad version > ...tos-6-4-0895.vpc.cloudera.com:25000/query_plan?query_id=d94f04135c4d25f9:ec1089e8\nFetched > 0 row(s) in 0.12s\n' = 0x7193b10>.stderr > 00:07:23 Captured stderr setup > - > 00:07:23 SET sync_ddl=False; > 00:07:23 -- executing against localhost:21000 > 00:07:23 DROP DATABASE IF EXISTS `test_kudu_dml_reporting_256dcf63` CASCADE; > 00:07:23 > 00:07:23 SET sync_ddl=False; > 00:07:23 -- executing against localhost:21000 > 00:07:23 CREATE DATABASE `test_kudu_dml_reporting_256dcf63`; > 00:07:23 > 00:07:23 MainThread: Created database "test_kudu_dml_reporting_256dcf63" for > test ID > "shell/test_shell_commandline.py::TestImpalaShell::()::test_kudu_dml_reporting" > 00:07:23 = 1 failed, 1932 passed, 63 skipped, 45 xfailed, 1 xpassed in > 6985.36 seconds == > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-7088) Parallel data load breaks load-data.py if loading data on a real cluster
[ https://issues.apache.org/jira/browse/IMPALA-7088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16494309#comment-16494309 ] ASF subversion and git services commented on IMPALA-7088: - Commit 573550ca2f781ff5cb781a6c6dcdfcbfc25edf04 in impala's branch refs/heads/master from [~joemcdonnell] [ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=573550c ] IMPALA-7088: Fix uninitialized variable in cluster dataload bin/load-data.py uses a unique directory for local Hive execution to avoid a race condition when executing multiple Hive commands at once. This unique directory is not needed when loading on a real cluster. However, the code to remove the unique directory at the end does not handle this correctly. This skips the code to remove the unique directory when it is uninitialized. Change-Id: I5581a45460dc341842d77eaa09647e50f35be6c7 Reviewed-on: http://gerrit.cloudera.org:8080/10526 Reviewed-by: Joe McDonnell Tested-by: Impala Public Jenkins > Parallel data load breaks load-data.py if loading data on a real cluster > > > Key: IMPALA-7088 > URL: https://issues.apache.org/jira/browse/IMPALA-7088 > Project: IMPALA > Issue Type: Bug > Components: Infrastructure >Affects Versions: Impala 3.0 >Reporter: David Knupp >Assignee: Joe McDonnell >Priority: Blocker > > {{Impala/bin/load-data.py}} is most commonly used to load test data onto a > simulated standalone cluster running on the local host. However, with the > correct inputs, it can also be used to load data onto an actual cluster > running on remote hosts. > A recent enhancement in the load-data.py script to parallelize parts of the > data loading process -- https://github.com/apache/impala/commit/d481cd48 -- > has introduced a regression in the latter use case: > From {{$IMPALA_HOME/logs/data_loading/data-load-functional-exhaustive.log}}: > {noformat} > Created table functional_hbase.widetable_1000_cols > Took 0.7121 seconds > 09:48:01 Beginning execution of hive SQL: > /home/systest/Impala-auxiliary-tests/tests/cdh_cluster/../../../Impala-cdh-cluster-test-runner/logs/data_loading/sql/functional/load-functional-query-exhaustive-hive-generated-text-none-none.sql > Traceback (most recent call last): > File > "/home/systest/Impala-auxiliary-tests/tests/cdh_cluster/../../../Impala-cdh-cluster-test-runner/bin/load-data.py", > line 494, in > if __name__ == "__main__": main() > File > "/home/systest/Impala-auxiliary-tests/tests/cdh_cluster/../../../Impala-cdh-cluster-test-runner/bin/load-data.py", > line 468, in main > hive_exec_query_files_parallel(thread_pool, hive_load_text_files) > File > "/home/systest/Impala-auxiliary-tests/tests/cdh_cluster/../../../Impala-cdh-cluster-test-runner/bin/load-data.py", > line 299, in hive_exec_query_files_parallel > exec_query_files_parallel(thread_pool, query_files, 'hive') > File > "/home/systest/Impala-auxiliary-tests/tests/cdh_cluster/../../../Impala-cdh-cluster-test-runner/bin/load-data.py", > line 290, in exec_query_files_parallel > for result in thread_pool.imap_unordered(execution_function, query_files): > File "/usr/lib/python2.7/multiprocessing/pool.py", line 659, in next > raise value > TypeError: coercing to Unicode: need string or buffer, NoneType found > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Resolved] (IMPALA-6712) TestRuntimeRowFilters.test_row_filters fails on a 2.x ASAN build
[ https://issues.apache.org/jira/browse/IMPALA-6712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tianyi Wang resolved IMPALA-6712. - Resolution: Cannot Reproduce Since the current known path forward is still simply increasing the timeout it might be unnecessary to do it if it doesn't happen recently. I will close this for now. Please reopen if it happens again. > TestRuntimeRowFilters.test_row_filters fails on a 2.x ASAN build > > > Key: IMPALA-6712 > URL: https://issues.apache.org/jira/browse/IMPALA-6712 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 2.12.0 >Reporter: Taras Bobrovytsky >Assignee: Tianyi Wang >Priority: Critical > > It looks like the query profile does not contain what we are looking for: > {noformat} > E AssertionError: Did not find matches for lines in runtime profile: > E EXPECTED LINES: > E row_regex: .*Rows processed: 16.38K.*{noformat} > > This happened several times. The latest failure was on this commit: > 7336839dbb2d609005362fdff174a822462f05fb -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Closed] (IMPALA-6776) Failed to assign hbase regions to servers
[ https://issues.apache.org/jira/browse/IMPALA-6776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vuk Ercegovac closed IMPALA-6776. - Resolution: Cannot Reproduce Fix Version/s: Impala 3.1.0 Impala 2.13.0 > Failed to assign hbase regions to servers > - > > Key: IMPALA-6776 > URL: https://issues.apache.org/jira/browse/IMPALA-6776 > Project: IMPALA > Issue Type: Bug > Components: Infrastructure >Affects Versions: Impala 3.0 >Reporter: Tianyi Wang >Assignee: Vuk Ercegovac >Priority: Blocker > Labels: broken-build > Fix For: Impala 2.13.0, Impala 3.1.0 > > > After switching to hadoop 3 components, split-hbase.sh failed in > HBaseTestDataRegionAssigment: > {noformat} > 20:40:27 Splitting HBase (logging to > /data/jenkins/workspace/impala-asf-master-exhaustive-release/repos/Impala/logs/data_loading/create-hbase.log)... > > 20:41:51 FAILED (Took: 1 min 24 sec) > 20:41:51 > '/data/jenkins/workspace/impala-asf-master-exhaustive-release/repos/Impala/testdata/bin/split-hbase.sh' > failed. Tail of log: > 20:41:51 18/03/29 20:41:50 INFO datagenerator.HBaseTestDataRegionAssigment: > functional_hbase.alltypessmall,3,1522381286429.7b13fefeda7afac230e22150deab5266. > 3 -> localhost:16202, expecting localhost,16202,1522374373018 > 20:41:51 18/03/29 20:41:50 INFO datagenerator.HBaseTestDataRegionAssigment: > functional_hbase.alltypessmall,5,1522381287511.7a243a822c5c4844a2a3d0f67a541961. > 5 -> localhost:16202, expecting localhost,16202,1522374373018 > 20:41:51 18/03/29 20:41:50 INFO datagenerator.HBaseTestDataRegionAssigment: > functional_hbase.alltypessmall,7,1522381288718.80d6e4a799ad114a146dc3cb41e18e93. > 7 -> localhost:16203, expecting localhost,16203,1522374374705 > 20:41:51 18/03/29 20:41:50 INFO datagenerator.HBaseTestDataRegionAssigment: > functional_hbase.alltypessmall,9,1522381288718.d705a2ea635916f4bb510ca60764080a. > 9 -> localhost:16203, expecting localhost,16203,1522374374705 > 20:41:51 18/03/29 20:41:50 INFO datagenerator.HBaseTestDataRegionAssigment: > functional_hbase.alltypessmall,,1522381282868.a99b569f5417ea9e2561eb5566c31be0. > -> localhost:16203, expecting localhost,16201,1522374371810 > 20:41:51 18/03/29 20:41:50 INFO datagenerator.HBaseTestDataRegionAssigment: > functional_hbase.alltypessmall,1,1522381285023.5fb566ba94e5fbb8aeca39f3da0a6362. > 1 -> localhost:16201, expecting localhost,16201,1522374371810 > 20:41:51 18/03/29 20:41:50 INFO datagenerator.HBaseTestDataRegionAssigment: > functional_hbase.alltypessmall,3,1522381286429.7b13fefeda7afac230e22150deab5266. > 3 -> localhost:16202, expecting localhost,16202,1522374373018 > 20:41:51 18/03/29 20:41:50 INFO datagenerator.HBaseTestDataRegionAssigment: > functional_hbase.alltypessmall,5,1522381287511.7a243a822c5c4844a2a3d0f67a541961. > 5 -> localhost:16202, expecting localhost,16202,1522374373018 > 20:41:51 18/03/29 20:41:50 INFO datagenerator.HBaseTestDataRegionAssigment: > functional_hbase.alltypessmall,7,1522381288718.80d6e4a799ad114a146dc3cb41e18e93. > 7 -> localhost:16203, expecting localhost,16203,1522374374705 > 20:41:51 18/03/29 20:41:50 INFO datagenerator.HBaseTestDataRegionAssigment: > functional_hbase.alltypessmall,9,1522381288718.d705a2ea635916f4bb510ca60764080a. > 9 -> localhost:16203, expecting localhost,16203,1522374374705 > 20:41:51 18/03/29 20:41:50 INFO datagenerator.HBaseTestDataRegionAssigment: > functional_hbase.alltypessmall,,1522381282868.a99b569f5417ea9e2561eb5566c31be0. > -> localhost:16203, expecting localhost,16201,1522374371810 > 20:41:51 18/03/29 20:41:50 INFO datagenerator.HBaseTestDataRegionAssigment: > functional_hbase.alltypessmall,1,1522381285023.5fb566ba94e5fbb8aeca39f3da0a6362. > 1 -> localhost:16201, expecting localhost,16201,1522374371810 > 20:41:51 18/03/29 20:41:50 INFO datagenerator.HBaseTestDataRegionAssigment: > functional_hbase.alltypessmall,3,1522381286429.7b13fefeda7afac230e22150deab5266. > 3 -> localhost:16202, expecting localhost,16202,1522374373018 > 20:41:51 18/03/29 20:41:50 INFO datagenerator.HBaseTestDataRegionAssigment: > functional_hbase.alltypessmall,5,1522381287511.7a243a822c5c4844a2a3d0f67a541961. > 5 -> localhost:16202, expecting localhost,16202,1522374373018 > 20:41:51 18/03/29 20:41:50 INFO datagenerator.HBaseTestDataRegionAssigment: > functional_hbase.alltypessmall,7,1522381288718.80d6e4a799ad114a146dc3cb41e18e93. > 7 -> localhost:16203, expecting localhost,16203,1522374374705 > 20:41:51 18/03/29 20:41:50 INFO datagenerator.HBaseTestDataRegionAssigment: > functional_hbase.alltypessmall,9,1522381288718.d705a2ea635916f4bb510ca60764080a. > 9 -> localhost:16203, expecting localhost,16203,1522374374705 > 20:41:51 18/03/29 20:41:51 INFO datagenerator.HBaseTestDataRegionAssigment: > function
[jira] [Resolved] (IMPALA-6933) test_kudu.TestCreateExternalTable on S3 failing with "AlreadyExistsException: Database already exists"
[ https://issues.apache.org/jira/browse/IMPALA-6933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vuk Ercegovac resolved IMPALA-6933. --- Resolution: Fixed Fix Version/s: Impala 3.1.0 Impala 2.13.0 > test_kudu.TestCreateExternalTable on S3 failing with "AlreadyExistsException: > Database already exists" > -- > > Key: IMPALA-6933 > URL: https://issues.apache.org/jira/browse/IMPALA-6933 > Project: IMPALA > Issue Type: Bug > Components: Infrastructure >Affects Versions: Impala 3.0 >Reporter: David Knupp >Assignee: Vuk Ercegovac >Priority: Critical > Labels: kudu, test-infra > Fix For: Impala 2.13.0, Impala 3.1.0 > > > Error Message > {noformat} > test setup failure > {noformat} > Stacktrace > {noformat} > conftest.py:347: in conn > with __unique_conn(db_name=db_name, timeout=timeout) as conn: > /usr/lib64/python2.6/contextlib.py:16: in __enter__ > return self.gen.next() > conftest.py:380: in __unique_conn > cur.execute("CREATE DATABASE %s" % db_name) > ../infra/python/env/lib/python2.6/site-packages/impala/hiveserver2.py:302: in > execute > configuration=configuration) > ../infra/python/env/lib/python2.6/site-packages/impala/hiveserver2.py:343: in > execute_async > self._execute_async(op) > ../infra/python/env/lib/python2.6/site-packages/impala/hiveserver2.py:362: in > _execute_async > operation_fn() > ../infra/python/env/lib/python2.6/site-packages/impala/hiveserver2.py:340: in > op > async=True) > ../infra/python/env/lib/python2.6/site-packages/impala/hiveserver2.py:1027: > in execute > return self._operation('ExecuteStatement', req) > ../infra/python/env/lib/python2.6/site-packages/impala/hiveserver2.py:957: in > _operation > resp = self._rpc(kind, request) > ../infra/python/env/lib/python2.6/site-packages/impala/hiveserver2.py:925: in > _rpc > err_if_rpc_not_ok(response) > ../infra/python/env/lib/python2.6/site-packages/impala/hiveserver2.py:704: in > err_if_rpc_not_ok > raise HiveServer2Error(resp.status.errorMessage) > E HiveServer2Error: ImpalaRuntimeException: Error making 'createDatabase' > RPC to Hive Metastore: > E CAUSED BY: AlreadyExistsException: Database f0mraw already exists > {noformat} > Tests affected: > * query_test.test_kudu.TestCreateExternalTable.test_unsupported_binary_col > * query_test.test_kudu.TestCreateExternalTable.test_drop_external_table > * query_test.test_kudu.TestCreateExternalTable.test_explicit_name > * query_test.test_kudu.TestCreateExternalTable.test_explicit_name_preference > * query_test.test_kudu.TestCreateExternalTable.test_explicit_name_doesnt_exist > * > query_test.test_kudu.TestCreateExternalTable.test_explicit_name_doesnt_exist_but_implicit_does > * query_test.test_kudu.TestCreateExternalTable.test_table_without_partitioning > * query_test.test_kudu.TestCreateExternalTable.test_column_name_case > * query_test.test_kudu.TestCreateExternalTable.test_conflicting_column_name -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-6933) test_kudu.TestCreateExternalTable on S3 failing with "AlreadyExistsException: Database already exists"
[ https://issues.apache.org/jira/browse/IMPALA-6933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16494182#comment-16494182 ] ASF subversion and git services commented on IMPALA-6933: - Commit 4653637b9e2ee573f3ad7a76da8941a0a4870bd8 in impala's branch refs/heads/master from [~vercego] [ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=4653637 ] IMPALA-6933: Avoids db name collisions for Kudu tests Kudu tests generate temporary db names in a way so that its unlikely, yet possible to collide. A recent test failure indicates such a collision came up. The fix changes the way that the name is generated so that it includes the classes name for which the db name is generated. This db name will make it easier to see which test created it and the name will not collide with other names generated by other tests. Testing: - ran the updated test locally Change-Id: I7c2f8a35fec90ae0dabe80237d83954668b47f6e Reviewed-on: http://gerrit.cloudera.org:8080/10513 Reviewed-by: Michael Brown Tested-by: Impala Public Jenkins > test_kudu.TestCreateExternalTable on S3 failing with "AlreadyExistsException: > Database already exists" > -- > > Key: IMPALA-6933 > URL: https://issues.apache.org/jira/browse/IMPALA-6933 > Project: IMPALA > Issue Type: Bug > Components: Infrastructure >Affects Versions: Impala 3.0 >Reporter: David Knupp >Assignee: Vuk Ercegovac >Priority: Critical > Labels: kudu, test-infra > > Error Message > {noformat} > test setup failure > {noformat} > Stacktrace > {noformat} > conftest.py:347: in conn > with __unique_conn(db_name=db_name, timeout=timeout) as conn: > /usr/lib64/python2.6/contextlib.py:16: in __enter__ > return self.gen.next() > conftest.py:380: in __unique_conn > cur.execute("CREATE DATABASE %s" % db_name) > ../infra/python/env/lib/python2.6/site-packages/impala/hiveserver2.py:302: in > execute > configuration=configuration) > ../infra/python/env/lib/python2.6/site-packages/impala/hiveserver2.py:343: in > execute_async > self._execute_async(op) > ../infra/python/env/lib/python2.6/site-packages/impala/hiveserver2.py:362: in > _execute_async > operation_fn() > ../infra/python/env/lib/python2.6/site-packages/impala/hiveserver2.py:340: in > op > async=True) > ../infra/python/env/lib/python2.6/site-packages/impala/hiveserver2.py:1027: > in execute > return self._operation('ExecuteStatement', req) > ../infra/python/env/lib/python2.6/site-packages/impala/hiveserver2.py:957: in > _operation > resp = self._rpc(kind, request) > ../infra/python/env/lib/python2.6/site-packages/impala/hiveserver2.py:925: in > _rpc > err_if_rpc_not_ok(response) > ../infra/python/env/lib/python2.6/site-packages/impala/hiveserver2.py:704: in > err_if_rpc_not_ok > raise HiveServer2Error(resp.status.errorMessage) > E HiveServer2Error: ImpalaRuntimeException: Error making 'createDatabase' > RPC to Hive Metastore: > E CAUSED BY: AlreadyExistsException: Database f0mraw already exists > {noformat} > Tests affected: > * query_test.test_kudu.TestCreateExternalTable.test_unsupported_binary_col > * query_test.test_kudu.TestCreateExternalTable.test_drop_external_table > * query_test.test_kudu.TestCreateExternalTable.test_explicit_name > * query_test.test_kudu.TestCreateExternalTable.test_explicit_name_preference > * query_test.test_kudu.TestCreateExternalTable.test_explicit_name_doesnt_exist > * > query_test.test_kudu.TestCreateExternalTable.test_explicit_name_doesnt_exist_but_implicit_does > * query_test.test_kudu.TestCreateExternalTable.test_table_without_partitioning > * query_test.test_kudu.TestCreateExternalTable.test_column_name_case > * query_test.test_kudu.TestCreateExternalTable.test_conflicting_column_name -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Work started] (IMPALA-7090) EqualityDisjunctsToInRule should respect the limit on the number of children in an expr
[ https://issues.apache.org/jira/browse/IMPALA-7090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on IMPALA-7090 started by Tianyi Wang. --- > EqualityDisjunctsToInRule should respect the limit on the number of children > in an expr > --- > > Key: IMPALA-7090 > URL: https://issues.apache.org/jira/browse/IMPALA-7090 > Project: IMPALA > Issue Type: Bug >Affects Versions: Impala 3.0, Impala 2.12.0 >Reporter: Tianyi Wang >Assignee: Tianyi Wang >Priority: Critical > > Currently EqualityDisjunctsToInRule introduced in IMPALA-5280 might create an > expr with unlimited number of children and fails a query, which should be > avoided. The easy solution is to not apply the rewrite when the number of > children is large. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-6990) TestClientSsl.test_tls_v12 failing due to Python SSL error
[ https://issues.apache.org/jira/browse/IMPALA-6990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16494139#comment-16494139 ] Philip Zeyliger commented on IMPALA-6990: - Is this user-visible? Let's say that a user had impala-shell working on RH6 or RH7 before. Does it still work? Does it work when using the same {{ssl-minimum-version}} and {{ssl-cipher-list}} flags? I think this test is saying that these flags don't work for the Python shipped in RH7. I suspect they didn't work before either: did they somehow work before? Surely before the Thrift change, we were using the same RH image? Once we've figured this out, I think the easier thing to do is to disable the test when using a too-old version of Python. We already have a "skip if legacy SSL" flag on the test; this is just one more skip if. We still want to run the test for Ubuntu16 or whatever. I think we can assume that the Python running the test and the python running impala-shell are the same for our purposes. Is there a weaker test that we'd want to add? > TestClientSsl.test_tls_v12 failing due to Python SSL error > -- > > Key: IMPALA-6990 > URL: https://issues.apache.org/jira/browse/IMPALA-6990 > Project: IMPALA > Issue Type: Bug >Affects Versions: Impala 3.0 >Reporter: Sailesh Mukil >Assignee: Sailesh Mukil >Priority: Blocker > Labels: broken-build, flaky > > We've seen quite a few jobs fail with the following error: > *_ssl.c:504: EOF occurred in violation of protocol* > {code:java} > custom_cluster/test_client_ssl.py:128: in test_tls_v12 > self._validate_positive_cases("%s/server-cert.pem" % self.CERT_DIR) > custom_cluster/test_client_ssl.py:181: in _validate_positive_cases > result = run_impala_shell_cmd(shell_options) > shell/util.py:97: in run_impala_shell_cmd > result.stderr) > E AssertionError: Cmd --ssl -q 'select 1 + 2' was expected to succeed: > Starting Impala Shell without Kerberos authentication > E SSL is enabled. Impala server certificates will NOT be verified (set > --ca_cert to change) > E > /data/jenkins/workspace/impala-cdh6.x-exhaustive-rhel7/Impala-Toolchain/thrift-0.9.3-p4/python/lib64/python2.7/site-packages/thrift/transport/TSSLSocket.py:80: > DeprecationWarning: 3th positional argument is deprecated. Use keyward > argument insteand. > E DeprecationWarning) > E > /data/jenkins/workspace/impala-cdh6.x-exhaustive-rhel7/Impala-Toolchain/thrift-0.9.3-p4/python/lib64/python2.7/site-packages/thrift/transport/TSSLSocket.py:80: > DeprecationWarning: 4th positional argument is deprecated. Use keyward > argument insteand. > E DeprecationWarning) > E > /data/jenkins/workspace/impala-cdh6.x-exhaustive-rhel7/Impala-Toolchain/thrift-0.9.3-p4/python/lib64/python2.7/site-packages/thrift/transport/TSSLSocket.py:80: > DeprecationWarning: 5th positional argument is deprecated. Use keyward > argument insteand. > E DeprecationWarning) > E > /data/jenkins/workspace/impala-cdh6.x-exhaustive-rhel7/Impala-Toolchain/thrift-0.9.3-p4/python/lib64/python2.7/site-packages/thrift/transport/TSSLSocket.py:216: > DeprecationWarning: validate is deprecated. Use cert_reqs=ssl.CERT_NONE > instead > E DeprecationWarning) > E No handlers could be found for logger "thrift.transport.TSSLSocket" > E Error connecting: TTransportException, Could not connect to > localhost:21000: [Errno 8] _ssl.c:504: EOF occurred in violation of protocol > E Not connected to Impala, could not execute queries. > {code} > We need to investigate why this is happening and fix it. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-6990) TestClientSsl.test_tls_v12 failing due to Python SSL error
[ https://issues.apache.org/jira/browse/IMPALA-6990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16494124#comment-16494124 ] Sailesh Mukil commented on IMPALA-6990: --- Spent some more time looking at this and found that 'requests' wasn't the culprit. When we upgraded to thrift-0.9.3, the TSSLSocket.py logic changed quite a bit. Our RHEL7 machines come equipped with Python 2.7.5. Looking at these comments, that means that we'll be unable to create a 'SSLContext' but able to explicitly specify ciphers: https://github.com/apache/thrift/blob/master/lib/py/src/transport/TSSLSocket.py#L37-L41 {code:java} # SSLContext is not available for Python < 2.7.9 _has_ssl_context = sys.hexversion >= 0x020709F0 # ciphers argument is not available for Python < 2.7.0 _has_ciphers = sys.hexversion >= 0x020700F0 {code} If we cannot create a 'SSLContext', then we cannot use TLSv1.2 and have to use TLSv1: https://github.com/apache/thrift/blob/master/lib/py/src/transport/TSSLSocket.py#L48-L49 {code:java} # For python >= 2.7.9, use latest TLS that both client and server # supports. # SSL 2.0 and 3.0 are disabled via ssl.OP_NO_SSLv2 and ssl.OP_NO_SSLv3. # For python < 2.7.9, use TLS 1.0 since TLSv1_X nor OP_NO_SSLvX is # unavailable. _default_protocol = ssl.PROTOCOL_SSLv23 if _has_ssl_context else \ ssl.PROTOCOL_TLSv1 {code} Our custom cluster test forces the server to use TLSv1.2 and also forces a specific cipher: https://github.com/apache/impala/blob/master/tests/custom_cluster/test_client_ssl.py#L118-L119 So this combination of configurations causes a failure in RHEL7 because we only allow a specific cipher which works with TLSv1.2, but the client cannot use TLSv1.2 due to the Python version as mentioned above. On systems lower than RHEL7, the machines come equipped with Python 2.6.6, which does not force the use of specific ciphers, so we get away without a failure. To fix this, we either need to change the Python version on RHEL 7 to be >= Python 2.7.9, or reduce the 'test_client_ssl' limitation to run TLSv1. The second option is the quickest, although not ideal, but it should at least unblock our builds while we can upgrade the AMIs for RHEL7. > TestClientSsl.test_tls_v12 failing due to Python SSL error > -- > > Key: IMPALA-6990 > URL: https://issues.apache.org/jira/browse/IMPALA-6990 > Project: IMPALA > Issue Type: Bug >Affects Versions: Impala 3.0 >Reporter: Sailesh Mukil >Assignee: Sailesh Mukil >Priority: Blocker > Labels: broken-build, flaky > > We've seen quite a few jobs fail with the following error: > *_ssl.c:504: EOF occurred in violation of protocol* > {code:java} > custom_cluster/test_client_ssl.py:128: in test_tls_v12 > self._validate_positive_cases("%s/server-cert.pem" % self.CERT_DIR) > custom_cluster/test_client_ssl.py:181: in _validate_positive_cases > result = run_impala_shell_cmd(shell_options) > shell/util.py:97: in run_impala_shell_cmd > result.stderr) > E AssertionError: Cmd --ssl -q 'select 1 + 2' was expected to succeed: > Starting Impala Shell without Kerberos authentication > E SSL is enabled. Impala server certificates will NOT be verified (set > --ca_cert to change) > E > /data/jenkins/workspace/impala-cdh6.x-exhaustive-rhel7/Impala-Toolchain/thrift-0.9.3-p4/python/lib64/python2.7/site-packages/thrift/transport/TSSLSocket.py:80: > DeprecationWarning: 3th positional argument is deprecated. Use keyward > argument insteand. > E DeprecationWarning) > E > /data/jenkins/workspace/impala-cdh6.x-exhaustive-rhel7/Impala-Toolchain/thrift-0.9.3-p4/python/lib64/python2.7/site-packages/thrift/transport/TSSLSocket.py:80: > DeprecationWarning: 4th positional argument is deprecated. Use keyward > argument insteand. > E DeprecationWarning) > E > /data/jenkins/workspace/impala-cdh6.x-exhaustive-rhel7/Impala-Toolchain/thrift-0.9.3-p4/python/lib64/python2.7/site-packages/thrift/transport/TSSLSocket.py:80: > DeprecationWarning: 5th positional argument is deprecated. Use keyward > argument insteand. > E DeprecationWarning) > E > /data/jenkins/workspace/impala-cdh6.x-exhaustive-rhel7/Impala-Toolchain/thrift-0.9.3-p4/python/lib64/python2.7/site-packages/thrift/transport/TSSLSocket.py:216: > DeprecationWarning: validate is deprecated. Use cert_reqs=ssl.CERT_NONE > instead > E DeprecationWarning) > E No handlers could be found for logger "thrift.transport.TSSLSocket" > E Error connecting: TTransportException, Could not connect to > localhost:21000: [Errno 8] _ssl.c:504: EOF occurred in violation of protocol > E Not connected to Impala, could not execute queries. > {code} > We need to investigate why this is happening and fix it. -- This message w
[jira] [Created] (IMPALA-7091) Occasional errors with failure.test_failpoints.TestFailpoints.test_failpoints in hbase tests
Philip Zeyliger created IMPALA-7091: --- Summary: Occasional errors with failure.test_failpoints.TestFailpoints.test_failpoints in hbase tests Key: IMPALA-7091 URL: https://issues.apache.org/jira/browse/IMPALA-7091 Project: IMPALA Issue Type: Task Components: Frontend Reporter: Philip Zeyliger Assignee: Philip Zeyliger When running the following test with "test-with-docker", I sometimes (but not always) see it fail. {code:java} failure.test_failpoints.TestFailpoints.test_failpoints[table_format: hbase/none | exec_option: {'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 'abort_on_error': 1, 'debug_action': None, 'exec_single_node_rows_threshold': 0} | mt_dop: 4 | location: OPEN | action: MEM_LIMIT_EXCEEDED | query: select * from alltypessmall union all select * from alltypessmall]{code} The error I see is an NPE, and, correlating some logs, I think it's this: {code} 26420:I0524 14:30:14.696190 12271 jni-util.cc:230] java.lang.NullPointerException 26421- at org.apache.impala.catalog.HBaseTable.getRegionSize(HBaseTable.java:652) 26422- at org.apache.impala.catalog.HBaseTable.getEstimatedRowStatsForRegion(HBaseTable.java:520) 26423- at org.apache.impala.catalog.HBaseTable.getEstimatedRowStats(HBaseTable.java:605) 26424- at org.apache.impala.planner.HBaseScanNode.computeStats(HBaseScanNode.java:203) 26425- at org.apache.impala.planner.HBaseScanNode.init(HBaseScanNode.java:127) 26426- at org.apache.impala.planner.SingleNodePlanner.createScanNode(SingleNodePlanner.java:1344) 26427- at org.apache.impala.planner.SingleNodePlanner.createTableRefNode(SingleNodePlanner.java:1514) 26428- at org.apache.impala.planner.SingleNodePlanner.createTableRefsPlan(SingleNodePlanner.java:776) 26429- at org.apache.impala.planner.SingleNodePlanner.createSelectPlan(SingleNodePlanner.java:614) 26430- at org.apache.impala.planner.SingleNodePlanner.createQueryPlan(SingleNodePlanner.java:257) 26431- at org.apache.impala.planner.SingleNodePlanner.createUnionPlan(SingleNodePlanner.java:1563) 26432- at org.apache.impala.planner.SingleNodePlanner.createUnionPlan(SingleNodePlanner.java:1630) 26433- at org.apache.impala.planner.SingleNodePlanner.createQueryPlan(SingleNodePlanner.java:275) 26434- at org.apache.impala.planner.SingleNodePlanner.createSingleNodePlan(SingleNodePlanner.java:147) 26435- at org.apache.impala.planner.Planner.createPlan(Planner.java:101) 26436- at org.apache.impala.planner.Planner.createParallelPlans(Planner.java:230) 26437- at org.apache.impala.service.Frontend.createExecRequest(Frontend.java:938) 26438- at org.apache.impala.service.Frontend.createExecRequest(Frontend.java:1062) 26439- at org.apache.impala.service.JniFrontend.createExecRequest(JniFrontend.java:156) 26440:I0524 14:30:14.796514 12271 status.cc:125] NullPointerException: null 26441-@ 0x1891839 impala::Status::Status() {code} The test-with-docker stuff starts HBase at run time independently of data load in a way that our other tests don't, and I suspect HBase simply hasn't loaded the tables. I have a change forthcoming to address this. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-7090) EqualityDisjunctsToInRule should respect the limit on the number of children in an expr
Tianyi Wang created IMPALA-7090: --- Summary: EqualityDisjunctsToInRule should respect the limit on the number of children in an expr Key: IMPALA-7090 URL: https://issues.apache.org/jira/browse/IMPALA-7090 Project: IMPALA Issue Type: Bug Affects Versions: Impala 2.12.0, Impala 3.0 Reporter: Tianyi Wang Assignee: Tianyi Wang Currently EqualityDisjunctsToInRule introduced in IMPALA-5280 might create an expr with unlimited number of children and fails a query, which should be avoided. The easy solution is to not apply the rewrite when the number of children is large. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-7067) sleep(100000) command from test_shell_commandline.py can hang around and cause test_metrics_are_zero to fail
[ https://issues.apache.org/jira/browse/IMPALA-7067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16494093#comment-16494093 ] ASF subversion and git services commented on IMPALA-7067: - Commit 0e7b075923cbecce4db2fd2e4fa3edf63afef06f in impala's branch refs/heads/2.x from [~tarmstr...@cloudera.com] [ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=0e7b075 ] IMPALA-7067: deflake test_cancellation Tweak the query so that it still runs for a long time but can cancel the fragment quicker instead of being stuck in a long sleep() call. Change-Id: I0c90d4f5c277f7b0d5561637944b454f7a44c76e Reviewed-on: http://gerrit.cloudera.org:8080/10499 Reviewed-by: Tim Armstrong Tested-by: Tim Armstrong > sleep(10) command from test_shell_commandline.py can hang around and > cause test_metrics_are_zero to fail > > > Key: IMPALA-7067 > URL: https://issues.apache.org/jira/browse/IMPALA-7067 > Project: IMPALA > Issue Type: Bug > Components: Infrastructure >Affects Versions: Impala 3.1.0 >Reporter: Tim Armstrong >Assignee: Tim Armstrong >Priority: Critical > Labels: flaky > Fix For: Impala 2.13.0, Impala 3.1.0 > > > {noformat} > 03:25:47 [gw6] PASSED > shell/test_shell_commandline.py::TestImpalaShell::test_cancellation > ... > 03:27:01 verifiers/test_verify_metrics.py:34: in test_metrics_are_zero > 03:27:01 verifier.verify_metrics_are_zero() > 03:27:01 verifiers/metric_verifier.py:47: in verify_metrics_are_zero > 03:27:01 self.wait_for_metric(metric, 0, timeout) > 03:27:01 verifiers/metric_verifier.py:62: in wait_for_metric > 03:27:01 self.impalad_service.wait_for_metric_value(metric_name, > expected_value, timeout) > 03:27:01 common/impala_service.py:135: in wait_for_metric_value > 03:27:01 json.dumps(self.read_debug_webpage('rpcz?json'))) > 03:27:01 E AssertionError: Metric value impala-server.mem-pool.total-bytes > did not reach value 0 in 60s > {noformat} > I used the json dump from memz and the logs to trace it back to the > sleep(10) query hanging around -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-7058) RC and Seq fuzz tests cause crash
[ https://issues.apache.org/jira/browse/IMPALA-7058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16494090#comment-16494090 ] ASF subversion and git services commented on IMPALA-7058: - Commit 47606806a478ea003d6487d375bf683682c16298 in impala's branch refs/heads/2.x from [~tarmstr...@cloudera.com] [ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=4760680 ] IMPALA-7058: disable fuzz test for RC and Seq There appear to still be some rare crashes. Let's disable the test until we can sort those out Change-Id: I10eb184ab2f27ca9b2d286630ceb37b71affcc27 Reviewed-on: http://gerrit.cloudera.org:8080/10485 Reviewed-by: Alex Behm Tested-by: Impala Public Jenkins > RC and Seq fuzz tests cause crash > - > > Key: IMPALA-7058 > URL: https://issues.apache.org/jira/browse/IMPALA-7058 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 3.0 >Reporter: Dimitris Tsirogiannis >Assignee: Tim Armstrong >Priority: Blocker > Labels: broken-build, crash > Fix For: Impala 2.13.0, Impala 3.1.0 > > > The backtrace is here: > {code:java} > #7 0x02d89a84 in > impala::DelimitedTextParser::ParseFieldLocations (this=0xcf539a0, > max_tuples=1, remaining_len=-102, byte_buffer_ptr=0x7fc6b764dad0, > row_end_locations=0x7fc6b764dac0, field_locations=0x10034000, > num_tuples=0x7fc6b764dacc, num_fields=0x7fc6b764dac8, > next_column_start=0x7fc6b764dad8) at > /data/jenkins/workspace/impala-asf-master-exhaustive-rhel7/repos/Impala/be/src/exec/delimited-text-parser.cc:205 > #8 0x01fdb641 in impala::HdfsSequenceScanner::ProcessRange > (this=0x15515f80, row_batch=0xcf54800) at > /data/jenkins/workspace/impala-asf-master-exhaustive-rhel7/repos/Impala/be/src/exec/hdfs-sequence-scanner.cc:352 > #9 0x02d7a20e in impala::BaseSequenceScanner::GetNextInternal > (this=0x15515f80, row_batch=0xcf54800) at > /data/jenkins/workspace/impala-asf-master-exhaustive-rhel7/repos/Impala/be/src/exec/base-sequence-scanner.cc:181 > #10 0x01fb1ff0 in impala::HdfsScanner::ProcessSplit (this=0x15515f80) > at > /data/jenkins/workspace/impala-asf-master-exhaustive-rhel7/repos/Impala/be/src/exec/hdfs-scanner.cc:134 > #11 0x01f89258 in impala::HdfsScanNode::ProcessSplit > (this=0x2a4a8800, filter_ctxs=..., expr_results_pool=0x7fc6b764e4b0, > scan_range=0x13f5f8700, scanner_thread_reservation=0x7fc6b764e428) at > /data/jenkins/workspace/impala-asf-master-exhaustive-rhel7/repos/Impala/be/src/exec/hdfs-scan-node.cc:453 > #12 0x01f885f9 in impala::HdfsScanNode::ScannerThread > (this=0x2a4a8800, first_thread=false, scanner_thread_reservation=32768) at > /data/jenkins/workspace/impala-asf-master-exhaustive-rhel7/repos/Impala/be/src/exec/hdfs-scan-node.cc:360 > #13 0x01f87a6c in impala::HdfsScanNodeoperator()(void) > const (__closure=0x7fc6b764ebe8) at > /data/jenkins/workspace/impala-asf-master-exhaustive-rhel7/repos/Impala/be/src/exec/hdfs-scan-node.cc:292 > #14 0x01f89ac8 in > boost::detail::function::void_function_obj_invoker0, > void>::invoke(boost::detail::function::function_buffer &) > (function_obj_ptr=...) at > /data/jenkins/workspace/impala-asf-master-exhaustive-rhel7/Impala-Toolchain/boost-1.57.0-p3/include/boost/function/function_template.hpp:153 > #15 0x01bf0b28 in boost::function0::operator() > (this=0x7fc6b764ebe0) at > /data/jenkins/workspace/impala-asf-master-exhaustive-rhel7/Impala-Toolchain/boost-1.57.0-p3/include/boost/function/function_template.hpp:767 > #16 0x01edc57f in impala::Thread::SuperviseThread(std::string const&, > std::string const&, boost::function, impala::ThreadDebugInfo const*, > impala::Promise*) (name=..., category=..., functor=..., > parent_thread_info=0x7fc6b9e53890, thread_started=0x7fc6b9e527c0) at > /data/jenkins/workspace/impala-asf-master-exhaustive-rhel7/repos/Impala/be/src/util/thread.cc:356 > #17 0x01ee471b in boost::_bi::list5, > boost::_bi::value, boost::_bi::value >, > boost::_bi::value, > boost::_bi::value*> >::operator() const&, std::string const&, boost::function, impala::ThreadDebugInfo > const*, impala::Promise*), boost::_bi::list0>(boost::_bi::type, > void (*&)(std::string const&, std::string const&, boost::function, > impala::ThreadDebugInfo const*, impala::Promise*), boost::_bi::list0&, > int) (this=0x2a370fc0, f=@0x2a370fb8: 0x1edc218 > boost::function, impala::ThreadDebugInfo const*, > impala::Promise*)>, a=...) at > /data/jenkins/workspace/impala-asf-master-exhaustive-rhel7/Impala-Toolchain/boost-1.57.0-p3/include/boost/bind/bind.hpp:525 > #18 0x01ee463f in boost::_bi::bind_t const&, std::string const&, boost::function, impala::ThreadDebugInfo > const*, impala::Promise*), > boos
[jira] [Commented] (IMPALA-5662) Log all information relevant to admission control decision making
[ https://issues.apache.org/jira/browse/IMPALA-5662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16494087#comment-16494087 ] ASF subversion and git services commented on IMPALA-5662: - Commit 466188b3970595e2e04d7ecf6a5141a7d3012909 in impala's branch refs/heads/2.x from [~bikram.sngh91] [ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=466188b ] IMPALA-3134: Support different proc mem limits among impalads for admission control checks Currently the admission controller assumes that all backends have the same process mem limit as the impalad it itself is running on. With this patch the proc mem limit for each impalad is available to the admission controller and it uses it for making correct admisssion decisions. It currently works under the assumption that the per-process memory limit does not change dynamically. Testing: Added an e2e test. IMPALA-5662: Log the queuing reason for a query The queuing reason is now logged both while queuing for the first time and while trying to dequeue. Change-Id: Idb72eee790cc17466bbfa82e30f369a65f2b060e Reviewed-on: http://gerrit.cloudera.org:8080/10396 Reviewed-by: Bikramjeet Vig Tested-by: Impala Public Jenkins > Log all information relevant to admission control decision making > - > > Key: IMPALA-5662 > URL: https://issues.apache.org/jira/browse/IMPALA-5662 > Project: IMPALA > Issue Type: Improvement > Components: Backend >Reporter: Balazs Jeszenszky >Assignee: Bikramjeet Vig >Priority: Major > Labels: admission-control, observability, resource-management, > supportability > Fix For: Impala 3.1.0 > > > Currently, when making a decision whether to admit a query or not, the log > has the following format: > {code:java} > I0705 14:43:04.031771 7388 admission-controller.cc:442] Stats: > agg_num_running=1, agg_num_queued=0, agg_mem_reserved=486.74 MB, > local_host(local_mem_admitted=0, num_admitted_running=0, num_queued=0, > backend_mem_reserved=56.07 MB) > {code} > Since it's also possible to queue queries due to one node not being able to > reserve the required memory, we should also log the max(backend_mem_reserved) > across all nodes. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-7055) test_avro_writer failing on upstream Jenkins (Expected exception: "Writing to table format AVRO is not supported")
[ https://issues.apache.org/jira/browse/IMPALA-7055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16494092#comment-16494092 ] ASF subversion and git services commented on IMPALA-7055: - Commit 1ba8581ceeac4f3c8dbf2b56139dec420de6e967 in impala's branch refs/heads/2.x from [~tarmstr...@cloudera.com] [ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=1ba8581 ] IMPALA-7055: fix race with DML errors Error statuses could be lost because backend_exec_complete_barrier_ went to 0 before the query was transitioned to an error state. Reordering the UpdateExecState() and backend_exec_complete_barrier_ calls prevents this race. Change-Id: Idafd0b342e77a065be7cc28fa8c8a9df445622c2 Reviewed-on: http://gerrit.cloudera.org:8080/10491 Reviewed-by: Tim Armstrong Tested-by: Impala Public Jenkins > test_avro_writer failing on upstream Jenkins (Expected exception: "Writing to > table format AVRO is not supported") > -- > > Key: IMPALA-7055 > URL: https://issues.apache.org/jira/browse/IMPALA-7055 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 3.0 >Reporter: David Knupp >Assignee: Tim Armstrong >Priority: Blocker > Labels: correctness, flaky > Fix For: Impala 2.13.0, Impala 3.1.0 > > > This failure occurred while verifying https://gerrit.cloudera.org/c/10455/, > but it is not related to that patch. The failing build is > https://jenkins.impala.io/job/gerrit-verify-dryrun/2511/ > (https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/2232/) > Test appears to be (from > [avro-writer.test|https://github.com/apache/impala/blob/master/testdata/workloads/functional-query/queries/QueryTest/avro-writer.test]): > {noformat} > QUERY > SET ALLOW_UNSUPPORTED_FORMATS=0; > insert into __avro_write select 1, "b", 2.2; > CATCH > Writing to table format AVRO is not supported. Use query option > ALLOW_UNSUPPORTED_FORMATS > {noformat} > Error output: > {noformat} > 01:50:18 ] FAIL > query_test/test_compressed_formats.py::TestTableWriters::()::test_avro_writer[exec_option: > {'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 5000, > 'disable_codegen': False, 'abort_on_error': 1, 'debug_action': None, > 'exec_single_node_rows_threshold': 0} | table_format: text/none] > 01:50:18 ] === FAILURES > === > 01:50:18 ] TestTableWriters.test_avro_writer[exec_option: {'batch_size': 0, > 'num_nodes': 0, 'disable_codegen_rows_threshold': 5000, 'disable_codegen': > False, 'abort_on_error': 1, 'debug_action': None, > 'exec_single_node_rows_threshold': 0} | table_format: text/none] > 01:50:18 ] [gw9] linux2 -- Python 2.7.12 > /home/ubuntu/Impala/bin/../infra/python/env/bin/python > 01:50:18 ] query_test/test_compressed_formats.py:189: in test_avro_writer > 01:50:18 ] self.run_test_case('QueryTest/avro-writer', vector) > 01:50:18 ] common/impala_test_suite.py:420: in run_test_case > 01:50:18 ] assert False, "Expected exception: %s" % expected_str > 01:50:18 ] E AssertionError: Expected exception: Writing to table format > AVRO is not supported. Use query option ALLOW_UNSUPPORTED_FORMATS > 01:50:18 ] Captured stderr setup > - > 01:50:18 ] -- connecting to: localhost:21000 > 01:50:18 ] - Captured stderr call > - > 01:50:18 ] -- executing against localhost:21000 > 01:50:18 ] use functional; > 01:50:18 ] > 01:50:18 ] SET batch_size=0; > 01:50:18 ] SET num_nodes=0; > 01:50:18 ] SET disable_codegen_rows_threshold=5000; > 01:50:18 ] SET disable_codegen=False; > 01:50:18 ] SET abort_on_error=1; > 01:50:18 ] SET exec_single_node_rows_threshold=0; > 01:50:18 ] -- executing against localhost:21000 > 01:50:18 ] drop table if exists __avro_write; > 01:50:18 ] > 01:50:18 ] -- executing against localhost:21000 > 01:50:18 ] SET COMPRESSION_CODEC=NONE; > 01:50:18 ] > 01:50:18 ] -- executing against localhost:21000 > 01:50:18 ] > 01:50:18 ] create table __avro_write (i int, s string, d double) > 01:50:18 ] stored as AVRO > 01:50:18 ] TBLPROPERTIES ('avro.schema.literal'='{ > 01:50:18 ] "name": "my_record", > 01:50:18 ] "type": "record", > 01:50:18 ] "fields": [ > 01:50:18 ] {"name":"i", "type":["int", "null"]}, > 01:50:18 ] {"name":"s", "type":["string", "null"]}, > 01:50:18 ] {"name":"d", "type":["double", "null"]}]}'); > 01:50:18 ] > 01:50:18 ] -- executing against localhost:21000 > 01:50:18 ] SET COMPRESSION_CODEC=""; > 01:50:18 ] > 01:50:18 ] --
[jira] [Commented] (IMPALA-7039) Frontend HBase tests cannot tolerate HBase running on a different port
[ https://issues.apache.org/jira/browse/IMPALA-7039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16494085#comment-16494085 ] ASF subversion and git services commented on IMPALA-7039: - Commit b07bb2729df4aa92d68626f88afa7cd09733ec23 in impala's branch refs/heads/2.x from [~tarasbob] [ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=b07bb27 ] IMPALA-7039: Ignore the port in HBase planner tests Before this patch, we used to check the HBase port in the HBase planner tests. This caused a failure when HBase was running on a different port than expected. We fix the problem in this patch by not checking the HBase port. Testing: ran the FE tests and they passed. Change-Id: I8eb7628061b2ebaf84323b37424925e9a64f70a0 Reviewed-on: http://gerrit.cloudera.org:8080/10459 Reviewed-by: Tim Armstrong Tested-by: Impala Public Jenkins > Frontend HBase tests cannot tolerate HBase running on a different port > -- > > Key: IMPALA-7039 > URL: https://issues.apache.org/jira/browse/IMPALA-7039 > Project: IMPALA > Issue Type: Bug > Components: Frontend >Affects Versions: Impala 3.1.0 >Reporter: Joe McDonnell >Assignee: Taras Bobrovytsky >Priority: Blocker > Labels: broken-build > Fix For: Impala 3.1.0 > > > When HBase doesn't get the same ports as usual, > org.apache.impala.planner.PlannerTest.testHbase and > org.apache.impala.planner.PlannerTest.testJoins fail with the following > errors: > {noformat} > section SCANRANGELOCATIONS of query: > select * from functional_hbase.alltypessmall > where id < 5 > Actual does not match expected result: > HBASE KEYRANGE port=16020 :3 > ^ > HBASE KEYRANGE port=16022 3:7 > HBASE KEYRANGE port=16023 7: > NODE 0: > Expected: > HBASE KEYRANGE port=16201 :3 > HBASE KEYRANGE port=16202 3:7 > HBASE KEYRANGE port=16203 7: > NODE 0: > section SCANRANGELOCATIONS of query: > select * from functional_hbase.stringids > where id = '5' > and tinyint_col = 5 > Actual does not match expected result: > HBASE KEYRANGE port=16022 5:5\0 > ^ > NODE 0: > Expected: > HBASE KEYRANGE port=16202 5:5\0 > NODE 0: > section SCANRANGELOCATIONS of query: > select * from functional_hbase.stringids > where id > '5' > and tinyint_col = 5 > Actual does not match expected result: > HBASE KEYRANGE port=16022 5\0:7 > ^ > HBASE KEYRANGE port=16023 7: > NODE 0: > Expected: > HBASE KEYRANGE port=16202 5\0:7 > HBASE KEYRANGE port=16203 7: > NODE 0: > section SCANRANGELOCATIONS of query: > select * from functional_hbase.stringids > where id >= '5' > and tinyint_col = 5 > Actual does not match expected result: > HBASE KEYRANGE port=16022 5:7 > ^^^ > HBASE KEYRANGE port=16023 7: > NODE 0: > Expected: > HBASE KEYRANGE port=16202 5:7 > HBASE KEYRANGE port=16203 7: > NODE 0: > section SCANRANGELOCATIONS of query: > select * from functional_hbase.stringids > where id < '5' > and tinyint_col = 5 > Actual does not match expected result: > HBASE KEYRANGE port=16020 :3 > ^ > HBASE KEYRANGE port=16022 3:5 > NODE 0: > Expected: > HBASE KEYRANGE port=16201 :3 > HBASE KEYRANGE port=16202 3:5 > NODE 0: > section SCANRANGELOCATIONS of query: > select * from functional_hbase.stringids > where id > '4' and id < '5' > and tinyint_col = 5 > Actual does not match expected result: > HBASE KEYRANGE port=16022 4\0:5 > ^ > NODE 0: > Expected: > HBASE KEYRANGE port=16202 4\0:5 > NODE 0: > section SCANRANGELOCATIONS of query: > select * from functional_hbase.stringids > where id >= '4' and id < '5' > and tinyint_col = 5 > Actual does not match expected result: > HBASE KEYRANGE port=16022 4:5 > ^^^ > NODE 0: > Expected: > HBASE KEYRANGE port=16202 4:5 > NODE 0: > section SCANRANGELOCATIONS of query: > select * from functional_hbase.stringids > where id > '4' and id <= '5' > and tinyint_col = 5 > Actual does not match expected result: > HBASE KEYRANGE port=16022 4\0:5\0 > ^^^ > NODE 0: > Expected: > HBASE KEYRANGE port=16202 4\0:5\0 > NODE 0: > section SCANRANGELOCATIONS of query: > select * from functional_hbase.stringids > where id >= '4' and id <= '5' > and tinyint_col = 5 > Actual does not match expected result: > HBASE KEYRANGE port=16022 4:5\0 > ^ > NODE 0: > Expected: > HBASE KEYRANGE port=16202 4:5\0 > NODE 0: > section SCANRANGELOCATIONS of query: > select * from functional_hbase.stringids > where string_col = '4' and tinyint_col = 5 and id >= '4' and id <= '5' > Actual does not match expected re
[jira] [Commented] (IMPALA-5642) [DOCS] Impala restrictions on using Hive UDFs
[ https://issues.apache.org/jira/browse/IMPALA-5642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16494096#comment-16494096 ] ASF subversion and git services commented on IMPALA-5642: - Commit 0b9334a564dd7dd8d4a08e78876aba3fb0852e4d in impala's branch refs/heads/master from [~arodoni_cloudera] [ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=0b9334a ] IMPALA-5642: [DOCS] An additional restriction for Hive/Java UDFs Change-Id: I79f5fcbb570fda48f9ac03f6c3760366aa1859d2 Reviewed-on: http://gerrit.cloudera.org:8080/10520 Reviewed-by: Bharath Vissapragada Tested-by: Impala Public Jenkins > [DOCS] Impala restrictions on using Hive UDFs > -- > > Key: IMPALA-5642 > URL: https://issues.apache.org/jira/browse/IMPALA-5642 > Project: IMPALA > Issue Type: Improvement > Components: Docs >Reporter: bharath v >Assignee: Alex Rodoni >Priority: Minor > > Along with the already stated restrictions on which Hive (Java) UDFs Impala > accepts, we need to add the following. > {noformat} > Impala requires that the Hive/Java UDFs must extend > 'org.apache.hadoop.hive.ql.exec.UDF' class > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-7079) test_scanners.TestParquet.test_multiple_blocks fails in the erasure coding job
[ https://issues.apache.org/jira/browse/IMPALA-7079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16494094#comment-16494094 ] ASF subversion and git services commented on IMPALA-7079: - Commit 56a740c07a6d80921e86fee769033fab5ad1ccf3 in impala's branch refs/heads/master from [~tarasbob] [ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=56a740c ] IMPALA-7079: Disable the multiple blocks test in erasure coding build The test is currently failing in erasure coding build, so disable it to make the build pass. Change-Id: I00af0914d907b8dcff69f687f71239e76b6ff335 Reviewed-on: http://gerrit.cloudera.org:8080/10521 Reviewed-by: Tianyi Wang Tested-by: Impala Public Jenkins > test_scanners.TestParquet.test_multiple_blocks fails in the erasure coding job > -- > > Key: IMPALA-7079 > URL: https://issues.apache.org/jira/browse/IMPALA-7079 > Project: IMPALA > Issue Type: Task >Affects Versions: Impala 3.1.0 >Reporter: Taras Bobrovytsky >Assignee: Taras Bobrovytsky >Priority: Major > > Several tests failed in TestParquet.test_multiple_blocks in a nightly erasure > coding run. > {code} > TestParquet.test_multiple_blocks[exec_option: {'batch_size': 0, 'num_nodes': > 0, 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, > 'abort_on_error': 1, 'debug_action': None, 'exec_single_node_rows_threshold': > 0} | table_format: parquet/none] > [gw0] linux2 -- Python 2.6.6 > /data/jenkins/workspace/impala-asf-master-core-erasure-coding/repos/Impala/bin/../infra/python/env/bin/python > query_test/test_scanners.py:550: in test_multiple_blocks > self._multiple_blocks_helper(table_name, 2, ranges_per_node=1) > query_test/test_scanners.py:598: in _multiple_blocks_helper > assert len(num_row_groups_list) == 4 > E assert 2 == 4 > E+ where 2 = len(['200', '200']) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-5502) "*DBC Connector for Impala" is without context
[ https://issues.apache.org/jira/browse/IMPALA-5502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16494095#comment-16494095 ] ASF subversion and git services commented on IMPALA-5502: - Commit 6ee48b9a11709ecb6eb8554500f6245eb42f1f8b in impala's branch refs/heads/master from [~arodoni_cloudera] [ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=6ee48b9 ] IMPALA-5502: [DOCS] Removed JDBC and ODBC connectors without a context Removed the section on Complex types and JDBC/ODBC connectors described without a context. Change-Id: I329dc497f9dd9cbf446d96e68c55cfe290b9fc58 Reviewed-on: http://gerrit.cloudera.org:8080/10522 Reviewed-by: Jim Apple Tested-by: Impala Public Jenkins > "*DBC Connector for Impala" is without context > -- > > Key: IMPALA-5502 > URL: https://issues.apache.org/jira/browse/IMPALA-5502 > Project: IMPALA > Issue Type: Task > Components: Docs >Affects Versions: Impala 2.9.0 >Reporter: Jim Apple >Assignee: Alex Rodoni >Priority: Minor > > http://impala.incubator.apache.org/docs/build/html/topics/impala_jdbc.html > says to use the Hive JDBC driver, but then says "The Impala complex types > (STRUCT, ARRAY, or MAP) are available in Impala 2.3 and higher. To use these > types with JDBC requires version 2.5.28 or higher of the JDBC Connector for > Impala. To use these types with ODBC requires version 2.5.30 or higher of the > ODBC Connector for Impala." > These connectors could be described or explained above, or this line could be > removed. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-6813) Hedged reads metrics broken when scanning non-HDFS based table
[ https://issues.apache.org/jira/browse/IMPALA-6813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16494088#comment-16494088 ] ASF subversion and git services commented on IMPALA-6813: - Commit a3efde84a5e0ef17357d24c3e69aa3f255eb4865 in impala's branch refs/heads/2.x from [~sailesh] [ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=a3efde8 ] IMPALA-6813: Hedged reads metrics broken when scanning non-HDFS based table We realized that the libHDFS API call hdfsGetHedgedReadMetrics() crashes when the 'fs' argument passed to it is not a HDFS filesystem. There is an open bug for it on the HDFS side: HDFS-13417 However, it looks like we won't be getting a fix for it in the short term, so our only option at this point is to skip it. Testing: Made sure that enabling preads and scanning from S3 doesn't cause a crash. Also, added a custom cluster test to exercise the pread code path. We are unable to verify hedged reads in a minicluster, but we can at least exercise the code path to make sure that nothing breaks. Change-Id: I48fe80dfd9a1ed68a8f2b7038e5f42b5a3df3baa Reviewed-on: http://gerrit.cloudera.org:8080/9966 Reviewed-by: Sailesh Mukil Tested-by: Impala Public Jenkins > Hedged reads metrics broken when scanning non-HDFS based table > -- > > Key: IMPALA-6813 > URL: https://issues.apache.org/jira/browse/IMPALA-6813 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 2.12.0 >Reporter: Mostafa Mokhtar >Assignee: Sailesh Mukil >Priority: Blocker > Fix For: Impala 2.13.0, Impala 3.1.0 > > > When preads are enabled ADLS scans can fail updating the Hedged reads metrics > {code} > (gdb) bt > #0 0x003346c32625 in raise () from /lib64/libc.so.6 > #1 0x003346c33e05 in abort () from /lib64/libc.so.6 > #2 0x7f185be140b5 in os::abort(bool) () >from /usr/java/jdk1.8.0_121-cloudera/jre/lib/amd64/server/libjvm.so > #3 0x7f185bfb6443 in VMError::report_and_die() () >from /usr/java/jdk1.8.0_121-cloudera/jre/lib/amd64/server/libjvm.so > #4 0x7f185be195bf in JVM_handle_linux_signal () >from /usr/java/jdk1.8.0_121-cloudera/jre/lib/amd64/server/libjvm.so > #5 0x7f185be0fb03 in signalHandler(int, siginfo*, void*) () >from /usr/java/jdk1.8.0_121-cloudera/jre/lib/amd64/server/libjvm.so > #6 > #7 0x7f185bbc1a7b in jni_invoke_nonstatic(JNIEnv_*, JavaValue*, > _jobject*, JNICallType, _jmethodID*, JNI_ArgumentPusher*, Thread*) () >from /usr/java/jdk1.8.0_121-cloudera/jre/lib/amd64/server/libjvm.so > #8 0x7f185bbc7e81 in jni_CallObjectMethodV () >from /usr/java/jdk1.8.0_121-cloudera/jre/lib/amd64/server/libjvm.so > #9 0x0212e2b7 in invokeMethod () > #10 0x02131297 in hdfsGetHedgedReadMetrics () > #11 0x011601c0 in impala::io::ScanRange::Close() () > #12 0x01158a95 in > impala::io::DiskIoMgr::HandleReadFinished(impala::io::DiskIoMgr::DiskQueue*, > impala::io::RequestContext*, std::unique_ptr std::default_delete >) () > #13 0x01158e1c in > impala::io::DiskIoMgr::ReadRange(impala::io::DiskIoMgr::DiskQueue*, > impala:---Type to continue, or q to quit--- > :io::RequestContext*, impala::io::ScanRange*) () > #14 0x01159052 in > impala::io::DiskIoMgr::WorkLoop(impala::io::DiskIoMgr::DiskQueue*) () > #15 0x00d5fcaf in > impala::Thread::SuperviseThread(std::basic_string std::char_traits, std::allocator > const&, > std::basic_string, std::allocator > > const&, boost::function, impala::ThreadDebugInfo const*, > impala::Promise*) () > #16 0x00d604aa in boost::detail::thread_data void (*)(std::basic_string, std::allocator > > const&, std::basic_string, > std::allocator > const&, boost::function, > impala::ThreadDebugInfo const*, impala::Promise*), > boost::_bi::list5 std::char_traits, std::allocator > >, > boost::_bi::value, > std::allocator > >, boost::_bi::value >, > boost::_bi::value, > boost::_bi::value*> > > >::run() () > #17 0x012d6dfa in ?? () > #18 0x003347007aa1 in start_thread () from /lib64/libpthread.so.0 > #19 0x003346ce893d in clone () from /lib64/libc.so.6 > {code} > {code} > CREATE TABLE adls.lineitem ( > l_orderkey BIGINT, > l_partkey BIGINT, > l_suppkey BIGINT, > l_linenumber BIGINT, > l_quantity DOUBLE, > l_extendedprice DOUBLE, > l_discount DOUBLE, > l_tax DOUBLE, > l_returnflag STRING, > l_linestatus STRING, > l_commitdate STRING, > l_receiptdate STRING, > l_shipinstruct STRING, > l_shipmode STRING, > l_comment STRING, > l_shipdate STRING > ) > STORED AS PARQUET > LOCATION 'adl://foo.azuredatalakestore.net/adls-test.db/lineitem' > {code} > select * from adls.lineitem limit 10; -- This message was sent by Atlassian
[jira] [Commented] (IMPALA-7048) Failed test: query_test.test_parquet_page_index.TestHdfsParquetTableIndexWriter.test_write_index_many_columns_tables
[ https://issues.apache.org/jira/browse/IMPALA-7048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16494089#comment-16494089 ] ASF subversion and git services commented on IMPALA-7048: - Commit a48bbfdf4692eb68f06a4cd192a98947bcc04aba in impala's branch refs/heads/2.x from [~boroknagyz] [ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=a48bbfd ] IMPALA-7048: Failed test: test_write_index_many_columns_tables The test in the title fails when the local filesystem is used. Looking at the error message it seems that the determined Parquet file size is too small when the local filesystem is used. There is already an annotation for that: 'SkipIfLocal.parquet_file_size' I added this annotation to the TestHdfsParquetTableIndexWriter class, therefore these tests won't be executed when the test-warehouse directory of Impala resides on the local filesystem. Change-Id: Idd3be70fb654a49dda44309a8914fe1f2b48a1af Reviewed-on: http://gerrit.cloudera.org:8080/10476 Reviewed-by: Zoltan Borok-Nagy Tested-by: Impala Public Jenkins > Failed test: > query_test.test_parquet_page_index.TestHdfsParquetTableIndexWriter.test_write_index_many_columns_tables > > > Key: IMPALA-7048 > URL: https://issues.apache.org/jira/browse/IMPALA-7048 > Project: IMPALA > Issue Type: Bug > Components: Backend >Reporter: Dimitris Tsirogiannis >Assignee: Zoltán Borók-Nagy >Priority: Blocker > Labels: broken-build > Fix For: Impala 2.13.0, Impala 3.1.0 > > > The following test fails when the filesystem is LOCAL: > {code:java} > query_test.test_parquet_page_index.TestHdfsParquetTableIndexWriter.test_write_index_many_columns_tables[exec_option: > \{'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, > 'disable_codegen': False, 'abort_on_error': 1, 'debug_action': None, > 'exec_single_node_rows_threshold': 0} | table_format: parquet/none] (from > pytest) {code} > Zoltan, assigning to you since this looks suspiciously related to the fix for > IMPALA-5842. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-3134) Admission controller should not assume all backends have same proc mem limit
[ https://issues.apache.org/jira/browse/IMPALA-3134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16494086#comment-16494086 ] ASF subversion and git services commented on IMPALA-3134: - Commit 466188b3970595e2e04d7ecf6a5141a7d3012909 in impala's branch refs/heads/2.x from [~bikram.sngh91] [ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=466188b ] IMPALA-3134: Support different proc mem limits among impalads for admission control checks Currently the admission controller assumes that all backends have the same process mem limit as the impalad it itself is running on. With this patch the proc mem limit for each impalad is available to the admission controller and it uses it for making correct admisssion decisions. It currently works under the assumption that the per-process memory limit does not change dynamically. Testing: Added an e2e test. IMPALA-5662: Log the queuing reason for a query The queuing reason is now logged both while queuing for the first time and while trying to dequeue. Change-Id: Idb72eee790cc17466bbfa82e30f369a65f2b060e Reviewed-on: http://gerrit.cloudera.org:8080/10396 Reviewed-by: Bikramjeet Vig Tested-by: Impala Public Jenkins > Admission controller should not assume all backends have same proc mem limit > > > Key: IMPALA-3134 > URL: https://issues.apache.org/jira/browse/IMPALA-3134 > Project: IMPALA > Issue Type: Improvement > Components: Backend >Affects Versions: Impala 2.5.0 >Reporter: Matthew Jacobs >Assignee: Bikramjeet Vig >Priority: Minor > Labels: admission-control, ramp-up, resource-management > Fix For: Impala 3.1.0 > > > The admission policy now checks that all backends have enough available > memory resources to execute the request, but it assumes that all backends > share the same process mem limit as the impalad it itself is running on. In a > heterogeneous environment, admission is wrong. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-4025) add functions PERCENTILE_DISC(), PERCENTILE_CONT(), and MEDIAN()
[ https://issues.apache.org/jira/browse/IMPALA-4025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16494091#comment-16494091 ] ASF subversion and git services commented on IMPALA-4025: - Commit 41d7cd908a05dabe31775dabf188d3b2136c25d2 in impala's branch refs/heads/2.x from [~tianyiwang] [ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=41d7cd9 ] IMPALA-4025: Part 1: Generalize and cleanup StmtRewriter This patch generalizes StmtRewriter, allowing it to be subclassed. The base class would traverse the stmt tree while the subclasses can install hooks to execute specific rewrite rules at certain places. Existing rewriting rules are moved into SubqueryRewriter. Change-Id: I9e7a6108d3d49be12ae032fdb54b5c3c23152a47 Reviewed-on: http://gerrit.cloudera.org:8080/10495 Reviewed-by: Vuk Ercegovac Tested-by: Impala Public Jenkins > add functions PERCENTILE_DISC(), PERCENTILE_CONT(), and MEDIAN() > > > Key: IMPALA-4025 > URL: https://issues.apache.org/jira/browse/IMPALA-4025 > Project: IMPALA > Issue Type: New Feature > Components: Backend, Frontend >Affects Versions: Impala 2.2.4 >Reporter: Greg Rahn >Assignee: Tianyi Wang >Priority: Major > Labels: built-in-function, sql-language > > Add the following functions as both an aggregate function and window/analytic > function: > * PERCENTILE_CONT > * PERCENTILE_DISC > * MEDIAN (impmented as PERCENTILE_CONT(0.5)) > h6. Syntax > {code} > PERCENTILE_CONT() WITHIN GROUP (ORDER BY [ASC|DESC] > [NULLS {FIRST | LAST}]) [ OVER ([])] > PERCENTILE_DISC() WITHIN GROUP (ORDER BY [ASC|DESC] > [NULLS {FIRST | LAST}]) [ OVER ([])] > MEDIAN(expr) [ OVER () ] > {code} > h6. Notes from other systems > *Greenplum* > {code} > PERCENTILE_CONT(_percentage_) WITHIN GROUP (ORDER BY _expression_) > {code} > http://gpdb.docs.pivotal.io/4320/admin_guide/query.html > Greenplum Database provides the MEDIAN aggregate function, which returns the > fiftieth percentile of the PERCENTILE_CONT result and special aggregate > expressions for inverse distribution functions as follows: > Currently you can use only these two expressions with the keyword WITHIN > GROUP. > Note: aggregation fuction only > *Oracle* > {code} > PERCENTILE_CONT(expr) WITHIN GROUP (ORDER BY expr [ DESC | ASC ]) [ OVER > (query_partition_clause) ]}} > {code} > http://docs.oracle.com/database/121/SQLRF/functions141.htm#SQLRF00687 > Note: implemented as both an aggregate and window function > *Vertica* > {code} > PERCENTILE_CONT ( %_number ) WITHIN GROUP (... ORDER BY expression [ ASC | > DESC ] ) OVER (... [ window-partition-clause ] ) > {code} > https://my.vertica.com/docs/7.2.x/HTML/index.htm#Authoring/SQLReferenceManual/Functions/Analytic/PERCENTILE_CONTAnalytic.htm > Note: window fuction only > *Teradata* > {code} > PERCENTILE_CONT() WITHIN GROUP (ORDER BY > [asc | desc] [nulls {first | last}]) > {code} > Note: aggregation fuction only > *Netezza* > {code} > SELECT fn() WITHIN GROUP (ORDER BY [asc|desc] [nulls > {first | last}]) FROM [GROUP BY ]; > {code} > https://www.ibm.com/support/knowledgecenter/SSULQD_7.2.1/com.ibm.nz.dbu.doc/c_dbuser_inverse_distribution_funcs_family_syntax.html > Note: aggregation fuction only > *Redshift* > {code} > PERCENTILE_CONT ( percentile ) WITHIN GROUP (ORDER BY expr) OVER ( [ > PARTITION BY expr_list ] ) > {code} > https://www.ibm.com/support/knowledgecenter/SSULQD_7.2.1/com.ibm.nz.dbu.doc/c_dbuser_inverse_distribution_funcs_family_syntax.html > Note: window fuction only -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Assigned] (IMPALA-6900) Invalidate metadata operation is ignored at a coordinator if catalog is empty
[ https://issues.apache.org/jira/browse/IMPALA-6900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dimitris Tsirogiannis reassigned IMPALA-6900: - Assignee: Vuk Ercegovac (was: Dimitris Tsirogiannis) > Invalidate metadata operation is ignored at a coordinator if catalog is empty > - > > Key: IMPALA-6900 > URL: https://issues.apache.org/jira/browse/IMPALA-6900 > Project: IMPALA > Issue Type: Bug > Components: Catalog >Affects Versions: Impala 3.0, Impala 2.12.0 >Reporter: Dimitris Tsirogiannis >Assignee: Vuk Ercegovac >Priority: Major > > The following workflow may cause an impalad that issued an invalidate > metadata to falsely consider that the effect of that operation has taken > effect, thus causing subsequent queries to fail due to unresolved references > to tables or databases. > Steps to reproduce: > # Start an impala cluster connecting to an empty HMS (no databases). > # Create a database "db" in HMS outside of Impala (e.g. using Hive). > # Run INVALIDATE METADATA through Impala. > # Run "use db" statement in Impala. > > The while condition in the code snippet below is cause the > WaitForMinCatalogUpdate function to prematurely return even though INVALIDATE > METADATA has not taken effect: > {code:java} > void ImpalaServer::WaitForMinCatalogUpdate(..) { > ... > VLOG_QUERY << "Waiting for minimum catalog object version: " ><< min_req_catalog_object_version << " current version: " ><< min_catalog_object_version; > while (catalog_update_info_.min_catalog_object_version < > min_req_catalog_object_version && catalog_update_info_.catalog_service_id == > catalog_service_id) { >catalog_version_update_cv_.Wait(unique_lock); > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-6776) Failed to assign hbase regions to servers
[ https://issues.apache.org/jira/browse/IMPALA-6776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16494066#comment-16494066 ] Vuk Ercegovac commented on IMPALA-6776: --- [IMPALA-7061|http://issues.cloudera.org/browse/IMPALA-7061] changed how hbase is split for front-end planner tests. Now, the front-end planner test does the splitting/assigning, which means that the error reported here will effect just the planner tests (and not all others). In addition, the splitting/assigning was changed. I'm closing this bug-- its already marked as being related to IMPALA-7061. Please open a new bug if a similar issue with region assignment comes up. > Failed to assign hbase regions to servers > - > > Key: IMPALA-6776 > URL: https://issues.apache.org/jira/browse/IMPALA-6776 > Project: IMPALA > Issue Type: Bug > Components: Infrastructure >Affects Versions: Impala 3.0 >Reporter: Tianyi Wang >Assignee: Vuk Ercegovac >Priority: Blocker > Labels: broken-build > > After switching to hadoop 3 components, split-hbase.sh failed in > HBaseTestDataRegionAssigment: > {noformat} > 20:40:27 Splitting HBase (logging to > /data/jenkins/workspace/impala-asf-master-exhaustive-release/repos/Impala/logs/data_loading/create-hbase.log)... > > 20:41:51 FAILED (Took: 1 min 24 sec) > 20:41:51 > '/data/jenkins/workspace/impala-asf-master-exhaustive-release/repos/Impala/testdata/bin/split-hbase.sh' > failed. Tail of log: > 20:41:51 18/03/29 20:41:50 INFO datagenerator.HBaseTestDataRegionAssigment: > functional_hbase.alltypessmall,3,1522381286429.7b13fefeda7afac230e22150deab5266. > 3 -> localhost:16202, expecting localhost,16202,1522374373018 > 20:41:51 18/03/29 20:41:50 INFO datagenerator.HBaseTestDataRegionAssigment: > functional_hbase.alltypessmall,5,1522381287511.7a243a822c5c4844a2a3d0f67a541961. > 5 -> localhost:16202, expecting localhost,16202,1522374373018 > 20:41:51 18/03/29 20:41:50 INFO datagenerator.HBaseTestDataRegionAssigment: > functional_hbase.alltypessmall,7,1522381288718.80d6e4a799ad114a146dc3cb41e18e93. > 7 -> localhost:16203, expecting localhost,16203,1522374374705 > 20:41:51 18/03/29 20:41:50 INFO datagenerator.HBaseTestDataRegionAssigment: > functional_hbase.alltypessmall,9,1522381288718.d705a2ea635916f4bb510ca60764080a. > 9 -> localhost:16203, expecting localhost,16203,1522374374705 > 20:41:51 18/03/29 20:41:50 INFO datagenerator.HBaseTestDataRegionAssigment: > functional_hbase.alltypessmall,,1522381282868.a99b569f5417ea9e2561eb5566c31be0. > -> localhost:16203, expecting localhost,16201,1522374371810 > 20:41:51 18/03/29 20:41:50 INFO datagenerator.HBaseTestDataRegionAssigment: > functional_hbase.alltypessmall,1,1522381285023.5fb566ba94e5fbb8aeca39f3da0a6362. > 1 -> localhost:16201, expecting localhost,16201,1522374371810 > 20:41:51 18/03/29 20:41:50 INFO datagenerator.HBaseTestDataRegionAssigment: > functional_hbase.alltypessmall,3,1522381286429.7b13fefeda7afac230e22150deab5266. > 3 -> localhost:16202, expecting localhost,16202,1522374373018 > 20:41:51 18/03/29 20:41:50 INFO datagenerator.HBaseTestDataRegionAssigment: > functional_hbase.alltypessmall,5,1522381287511.7a243a822c5c4844a2a3d0f67a541961. > 5 -> localhost:16202, expecting localhost,16202,1522374373018 > 20:41:51 18/03/29 20:41:50 INFO datagenerator.HBaseTestDataRegionAssigment: > functional_hbase.alltypessmall,7,1522381288718.80d6e4a799ad114a146dc3cb41e18e93. > 7 -> localhost:16203, expecting localhost,16203,1522374374705 > 20:41:51 18/03/29 20:41:50 INFO datagenerator.HBaseTestDataRegionAssigment: > functional_hbase.alltypessmall,9,1522381288718.d705a2ea635916f4bb510ca60764080a. > 9 -> localhost:16203, expecting localhost,16203,1522374374705 > 20:41:51 18/03/29 20:41:50 INFO datagenerator.HBaseTestDataRegionAssigment: > functional_hbase.alltypessmall,,1522381282868.a99b569f5417ea9e2561eb5566c31be0. > -> localhost:16203, expecting localhost,16201,1522374371810 > 20:41:51 18/03/29 20:41:50 INFO datagenerator.HBaseTestDataRegionAssigment: > functional_hbase.alltypessmall,1,1522381285023.5fb566ba94e5fbb8aeca39f3da0a6362. > 1 -> localhost:16201, expecting localhost,16201,1522374371810 > 20:41:51 18/03/29 20:41:50 INFO datagenerator.HBaseTestDataRegionAssigment: > functional_hbase.alltypessmall,3,1522381286429.7b13fefeda7afac230e22150deab5266. > 3 -> localhost:16202, expecting localhost,16202,1522374373018 > 20:41:51 18/03/29 20:41:50 INFO datagenerator.HBaseTestDataRegionAssigment: > functional_hbase.alltypessmall,5,1522381287511.7a243a822c5c4844a2a3d0f67a541961. > 5 -> localhost:16202, expecting localhost,16202,1522374373018 > 20:41:51 18/03/29 20:41:50 INFO datagenerator.HBaseTestDataRegionAssigment: > functional_hbase.alltypessmall,7,1522381288718.80d6e4a799ad114a146dc3cb41e1
[jira] [Updated] (IMPALA-6338) Runtime profile for query with limit may be missing pieces
[ https://issues.apache.org/jira/browse/IMPALA-6338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Tauber-Marshall updated IMPALA-6338: --- Summary: Runtime profile for query with limit may be missing pieces (was: test_profile_fragment_instances failing) > Runtime profile for query with limit may be missing pieces > -- > > Key: IMPALA-6338 > URL: https://issues.apache.org/jira/browse/IMPALA-6338 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 2.11.0, Impala 2.12.0 >Reporter: David Knupp >Assignee: Thomas Tauber-Marshall >Priority: Critical > Labels: broken-build, flaky > Attachments: profile-failure.txt, profile-success.txt > > > Stack trace: > {noformat} > query_test/test_observability.py:123: in test_profile_fragment_instances > assert results.runtime_profile.count("HDFS_SCAN_NODE") == 12 > E assert 11 == 12 > E+ where 11 = 0x68bd2f0>('HDFS_SCAN_NODE') > E+where = 'Query > (id=ae4cee91aafc5c6c:11b545c6):\n DEBUG MODE WARNING: Query profile > created while running a DEBUG buil...ontextSwitches: 0 (0)\n - > TotalRawHdfsReadTime(*): 5s784ms\n - TotalReadThroughput: 17.33 > MB/sec\n'.count > E+ where 'Query (id=ae4cee91aafc5c6c:11b545c6):\n DEBUG > MODE WARNING: Query profile created while running a DEBUG > buil...ontextSwitches: 0 (0)\n - TotalRawHdfsReadTime(*): 5s784ms\n > - TotalReadThroughput: 17.33 MB/sec\n' = > 0x6322e10>.runtime_profile > {noformat} > Query: > {noformat} > with l as (select * from tpch.lineitem UNION ALL select * from tpch.lineitem) > select STRAIGHT_JOIN count(*) from (select * from tpch.lineitem a > LIMIT 1) a > join (select * from l LIMIT 200) b on a.l_orderkey = > -b.l_orderkey; > {noformat} > Summary: > {noformat} > Operator #Hosts Avg Time Max Time #Rows Est. #Rows Peak Mem > Est. Peak Mem Detail > > 05:AGGREGATE 1 0.000ns 0.000ns 1 1 28.00 KB > 10.00 MB FINALIZE > 04:HASH JOIN 1 15.000ms 15.000ms 0 1 141.06 MB > 17.00 MB INNER JOIN, BROADCAST > |--08:EXCHANGE1 4s153ms 4s153ms 2.00M 2.00M 0 > 0 UNPARTITIONED > | 07:EXCHANGE1 3s783ms 3s783ms 2.00M 2.00M 0 > 0 UNPARTITIONED > | 01:UNION 3 17.000ms 28.001ms 3.03M 2.00M 0 > 0 > | |--03:SCAN HDFS3 0.000ns 0.000ns 0 6.00M 0 > 176.00 MB tpch.lineitem > | 02:SCAN HDFS 3 6s133ms 6s948ms 3.03M 6.00M 24.02 MB > 176.00 MB tpch.lineitem > 06:EXCHANGE 1 5s655ms 5s655ms 1 1 0 > 0 UNPARTITIONED > 00:SCAN HDFS 3 4s077ms 6s207ms 2 1 16.05 MB > 176.00 MB tpch.lineitem a > {noformat} > Plan: > {noformat} > > Max Per-Host Resource Reservation: Memory=17.00MB > Per-Host Resource Estimates: Memory=379.00MB > F01:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1 > | Per-Host Resources: mem-estimate=27.00MB mem-reservation=17.00MB > PLAN-ROOT SINK > | mem-estimate=0B mem-reservation=0B > | > 05:AGGREGATE [FINALIZE] > | output: count(*) > | mem-estimate=10.00MB mem-reservation=0B spill-buffer=2.00MB > | tuple-ids=7 row-size=8B cardinality=1 > | > 04:HASH JOIN [INNER JOIN, BROADCAST] > | hash predicates: a.l_orderkey = -1 * l_orderkey > | fk/pk conjuncts: assumed fk/pk > | runtime filters: RF000[bloom] <- -1 * l_orderkey > | mem-estimate=17.00MB mem-reservation=17.00MB spill-buffer=1.00MB > | tuple-ids=0,4 row-size=16B cardinality=1 > | > |--08:EXCHANGE [UNPARTITIONED] > | | mem-estimate=0B mem-reservation=0B > | | tuple-ids=4 row-size=8B cardinality=200 > | | > | F05:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1 > | Per-Host Resources: mem-estimate=0B mem-reservation=0B > | 07:EXCHANGE [UNPARTITIONED] > | | limit: 200 > | | mem-estimate=0B mem-reservation=0B > | | tuple-ids=4 row-size=8B cardinality=200 > | | > | F04:PLAN FRAGMENT [RANDOM] hosts=3 instances=3 > | Per-Host Resources: mem-estimate=176.00MB mem-reservation=0B > | 01:UNION > | | pass-through-operands: all > | | limit: 200 > | | mem-estimate=0B mem-reservation=0B > | | tuple-ids=4 row-size=8B cardinality=200 > | | > | |--03:SCAN HDFS [tpch.lineitem, RANDOM] > | | partitions
[jira] [Resolved] (IMPALA-6928) test_bloom_filters failing on ASAN build: did not find "Runtime Filter Published" in profile
[ https://issues.apache.org/jira/browse/IMPALA-6928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Tauber-Marshall resolved IMPALA-6928. Resolution: Duplicate dup of IMPALA-6338 > test_bloom_filters failing on ASAN build: did not find "Runtime Filter > Published" in profile > > > Key: IMPALA-6928 > URL: https://issues.apache.org/jira/browse/IMPALA-6928 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 2.12.0 >Reporter: David Knupp >Assignee: Thomas Tauber-Marshall >Priority: Blocker > > Stacktrace > {noformat} > query_test/test_runtime_filters.py:81: in test_bloom_filters > self.run_test_case('QueryTest/bloom_filters', vector) > common/impala_test_suite.py:444: in run_test_case > verify_runtime_profile(test_section['RUNTIME_PROFILE'], > result.runtime_profile) > common/test_result_verifier.py:560: in verify_runtime_profile > actual)) > E AssertionError: Did not find matches for lines in runtime profile: > E EXPECTED LINES: > E row_regex: .*1 of 1 Runtime Filter Published.* > E > E ACTUAL PROFILE: > E Query (id=a64a18654d28e0c3:e6220f6c): > E DEBUG MODE WARNING: Query profile created while running a DEBUG build > of Impala. Use RELEASE builds to measure query performance. > E Summary: > E Session ID: 244e6109f4226b2b:39160855c64ad4a1 > E Session Type: BEESWAX > E Start Time: 2018-04-23 23:31:59.326883000 > E End Time: > E Query Type: QUERY > E Query State: FINISHED > E Query Status: OK > E Impala Version: impalad version 2.12.0-cdh5.15.0 DEBUG (build > 3d60947b813429cd1db59f9a342498982d341de9) > E User: jenkins > E Connected User: jenkins > E Delegated User: > E Network Address: 127.0.0.1:55776 > E Default Db: functional > E Sql Statement: with l as (select * from tpch.lineitem UNION ALL > select * from tpch.lineitem) > E select STRAIGHT_JOIN count(*) from (select * from tpch.lineitem a LIMIT > 1) a > E join (select * from l LIMIT 200) b on a.l_orderkey = -b.l_orderkey > E Coordinator: ec2-m2-4xlarge-centos-6-4-0f06.vpc.cloudera.com:22000 > E Query Options (set by configuration): > ABORT_ON_ERROR=1,EXEC_SINGLE_NODE_ROWS_THRESHOLD=0,RUNTIME_FILTER_WAIT_TIME_MS=3,RUNTIME_FILTER_MIN_SIZE=65536,DISABLE_CODEGEN_ROWS_THRESHOLD=0 > E Query Options (set by configuration and planner): > ABORT_ON_ERROR=1,EXEC_SINGLE_NODE_ROWS_THRESHOLD=0,RUNTIME_FILTER_WAIT_TIME_MS=3,MT_DOP=0,RUNTIME_FILTER_MIN_SIZE=65536,DISABLE_CODEGEN_ROWS_THRESHOLD=0 > E Plan: > E > E Max Per-Host Resource Reservation: Memory=19.00MB > E Per-Host Resource Estimates: Memory=557.00MB > E > E F01:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1 > E | Per-Host Resources: mem-estimate=28.00MB mem-reservation=18.00MB > runtime-filters-memory=1.00MB > E PLAN-ROOT SINK > E | mem-estimate=0B mem-reservation=0B > E | > E 05:AGGREGATE [FINALIZE] > E | output: count(*) > E | mem-estimate=10.00MB mem-reservation=0B spill-buffer=2.00MB > E | tuple-ids=7 row-size=8B cardinality=1 > E | > E 04:HASH JOIN [INNER JOIN, BROADCAST] > E | hash predicates: a.l_orderkey = -1 * l_orderkey > E | fk/pk conjuncts: assumed fk/pk > E | runtime filters: RF000[bloom] <- -1 * l_orderkey > E | mem-estimate=17.00MB mem-reservation=17.00MB spill-buffer=1.00MB > E | tuple-ids=0,4 row-size=16B cardinality=1 > E | > E |--08:EXCHANGE [UNPARTITIONED] > E | | mem-estimate=0B mem-reservation=0B > E | | tuple-ids=4 row-size=8B cardinality=200 > E | | > E | F05:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1 > E | Per-Host Resources: mem-estimate=0B mem-reservation=0B > E | 07:EXCHANGE [UNPARTITIONED] > E | | limit: 200 > E | | mem-estimate=0B mem-reservation=0B > E | | tuple-ids=4 row-size=8B cardinality=200 > E | | > E | F04:PLAN FRAGMENT [RANDOM] hosts=3 instances=3 > E | Per-Host Resources: mem-estimate=264.00MB mem-reservation=0B > E | 01:UNION > E | | pass-through-operands: all > E | | limit: 200 > E | | mem-estimate=0B mem-reservation=0B > E | | tuple-ids=4 row-size=8B cardinality=200 > E | | > E | |--03:SCAN HDFS [tpch.lineitem, RANDOM] > E | | partitions=1/1 files=1 size=718.94MB > E | | stored statistics: > E | | table: rows=6001215 size=718.94MB > E | | columns: all > E | | extrapolated-rows=disabled > E | | mem-estimate=264.00MB mem-reservation=0B > E | | tuple-ids=3 row-size=8B cardinality=6001215 > E | | > E | 02:SCAN HDFS [tpch.lineitem, RANDOM] > E | partitions=1/
[jira] [Assigned] (IMPALA-7088) Parallel data load breaks load-data.py if loading data on a real cluster
[ https://issues.apache.org/jira/browse/IMPALA-7088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joe McDonnell reassigned IMPALA-7088: - Assignee: Joe McDonnell > Parallel data load breaks load-data.py if loading data on a real cluster > > > Key: IMPALA-7088 > URL: https://issues.apache.org/jira/browse/IMPALA-7088 > Project: IMPALA > Issue Type: Bug > Components: Infrastructure >Affects Versions: Impala 3.0 >Reporter: David Knupp >Assignee: Joe McDonnell >Priority: Blocker > > {{Impala/bin/load-data.py}} is most commonly used to load test data onto a > simulated standalone cluster running on the local host. However, with the > correct inputs, it can also be used to load data onto an actual cluster > running on remote hosts. > A recent enhancement in the load-data.py script to parallelize parts of the > data loading process -- https://github.com/apache/impala/commit/d481cd48 -- > has introduced a regression in the latter use case: > From {{$IMPALA_HOME/logs/data_loading/data-load-functional-exhaustive.log}}: > {noformat} > Created table functional_hbase.widetable_1000_cols > Took 0.7121 seconds > 09:48:01 Beginning execution of hive SQL: > /home/systest/Impala-auxiliary-tests/tests/cdh_cluster/../../../Impala-cdh-cluster-test-runner/logs/data_loading/sql/functional/load-functional-query-exhaustive-hive-generated-text-none-none.sql > Traceback (most recent call last): > File > "/home/systest/Impala-auxiliary-tests/tests/cdh_cluster/../../../Impala-cdh-cluster-test-runner/bin/load-data.py", > line 494, in > if __name__ == "__main__": main() > File > "/home/systest/Impala-auxiliary-tests/tests/cdh_cluster/../../../Impala-cdh-cluster-test-runner/bin/load-data.py", > line 468, in main > hive_exec_query_files_parallel(thread_pool, hive_load_text_files) > File > "/home/systest/Impala-auxiliary-tests/tests/cdh_cluster/../../../Impala-cdh-cluster-test-runner/bin/load-data.py", > line 299, in hive_exec_query_files_parallel > exec_query_files_parallel(thread_pool, query_files, 'hive') > File > "/home/systest/Impala-auxiliary-tests/tests/cdh_cluster/../../../Impala-cdh-cluster-test-runner/bin/load-data.py", > line 290, in exec_query_files_parallel > for result in thread_pool.imap_unordered(execution_function, query_files): > File "/usr/lib/python2.7/multiprocessing/pool.py", line 659, in next > raise value > TypeError: coercing to Unicode: need string or buffer, NoneType found > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-7089) test_kudu_dml_reporting failing
Thomas Tauber-Marshall created IMPALA-7089: -- Summary: test_kudu_dml_reporting failing Key: IMPALA-7089 URL: https://issues.apache.org/jira/browse/IMPALA-7089 Project: IMPALA Issue Type: Bug Reporter: Thomas Tauber-Marshall Assignee: Thomas Tauber-Marshall See in numerous builds: {noformat} 00:07:23 ___ TestImpalaShell.test_kudu_dml_reporting 00:07:23 [gw1] linux2 -- Python 2.6.6 /data/jenkins/workspace/impala-asf-master-core/repos/Impala/bin/../infra/python/env/bin/python 00:07:23 shell/test_shell_commandline.py:601: in test_kudu_dml_reporting 00:07:23 "with y as (values(7)) insert into %s.dml_test (id) select * from y" % db, 1, 0) 00:07:23 shell/test_shell_commandline.py:580: in _validate_dml_stmt 00:07:23 assert expected_output in results.stderr 00:07:23 E assert 'Modified 1 row(s), 0 row error(s)' in 'Starting Impala Shell without Kerberos authentication\nConnected to localhost:21000\nServer version: impalad version ...tos-6-4-0895.vpc.cloudera.com:25000/query_plan?query_id=d94f04135c4d25f9:ec1089e8\nFetched 0 row(s) in 0.12s\n' 00:07:23 E+ where 'Starting Impala Shell without Kerberos authentication\nConnected to localhost:21000\nServer version: impalad version ...tos-6-4-0895.vpc.cloudera.com:25000/query_plan?query_id=d94f04135c4d25f9:ec1089e8\nFetched 0 row(s) in 0.12s\n' = .stderr 00:07:23 Captured stderr setup - 00:07:23 SET sync_ddl=False; 00:07:23 -- executing against localhost:21000 00:07:23 DROP DATABASE IF EXISTS `test_kudu_dml_reporting_256dcf63` CASCADE; 00:07:23 00:07:23 SET sync_ddl=False; 00:07:23 -- executing against localhost:21000 00:07:23 CREATE DATABASE `test_kudu_dml_reporting_256dcf63`; 00:07:23 00:07:23 MainThread: Created database "test_kudu_dml_reporting_256dcf63" for test ID "shell/test_shell_commandline.py::TestImpalaShell::()::test_kudu_dml_reporting" 00:07:23 = 1 failed, 1932 passed, 63 skipped, 45 xfailed, 1 xpassed in 6985.36 seconds == {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-7088) Parallel data load breaks load-data.py if loading data on a real cluster
[ https://issues.apache.org/jira/browse/IMPALA-7088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16493907#comment-16493907 ] Joe McDonnell commented on IMPALA-7088: --- [~dknupp] I think it is due to the remove of the unique_dir. In minicluster operation, unique_dir is set, but in a real cluster it would not be: {noformat} unique_dir = None if options.hive_hs2_hostport.startswith("localhost:"): unique_dir = tempfile.mkdtemp(prefix="hive-data-load-") ... shutil.rmtree(unique_dir){noformat} The shutil.rmtree(unique_dir) should only happen if unique_dir is not None. > Parallel data load breaks load-data.py if loading data on a real cluster > > > Key: IMPALA-7088 > URL: https://issues.apache.org/jira/browse/IMPALA-7088 > Project: IMPALA > Issue Type: Bug > Components: Infrastructure >Affects Versions: Impala 3.0 >Reporter: David Knupp >Priority: Blocker > > {{Impala/bin/load-data.py}} is most commonly used to load test data onto a > simulated standalone cluster running on the local host. However, with the > correct inputs, it can also be used to load data onto an actual cluster > running on remote hosts. > A recent enhancement in the load-data.py script to parallelize parts of the > data loading process -- https://github.com/apache/impala/commit/d481cd48 -- > has introduced a regression in the latter use case: > From {{$IMPALA_HOME/logs/data_loading/data-load-functional-exhaustive.log}}: > {noformat} > Created table functional_hbase.widetable_1000_cols > Took 0.7121 seconds > 09:48:01 Beginning execution of hive SQL: > /home/systest/Impala-auxiliary-tests/tests/cdh_cluster/../../../Impala-cdh-cluster-test-runner/logs/data_loading/sql/functional/load-functional-query-exhaustive-hive-generated-text-none-none.sql > Traceback (most recent call last): > File > "/home/systest/Impala-auxiliary-tests/tests/cdh_cluster/../../../Impala-cdh-cluster-test-runner/bin/load-data.py", > line 494, in > if __name__ == "__main__": main() > File > "/home/systest/Impala-auxiliary-tests/tests/cdh_cluster/../../../Impala-cdh-cluster-test-runner/bin/load-data.py", > line 468, in main > hive_exec_query_files_parallel(thread_pool, hive_load_text_files) > File > "/home/systest/Impala-auxiliary-tests/tests/cdh_cluster/../../../Impala-cdh-cluster-test-runner/bin/load-data.py", > line 299, in hive_exec_query_files_parallel > exec_query_files_parallel(thread_pool, query_files, 'hive') > File > "/home/systest/Impala-auxiliary-tests/tests/cdh_cluster/../../../Impala-cdh-cluster-test-runner/bin/load-data.py", > line 290, in exec_query_files_parallel > for result in thread_pool.imap_unordered(execution_function, query_files): > File "/usr/lib/python2.7/multiprocessing/pool.py", line 659, in next > raise value > TypeError: coercing to Unicode: need string or buffer, NoneType found > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-7088) Parallel data load breaks load-data.py if loading data on a real cluster
[ https://issues.apache.org/jira/browse/IMPALA-7088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Knupp updated IMPALA-7088: Description: {{Impala/bin/load-data.py}} is most commonly used to load test data onto a simulated standalone cluster running on the local host. However, with the correct inputs, it can also be used to load data onto an actual cluster running on remote hosts. A recent enhancement in the load-data.py script to parallelize parts of the data loading process -- https://github.com/apache/impala/commit/d481cd48 -- has introduced a regression in the latter use case: >From {{$IMPALA_HOME/logs/data_loading/data-load-functional-exhaustive.log}}: {noformat} Created table functional_hbase.widetable_1000_cols Took 0.7121 seconds 09:48:01 Beginning execution of hive SQL: /home/systest/Impala-auxiliary-tests/tests/cdh_cluster/../../../Impala-cdh-cluster-test-runner/logs/data_loading/sql/functional/load-functional-query-exhaustive-hive-generated-text-none-none.sql Traceback (most recent call last): File "/home/systest/Impala-auxiliary-tests/tests/cdh_cluster/../../../Impala-cdh-cluster-test-runner/bin/load-data.py", line 494, in if __name__ == "__main__": main() File "/home/systest/Impala-auxiliary-tests/tests/cdh_cluster/../../../Impala-cdh-cluster-test-runner/bin/load-data.py", line 468, in main hive_exec_query_files_parallel(thread_pool, hive_load_text_files) File "/home/systest/Impala-auxiliary-tests/tests/cdh_cluster/../../../Impala-cdh-cluster-test-runner/bin/load-data.py", line 299, in hive_exec_query_files_parallel exec_query_files_parallel(thread_pool, query_files, 'hive') File "/home/systest/Impala-auxiliary-tests/tests/cdh_cluster/../../../Impala-cdh-cluster-test-runner/bin/load-data.py", line 290, in exec_query_files_parallel for result in thread_pool.imap_unordered(execution_function, query_files): File "/usr/lib/python2.7/multiprocessing/pool.py", line 659, in next raise value TypeError: coercing to Unicode: need string or buffer, NoneType found {noformat} was: {{Impala/bin/load-data.py}} is most commonly used to load test data onto a simulated standalone cluster running on the local host. However, with the correct inputs, it can also be used to load data onto an actual remote cluster. A recent enhancement in the load-data.py script to parallelize parts of the data loading process -- https://github.com/apache/impala/commit/d481cd48 -- has introduced a regression in the latter use case: >From {{$IMPALA_HOME/logs/data_loading/data-load-functional-exhaustive.log}}: {noformat} Created table functional_hbase.widetable_1000_cols Took 0.7121 seconds 09:48:01 Beginning execution of hive SQL: /home/systest/Impala-auxiliary-tests/tests/cdh_cluster/../../../Impala-cdh-cluster-test-runner/logs/data_loading/sql/functional/load-functional-query-exhaustive-hive-generated-text-none-none.sql Traceback (most recent call last): File "/home/systest/Impala-auxiliary-tests/tests/cdh_cluster/../../../Impala-cdh-cluster-test-runner/bin/load-data.py", line 494, in if __name__ == "__main__": main() File "/home/systest/Impala-auxiliary-tests/tests/cdh_cluster/../../../Impala-cdh-cluster-test-runner/bin/load-data.py", line 468, in main hive_exec_query_files_parallel(thread_pool, hive_load_text_files) File "/home/systest/Impala-auxiliary-tests/tests/cdh_cluster/../../../Impala-cdh-cluster-test-runner/bin/load-data.py", line 299, in hive_exec_query_files_parallel exec_query_files_parallel(thread_pool, query_files, 'hive') File "/home/systest/Impala-auxiliary-tests/tests/cdh_cluster/../../../Impala-cdh-cluster-test-runner/bin/load-data.py", line 290, in exec_query_files_parallel for result in thread_pool.imap_unordered(execution_function, query_files): File "/usr/lib/python2.7/multiprocessing/pool.py", line 659, in next raise value TypeError: coercing to Unicode: need string or buffer, NoneType found {noformat} > Parallel data load breaks load-data.py if loading data on a real cluster > > > Key: IMPALA-7088 > URL: https://issues.apache.org/jira/browse/IMPALA-7088 > Project: IMPALA > Issue Type: Bug > Components: Infrastructure >Affects Versions: Impala 3.0 >Reporter: David Knupp >Priority: Blocker > > {{Impala/bin/load-data.py}} is most commonly used to load test data onto a > simulated standalone cluster running on the local host. However, with the > correct inputs, it can also be used to load data onto an actual cluster > running on remote hosts. > A recent enhancement in the load-data.py script to parallelize parts of the > data loading process -- https://github.com/apache/impala/commit/d481cd48 -- > has introduced a regression in the latter use case: > From {{$IMPALA_HOME/log
[jira] [Commented] (IMPALA-7088) Parallel data load breaks load-data.py if loading data on a real cluster
[ https://issues.apache.org/jira/browse/IMPALA-7088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16493882#comment-16493882 ] David Knupp commented on IMPALA-7088: - Cc: [~joemcdonnell], [~njanarthanan], [~mikesbrown] > Parallel data load breaks load-data.py if loading data on a real cluster > > > Key: IMPALA-7088 > URL: https://issues.apache.org/jira/browse/IMPALA-7088 > Project: IMPALA > Issue Type: Bug > Components: Infrastructure >Affects Versions: Impala 3.0 >Reporter: David Knupp >Priority: Blocker > > {{Impala/bin/load-data.py}} is most commonly used to load test data onto a > simulated standalone cluster running on the local host. However, with the > correct inputs, it can also be used to load data onto an actual remote > cluster. > A recent enhancement in the load-data.py script to parallelize parts of the > data loading process -- https://github.com/apache/impala/commit/d481cd48 -- > has introduced a regression in the latter use case: > From {{$IMPALA_HOME/logs/data_loading/data-load-functional-exhaustive.log}}: > {noformat} > Created table functional_hbase.widetable_1000_cols > Took 0.7121 seconds > 09:48:01 Beginning execution of hive SQL: > /home/systest/Impala-auxiliary-tests/tests/cdh_cluster/../../../Impala-cdh-cluster-test-runner/logs/data_loading/sql/functional/load-functional-query-exhaustive-hive-generated-text-none-none.sql > Traceback (most recent call last): > File > "/home/systest/Impala-auxiliary-tests/tests/cdh_cluster/../../../Impala-cdh-cluster-test-runner/bin/load-data.py", > line 494, in > if __name__ == "__main__": main() > File > "/home/systest/Impala-auxiliary-tests/tests/cdh_cluster/../../../Impala-cdh-cluster-test-runner/bin/load-data.py", > line 468, in main > hive_exec_query_files_parallel(thread_pool, hive_load_text_files) > File > "/home/systest/Impala-auxiliary-tests/tests/cdh_cluster/../../../Impala-cdh-cluster-test-runner/bin/load-data.py", > line 299, in hive_exec_query_files_parallel > exec_query_files_parallel(thread_pool, query_files, 'hive') > File > "/home/systest/Impala-auxiliary-tests/tests/cdh_cluster/../../../Impala-cdh-cluster-test-runner/bin/load-data.py", > line 290, in exec_query_files_parallel > for result in thread_pool.imap_unordered(execution_function, query_files): > File "/usr/lib/python2.7/multiprocessing/pool.py", line 659, in next > raise value > TypeError: coercing to Unicode: need string or buffer, NoneType found > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-7088) Parallel data load breaks load-data.py if loading data on a real cluster
[ https://issues.apache.org/jira/browse/IMPALA-7088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Knupp updated IMPALA-7088: Description: {{Impala/bin/load-data.py}} is most commonly used to load test data onto a simulated standalone cluster running on the local host. However, with the correct inputs, it can also be used to load data onto an actual remote cluster. A recent enhancement in the load-data.py script to parallelize parts of the data loading process -- https://github.com/apache/impala/commit/d481cd48 -- has introduced a regression in the latter use case: >From {{$IMPALA_HOME/logs/data_loading/data-load-functional-exhaustive.log}}: {noformat} Created table functional_hbase.widetable_1000_cols Took 0.7121 seconds 09:48:01 Beginning execution of hive SQL: /home/systest/Impala-auxiliary-tests/tests/cdh_cluster/../../../Impala-cdh-cluster-test-runner/logs/data_loading/sql/functional/load-functional-query-exhaustive-hive-generated-text-none-none.sql Traceback (most recent call last): File "/home/systest/Impala-auxiliary-tests/tests/cdh_cluster/../../../Impala-cdh-cluster-test-runner/bin/load-data.py", line 494, in if __name__ == "__main__": main() File "/home/systest/Impala-auxiliary-tests/tests/cdh_cluster/../../../Impala-cdh-cluster-test-runner/bin/load-data.py", line 468, in main hive_exec_query_files_parallel(thread_pool, hive_load_text_files) File "/home/systest/Impala-auxiliary-tests/tests/cdh_cluster/../../../Impala-cdh-cluster-test-runner/bin/load-data.py", line 299, in hive_exec_query_files_parallel exec_query_files_parallel(thread_pool, query_files, 'hive') File "/home/systest/Impala-auxiliary-tests/tests/cdh_cluster/../../../Impala-cdh-cluster-test-runner/bin/load-data.py", line 290, in exec_query_files_parallel for result in thread_pool.imap_unordered(execution_function, query_files): File "/usr/lib/python2.7/multiprocessing/pool.py", line 659, in next raise value TypeError: coercing to Unicode: need string or buffer, NoneType found {noformat} was: Impala/bin/load-data.py is most commonly used to load test data onto a simulated standalone cluster running on the local host. However, with the correct inputs, it can also be used to load data onto an actual remote cluster. A recent enhancement in the load-data.py script to parallelize parts of the data loading process -- https://github.com/apache/impala/commit/d481cd48 -- has introduced a regression in the latter use case: >From *$IMPALA_HOME/logs/data_loading/data-load-functional-exhaustive.log*: {noformat} Created table functional_hbase.widetable_1000_cols Took 0.7121 seconds 09:48:01 Beginning execution of hive SQL: /home/systest/Impala-auxiliary-tests/tests/cdh_cluster/../../../Impala-cdh-cluster-test-runner/logs/data_loading/sql/functional/load-functional-query-exhaustive-hive-generated-text-none-none.sql Traceback (most recent call last): File "/home/systest/Impala-auxiliary-tests/tests/cdh_cluster/../../../Impala-cdh-cluster-test-runner/bin/load-data.py", line 494, in if __name__ == "__main__": main() File "/home/systest/Impala-auxiliary-tests/tests/cdh_cluster/../../../Impala-cdh-cluster-test-runner/bin/load-data.py", line 468, in main hive_exec_query_files_parallel(thread_pool, hive_load_text_files) File "/home/systest/Impala-auxiliary-tests/tests/cdh_cluster/../../../Impala-cdh-cluster-test-runner/bin/load-data.py", line 299, in hive_exec_query_files_parallel exec_query_files_parallel(thread_pool, query_files, 'hive') File "/home/systest/Impala-auxiliary-tests/tests/cdh_cluster/../../../Impala-cdh-cluster-test-runner/bin/load-data.py", line 290, in exec_query_files_parallel for result in thread_pool.imap_unordered(execution_function, query_files): File "/usr/lib/python2.7/multiprocessing/pool.py", line 659, in next raise value TypeError: coercing to Unicode: need string or buffer, NoneType found {noformat} > Parallel data load breaks load-data.py if loading data on a real cluster > > > Key: IMPALA-7088 > URL: https://issues.apache.org/jira/browse/IMPALA-7088 > Project: IMPALA > Issue Type: Bug > Components: Infrastructure >Affects Versions: Impala 3.0 >Reporter: David Knupp >Priority: Blocker > > {{Impala/bin/load-data.py}} is most commonly used to load test data onto a > simulated standalone cluster running on the local host. However, with the > correct inputs, it can also be used to load data onto an actual remote > cluster. > A recent enhancement in the load-data.py script to parallelize parts of the > data loading process -- https://github.com/apache/impala/commit/d481cd48 -- > has introduced a regression in the latter use case: > From {{$IMPALA_HOME/logs/data_loading/data-load-functional-exhau
[jira] [Created] (IMPALA-7088) Parallel data load breaks load-data.py if loading data on a real cluster
David Knupp created IMPALA-7088: --- Summary: Parallel data load breaks load-data.py if loading data on a real cluster Key: IMPALA-7088 URL: https://issues.apache.org/jira/browse/IMPALA-7088 Project: IMPALA Issue Type: Bug Components: Infrastructure Affects Versions: Impala 3.0 Reporter: David Knupp Impala/bin/load-data.py is most commonly used to load test data onto a simulated standalone cluster running on the local host. However, with the correct inputs, it can also be used to load data onto an actual remote cluster. A recent enhancement in the load-data.py script to parallelize parts of the data loading process -- https://github.com/apache/impala/commit/d481cd48 -- has introduced a regression in the latter use case: >From *$IMPALA_HOME/logs/data_loading/data-load-functional-exhaustive.log*: {noformat} Created table functional_hbase.widetable_1000_cols Took 0.7121 seconds 09:48:01 Beginning execution of hive SQL: /home/systest/Impala-auxiliary-tests/tests/cdh_cluster/../../../Impala-cdh-cluster-test-runner/logs/data_loading/sql/functional/load-functional-query-exhaustive-hive-generated-text-none-none.sql Traceback (most recent call last): File "/home/systest/Impala-auxiliary-tests/tests/cdh_cluster/../../../Impala-cdh-cluster-test-runner/bin/load-data.py", line 494, in if __name__ == "__main__": main() File "/home/systest/Impala-auxiliary-tests/tests/cdh_cluster/../../../Impala-cdh-cluster-test-runner/bin/load-data.py", line 468, in main hive_exec_query_files_parallel(thread_pool, hive_load_text_files) File "/home/systest/Impala-auxiliary-tests/tests/cdh_cluster/../../../Impala-cdh-cluster-test-runner/bin/load-data.py", line 299, in hive_exec_query_files_parallel exec_query_files_parallel(thread_pool, query_files, 'hive') File "/home/systest/Impala-auxiliary-tests/tests/cdh_cluster/../../../Impala-cdh-cluster-test-runner/bin/load-data.py", line 290, in exec_query_files_parallel for result in thread_pool.imap_unordered(execution_function, query_files): File "/usr/lib/python2.7/multiprocessing/pool.py", line 659, in next raise value TypeError: coercing to Unicode: need string or buffer, NoneType found {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-7084) Partition doesn't exist after attempting to ALTER partition location to non-existing path
[ https://issues.apache.org/jira/browse/IMPALA-7084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16493850#comment-16493850 ] Fredy Wijaya commented on IMPALA-7084: -- It doesn't seem to be an issue anymore in the latest master (3.x) and 2.x branches. {noformat} [localhost:21000] default> create table test (a int) partitioned by (b int); Query: create table test (a int) partitioned by (b int) Fetched 1 row(s) in 0.02s >[localhost:21000] default> insert into test partition (b=1) values (1); Query: insert into test partition (b=1) values (1) Query submitted at: 2018-05-29 08:35:14 (Coordinator: http://impala-dev:25000) Query progress can be monitored at: http://impala-dev:25000/query_plan?query_id=84adf7dc8062947:7c9176fc Modified 1 row(s) in 4.10s [localhost:21000] default> alter table test add partition (b=2) location 'hdfs://localhost:20500/test-warehouse/test/b=2/'; Query: alter table test add partition (b=2) location 'hdfs://localhost:20500/test-warehouse/test/b=2/' ++ | summary| ++ | New partition has been added to the table. | ++ Fetched 1 row(s) in 0.03s [localhost:21000] default> alter table test partition (b=1) set location 'hdfs://localhost:20500/test-warehouse/test/b=5/'; Query: alter table test partition (b=1) set location 'hdfs://localhost:20500/test-warehouse/test/b=5/' ++ | summary| ++ | New location has been set for the specified partition. | ++ Fetched 1 row(s) in 0.07s [localhost:21000] default> show partitions test; Query: show partitions test +---+---++--+--+---++---++ | b | #Rows | #Files | Size | Bytes Cached | Cache Replication | Format | Incremental stats | Location | +---+---++--+--+---++---++ | 1 | -1 | 0 | 0B | NOT CACHED | NOT CACHED | TEXT | false | hdfs://localhost:20500/test-warehouse/test/b=5 | | 2 | -1 | 0 | 0B | NOT CACHED | NOT CACHED | TEXT | false | hdfs://localhost:20500/test-warehouse/test/b=2 | | Total | -1 | 0 | 0B | 0B || | | | +---+---++--+--+---++---++ Fetched 3 row(s) in 0.01s{noformat} > Partition doesn't exist after attempting to ALTER partition location to > non-existing path > - > > Key: IMPALA-7084 > URL: https://issues.apache.org/jira/browse/IMPALA-7084 > Project: IMPALA > Issue Type: Bug > Components: Catalog >Affects Versions: Impala 2.12.0 >Reporter: Gabor Kaszab >Priority: Major > Labels: correctness, ramp-up > > > > {code:java} > create table test (a int) partitioned by (b int); > insert into test partition (b=1) values (1); > // create another partition that points to a different location. > alter table test add partition (b=2) location > 'hdfs://localhost:20500/test-warehouse/test/b=2/'; > // setting the first partition to a non existing location. This fails as > expected. The error message is not exactly user friendly, though. Could have > said "Invalid location or such". > alter table test partition (b=1) set location > 'hdfs://localhost:20500/test-warehouse/test/b=5/'; > Query: alter table test partition (b=1) set location > 'hdfs://localhost:20500/test-warehouse/test/b=5/' > ERROR: TableLoadingException: Failed to load metadata for table: default.test > CAUSED BY: NullPointerException: null > // Setting the first partition to an existing location. This surprisingly > fails as the partitions doesn't exist. > alter table test partition (b=1) set location > 'hdfs://localhost:20500/test-warehouse/test/b=2/'; > Query: alter table test partition (b=1) set location > 'hdfs://localhost:20500/test-warehouse/test/b=1/' > ERROR: PartitionNotFoundException: Partition not found: > TPartitionKeyValue(name:b, value:1) > // However, show partition displays b=1 partition as well. > show partitions test; > Query: show partitions test > +---+---++--+--+
[jira] [Resolved] (IMPALA-6317) Expose -cmake_only flag to buildall.sh
[ https://issues.apache.org/jira/browse/IMPALA-6317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Knupp resolved IMPALA-6317. - Resolution: Fixed > Expose -cmake_only flag to buildall.sh > -- > > Key: IMPALA-6317 > URL: https://issues.apache.org/jira/browse/IMPALA-6317 > Project: IMPALA > Issue Type: Improvement > Components: Infrastructure >Affects Versions: Impala 2.11.0 >Reporter: David Knupp >Assignee: David Knupp >Priority: Minor > > Impala/bin/make_impala.sh has a {{-cmake_only}} command line option: > {noformat} > -cmake_only) > CMAKE_ONLY=1 > {noformat} > Passing this flag means that makefiles only will be generated during the > build. However, this flag is not provided in buildall.sh (the caller of > make_impala.sh) which effectively renders it useless. > It turns out that if one has no intention of running the Impala cluster > locally (e.g., as when trying to build just enough of the toolchain and dev > environment to run the data load scripts for loading data onto a remote > cluster) then being able to only generate makefiles is a useful thing. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-7087) Impala is unable to read Parquet decimal columns with lower precision/scale than table metadata
[ https://issues.apache.org/jira/browse/IMPALA-7087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Armstrong updated IMPALA-7087: -- Description: This is similar to IMPALA-2515, except relates to a different precision/scale in the file metadata rather than just a mismatch in the bytes used to store the data. In a lot of cases we should be able to convert the decimal type on the fly to the higher-precision type. {noformat} ERROR: File '/hdfs/path/00_0_x_2' column 'alterd_decimal' has an invalid type length. Expecting: 11 len in file: 8 {noformat} It would be convenient to allow reading parquet files where the precision/scale in the file can be converted to the precision/scale in the table metadata without loss of precision. was: This is similar to IMPALA-2515, except relates to a different precision/scale in the file metadata rather than just a mismatch in the bytes used to store the data. In a lot of cases we should be able to convert the decimal type on the fly to the higher-precision type. {noformat} ERROR: File '/hdfs/path/00_0_x_2' column 'alterd_decimal' has an invalid type length. Expecting: 11 len in file: 8 {noformat} > Impala is unable to read Parquet decimal columns with lower precision/scale > than table metadata > --- > > Key: IMPALA-7087 > URL: https://issues.apache.org/jira/browse/IMPALA-7087 > Project: IMPALA > Issue Type: Sub-task > Components: Backend >Reporter: Tim Armstrong >Priority: Major > Labels: parquet > > This is similar to IMPALA-2515, except relates to a different precision/scale > in the file metadata rather than just a mismatch in the bytes used to store > the data. In a lot of cases we should be able to convert the decimal type on > the fly to the higher-precision type. > {noformat} > ERROR: File '/hdfs/path/00_0_x_2' column 'alterd_decimal' has an invalid > type length. Expecting: 11 len in file: 8 > {noformat} > It would be convenient to allow reading parquet files where the > precision/scale in the file can be converted to the precision/scale in the > table metadata without loss of precision. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-7087) Impala is unable to read Parquet decimal columns with lower precision/scale than table metadata
Tim Armstrong created IMPALA-7087: - Summary: Impala is unable to read Parquet decimal columns with lower precision/scale than table metadata Key: IMPALA-7087 URL: https://issues.apache.org/jira/browse/IMPALA-7087 Project: IMPALA Issue Type: Sub-task Components: Backend Reporter: Tim Armstrong This is similar to IMPALA-2515, except relates to a different precision/scale in the file metadata rather than just a mismatch in the bytes used to store the data. In a lot of cases we should be able to convert the decimal type on the fly to the higher-precision type. {noformat} ERROR: File '/hdfs/path/00_0_x_2' column 'alterd_decimal' has an invalid type length. Expecting: 11 len in file: 8 {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org