[jira] [Created] (IMPALA-9054) Flaky test: test_misformatted_profile_text in query_test/test_cancellation.py

2019-10-16 Thread Quanlong Huang (Jira)
Quanlong Huang created IMPALA-9054:
--

 Summary: Flaky test: test_misformatted_profile_text in 
query_test/test_cancellation.py
 Key: IMPALA-9054
 URL: https://issues.apache.org/jira/browse/IMPALA-9054
 Project: IMPALA
  Issue Type: Bug
Reporter: Quanlong Huang


Saw this in several builds in ubuntu-16.04-dockerised-tests:
{code}
FAIL 
query_test/test_cancellation.py::TestCancellationParallel::()::test_misformatted_profile_text
=== FAILURES ===
___ TestCancellationParallel.test_misformatted_profile_text 
[gw8] linux2 -- Python 2.7.12 
/home/ubuntu/Impala/bin/../infra/python/env/bin/python
query_test/test_cancellation.py:171: in test_misformatted_profile_text
assert any(client.get_state(handle) == 'RUNNING_STATE' or sleep(1)
E   AssertionError: Query failed to start
E   assert any( at 0x7f99c462acd0>)
 Captured stderr setup -
SET 
client_identifier=query_test/test_cancellation.py::TestCancellationParallel::()::test_misformatted_profile_text;
-- connecting to: localhost:21000
-- connecting to localhost:21050 with impyla
-- 2019-10-16 03:20:40,776 INFO MainThread: Closing active operation
- Captured stderr call -
-- executing against Impala at localhost:21050

select count(*) from functional_parquet.alltypes where bool_col = sleep(100);

-- getting state for operation: 
-- getting state for operation: 
-- getting state for operation: 
-- getting state for operation: 
-- getting state for operation: 
== 1 failed, 2607 passed, 151 skipped, 54 xfailed in 3706.37 seconds ===
{code}

The test waits 5 seconds for the query to run and then test on cancel it. But 
somehow the query failed to start in 5 seconds. Maybe 5 seconds is too short 
for a dockerised env.

Test failures can be found in:
https://jenkins.impala.io/job/ubuntu-16.04-dockerised-tests/1427/
https://jenkins.impala.io/job/ubuntu-16.04-dockerised-tests/1424/



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (IMPALA-8498) Write column index for floating types when NaN is not present

2019-10-16 Thread Norbert Luksa (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Norbert Luksa resolved IMPALA-8498.
---
Fix Version/s: Impala 3.4.0
   Resolution: Fixed

> Write column index for floating types when NaN is not present
> -
>
> Key: IMPALA-8498
> URL: https://issues.apache.org/jira/browse/IMPALA-8498
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Zoltán Borók-Nagy
>Assignee: Norbert Luksa
>Priority: Major
>  Labels: ramp-up
> Fix For: Impala 3.4.0
>
>
> IMPALA-7304 disabled column index writing for floating point columns until 
> PARQUET-1222 is resolved.
> PARQUET-1222 is responsible for defining a total order for floating values, 
> but the problematic values are only the NaNs. Therefore we can write the 
> column index if NaNs are not present in the data. Parquet-MR also does this, 
> following the principles in 
> [https://github.com/apache/parquet-format/blob/75eb7a7b84e6e62bfb09668b6d8d40b12597456e/src/main/thrift/parquet.thrift#L827-L834]
>  
> Impala should follow this behavior, and also when storing zeroes, it should 
> store -0.0 as minimum and +0.0 as maximum.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IMPALA-9055) HDFS Caching with Impala: Expiration 26687997791:19:48:13.951 exceeds the max relative expiration time of

2019-10-16 Thread Adriano (Jira)
Adriano created IMPALA-9055:
---

 Summary: HDFS Caching with Impala: Expiration 
26687997791:19:48:13.951 exceeds the max relative expiration time of 
 Key: IMPALA-9055
 URL: https://issues.apache.org/jira/browse/IMPALA-9055
 Project: IMPALA
  Issue Type: Bug
  Components: Catalog
Reporter: Adriano


HDFS Caching with Impala:
If we create a pool specifying the maxTtl with the hdfs command:
e.g:
{{sudo -u hdfs hdfs cacheadmin -addPool case422446 -owner impala -group hdfs 
-mode 755 -limit 1000  -maxTtl 7d}}

when we try to alter a table adding a partition in Impala:
e.g:
{{alter table foo partition (p1=1) set cached in 'foo'
}}
we get a failure with the exception:
ERROR: ImpalaRuntimeException: Expiration 26687997791:19:48:13.951 exceeds the 
max relative expiration time of 60480 ms.
at 
org.apache.hadoop.hdfs.server.namenode.CacheManager.validateExpiryTime(CacheManager.java:378)
at 
org.apache.hadoop.hdfs.server.namenode.CacheManager.addDirective(CacheManager.java:528)
at 
org.apache.hadoop.hdfs.server.namenode.FSNDNCacheOp.addCacheDirective(FSNDNCacheOp.java:45)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.addCacheDirective(FSNamesystem.java:6782)
at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addCacheDirective(NameNodeRpcServer.java:1883)
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addCacheDirective(ClientNamenodeProtocolServerSideTranslatorPB.java:1265)
at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1685)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675)

CAUSED BY: InvalidRequestException: Expiration 26687997791:19:48:13.951 exceeds 
the max relative expiration time of 60480 ms.
at 
org.apache.hadoop.hdfs.server.namenode.CacheManager.validateExpiryTime(CacheManager.java:378)
at 
org.apache.hadoop.hdfs.server.namenode.CacheManager.addDirective(CacheManager.java:528)
at 
org.apache.hadoop.hdfs.server.namenode.FSNDNCacheOp.addCacheDirective(FSNDNCacheOp.java:45)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.addCacheDirective(FSNamesystem.java:6782)
at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addCacheDirective(NameNodeRpcServer.java:1883)
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addCacheDirective(ClientNamenodeProtocolServerSideTranslatorPB.java:1265)
at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1685)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675)

CAUSED BY: RemoteException: Expiration 26687997791:19:48:13.951 exceeds the max 
relative expiration time of 60480 ms.
at 
org.apache.hadoop.hdfs.server.namenode.CacheManager.validateExpiryTime(CacheManager.java:378)
at 
org.apache.hadoop.hdfs.server.namenode.CacheManager.addDirective(CacheManager.java:528)
at 
org.apache.hadoop.hdfs.server.namenode.FSNDNCacheOp.addCacheDirective(FSNDNCacheOp.java:45)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.addCacheDirective(FSNamesystem.java:6782)
at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addCacheDirective(NameNodeRpcServer.java:1883)
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addCacheDirective(ClientNamenodeProtocolServerSideTranslatorPB.java:1265)
at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at 
org.apache.hadoop.ipc.Pro

[jira] [Resolved] (IMPALA-9002) Add flag to only check SELECT priviledge in GET_TABLES

2019-10-16 Thread Quanlong Huang (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Quanlong Huang resolved IMPALA-9002.

Resolution: Fixed

> Add flag to only check SELECT priviledge in GET_TABLES
> --
>
> Key: IMPALA-9002
> URL: https://issues.apache.org/jira/browse/IMPALA-9002
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Security
>Reporter: Quanlong Huang
>Assignee: Quanlong Huang
>Priority: Major
>
> In Frontend.doGetTableNames(), if authorization is enabled, we only return 
> tables that current user has ANY priviledge on them:
> {code:java}
>   private List doGetTableNames(String dbName, PatternMatcher matcher,
>   User user) throws ImpalaException {
> FeCatalog catalog = getCatalog();
> List tblNames = catalog.getTableNames(dbName, matcher);
> if (authzFactory_.getAuthorizationConfig().isEnabled()) {
>   Iterator iter = tblNames.iterator();
>   while (iter.hasNext()) {
> ..
> PrivilegeRequest privilegeRequest = new PrivilegeRequestBuilder(
> authzFactory_.getAuthorizableFactory())
> .any().onAnyColumn(dbName, tblName, tableOwner).build();  <-- 
> require ANY priviledge here
> if (!authzChecker_.get().hasAccess(user, privilegeRequest)) {
>   iter.remove();
> }
>   }
> }
> return tblNames;
>   } {code}
> In Sentry integration, checking ANY priviledge will check all possible 
> priviledges, i.e. ALL, OWNER, ALTER, DROP, CREATE, INSERT, SELECT, REFRESH, 
> until one is permitted. In the worst case that current use don't have any 
> priviledge on a table, we need to perform 8 checks on this table.
> {code:java}
> public enum Privilege {
>   ...
>   static {
> ...
> ANY.implied_ = EnumSet.of(ALL, OWNER, ALTER, DROP, CREATE, INSERT, SELECT,
> REFRESH); {code}
> GET_TABLES performance is poor when there're thosands of tables. It's 
> reasonable to only return tables that current user has SELECT priviledge on 
> them. Checking only the SELECT priviledge can boost the perfomance to be 8 
> times better. In my experiment on impala-2.12-cdh5.16.2 with 40k tables, 
> GET_TABLES takes 16s originally when current user only have priviledges on 6 
> tables. With this change, time reduces to 2s.
> We can add a flag to only check on SELECT priviledge for table visuability.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IMPALA-9056) Handle more cases of set limit on SQL statement

2019-10-16 Thread Yongzhi Chen (Jira)
Yongzhi Chen created IMPALA-9056:


 Summary: Handle more cases of set limit on SQL statement
 Key: IMPALA-9056
 URL: https://issues.apache.org/jira/browse/IMPALA-9056
 Project: IMPALA
  Issue Type: Bug
Reporter: Yongzhi Chen
 Attachments: repro.sql.txt

This is a follow-on of IMPALA-4551 , attached reproduce will cause
java.lang.OutOfMemoryError: Java heap space or if the cluster does has large 
enough
memory, the query is stuck on the following stack:
{noformat}
Thread 1964045: (state = BLOCKED)
 - org.apache.impala.catalog.Type.toThrift() @bci=0, line=233 (Compiled frame)
 - 
org.apache.impala.analysis.Expr.treeToThriftHelper(org.apache.impala.thrift.TExpr)
 @bci=68, line=610 (Compiled frame)
 - 
org.apache.impala.analysis.Expr.treeToThriftHelper(org.apache.impala.thrift.TExpr)
 @bci=191, line=622 (Compiled frame)
 - 
org.apache.impala.analysis.Expr.treeToThriftHelper(org.apache.impala.thrift.TExpr)
 @bci=191, line=622 (Compiled frame)
 - 
org.apache.impala.analysis.Expr.treeToThriftHelper(org.apache.impala.thrift.TExpr)
 @bci=191, line=622 (Compiled frame)
 - 
org.apache.impala.analysis.Expr.treeToThriftHelper(org.apache.impala.thrift.TExpr)
 @bci=191, line=622 (Compiled frame)
 - 
org.apache.impala.analysis.Expr.treeToThriftHelper(org.apache.impala.thrift.TExpr)
 @bci=191, line=622 (Compiled frame)
 - 
org.apache.impala.analysis.Expr.treeToThriftHelper(org.apache.impala.thrift.TExpr)
 @bci=191, line=622 (Compiled frame)
 - 
org.apache.impala.analysis.Expr.treeToThriftHelper(org.apache.impala.thrift.TExpr)
 @bci=191, line=622 (Compiled frame)
 - 
org.apache.impala.analysis.Expr.treeToThriftHelper(org.apache.impala.thrift.TExpr)
 @bci=191, line=622 (Compiled frame)
 - 
org.apache.impala.analysis.Expr.treeToThriftHelper(org.apache.impala.thrift.TExpr)
 @bci=191, line=622 (Compiled frame)
 - 
org.apache.impala.analysis.Expr.treeToThriftHelper(org.apache.impala.thrift.TExpr)
 @bci=191, line=622 (Compiled frame)
 - 
org.apache.impala.analysis.Expr.treeToThriftHelper(org.apache.impala.thrift.TExpr)
 @bci=191, line=622 (Compiled frame)
 - 
org.apache.impala.analysis.Expr.treeToThriftHelper(org.apache.impala.thrift.TExpr)
 @bci=191, line=622 (Compiled frame)
 - 
org.apache.impala.analysis.Expr.treeToThriftHelper(org.apache.impala.thrift.TExpr)
 @bci=191, line=622 (Compiled frame)
 - 
org.apache.impala.analysis.Expr.treeToThriftHelper(org.apache.impala.thrift.TExpr)
 @bci=191, line=622 (Compiled frame)
 - 
org.apache.impala.analysis.Expr.treeToThriftHelper(org.apache.impala.thrift.TExpr)
 @bci=191, line=622 (Compiled frame)
 - 
org.apache.impala.analysis.Expr.treeToThriftHelper(org.apache.impala.thrift.TExpr)
 @bci=191, line=622 (Compiled frame)
 - 
org.apache.impala.analysis.Expr.treeToThriftHelper(org.apache.impala.thrift.TExpr)
 @bci=191, line=622 (Compiled frame)
 - 
org.apache.impala.analysis.Expr.treeToThriftHelper(org.apache.impala.thrift.TExpr)
 @bci=191, line=622 (Compiled frame)
 - 
org.apache.impala.analysis.Expr.treeToThriftHelper(org.apache.impala.thrift.TExpr)
 @bci=191, line=622 (Compiled frame)
 - 
org.apache.impala.analysis.Expr.treeToThriftHelper(org.apache.impala.thrift.TExpr)
 @bci=191, line=622 (Compiled frame)
 - 
org.apache.impala.analysis.Expr.treeToThriftHelper(org.apache.impala.thrift.TExpr)
 @bci=191, line=622 (Compiled frame)
 - 
org.apache.impala.analysis.Expr.treeToThriftHelper(org.apache.impala.thrift.TExpr)
 @bci=191, line=622 (Compiled frame)
 - 
org.apache.impala.analysis.Expr.treeToThriftHelper(org.apache.impala.thrift.TExpr)
 @bci=191, line=622 (Compiled frame)
 - 
org.apache.impala.analysis.Expr.treeToThriftHelper(org.apache.impala.thrift.TExpr)
 @bci=191, line=622 (Compiled frame)
 - 
org.apache.impala.analysis.Expr.treeToThriftHelper(org.apache.impala.thrift.TExpr)
 @bci=191, line=622 (Compiled frame)
 - 
org.apache.impala.analysis.Expr.treeToThriftHelper(org.apache.impala.thrift.TExpr)
 @bci=191, line=622 (Compiled frame)
 - org.apache.impala.analysis.Expr.treeToThrift() @bci=52, line=598 (Compiled 
frame)
 - org.apache.impala.analysis.Expr.treesToThrift(java.util.List) @bci=32, 
line=650 (Compiled frame)
 - org.apache.impala.planner.PlanFragment.toThrift() @bci=51, line=335 
(Compiled frame)
 - 
org.apache.impala.service.Frontend.createPlanExecInfo(org.apache.impala.planner.PlanFragment,
 org.apache.impala.planner.Planner, org.apache.impala.thrift.TQueryCtx, 
org.apache.impala.thrift.TQueryExecRequest) @bci=392, line=881 (Compiled frame)
 - 
org.apache.impala.service.Frontend.createExecRequest(org.apache.impala.planner.Planner,
 java.lang.StringBuilder) @bci=173, line=916 (Compiled frame)
 - 
org.apache.impala.service.Frontend.createExecRequest(org.apache.impala.thrift.TQueryCtx,
 java.lang.StringBuilder) @bci=593, line=1027 (Compiled frame)
 - org.apache.impala.service.JniFrontend.createExecRequest(byte[]) @bci=30, 
line=157 (Compiled frame)
{noformat}

[jira] [Created] (IMPALA-9057) TestEventProcessing.test_insert_events_transactional is flaky

2019-10-16 Thread Alice Fan (Jira)
Alice Fan created IMPALA-9057:
-

 Summary: TestEventProcessing.test_insert_events_transactional is 
flaky
 Key: IMPALA-9057
 URL: https://issues.apache.org/jira/browse/IMPALA-9057
 Project: IMPALA
  Issue Type: Bug
  Components: Frontend
Affects Versions: Impala 3.4.0
Reporter: Alice Fan
Assignee: Vihang Karajgaonkar


Assertion failure for 
custom_cluster.test_event_processing.TestEventProcessing.test_insert_events_transactional
 

{code:java}
Error Message
assert ['101', 'x', ..., '3', '2019'] == ['101', 'z', '28', '3', '2019']   At 
index 1 diff: 'x' != 'z'   Full diff:   - ['101', 'x', '28', '3', '2019']   ?   
   ^   + ['101', 'z', '28', '3', '2019']   ?  ^
Stacktrace
custom_cluster/test_event_processing.py:49: in test_insert_events_transactional
self.run_test_insert_events(is_transactional=True)
custom_cluster/test_event_processing.py:131: in run_test_insert_events
assert data.split('\t') == ['101', 'z', '28', '3', '2019']
E   assert ['101', 'x', ..., '3', '2019'] == ['101', 'z', '28', '3', '2019']
E At index 1 diff: 'x' != 'z'
E Full diff:
E - ['101', 'x', '28', '3', '2019']
E ?  ^
E + ['101', 'z', '28', '3', '2019']
E ?  ^
{code}




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IMPALA-9058) S3 tests failing with FileNotFoundException getVersionMarkerItem on ../VERSION

2019-10-16 Thread Sahil Takiar (Jira)
Sahil Takiar created IMPALA-9058:


 Summary: S3 tests failing with FileNotFoundException 
getVersionMarkerItem on ../VERSION
 Key: IMPALA-9058
 URL: https://issues.apache.org/jira/browse/IMPALA-9058
 Project: IMPALA
  Issue Type: Test
Reporter: Sahil Takiar
Assignee: Sahil Takiar


I've seen this happen several times now, S3 tests intermittently fail with an 
error such as:
{code:java}
Query aborted:InternalException: Error adding partitions E   CAUSED BY: 
MetaException: java.io.IOException: Got exception: 
java.io.FileNotFoundException getVersionMarkerItem on ../VERSION: 
com.amazonaws.services.dynamodbv2.model.ResourceNotFoundException: Requested 
resource not found (Service: AmazonDynamoDBv2; Status Code: 400; Error Code: 
ResourceNotFoundException; Request ID: 
8T9IS939MDI7ASOB0IJCC34J3NVV4KQNSO5AEMVJF66Q9ASUAAJG) {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IMPALA-9059) Add UNPIVOT operator

2019-10-16 Thread Greg Rahn (Jira)
Greg Rahn created IMPALA-9059:
-

 Summary: Add UNPIVOT operator
 Key: IMPALA-9059
 URL: https://issues.apache.org/jira/browse/IMPALA-9059
 Project: IMPALA
  Issue Type: New Feature
Reporter: Greg Rahn


References:
* 
https://docs.oracle.com/cd/B28359_01/server.111/b28286/statements_10002.htm#CHDCEJJE
* 
https://docs.microsoft.com/en-us/sql/t-sql/queries/from-using-pivot-and-unpivot?view=sql-server-ver15



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (IMPALA-9054) Flaky test: test_misformatted_profile_text in query_test/test_cancellation.py

2019-10-16 Thread Tim Armstrong (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9054?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong resolved IMPALA-9054.
---
Fix Version/s: Impala 3.4.0
   Resolution: Fixed

> Flaky test: test_misformatted_profile_text in query_test/test_cancellation.py
> -
>
> Key: IMPALA-9054
> URL: https://issues.apache.org/jira/browse/IMPALA-9054
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Quanlong Huang
>Assignee: Tim Armstrong
>Priority: Major
> Fix For: Impala 3.4.0
>
>
> Saw this in several builds in ubuntu-16.04-dockerised-tests:
> {code}
> FAIL 
> query_test/test_cancellation.py::TestCancellationParallel::()::test_misformatted_profile_text
> === FAILURES 
> ===
> ___ TestCancellationParallel.test_misformatted_profile_text 
> 
> [gw8] linux2 -- Python 2.7.12 
> /home/ubuntu/Impala/bin/../infra/python/env/bin/python
> query_test/test_cancellation.py:171: in test_misformatted_profile_text
> assert any(client.get_state(handle) == 'RUNNING_STATE' or sleep(1)
> E   AssertionError: Query failed to start
> E   assert any( at 0x7f99c462acd0>)
>  Captured stderr setup 
> -
> SET 
> client_identifier=query_test/test_cancellation.py::TestCancellationParallel::()::test_misformatted_profile_text;
> -- connecting to: localhost:21000
> -- connecting to localhost:21050 with impyla
> -- 2019-10-16 03:20:40,776 INFO MainThread: Closing active operation
> - Captured stderr call 
> -
> -- executing against Impala at localhost:21050
> select count(*) from functional_parquet.alltypes where bool_col = sleep(100);
> -- getting state for operation: 
> 
> -- getting state for operation: 
> 
> -- getting state for operation: 
> 
> -- getting state for operation: 
> 
> -- getting state for operation: 
> 
> == 1 failed, 2607 passed, 151 skipped, 54 xfailed in 3706.37 seconds 
> ===
> {code}
> The test waits 5 seconds for the query to run and then test on cancel it. But 
> somehow the query failed to start in 5 seconds. Maybe 5 seconds is too short 
> for a dockerised env.
> Test failures can be found in:
> https://jenkins.impala.io/job/ubuntu-16.04-dockerised-tests/1427/
> https://jenkins.impala.io/job/ubuntu-16.04-dockerised-tests/1424/



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (IMPALA-8998) Admission control accounting for mt_dop

2019-10-16 Thread Tim Armstrong (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong resolved IMPALA-8998.
---
Fix Version/s: Impala 3.4.0
   Resolution: Fixed

> Admission control accounting for mt_dop
> ---
>
> Key: IMPALA-8998
> URL: https://issues.apache.org/jira/browse/IMPALA-8998
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Backend
>Reporter: Tim Armstrong
>Assignee: Tim Armstrong
>Priority: Major
> Fix For: Impala 3.4.0
>
>
> We should account for the degree of parallelism that the query runs with on a 
> backend to avoid overadmitting too many parallel queries. 
> We could probably simply count the effective degree of parallelism (max # 
> instances of a fragment on that backend) toward the number of slots in 
> admission control (although slots are not enabled for the default group yet - 
> see IMPALA-8757).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (IMPALA-8884) Track Read(), Open() and Write() operation time per disk queue

2019-10-16 Thread Tim Armstrong (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong resolved IMPALA-8884.
---
Fix Version/s: Impala 3.4.0
   Resolution: Fixed

> Track Read(), Open() and Write() operation time per disk queue
> --
>
> Key: IMPALA-8884
> URL: https://issues.apache.org/jira/browse/IMPALA-8884
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Reporter: Tim Armstrong
>Assignee: Tim Armstrong
>Priority: Major
>  Labels: observability
> Fix For: Impala 3.4.0
>
>
> It would be useful for debugging I/O performance problems if we had histogram 
> stats for the time taken for various operations so that we could see if there 
> were slow operations on a particular disk (e.g. because of disk failure) or 
> from a particular remote filesystem.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (IMPALA-5755) Undefined symbol in catalog in Ubuntu 16.04 with -so

2019-10-16 Thread Tim Armstrong (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-5755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong resolved IMPALA-5755.
---
Resolution: Duplicate

IMPALA-3926

> Undefined symbol in catalog in Ubuntu 16.04 with -so
> 
>
> Key: IMPALA-5755
> URL: https://issues.apache.org/jira/browse/IMPALA-5755
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Infrastructure
> Environment: ec2, c4.8xlarge, ubuntu 16.04, Linux ip-172-31-7-214 
> 4.4.0-1022-aws #31-Ubuntu SMP Tue Jun 27 11:27:55 UTC 2017 x86_64 x86_64 
> x86_64 GNU/Linux
>Reporter: Jim Apple
>Priority: Major
>
> A build with the {{-so}} flag fails in 
> {{/home/ubuntu/Impala/logs/data_loading/catalogd-error.log}} with:
> {noformat}
> /home/ubuntu/Impala/be/build/latest/catalog/catalogd: symbol lookup error: 
> /home/ubuntu/Impala/be/src/kudu/util/libversion_info_proto.so: undefined 
> symbol: _ZN6google8protobuf8internal13empty_string_E
> {noformat}
> The last things printed to stdout are:
> {noformat}
> Creating /test-warehouse HDFS directory (logging to 
> /home/ubuntu/Impala/logs/data_loading/create-test-warehouse-dir.log)... 
> OK (Took: 0 min 2 sec)
> Derived params for create-load-data.sh:
> EXPLORATION_STRATEGY=exhaustive
> SKIP_METADATA_LOAD=0
> SKIP_SNAPSHOT_LOAD=0
> SNAPSHOT_FILE=
> CM_HOST=
> REMOTE_LOAD=
> Starting Impala cluster (logging to 
> /home/ubuntu/Impala/logs/data_loading/start-impala-cluster.log)... 
> FAILED (Took: 0 min 11 sec)
> '/home/ubuntu/Impala/bin/start-impala-cluster.py 
> --log_dir=/home/ubuntu/Impala/logs/data_loading -s 3' failed. Tail of log:
> Log for command '/home/ubuntu/Impala/bin/start-impala-cluster.py 
> --log_dir=/home/ubuntu/Impala/logs/data_loading -s 3'
> Starting State Store logging to 
> /home/ubuntu/Impala/logs/data_loading/statestored.INFO
> Starting Catalog Service logging to 
> /home/ubuntu/Impala/logs/data_loading/catalogd.INFO
> Error starting cluster: Unable to start catalogd. Check log or file 
> permissions for more details.
> Error in /home/ubuntu/Impala/testdata/bin/create-load-data.sh at line 48: 
> LOAD_DATA_ARGS=""
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (IMPALA-5368) test_table_is_cached fails when upgrading from Ubuntu 14.04 to 16.04

2019-10-16 Thread Tim Armstrong (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-5368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong resolved IMPALA-5368.
---
Resolution: Cannot Reproduce

We run this test on 16.04 now.

> test_table_is_cached fails when upgrading from Ubuntu 14.04 to 16.04
> 
>
> Key: IMPALA-5368
> URL: https://issues.apache.org/jira/browse/IMPALA-5368
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Infrastructure
>Reporter: Jim Apple
>Priority: Major
>
> {noformat}
> FAIL 
> query_test/test_hdfs_caching.py::TestHdfsCaching::()::test_table_is_cached[exec_option:
>  {'disable_codegen': False, 'abort_on_error': 1, 
> 'exec_single_node_rows_threshold': 0, 'batch_size': 0, 'num_nodes': 0} | 
> table_format: text/none]
> FAIL 
> query_test/test_hdfs_caching.py::TestHdfsCaching::()::test_table_is_cached[exec_option:
>  {'disable_codegen': False, 'abort_on_error': 1, 
> 'exec_single_node_rows_threshold': 0, 'batch_size': 0, 'num_nodes': 0} | 
> table_format: text/gzip/
> block]
> === FAILURES 
> ===
>  TestHdfsCaching.test_table_is_cached[exec_option: {'disable_codegen': False, 
> 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0, 'batch_size': 0, 
> 'num_nodes': 0} | table_format: text/none] 
> query_test/test_hdfs_caching.py:87: in test_table_is_cached
> assert(False)
> E   assert False
> - Captured stdout call 
> -
> 0 0
> 0 0
> 0 0
> - Captured stderr call 
> -
> MainThread: Found 3 impalad/1 statestored/1 catalogd process(es)
>  TestHdfsCaching.test_table_is_cached[exec_option: {'disable_codegen': False, 
> 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0, 'batch_size': 0, 
> 'num_nodes': 0} | table_format: text/gzip/block] 
> query_test/test_hdfs_caching.py:87: in test_table_is_cached
> assert(False)
> E   assert False
> - Captured stdout call 
> -
> 0 0
> 0 0
> 0 0
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (IMPALA-5366) Missing datetime in python when upgrading from Ubuntu 14.04 to 16.04

2019-10-16 Thread Tim Armstrong (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-5366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong resolved IMPALA-5366.
---
Resolution: Cannot Reproduce

Hasn't been reported for a long time

> Missing datetime in python when upgrading from Ubuntu 14.04 to 16.04
> 
>
> Key: IMPALA-5366
> URL: https://issues.apache.org/jira/browse/IMPALA-5366
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Infrastructure
>Affects Versions: Impala 2.8.0
>Reporter: Jim Apple
>Priority: Major
>
> When upgrading from Ubuntu 14.04 to 16.04, starting a minicluster fails with 
> a missing Python {{datetime}} module. This should be surprising, since 
> datetime is built-in.  See these related bugs:
> https://askubuntu.com/questions/808749/importerror-no-module-named-datetime-upgrade-to-ubuntu-16-04-lts-aws-cli?rq=1
> https://askubuntu.com/questions/509283/python-no-module-named-datetime



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (IMPALA-5365) Broken builds and tests when upgrading from Ubuntu 14.04 to Ubuntu 16.04

2019-10-16 Thread Tim Armstrong (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-5365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong resolved IMPALA-5365.
---
Resolution: Fixed

We seem to be running fine with Ubuntu 16.04 now.

> Broken builds and tests when upgrading from Ubuntu 14.04 to Ubuntu 16.04
> 
>
> Key: IMPALA-5365
> URL: https://issues.apache.org/jira/browse/IMPALA-5365
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 2.8.0
>Reporter: Jim Apple
>Priority: Major
>
> This is a parent bug for issues when upgrading from Ubuntu 14.04 to 16.04. A 
> 16.04 fresh install can pass the builds and tests, but upgrading will likely 
> be an important use case for many Impala community members.
> For a working fresh install scrips, see 
> https://cwiki.apache.org/confluence/display/IMPALA/Bootstrapping+an+Impala+Development+Environment+From+Scratch



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (IMPALA-5700) Improve error message when catalogd fails to start

2019-10-16 Thread Tim Armstrong (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-5700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong resolved IMPALA-5700.
---
Resolution: Cannot Reproduce

> Improve error message when catalogd fails to start
> --
>
> Key: IMPALA-5700
> URL: https://issues.apache.org/jira/browse/IMPALA-5700
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Infrastructure
>Reporter: Jim Apple
>Priority: Major
>
> When catalogd fails to start, the error message is
> {noformat}
> Starting Catalog Service logging to 
> /home/ubuntu/Impala/logs/cluster/catalogd.INFO
> Error starting cluster: Unable to start catalogd. Check log or file 
> permissions for more details.
> {noformat}
> But sometimes that log shows no useful information and the other message 
> doesn't explain which file permissions to check. An example log without much 
> information:
> {noformat}
> I0723 21:42:15.238229  5972 init.cc:218] Cpu Info:
>   Model: Intel(R) Xeon(R) CPU E5-2686 v4 @ 2.30GHz
>   Cores: 8
>   Max Possible Cores: 8
>   L1 Cache: 32.00 KB (Line: 64.00 B)
>   L2 Cache: 256.00 KB (Line: 64.00 B)
>   L3 Cache: 45.00 MB (Line: 64.00 B)
>   Hardware Supports:
> ssse3
> sse4_1
> sse4_2
> popcnt
> avx
> avx2
>   Numa Nodes: 1
>   Numa Nodes of Cores: 0->0 | 1->0 | 2->0 | 3->0 | 4->0 | 5->0 | 6->0 | 7->0 |
> I0723 21:42:15.238239  5972 init.cc:219] Disk Info: 
>   Num disks 2: 
> ram (rotational=true)
> xvda (rotational=false)
> I0723 21:42:15.238245  5972 init.cc:220] Physical Memory: 59.97 GB
> I0723 21:42:15.238250  5972 init.cc:221] OS version: Linux version 
> 4.4.0-1020-aws (buildd@lgw01-14) (gcc version 5.4.0 20160609 (Ubuntu 
> 5.4.0-6ubuntu1~16.04.4) ) #
> {noformat}
> Changing the ownership of all the files in the home directory of the user did 
> not alleviate the problem.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IMPALA-9061) Update ant version for centos in bootstrap_system.sh

2019-10-16 Thread Fucun Chu (Jira)
Fucun Chu created IMPALA-9061:
-

 Summary: Update ant version for centos in bootstrap_system.sh
 Key: IMPALA-9061
 URL: https://issues.apache.org/jira/browse/IMPALA-9061
 Project: IMPALA
  Issue Type: Bug
  Components: Infrastructure
 Environment: Centos 7
Reporter: Fucun Chu


{{bootstrap_system.sh}}  currently use [ant 
1.9.13|https://github.com/apache/impala/blob/b0c6740faec6b0a00dcfee126ab39324026c0ca9/bin/bootstrap_system.sh#L239]
 on CentOS/Redhat environment.  The file ant-1.9.13-bin.tar.gz release cannot 
be accessed , the earliest version was 1.9.14. please see 
[here|https://www-us.apache.org/dist/ant/binaries/]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IMPALA-9062) Don't need to acquire table locks in gathering catalog topic updates in minimal topic mode

2019-10-16 Thread Quanlong Huang (Jira)
Quanlong Huang created IMPALA-9062:
--

 Summary: Don't need to acquire table locks in gathering catalog 
topic updates in minimal topic mode
 Key: IMPALA-9062
 URL: https://issues.apache.org/jira/browse/IMPALA-9062
 Project: IMPALA
  Issue Type: Sub-task
Reporter: Quanlong Huang


If catalog_topic_mode is minimal, for table updates, catalogd only propagates 
the database name, table name and catalog version associated with the table:
 
[https://github.com/apache/impala/blob/3.3.0/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java#L619]
{code:java}
private TCatalogObject getMinimalObjectForV2(TCatalogObject obj) {
  Preconditions.checkState(topicMode_ == TopicMode.MINIMAL ||
  topicMode_ == TopicMode.MIXED);
  TCatalogObject min = new TCatalogObject(obj.type, obj.catalog_version);
  switch (obj.type) {
  case DATABASE:
min.setDb(new TDatabase(obj.db.db_name));
break;
  case TABLE:
  case VIEW:
min.setTable(new TTable(obj.table.db_name, obj.table.tbl_name));
break;{code}
We acquire the table lock in case of reading partial results written by other 
concurrent DDLs: 
[https://github.com/apache/impala/blob/3.3.0/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java#L1078]
{code:java}
  private void addTableToCatalogDeltaHelper(Table tbl, GetCatalogDeltaContext 
ctx)
  throws TException {
TCatalogObject catalogTbl =
new TCatalogObject(TCatalogObjectType.TABLE, 
Catalog.INITIAL_CATALOG_VERSION);
tbl.getLock().lock();  <-- acquire table lock here could be blocked by DDLs
try {
  long tblVersion = tbl.getCatalogVersion();
  if (tblVersion <= ctx.fromVersion) return;
  String tableUniqueName = tbl.getUniqueName();
  TopicUpdateLog.Entry topicUpdateEntry =
  topicUpdateLog_.getOrCreateLogEntry(tableUniqueName);
  if (tblVersion > ctx.toVersion &&
  topicUpdateEntry.getNumSkippedTopicUpdates() < 
MAX_NUM_SKIPPED_TOPIC_UPDATES) {
LOG.info("Table " + tbl.getFullName() + " is skipping topic update " +
ctx.toVersion);
topicUpdateLog_.add(tableUniqueName,
new TopicUpdateLog.Entry(
topicUpdateEntry.getNumSkippedTopicUpdates() + 1,
topicUpdateEntry.getLastSentVersion(),
topicUpdateEntry.getLastSentCatalogUpdate()));
return;
  }
  try {
catalogTbl.setTable(tbl.toThrift());
  } catch (Exception e) {
LOG.error(String.format("Error calling toThrift() on table %s: %s",
tbl.getFullName(), e.getMessage()), e);
return;
  }
  catalogTbl.setCatalog_version(tbl.getCatalogVersion());
  ctx.addCatalogObject(catalogTbl, false);
} finally {
  tbl.getLock().unlock();
}
  } {code}
Acquiring the table lock here could be blocked by slow concurrent DDLs like 
REFRESHs, causing problems like IMPALA-6671. Actually in minimal topic mode we 
just need database name, table name and catalog version for a table. The first 
two won't change during DDLs (rename are treated as drop+create). The last one, 
catalog version, is acceptable to propogate a value older than the latest 
version since it's already newer than or equal to those cached in coordinators. 
Thus, we don't need to acquire the table lock here.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)