[jira] [Commented] (IMPALA-8778) Support read/write Apache Hudi tables

2019-11-05 Thread Yanjia Gary Li (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-8778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16968038#comment-16968038
 ] 

Yanjia Gary Li commented on IMPALA-8778:


Hello [~tarmstrong] , I'd like to resume the discussion on this topic. Yuanbin 
finished his internship a few months ago so please assign this ticket to me. 

After reading some code on both impala and hudi sides, the following are the 
approaches I could think about.
 * As discussed above, to create a new class similar to hdfsTable with Hudi 
dependency to filter path. 
 * Implement everything on the Hudi side and send a sequence of queries to the 
impala server to ALTER the table. The hive sync tool on the Hudi repo is using 
this method. I think this approach could be easier than the one above because 
we could follow a similar strategy as the hive sync tool and we don't need to 
wait until the next release to use this feature.

To make sure this method is possible, I'd like to know what query could handle 
this situation:
 * first stage: in HDFS partition year=2019/month=10/day=1, we have 
file1_v1.parquet, file2_v1.parquet
 * second stage: we ran a Hudi job to update the partition 
year=2019/month=10/day=1, we have file1_v1.parquet, file1_v2.parquet, 
file2_v1.parquet

If we want to *drop* file1_v1.parquet and *load* file1_v2.parquet to the table, 
what query should I run? What will happen if another user submits a query when 
the metadata is updating?

Thanks

> Support read/write Apache Hudi tables
> -
>
> Key: IMPALA-8778
> URL: https://issues.apache.org/jira/browse/IMPALA-8778
> Project: IMPALA
>  Issue Type: New Feature
>Reporter: Yuanbin Cheng
>Assignee: Yuanbin Cheng
>Priority: Major
>
> Apache Impala currently not support Apache Hudi, cannot even pull metadata 
> from Hive.
> Related issue: 
> [https://github.com/apache/incubator-hudi/issues/179] 
> [https://issues.apache.org/jira/projects/HUDI/issues/HUDI-146|https://issues.apache.org/jira/projects/HUDI/issues/HUDI-146?filter=allopenissues]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-8692) Gracefully fail complex type inserts

2019-11-05 Thread Abhishek Rawat (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Rawat resolved IMPALA-8692.

Fix Version/s: Impala 3.4.0
   Resolution: Fixed

> Gracefully fail complex type inserts
> 
>
> Key: IMPALA-8692
> URL: https://issues.apache.org/jira/browse/IMPALA-8692
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Reporter: Abhishek Rawat
>Assignee: Abhishek Rawat
>Priority: Blocker
>  Labels: analysis, crash, front-end, parquet
> Fix For: Impala 3.4.0
>
>
> Block such insert statement in analysis phase.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Resolved] (IMPALA-8692) Gracefully fail complex type inserts

2019-11-05 Thread Abhishek Rawat (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Rawat resolved IMPALA-8692.

Fix Version/s: Impala 3.4.0
   Resolution: Fixed

> Gracefully fail complex type inserts
> 
>
> Key: IMPALA-8692
> URL: https://issues.apache.org/jira/browse/IMPALA-8692
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Reporter: Abhishek Rawat
>Assignee: Abhishek Rawat
>Priority: Blocker
>  Labels: analysis, crash, front-end, parquet
> Fix For: Impala 3.4.0
>
>
> Block such insert statement in analysis phase.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IMPALA-8692) Gracefully fail complex type inserts

2019-11-05 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-8692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16967974#comment-16967974
 ] 

ASF subversion and git services commented on IMPALA-8692:
-

Commit 4bffd12b20aa6ddce136f34ad423eb5dd9e4d05f in impala's branch 
refs/heads/master from Abhishek Rawat
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=4bffd12 ]

IMPALA-8692 Gracefully fail complex type inserts

Impala doesn't support writing new data files containing complex type
columns. Block such statements during analysis.

Added new tests in AnalyzeStmtsTest for the blocked scenarios.

Change-Id: I81bd04b2f1cd9873462098f67a45cd3974094d96
Reviewed-on: http://gerrit.cloudera.org:8080/14634
Reviewed-by: Tim Armstrong 
Tested-by: Impala Public Jenkins 


> Gracefully fail complex type inserts
> 
>
> Key: IMPALA-8692
> URL: https://issues.apache.org/jira/browse/IMPALA-8692
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Reporter: Abhishek Rawat
>Assignee: Abhishek Rawat
>Priority: Blocker
>  Labels: analysis, crash, front-end, parquet
>
> Block such insert statement in analysis phase.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-9025) An Impala query fails with error "ERROR: IllegalStateException: null"

2019-11-05 Thread Greg Rahn (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Rahn updated IMPALA-9025:
--
Description: 
Synopsis:
 =
 An Impala query fails with error "ERROR: IllegalStateException: null"

Problem:
 
 Customer has provided the following reproduction case:
{noformat}
create table tab1( tdate decimal(10,0) )
   partitioned by (p_tdate int);
 
select
   *
from
   (SELECT
   t.*
FROM
tab1 t
INNER JOIN
(SELECT 20191004 tdate ) d
WHERE
t.p_tdate < d.tdate
) q
where
   p_tdate <> tdate
   and P_TDATE = 20191004;
{noformat}
Impala logs shows the following warning:
{noformat}
W1007 10:04:42.954334 1945974 Expr.java:1075] Not able to analyze after rewrite:
org.apache.impala.common.AnalysisException: Column/field reference is 
ambiguous: 'tdate' conjuncts: BinaryPredicate{op=!=, 
NumericLiteral{value=20191004, type=BIGINT} CastExpr{isImplicit=true, 
target=BIGINT, SlotRef{path=tdate, type=DECIMAL(10,0), id=1}}} 
BinaryPredicate{op==, SlotRef{path=p_tdate, type=INT, id=2} 
NumericLiteral{value=20191004, type=INT}}
{noformat}

  was:
Synopsis:
=
An Impala query fails with error "ERROR: IllegalStateException: null"

Problem:

Customer has provided the following reproduction case:
{noformat}
create table tab1( tdate decimal(10,0) )
   partitioned by (p_tdate int);
 
select
   *
from
   (SELECT
   t.*
FROM
tab1 t
INNER JOIN
(SELECT 20191004 tdate ) d
WHERE
t.p_tdate < d.tdate
) q
where
   p_tdate <> tdate
   and P_TDATE = 20191004;
{noformat}
Impala logs shows the following warning:
{noformat}
W1007 10:04:42.954334 1945974 Expr.java:1075] Not able to analyze after 
rewrite: org.apache.impala.common.AnalysisException: Column/field reference is 
ambiguous: 'tdate' conjuncts: BinaryPredicate{op=!=, 
NumericLiteral{value=20191004, type=BIGINT} CastExpr{isImplicit=true, 
target=BIGINT, SlotRef{path=tdate, type=DECIMAL(10,0), id=1}}} 
BinaryPredicate{op==, SlotRef{path=p_tdate, type=INT, id=2} 
NumericLiteral{value=20191004, type=INT}}
{noformat}


> An Impala query fails with error "ERROR: IllegalStateException: null"
> -
>
> Key: IMPALA-9025
> URL: https://issues.apache.org/jira/browse/IMPALA-9025
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
>Priority: Major
> Fix For: Impala 3.4.0
>
>
> Synopsis:
>  =
>  An Impala query fails with error "ERROR: IllegalStateException: null"
> Problem:
>  
>  Customer has provided the following reproduction case:
> {noformat}
> create table tab1( tdate decimal(10,0) )
>partitioned by (p_tdate int);
>  
> select
>*
> from
>(SELECT
>t.*
> FROM
> tab1 t
> INNER JOIN
> (SELECT 20191004 tdate ) d
> WHERE
> t.p_tdate < d.tdate
> ) q
> where
>p_tdate <> tdate
>and P_TDATE = 20191004;
> {noformat}
> Impala logs shows the following warning:
> {noformat}
> W1007 10:04:42.954334 1945974 Expr.java:1075] Not able to analyze after 
> rewrite:
> org.apache.impala.common.AnalysisException: Column/field reference is 
> ambiguous: 'tdate' conjuncts: BinaryPredicate{op=!=, 
> NumericLiteral{value=20191004, type=BIGINT} CastExpr{isImplicit=true, 
> target=BIGINT, SlotRef{path=tdate, type=DECIMAL(10,0), id=1}}} 
> BinaryPredicate{op==, SlotRef{path=p_tdate, type=INT, id=2} 
> NumericLiteral{value=20191004, type=INT}}
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-9025) An Impala query fails with error "ERROR: IllegalStateException: null"

2019-11-05 Thread Greg Rahn (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Rahn updated IMPALA-9025:
--
Description: 
Synopsis:
=
An Impala query fails with error "ERROR: IllegalStateException: null"

Problem:

Customer has provided the following reproduction case:
{noformat}
create table tab1( tdate decimal(10,0) )
   partitioned by (p_tdate int);
 
select
   *
from
   (SELECT
   t.*
FROM
tab1 t
INNER JOIN
(SELECT 20191004 tdate ) d
WHERE
t.p_tdate < d.tdate
) q
where
   p_tdate <> tdate
   and P_TDATE = 20191004;
{noformat}
Impala logs shows the following warning:
{noformat}
W1007 10:04:42.954334 1945974 Expr.java:1075] Not able to analyze after 
rewrite: org.apache.impala.common.AnalysisException: Column/field reference is 
ambiguous: 'tdate' conjuncts: BinaryPredicate{op=!=, 
NumericLiteral{value=20191004, type=BIGINT} CastExpr{isImplicit=true, 
target=BIGINT, SlotRef{path=tdate, type=DECIMAL(10,0), id=1}}} 
BinaryPredicate{op==, SlotRef{path=p_tdate, type=INT, id=2} 
NumericLiteral{value=20191004, type=INT}}
{noformat}

  was:
Synopsis:
=
An Impala query fails with error "ERROR: IllegalStateException: null"

Problem:

Customer has provided the following reproduction case:

create table tab1( tdate decimal(10,0) )
   partitioned by (p_tdate int);
 
select
   *
from
   (SELECT
   t.*
FROM
tab1 t
INNER JOIN
(SELECT 20191004 tdate ) d
WHERE
t.p_tdate < d.tdate
) q
where
   p_tdate <> tdate
   and P_TDATE = 20191004;
   
Impala logs shows the following warning:

W1007 10:04:42.954334 1945974 Expr.java:1075] Not able to analyze after 
rewrite: org.apache.impala.common.AnalysisException: Column/field reference is 
ambiguous: 'tdate' conjuncts: BinaryPredicate{op=!=, 
NumericLiteral{value=20191004, type=BIGINT} CastExpr{isImplicit=true, 
target=BIGINT, SlotRef{path=tdate, type=DECIMAL(10,0), id=1}}} 
BinaryPredicate{op==, SlotRef{path=p_tdate, type=INT, id=2} 
NumericLiteral{value=20191004, type=INT}}


> An Impala query fails with error "ERROR: IllegalStateException: null"
> -
>
> Key: IMPALA-9025
> URL: https://issues.apache.org/jira/browse/IMPALA-9025
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
>Priority: Major
> Fix For: Impala 3.4.0
>
>
> Synopsis:
> =
> An Impala query fails with error "ERROR: IllegalStateException: null"
> Problem:
> 
> Customer has provided the following reproduction case:
> {noformat}
> create table tab1( tdate decimal(10,0) )
>partitioned by (p_tdate int);
>  
> select
>*
> from
>(SELECT
>t.*
> FROM
> tab1 t
> INNER JOIN
> (SELECT 20191004 tdate ) d
> WHERE
> t.p_tdate < d.tdate
> ) q
> where
>p_tdate <> tdate
>and P_TDATE = 20191004;
> {noformat}
> Impala logs shows the following warning:
> {noformat}
> W1007 10:04:42.954334 1945974 Expr.java:1075] Not able to analyze after 
> rewrite: org.apache.impala.common.AnalysisException: Column/field reference 
> is ambiguous: 'tdate' conjuncts: BinaryPredicate{op=!=, 
> NumericLiteral{value=20191004, type=BIGINT} CastExpr{isImplicit=true, 
> target=BIGINT, SlotRef{path=tdate, type=DECIMAL(10,0), id=1}}} 
> BinaryPredicate{op==, SlotRef{path=p_tdate, type=INT, id=2} 
> NumericLiteral{value=20191004, type=INT}}
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Work started] (IMPALA-9129) Provide a way for negative tests to remove intentionally generated core dumps

2019-11-05 Thread David Knupp (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-9129 started by David Knupp.
---
> Provide a way for negative tests to remove intentionally generated core dumps
> -
>
> Key: IMPALA-9129
> URL: https://issues.apache.org/jira/browse/IMPALA-9129
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Infrastructure
>Reporter: David Knupp
>Assignee: David Knupp
>Priority: Major
>
> Occasionally, tests (esp. custom cluster tests) will inject an error or set 
> some invalid config, expecting Impala to generate a core dump.
> We should have a general way for such files to delete the bogus core dumps, 
> otherwise they can complicate/confuse later triaging efforts of legitimate 
> test failures.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-9025) An Impala query fails with error "ERROR: IllegalStateException: null"

2019-11-05 Thread Greg Rahn (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-9025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16967885#comment-16967885
 ] 

Greg Rahn commented on IMPALA-9025:
---

This issue may be worked around using 
{{set ENABLE_EXPR_REWRITES=0;}}
 

> An Impala query fails with error "ERROR: IllegalStateException: null"
> -
>
> Key: IMPALA-9025
> URL: https://issues.apache.org/jira/browse/IMPALA-9025
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
>Priority: Major
> Fix For: Impala 3.4.0
>
>
> Synopsis:
> =
> An Impala query fails with error "ERROR: IllegalStateException: null"
> Problem:
> 
> Customer has provided the following reproduction case:
> create table tab1( tdate decimal(10,0) )
>partitioned by (p_tdate int);
>  
> select
>*
> from
>(SELECT
>t.*
> FROM
> tab1 t
> INNER JOIN
> (SELECT 20191004 tdate ) d
> WHERE
> t.p_tdate < d.tdate
> ) q
> where
>p_tdate <> tdate
>and P_TDATE = 20191004;
>
> Impala logs shows the following warning:
> W1007 10:04:42.954334 1945974 Expr.java:1075] Not able to analyze after 
> rewrite: org.apache.impala.common.AnalysisException: Column/field reference 
> is ambiguous: 'tdate' conjuncts: BinaryPredicate{op=!=, 
> NumericLiteral{value=20191004, type=BIGINT} CastExpr{isImplicit=true, 
> target=BIGINT, SlotRef{path=tdate, type=DECIMAL(10,0), id=1}}} 
> BinaryPredicate{op==, SlotRef{path=p_tdate, type=INT, id=2} 
> NumericLiteral{value=20191004, type=INT}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-9129) Provide a way for negative tests to remove intentionally generated core dumps

2019-11-05 Thread David Knupp (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Knupp reassigned IMPALA-9129:
---

Assignee: David Knupp

> Provide a way for negative tests to remove intentionally generated core dumps
> -
>
> Key: IMPALA-9129
> URL: https://issues.apache.org/jira/browse/IMPALA-9129
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Infrastructure
>Reporter: David Knupp
>Assignee: David Knupp
>Priority: Major
>
> Occasionally, tests (esp. custom cluster tests) will perform some action, 
> expecting Impala to generate a core dump.
> We should have a general way for such files to delete the bogus core dumps, 
> otherwise they can complicate/confuse later test triaging efforts.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-9129) Provide a way for negative tests to remove intentionally generated core dumps

2019-11-05 Thread David Knupp (Jira)
David Knupp created IMPALA-9129:
---

 Summary: Provide a way for negative tests to remove intentionally 
generated core dumps
 Key: IMPALA-9129
 URL: https://issues.apache.org/jira/browse/IMPALA-9129
 Project: IMPALA
  Issue Type: Improvement
  Components: Infrastructure
Reporter: David Knupp


Occasionally, tests (esp. custom cluster tests) will perform some action, 
expecting Impala to generate a core dump.

We should have a general way for such files to delete the bogus core dumps, 
otherwise they can complicate/confuse later test triaging efforts.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IMPALA-9129) Provide a way for negative tests to remove intentionally generated core dumps

2019-11-05 Thread David Knupp (Jira)
David Knupp created IMPALA-9129:
---

 Summary: Provide a way for negative tests to remove intentionally 
generated core dumps
 Key: IMPALA-9129
 URL: https://issues.apache.org/jira/browse/IMPALA-9129
 Project: IMPALA
  Issue Type: Improvement
  Components: Infrastructure
Reporter: David Knupp


Occasionally, tests (esp. custom cluster tests) will perform some action, 
expecting Impala to generate a core dump.

We should have a general way for such files to delete the bogus core dumps, 
otherwise they can complicate/confuse later test triaging efforts.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-9122) TestEventProcessing.test_insert_events flaky in precommit

2019-11-05 Thread Vihang Karajgaonkar (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-9122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16967830#comment-16967830
 ] 

Vihang Karajgaonkar commented on IMPALA-9122:
-

Thanks

> TestEventProcessing.test_insert_events flaky in precommit
> -
>
> Key: IMPALA-9122
> URL: https://issues.apache.org/jira/browse/IMPALA-9122
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Reporter: Tim Armstrong
>Assignee: Vihang Karajgaonkar
>Priority: Critical
>  Labels: broken-build, flaky
>
> https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/8613/
> {noformat}
> custom_cluster.test_event_processing.TestEventProcessing.test_insert_events 
> (from pytest)
> Failing for the past 1 build (Since Failed#8613 )
> Took 1 min 20 sec.
> add description
> Error Message
> assert  >(18421) 
> is True  +  where  TestEventProcessing.wait_for_insert_event_processing of 
> > = 
>  0x7f7fa86ec6d0>.wait_for_insert_event_processing
> Stacktrace
> custom_cluster/test_event_processing.py:82: in test_insert_events
> self.run_test_insert_events()
> custom_cluster/test_event_processing.py:143: in run_test_insert_events
> assert self.wait_for_insert_event_processing(last_synced_event_id) is True
> E   assert  of  0x7f7fa86ec6d0>>(18421) is True
> E+  where  TestEventProcessing.wait_for_insert_event_processing of 
> > = 
>  0x7f7fa86ec6d0>.wait_for_insert_event_processing
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-9128) Improve debugging for slow sends in KrpcDataStreamSender

2019-11-05 Thread Tim Armstrong (Jira)
Tim Armstrong created IMPALA-9128:
-

 Summary: Improve debugging for slow sends in KrpcDataStreamSender
 Key: IMPALA-9128
 URL: https://issues.apache.org/jira/browse/IMPALA-9128
 Project: IMPALA
  Issue Type: Bug
  Components: Distributed Exec
Reporter: Tim Armstrong
Assignee: Tim Armstrong


I'm trying to debug a problem that appears to be caused by a slow RPC:
{noformat}
Fragment F00
  Instance 754fc21ba4744310:d58fd0420020 (host=x)
Hdfs split stats (:<# splits>/): 0:1/120.48 
MB 
- AverageThreadTokens: 1.00 (1.0)
- BloomFilterBytes: 0 B (0)
- InactiveTotalTime: 0ns (0)
- PeakMemoryUsage: 3.2 MiB (3337546)
- PeakReservation: 2.0 MiB (2097152)
- PeakUsedReservation: 0 B (0)
- PerHostPeakMemUsage: 6.7 MiB (6987376)
- RowsProduced: 7 (7)
- TotalNetworkReceiveTime: 0ns (0)
- TotalNetworkSendTime: 3.6m (215354065071)
- TotalStorageWaitTime: 4ms (4552708)
- TotalThreadsInvoluntaryContextSwitches: 2 (2)
- TotalThreadsTotalWallClockTime: 3.6m (215924079474)
  - TotalThreadsSysTime: 24ms (24386000)
  - TotalThreadsUserTime: 505ms (505714000)
- TotalThreadsVoluntaryContextSwitches: 3,623 (3623)
- TotalTime: 3.6m (215801961705)
Fragment Instance Lifecycle Event Timeline
  Prepare Finished: 1ms (1812344)
  Open Finished: 322ms (322905753)
  First Batch Produced: 447ms (447050377)
  First Batch Sent: 447ms (447054546)
  ExecInternal Finished: 3.6m (215802284852)
Buffer pool
  - AllocTime: 0ns (0)
  - CumulativeAllocationBytes: 0 B (0)
  - CumulativeAllocations: 0 (0)
  - InactiveTotalTime: 0ns (0)
  - PeakReservation: 0 B (0)
  - PeakUnpinnedBytes: 0 B (0)
  - PeakUsedReservation: 0 B (0)
  - ReadIoBytes: 0 B (0)
  - ReadIoOps: 0 (0)
  - ReadIoWaitTime: 0ns (0)
  - ReservationLimit: 0 B (0)
  - TotalTime: 0ns (0)
  - WriteIoBytes: 0 B (0)
  - WriteIoOps: 0 (0)
  - WriteIoWaitTime: 0ns (0)
Fragment Instance Lifecycle Timings
  - ExecTime: 3.6m (215479380267)
- ExecTreeExecTime: 124ms (124299400)
  - InactiveTotalTime: 0ns (0)
  - OpenTime: 321ms (321088906)
- ExecTreeOpenTime: 572.04us (572045)
  - PrepareTime: 1ms (1426412)
- ExecTreePrepareTime: 233.32us (233318)
  - TotalTime: 0ns (0)
KrpcDataStreamSender (dst_id=3)
  - EosSent: 58 (58)
  - InactiveTotalTime: 3.6m (215354085858)
  - PeakMemoryUsage: 464.4 KiB (475504)
  - RowsSent: 7 (7)
  - RpcFailure: 0 (0)
  - RpcRetry: 0 (0)
  - SerializeBatchTime: 99.87us (99867)
  - TotalBytesSent: 207 B (207)
  - TotalTime: 3.6m (215355336381)
  - UncompressedRowBatchSize: 267 B (267)

{noformat}

We should add some diagnostics that will allow us to figure out which RPCs are 
slow and whether there's a pattern about which host is the problem. E.g. maybe 
we should log if the RPC time exceeds a configured threshold.

It may also be useful to include some stats about the wait time, e.g. a 
histogram of the wait times, so that we can see if it's an outlier or general 
slowness.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-9128) Improve debugging for slow sends in KrpcDataStreamSender

2019-11-05 Thread Tim Armstrong (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong updated IMPALA-9128:
--
Labels: observability  (was: )

> Improve debugging for slow sends in KrpcDataStreamSender
> 
>
> Key: IMPALA-9128
> URL: https://issues.apache.org/jira/browse/IMPALA-9128
> Project: IMPALA
>  Issue Type: Bug
>  Components: Distributed Exec
>Reporter: Tim Armstrong
>Assignee: Tim Armstrong
>Priority: Major
>  Labels: observability
>
> I'm trying to debug a problem that appears to be caused by a slow RPC:
> {noformat}
> Fragment F00
>   Instance 754fc21ba4744310:d58fd0420020 (host=x)
> Hdfs split stats (:<# splits>/): 0:1/120.48 
> MB 
> - AverageThreadTokens: 1.00 (1.0)
> - BloomFilterBytes: 0 B (0)
> - InactiveTotalTime: 0ns (0)
> - PeakMemoryUsage: 3.2 MiB (3337546)
> - PeakReservation: 2.0 MiB (2097152)
> - PeakUsedReservation: 0 B (0)
> - PerHostPeakMemUsage: 6.7 MiB (6987376)
> - RowsProduced: 7 (7)
> - TotalNetworkReceiveTime: 0ns (0)
> - TotalNetworkSendTime: 3.6m (215354065071)
> - TotalStorageWaitTime: 4ms (4552708)
> - TotalThreadsInvoluntaryContextSwitches: 2 (2)
> - TotalThreadsTotalWallClockTime: 3.6m (215924079474)
>   - TotalThreadsSysTime: 24ms (24386000)
>   - TotalThreadsUserTime: 505ms (505714000)
> - TotalThreadsVoluntaryContextSwitches: 3,623 (3623)
> - TotalTime: 3.6m (215801961705)
> Fragment Instance Lifecycle Event Timeline
>   Prepare Finished: 1ms (1812344)
>   Open Finished: 322ms (322905753)
>   First Batch Produced: 447ms (447050377)
>   First Batch Sent: 447ms (447054546)
>   ExecInternal Finished: 3.6m (215802284852)
> Buffer pool
>   - AllocTime: 0ns (0)
>   - CumulativeAllocationBytes: 0 B (0)
>   - CumulativeAllocations: 0 (0)
>   - InactiveTotalTime: 0ns (0)
>   - PeakReservation: 0 B (0)
>   - PeakUnpinnedBytes: 0 B (0)
>   - PeakUsedReservation: 0 B (0)
>   - ReadIoBytes: 0 B (0)
>   - ReadIoOps: 0 (0)
>   - ReadIoWaitTime: 0ns (0)
>   - ReservationLimit: 0 B (0)
>   - TotalTime: 0ns (0)
>   - WriteIoBytes: 0 B (0)
>   - WriteIoOps: 0 (0)
>   - WriteIoWaitTime: 0ns (0)
> Fragment Instance Lifecycle Timings
>   - ExecTime: 3.6m (215479380267)
> - ExecTreeExecTime: 124ms (124299400)
>   - InactiveTotalTime: 0ns (0)
>   - OpenTime: 321ms (321088906)
> - ExecTreeOpenTime: 572.04us (572045)
>   - PrepareTime: 1ms (1426412)
> - ExecTreePrepareTime: 233.32us (233318)
>   - TotalTime: 0ns (0)
> KrpcDataStreamSender (dst_id=3)
>   - EosSent: 58 (58)
>   - InactiveTotalTime: 3.6m (215354085858)
>   - PeakMemoryUsage: 464.4 KiB (475504)
>   - RowsSent: 7 (7)
>   - RpcFailure: 0 (0)
>   - RpcRetry: 0 (0)
>   - SerializeBatchTime: 99.87us (99867)
>   - TotalBytesSent: 207 B (207)
>   - TotalTime: 3.6m (215355336381)
>   - UncompressedRowBatchSize: 267 B (267)
> {noformat}
> We should add some diagnostics that will allow us to figure out which RPCs 
> are slow and whether there's a pattern about which host is the problem. E.g. 
> maybe we should log if the RPC time exceeds a configured threshold.
> It may also be useful to include some stats about the wait time, e.g. a 
> histogram of the wait times, so that we can see if it's an outlier or general 
> slowness.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-9128) Improve debugging for slow sends in KrpcDataStreamSender

2019-11-05 Thread Tim Armstrong (Jira)
Tim Armstrong created IMPALA-9128:
-

 Summary: Improve debugging for slow sends in KrpcDataStreamSender
 Key: IMPALA-9128
 URL: https://issues.apache.org/jira/browse/IMPALA-9128
 Project: IMPALA
  Issue Type: Bug
  Components: Distributed Exec
Reporter: Tim Armstrong
Assignee: Tim Armstrong


I'm trying to debug a problem that appears to be caused by a slow RPC:
{noformat}
Fragment F00
  Instance 754fc21ba4744310:d58fd0420020 (host=x)
Hdfs split stats (:<# splits>/): 0:1/120.48 
MB 
- AverageThreadTokens: 1.00 (1.0)
- BloomFilterBytes: 0 B (0)
- InactiveTotalTime: 0ns (0)
- PeakMemoryUsage: 3.2 MiB (3337546)
- PeakReservation: 2.0 MiB (2097152)
- PeakUsedReservation: 0 B (0)
- PerHostPeakMemUsage: 6.7 MiB (6987376)
- RowsProduced: 7 (7)
- TotalNetworkReceiveTime: 0ns (0)
- TotalNetworkSendTime: 3.6m (215354065071)
- TotalStorageWaitTime: 4ms (4552708)
- TotalThreadsInvoluntaryContextSwitches: 2 (2)
- TotalThreadsTotalWallClockTime: 3.6m (215924079474)
  - TotalThreadsSysTime: 24ms (24386000)
  - TotalThreadsUserTime: 505ms (505714000)
- TotalThreadsVoluntaryContextSwitches: 3,623 (3623)
- TotalTime: 3.6m (215801961705)
Fragment Instance Lifecycle Event Timeline
  Prepare Finished: 1ms (1812344)
  Open Finished: 322ms (322905753)
  First Batch Produced: 447ms (447050377)
  First Batch Sent: 447ms (447054546)
  ExecInternal Finished: 3.6m (215802284852)
Buffer pool
  - AllocTime: 0ns (0)
  - CumulativeAllocationBytes: 0 B (0)
  - CumulativeAllocations: 0 (0)
  - InactiveTotalTime: 0ns (0)
  - PeakReservation: 0 B (0)
  - PeakUnpinnedBytes: 0 B (0)
  - PeakUsedReservation: 0 B (0)
  - ReadIoBytes: 0 B (0)
  - ReadIoOps: 0 (0)
  - ReadIoWaitTime: 0ns (0)
  - ReservationLimit: 0 B (0)
  - TotalTime: 0ns (0)
  - WriteIoBytes: 0 B (0)
  - WriteIoOps: 0 (0)
  - WriteIoWaitTime: 0ns (0)
Fragment Instance Lifecycle Timings
  - ExecTime: 3.6m (215479380267)
- ExecTreeExecTime: 124ms (124299400)
  - InactiveTotalTime: 0ns (0)
  - OpenTime: 321ms (321088906)
- ExecTreeOpenTime: 572.04us (572045)
  - PrepareTime: 1ms (1426412)
- ExecTreePrepareTime: 233.32us (233318)
  - TotalTime: 0ns (0)
KrpcDataStreamSender (dst_id=3)
  - EosSent: 58 (58)
  - InactiveTotalTime: 3.6m (215354085858)
  - PeakMemoryUsage: 464.4 KiB (475504)
  - RowsSent: 7 (7)
  - RpcFailure: 0 (0)
  - RpcRetry: 0 (0)
  - SerializeBatchTime: 99.87us (99867)
  - TotalBytesSent: 207 B (207)
  - TotalTime: 3.6m (215355336381)
  - UncompressedRowBatchSize: 267 B (267)

{noformat}

We should add some diagnostics that will allow us to figure out which RPCs are 
slow and whether there's a pattern about which host is the problem. E.g. maybe 
we should log if the RPC time exceeds a configured threshold.

It may also be useful to include some stats about the wait time, e.g. a 
histogram of the wait times, so that we can see if it's an outlier or general 
slowness.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work started] (IMPALA-7613) Support round(DECIMAL) with non-constant second argument

2019-11-05 Thread Abhishek Rawat (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-7613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-7613 started by Abhishek Rawat.
--
> Support round(DECIMAL) with non-constant second argument
> 
>
> Key: IMPALA-7613
> URL: https://issues.apache.org/jira/browse/IMPALA-7613
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 3.0, Impala 2.12.0
>Reporter: Tim Armstrong
>Assignee: Abhishek Rawat
>Priority: Major
>  Labels: decimal, ramp-up
>
> Sometimes users want to round to a precision that is data-driven (e.g. using 
> a lookup table). They can't currently do this with decimal. I think we could 
> support this by just using the input decimal type as the output type when the 
> second argument is non-constant.
> {noformat}
> select round(l_tax, l_linenumber) from tpch.lineitem limit 5;
> Query: select round(l_tax, l_linenumber) from tpch.lineitem limit 5
> Query submitted at: 2018-09-24 11:03:10 (Coordinator: 
> http://tarmstrong-box:25000)
> ERROR: AnalysisException: round() must be called with a constant second 
> argument.
> {noformat}
> Motivated by a user trying to do something like this; 
> http://community.cloudera.com/t5/Interactive-Short-cycle-SQL/Impala-round-function-does-not-return-expected-result/m-p/80200#M4906



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-9127) Clean up probe-side state machine in hash join

2019-11-05 Thread Tim Armstrong (Jira)
Tim Armstrong created IMPALA-9127:
-

 Summary: Clean up probe-side state machine in hash join
 Key: IMPALA-9127
 URL: https://issues.apache.org/jira/browse/IMPALA-9127
 Project: IMPALA
  Issue Type: Sub-task
  Components: Backend
Reporter: Tim Armstrong


There's an implicit state machine in the main loop in  
PartitionedHashJoinNode::GetNext() 
https://github.com/apache/impala/blob/eea617b/be/src/exec/partitioned-hash-join-node.cc#L510

The state is implicitly defined based on the following conditions:
* !output_build_partitions_.empty() -> "outputting build rows after probing"
* builder_->null_aware_partition() == NULL -> "eos, because this the null-aware 
partition is processed after all other partitions"
* null_probe_output_idx_ >= 0 -> "null probe rows being processed"
* output_null_aware_probe_rows_running_ -> "null-aware partition being 
processed"
* probe_batch_pos_ != -1 -> "processing probe batch"
* builder_->num_hash_partitions() != 0 -> "have active hash partitions that are 
being probed"
* spilled_partitions_.empty() -> "no more spilled partitions"

I think this would be a lot easier to follow if the state machine was explicit 
and documented, and would make separating out the build side of a spilling hash 
join easier to get right.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IMPALA-9127) Clean up probe-side state machine in hash join

2019-11-05 Thread Tim Armstrong (Jira)
Tim Armstrong created IMPALA-9127:
-

 Summary: Clean up probe-side state machine in hash join
 Key: IMPALA-9127
 URL: https://issues.apache.org/jira/browse/IMPALA-9127
 Project: IMPALA
  Issue Type: Sub-task
  Components: Backend
Reporter: Tim Armstrong


There's an implicit state machine in the main loop in  
PartitionedHashJoinNode::GetNext() 
https://github.com/apache/impala/blob/eea617b/be/src/exec/partitioned-hash-join-node.cc#L510

The state is implicitly defined based on the following conditions:
* !output_build_partitions_.empty() -> "outputting build rows after probing"
* builder_->null_aware_partition() == NULL -> "eos, because this the null-aware 
partition is processed after all other partitions"
* null_probe_output_idx_ >= 0 -> "null probe rows being processed"
* output_null_aware_probe_rows_running_ -> "null-aware partition being 
processed"
* probe_batch_pos_ != -1 -> "processing probe batch"
* builder_->num_hash_partitions() != 0 -> "have active hash partitions that are 
being probed"
* spilled_partitions_.empty() -> "no more spilled partitions"

I think this would be a lot easier to follow if the state machine was explicit 
and documented, and would make separating out the build side of a spilling hash 
join easier to get right.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-9122) TestEventProcessing.test_insert_events flaky in precommit

2019-11-05 Thread Anurag Mantripragada (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-9122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16967818#comment-16967818
 ] 

Anurag Mantripragada commented on IMPALA-9122:
--

Sure, taking a look at it.

> TestEventProcessing.test_insert_events flaky in precommit
> -
>
> Key: IMPALA-9122
> URL: https://issues.apache.org/jira/browse/IMPALA-9122
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Reporter: Tim Armstrong
>Assignee: Vihang Karajgaonkar
>Priority: Critical
>  Labels: broken-build, flaky
>
> https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/8613/
> {noformat}
> custom_cluster.test_event_processing.TestEventProcessing.test_insert_events 
> (from pytest)
> Failing for the past 1 build (Since Failed#8613 )
> Took 1 min 20 sec.
> add description
> Error Message
> assert  >(18421) 
> is True  +  where  TestEventProcessing.wait_for_insert_event_processing of 
> > = 
>  0x7f7fa86ec6d0>.wait_for_insert_event_processing
> Stacktrace
> custom_cluster/test_event_processing.py:82: in test_insert_events
> self.run_test_insert_events()
> custom_cluster/test_event_processing.py:143: in run_test_insert_events
> assert self.wait_for_insert_event_processing(last_synced_event_id) is True
> E   assert  of  0x7f7fa86ec6d0>>(18421) is True
> E+  where  TestEventProcessing.wait_for_insert_event_processing of 
> > = 
>  0x7f7fa86ec6d0>.wait_for_insert_event_processing
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-9126) Cleanly separate build and probe state in hash join node

2019-11-05 Thread Tim Armstrong (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong reassigned IMPALA-9126:
-

Assignee: Tim Armstrong

> Cleanly separate build and probe state in hash join node
> 
>
> Key: IMPALA-9126
> URL: https://issues.apache.org/jira/browse/IMPALA-9126
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Backend
>Reporter: Tim Armstrong
>Assignee: Tim Armstrong
>Priority: Major
>
> As a precursor to IMPALA-4224, we should clean up the hash join 
> implementation so that the build and probe state is better separated. The 
> builder should not deal with probe side data structures (like the probe 
> streams that it allocates) and all accesses to the build-side data structures 
> should go through as narrow APIs as possible.
> The nested loop join is already pretty clean.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-9109) Create Catalog debug page top-k average table loading ranking

2019-11-05 Thread Jiawei Wang (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-9109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16967811#comment-16967811
 ] 

Jiawei Wang commented on IMPALA-9109:
-

Code review: [https://gerrit.cloudera.org/#/c/14600/]

> Create Catalog debug page top-k average table loading ranking
> -
>
> Key: IMPALA-9109
> URL: https://issues.apache.org/jira/browse/IMPALA-9109
> Project: IMPALA
>  Issue Type: New Feature
>  Components: Catalog, Frontend
>Reporter: Jiawei Wang
>Assignee: Dinesh Garg
>Priority: Critical
>
> Right now we have top-k table with memory requirements, numbers of operations 
> and most numbers of files. It would be great if we can also have a ranking of 
> table metadata loading time.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-9110) Add table loading time break-down metrics for HdfsTable

2019-11-05 Thread Jiawei Wang (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-9110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16967812#comment-16967812
 ] 

Jiawei Wang commented on IMPALA-9110:
-

Code review: [https://gerrit.cloudera.org/#/c/14611/]

> Add table loading time break-down metrics for HdfsTable
> ---
>
> Key: IMPALA-9110
> URL: https://issues.apache.org/jira/browse/IMPALA-9110
> Project: IMPALA
>  Issue Type: New Feature
>  Components: Catalog, Frontend
>Reporter: Jiawei Wang
>Assignee: Dinesh Garg
>Priority: Critical
>
> We are only able to get total table loading time right now, which makes it 
> really hard for us to debug why sometimes table loading is slow. Therefore, 
> it would be good to have a break-down metrics on how much time each function 
> cost when loading tables.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-9126) Cleanly separate build and probe state in hash join node

2019-11-05 Thread Tim Armstrong (Jira)
Tim Armstrong created IMPALA-9126:
-

 Summary: Cleanly separate build and probe state in hash join node
 Key: IMPALA-9126
 URL: https://issues.apache.org/jira/browse/IMPALA-9126
 Project: IMPALA
  Issue Type: Sub-task
  Components: Backend
Reporter: Tim Armstrong


As a precursor to IMPALA-4224, we should clean up the hash join implementation 
so that the build and probe state is better separated. The builder should not 
deal with probe side data structures (like the probe streams that it allocates) 
and all accesses to the build-side data structures should go through as narrow 
APIs as possible.

The nested loop join is already pretty clean.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-9126) Cleanly separate build and probe state in hash join node

2019-11-05 Thread Tim Armstrong (Jira)
Tim Armstrong created IMPALA-9126:
-

 Summary: Cleanly separate build and probe state in hash join node
 Key: IMPALA-9126
 URL: https://issues.apache.org/jira/browse/IMPALA-9126
 Project: IMPALA
  Issue Type: Sub-task
  Components: Backend
Reporter: Tim Armstrong


As a precursor to IMPALA-4224, we should clean up the hash join implementation 
so that the build and probe state is better separated. The builder should not 
deal with probe side data structures (like the probe streams that it allocates) 
and all accesses to the build-side data structures should go through as narrow 
APIs as possible.

The nested loop join is already pretty clean.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IMPALA-9125) Add general mechanism to find DataSink from other fragments

2019-11-05 Thread Tim Armstrong (Jira)
Tim Armstrong created IMPALA-9125:
-

 Summary: Add general mechanism to find DataSink from other 
fragments
 Key: IMPALA-9125
 URL: https://issues.apache.org/jira/browse/IMPALA-9125
 Project: IMPALA
  Issue Type: Sub-task
  Components: Backend
Reporter: Tim Armstrong
Assignee: Tim Armstrong


As a precursor to IMPALA-4224, we should add a mechanism for an finstance to 
discover the join build sink from another finstance. 

We already have a related single-purpose mechanism in the coordinator to find 
PlanRootSink. We should generalise this to allow looking up the sink of any 
other finstance and  move it into QueryState



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-9125) Add general mechanism to find DataSink from other fragments

2019-11-05 Thread Tim Armstrong (Jira)
Tim Armstrong created IMPALA-9125:
-

 Summary: Add general mechanism to find DataSink from other 
fragments
 Key: IMPALA-9125
 URL: https://issues.apache.org/jira/browse/IMPALA-9125
 Project: IMPALA
  Issue Type: Sub-task
  Components: Backend
Reporter: Tim Armstrong
Assignee: Tim Armstrong


As a precursor to IMPALA-4224, we should add a mechanism for an finstance to 
discover the join build sink from another finstance. 

We already have a related single-purpose mechanism in the coordinator to find 
PlanRootSink. We should generalise this to allow looking up the sink of any 
other finstance and  move it into QueryState



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work started] (IMPALA-8692) Gracefully fail complex type inserts

2019-11-05 Thread Abhishek Rawat (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-8692 started by Abhishek Rawat.
--
> Gracefully fail complex type inserts
> 
>
> Key: IMPALA-8692
> URL: https://issues.apache.org/jira/browse/IMPALA-8692
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Reporter: Abhishek Rawat
>Assignee: Abhishek Rawat
>Priority: Blocker
>  Labels: analysis, crash, front-end, parquet
>
> Block such insert statement in analysis phase.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Work started] (IMPALA-9110) Add table loading time break-down metrics for HdfsTable

2019-11-05 Thread Dinesh Garg (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-9110 started by Dinesh Garg.
---
> Add table loading time break-down metrics for HdfsTable
> ---
>
> Key: IMPALA-9110
> URL: https://issues.apache.org/jira/browse/IMPALA-9110
> Project: IMPALA
>  Issue Type: New Feature
>  Components: Catalog, Frontend
>Reporter: Jiawei Wang
>Assignee: Dinesh Garg
>Priority: Critical
>
> We are only able to get total table loading time right now, which makes it 
> really hard for us to debug why sometimes table loading is slow. Therefore, 
> it would be good to have a break-down metrics on how much time each function 
> cost when loading tables.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-9122) TestEventProcessing.test_insert_events flaky in precommit

2019-11-05 Thread Vihang Karajgaonkar (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-9122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16967789#comment-16967789
 ] 

Vihang Karajgaonkar commented on IMPALA-9122:
-

[~anuragmantri] Can you please help take a look at this?

> TestEventProcessing.test_insert_events flaky in precommit
> -
>
> Key: IMPALA-9122
> URL: https://issues.apache.org/jira/browse/IMPALA-9122
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Reporter: Tim Armstrong
>Assignee: Vihang Karajgaonkar
>Priority: Critical
>  Labels: broken-build, flaky
>
> https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/8613/
> {noformat}
> custom_cluster.test_event_processing.TestEventProcessing.test_insert_events 
> (from pytest)
> Failing for the past 1 build (Since Failed#8613 )
> Took 1 min 20 sec.
> add description
> Error Message
> assert  >(18421) 
> is True  +  where  TestEventProcessing.wait_for_insert_event_processing of 
> > = 
>  0x7f7fa86ec6d0>.wait_for_insert_event_processing
> Stacktrace
> custom_cluster/test_event_processing.py:82: in test_insert_events
> self.run_test_insert_events()
> custom_cluster/test_event_processing.py:143: in run_test_insert_events
> assert self.wait_for_insert_event_processing(last_synced_event_id) is True
> E   assert  of  0x7f7fa86ec6d0>>(18421) is True
> E+  where  TestEventProcessing.wait_for_insert_event_processing of 
> > = 
>  0x7f7fa86ec6d0>.wait_for_insert_event_processing
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Work started] (IMPALA-9090) Scan node in HDFS profile should include name of table being scanned

2019-11-05 Thread Xiaomeng Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-9090 started by Xiaomeng Zhang.
--
> Scan node in HDFS profile should include name of table being scanned
> 
>
> Key: IMPALA-9090
> URL: https://issues.apache.org/jira/browse/IMPALA-9090
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Reporter: Tim Armstrong
>Assignee: Xiaomeng Zhang
>Priority: Critical
>
> The only way to figure out the table being scanned by a scan node in the 
> profile is to pull the string out of the explain plan or execsummary. This is 
> awkward, both for manual and automated analysis of the profiles. We should 
> include the table name as a string in the SCAN_NODE implementation.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Work started] (IMPALA-8065) OSInfo produces somewhat misleading output when running in container

2019-11-05 Thread Xiaomeng Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-8065 started by Xiaomeng Zhang.
--
> OSInfo produces somewhat misleading output when running in container
> 
>
> Key: IMPALA-8065
> URL: https://issues.apache.org/jira/browse/IMPALA-8065
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Affects Versions: Impala 3.1.0
>Reporter: Tim Armstrong
>Assignee: Xiaomeng Zhang
>Priority: Critical
>
> It uses /proc/version, which returns the host version. It would be good to 
> also get the version from lsb-release from the Ubuntu container we're running 
> in and disambiguate on the debug page.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-9101) Unneccessary REFRESH due to wrong self-event detection

2019-11-05 Thread Vihang Karajgaonkar (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-9101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16967783#comment-16967783
 ] 

Vihang Karajgaonkar commented on IMPALA-9101:
-

I am going to use this JIRA to fix another issue with self-events which happen 
if the table or partition is reloaded before the event is received.

Currently, there is a problem in case of partitions where if there is a refresh 
partition between the 2 steps below, we lose the information about in-flight 
events and hence it causes an unnecessary refresh.

1. DDL on partition, event is generated
2. Event is received, events processor detects if this is a self-event and 
refreshes it if needed


> Unneccessary REFRESH due to wrong self-event detection
> --
>
> Key: IMPALA-9101
> URL: https://issues.apache.org/jira/browse/IMPALA-9101
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Quanlong Huang
>Assignee: Vihang Karajgaonkar
>Priority: Critical
>
> In {{CatalogOpExecutor.alterTable()}}, we call 
> {{addVersionsForInflightEvents()}} whenever the AlterTable operation changes 
> anything or not. If nothing changes, no HMS RPCs are sent. The event 
> processor ends up waiting on a non-existed self-event. Then all self-events 
> are treated as outside events and unneccessary REFRESH/INVALIDATE on this 
> table will be performed.
> Codes:
> {code:java}
>   private void alterTable(TAlterTableParams params, TDdlExecResponse response)
>   throws ImpalaException {
> 
> tryLock(tbl);
> // Get a new catalog version to assign to the table being altered.
> long newCatalogVersion = catalog_.incrementAndGetCatalogVersion();
> addCatalogServiceIdentifiers(tbl, catalog_.getCatalogServiceId(), 
> newCatalogVersion);
> 
>   // now that HMS alter operation has succeeded, add this version to list 
> of inflight
>   // events in catalog table if event processing is enabled
>   catalog_.addVersionsForInflightEvents(tbl, newCatalogVersion);< 
> We should check before calling this.
>   }
> {code}
> Reproduce:
> {code:sql}
> create table testtbl (col int) partitioned by (p1 int, p2 int);
> alter table testtbl add partition (p1=2,p2=6);
> alter table testtbl add if not exists partition (p1=2,p2=6);
> -- After this point, can't detect self-events on this table
> alter table testtbl add partition (p1=2,p2=7);
> {code}
> Catalogd logs:
> {code:bash}
> I1029 07:41:15.310956  8546 HdfsTable.java:630] Loaded file and block 
> metadata for default.testtbl partitions: p1=2/p2=6
> I1029 07:41:15.892410  8321 MetastoreEventsProcessor.java:480] Received 1 
> events. Start event id : 11463
> I1029 07:41:15.895717  8321 MetastoreEvents.java:396] EventId: 11464 
> EventType: ADD_PARTITION Creating event 11464 of type ADD_PARTITION on table 
> default.testtbl
> I1029 07:41:15.940225  8321 MetastoreEvents.java:241] Total number of events 
> received: 1 Total number of events filtered out: 0
> I1029 07:41:15.940414  8321 MetastoreEvents.java:385] EventId: 11464 
> EventType: ADD_PARTITION Not processing the event as it is a self-event
>  Correctly recognize self-event 
> I1029 07:41:16.829824  8329 catalog-server.cc:641] Collected update: 
> 1:TABLE:default.testtbl, version=1385, original size=4438, compressed 
> size=1216
> I1029 07:41:16.831853  8329 catalog-server.cc:641] Collected update: 
> 1:CATALOG_SERVICE_ID, version=1385, original size=60, compressed size=58
> I1029 07:41:18.827137  8339 catalog-server.cc:337] A catalog update with 2 
> entries is assembled. Catalog version: 1385 Last sent catalog version: 1384
>  No events for adding partition p1=2,p2=6 again. But we still bump the 
> catalog version.
> I1029 07:45:38.900974  8329 catalog-server.cc:641] Collected update: 
> 1:CATALOG_SERVICE_ID, version=1386, original size=60, compressed size=58
> I1029 07:45:40.899353  8339 catalog-server.cc:337] A catalog update with 1 
> entries is assembled. Catalog version: 1386 Last sent catalog version: 1385
>  Creating partition p1=2,p2=7
> I1029 07:45:48.827221  8546 HdfsTable.java:630] Loaded file and block 
> metadata for default.testtbl partitions: p1=2/p2=7
> I1029 07:45:48.904234  8329 catalog-server.cc:641] Collected update: 
> 1:TABLE:default.testtbl, version=1387, original size=4886, compressed 
> size=1251
> I1029 07:45:48.905262  8329 catalog-server.cc:641] Collected update: 
> 1:CATALOG_SERVICE_ID, version=1387, original size=60, compressed size=58
> I1029 07:45:49.523567  8321 MetastoreEventsProcessor.java:480] Received 1 
> events. Start event id : 11464
> I1029 07:45:49.524150  8321 MetastoreEvents.java:396] EventId: 11465 
> EventType: ADD_PARTITION Creating event 11465 of type ADD_PARTITION on table 
> default.testtbl
> I1029 07:45:49.527262  8321 MetastoreEvents.java:241] 

[jira] [Updated] (IMPALA-9101) Unneccessary REFRESH due to wrong self-event detection

2019-11-05 Thread Vihang Karajgaonkar (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar updated IMPALA-9101:

Priority: Critical  (was: Minor)

> Unneccessary REFRESH due to wrong self-event detection
> --
>
> Key: IMPALA-9101
> URL: https://issues.apache.org/jira/browse/IMPALA-9101
> Project: IMPALA
>  Issue Type: Bug
>Reporter: Quanlong Huang
>Assignee: Vihang Karajgaonkar
>Priority: Critical
>
> In {{CatalogOpExecutor.alterTable()}}, we call 
> {{addVersionsForInflightEvents()}} whenever the AlterTable operation changes 
> anything or not. If nothing changes, no HMS RPCs are sent. The event 
> processor ends up waiting on a non-existed self-event. Then all self-events 
> are treated as outside events and unneccessary REFRESH/INVALIDATE on this 
> table will be performed.
> Codes:
> {code:java}
>   private void alterTable(TAlterTableParams params, TDdlExecResponse response)
>   throws ImpalaException {
> 
> tryLock(tbl);
> // Get a new catalog version to assign to the table being altered.
> long newCatalogVersion = catalog_.incrementAndGetCatalogVersion();
> addCatalogServiceIdentifiers(tbl, catalog_.getCatalogServiceId(), 
> newCatalogVersion);
> 
>   // now that HMS alter operation has succeeded, add this version to list 
> of inflight
>   // events in catalog table if event processing is enabled
>   catalog_.addVersionsForInflightEvents(tbl, newCatalogVersion);< 
> We should check before calling this.
>   }
> {code}
> Reproduce:
> {code:sql}
> create table testtbl (col int) partitioned by (p1 int, p2 int);
> alter table testtbl add partition (p1=2,p2=6);
> alter table testtbl add if not exists partition (p1=2,p2=6);
> -- After this point, can't detect self-events on this table
> alter table testtbl add partition (p1=2,p2=7);
> {code}
> Catalogd logs:
> {code:bash}
> I1029 07:41:15.310956  8546 HdfsTable.java:630] Loaded file and block 
> metadata for default.testtbl partitions: p1=2/p2=6
> I1029 07:41:15.892410  8321 MetastoreEventsProcessor.java:480] Received 1 
> events. Start event id : 11463
> I1029 07:41:15.895717  8321 MetastoreEvents.java:396] EventId: 11464 
> EventType: ADD_PARTITION Creating event 11464 of type ADD_PARTITION on table 
> default.testtbl
> I1029 07:41:15.940225  8321 MetastoreEvents.java:241] Total number of events 
> received: 1 Total number of events filtered out: 0
> I1029 07:41:15.940414  8321 MetastoreEvents.java:385] EventId: 11464 
> EventType: ADD_PARTITION Not processing the event as it is a self-event
>  Correctly recognize self-event 
> I1029 07:41:16.829824  8329 catalog-server.cc:641] Collected update: 
> 1:TABLE:default.testtbl, version=1385, original size=4438, compressed 
> size=1216
> I1029 07:41:16.831853  8329 catalog-server.cc:641] Collected update: 
> 1:CATALOG_SERVICE_ID, version=1385, original size=60, compressed size=58
> I1029 07:41:18.827137  8339 catalog-server.cc:337] A catalog update with 2 
> entries is assembled. Catalog version: 1385 Last sent catalog version: 1384
>  No events for adding partition p1=2,p2=6 again. But we still bump the 
> catalog version.
> I1029 07:45:38.900974  8329 catalog-server.cc:641] Collected update: 
> 1:CATALOG_SERVICE_ID, version=1386, original size=60, compressed size=58
> I1029 07:45:40.899353  8339 catalog-server.cc:337] A catalog update with 1 
> entries is assembled. Catalog version: 1386 Last sent catalog version: 1385
>  Creating partition p1=2,p2=7
> I1029 07:45:48.827221  8546 HdfsTable.java:630] Loaded file and block 
> metadata for default.testtbl partitions: p1=2/p2=7
> I1029 07:45:48.904234  8329 catalog-server.cc:641] Collected update: 
> 1:TABLE:default.testtbl, version=1387, original size=4886, compressed 
> size=1251
> I1029 07:45:48.905262  8329 catalog-server.cc:641] Collected update: 
> 1:CATALOG_SERVICE_ID, version=1387, original size=60, compressed size=58
> I1029 07:45:49.523567  8321 MetastoreEventsProcessor.java:480] Received 1 
> events. Start event id : 11464
> I1029 07:45:49.524150  8321 MetastoreEvents.java:396] EventId: 11465 
> EventType: ADD_PARTITION Creating event 11465 of type ADD_PARTITION on table 
> default.testtbl
> I1029 07:45:49.527262  8321 MetastoreEvents.java:241] Total number of events 
> received: 1 Total number of events filtered out: 0
> I1029 07:45:49.530278  8321 MetastoreEvents.java:385] EventId: 11465 
> EventType: ADD_PARTITION Trying to refresh 1 partitions added to table 
> default.testtbl in the event
> I1029 07:45:49.531026  8321 CatalogServiceCatalog.java:2572] Refreshing 
> partition metadata: default.testtbl p1=2/p2=7 (processing ADD_PARTITION event 
> from HMS)
>  Unneccessary REFRESH 
> I1029 07:45:49.604936  8321 HdfsTable.java:630] 

[jira] [Updated] (IMPALA-8065) OSInfo produces somewhat misleading output when running in container

2019-11-05 Thread Xiaomeng Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-8065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaomeng Zhang updated IMPALA-8065:
---
Priority: Major  (was: Minor)

> OSInfo produces somewhat misleading output when running in container
> 
>
> Key: IMPALA-8065
> URL: https://issues.apache.org/jira/browse/IMPALA-8065
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Affects Versions: Impala 3.1.0
>Reporter: Tim Armstrong
>Assignee: Xiaomeng Zhang
>Priority: Major
>
> It uses /proc/version, which returns the host version. It would be good to 
> also get the version from lsb-release from the Ubuntu container we're running 
> in and disambiguate on the debug page.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Work started] (IMPALA-9104) Add support for retrieval of primary keys and foreign keys in impala-hs2-server.

2019-11-05 Thread Anurag Mantripragada (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-9104 started by Anurag Mantripragada.

> Add support for retrieval of primary keys and foreign keys in 
> impala-hs2-server.
> 
>
> Key: IMPALA-9104
> URL: https://issues.apache.org/jira/browse/IMPALA-9104
> Project: IMPALA
>  Issue Type: Sub-task
>Reporter: Anurag Mantripragada
>Assignee: Anurag Mantripragada
>Priority: Critical
>
> Impala's TCLIService.thrift files was taken from Hive-1 days and does not 
> contain APIs related to primary keys and foreign keys that was added to Hive 
> jdbc driver start Hive-2.1. We need to port these APIs inorder to be able to 
> use them in Impala JDBC clients.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-9109) Create Catalog debug page top-k average table loading ranking

2019-11-05 Thread Jiawei Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiawei Wang updated IMPALA-9109:

Priority: Critical  (was: Major)

> Create Catalog debug page top-k average table loading ranking
> -
>
> Key: IMPALA-9109
> URL: https://issues.apache.org/jira/browse/IMPALA-9109
> Project: IMPALA
>  Issue Type: New Feature
>  Components: Catalog, Frontend
>Reporter: Jiawei Wang
>Assignee: Jiawei Wang
>Priority: Critical
>
> Right now we have top-k table with memory requirements, numbers of operations 
> and most numbers of files. It would be great if we can also have a ranking of 
> table metadata loading time.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-9110) Add table loading time break-down metrics for HdfsTable

2019-11-05 Thread Jiawei Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiawei Wang updated IMPALA-9110:

Priority: Critical  (was: Major)

> Add table loading time break-down metrics for HdfsTable
> ---
>
> Key: IMPALA-9110
> URL: https://issues.apache.org/jira/browse/IMPALA-9110
> Project: IMPALA
>  Issue Type: New Feature
>  Components: Catalog, Frontend
>Reporter: Jiawei Wang
>Assignee: Jiawei Wang
>Priority: Critical
>
> We are only able to get total table loading time right now, which makes it 
> really hard for us to debug why sometimes table loading is slow. Therefore, 
> it would be good to have a break-down metrics on how much time each function 
> cost when loading tables.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-9120) Refreshing an ABFS table with a deleted directory fails

2019-11-05 Thread Sahil Takiar (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated IMPALA-9120:
-
Priority: Critical  (was: Major)

> Refreshing an ABFS table with a deleted directory fails
> ---
>
> Key: IMPALA-9120
> URL: https://issues.apache.org/jira/browse/IMPALA-9120
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Critical
>
> The following fails on ABFS (but succeeds on HDFS):
> {code:java}
> hdfs dfs -mkdir /test-external-table
> ./bin/impala-shell.sh
> [localhost:21000] default> create external table (col int) location 
> '/test-external-table'; 
> [localhost:21000] default> select * from test;
> hdfs dfs -rm -r -skipTrash /test-external-table
> ./bin/impala-shell.sh
> [localhost:21000] default> refresh test;
> ERROR: TableLoadingException: Refreshing file and block metadata for 1 paths 
> for table default.test: failed to load 1 paths. Check the catalog server log 
> for more details.{code}
> This causes the test 
> tests/query_test/test_hdfs_file_mods.py::TestHdfsFileMods::test_file_modifications[modification_type:
>  delete_directory | ...] to fail on ABFS as well.
> The error from catalogd is:
> {code:java}
> E1104 22:38:53.748571 87486 ParallelFileMetadataLoader.java:102] Loading file 
> and block metadata for 1 paths for table test_file_modifications_d0471c2c.t1 
> encountered an error loading data for path 
> abfss://[]@[].dfs.core.windows.net/test-warehouse/test_file_modifications_d0471c2c
> Java exception follows:
> java.util.concurrent.ExecutionException: java.io.FileNotFoundException: GET 
> https://[].dfs.core.windows.net/[]?resource=filesystem=5000=test-warehouse/test_file_modifications_d0471c2c=90=false
> StatusCode=404
> StatusDescription=The specified path does not exist.
> ErrorCode=PathNotFound
> ErrorMessage=The specified path does not exist.
> RequestId:[]
> Time:2019-11-04T22:38:53.7469083Z
> at java.util.concurrent.FutureTask.report(FutureTask.java:122)
> at java.util.concurrent.FutureTask.get(FutureTask.java:192)
> at 
> org.apache.impala.catalog.ParallelFileMetadataLoader.load(ParallelFileMetadataLoader.java:99)
> at 
> org.apache.impala.catalog.HdfsTable.loadFileMetadataForPartitions(HdfsTable.java:606)
> at 
> org.apache.impala.catalog.HdfsTable.loadAllPartitions(HdfsTable.java:547)
> at org.apache.impala.catalog.HdfsTable.load(HdfsTable.java:973)
> at org.apache.impala.catalog.HdfsTable.load(HdfsTable.java:896)
> at org.apache.impala.catalog.TableLoader.load(TableLoader.java:83)
> at 
> org.apache.impala.catalog.TableLoadingMgr$2.call(TableLoadingMgr.java:244)
> at 
> org.apache.impala.catalog.TableLoadingMgr$2.call(TableLoadingMgr.java:241)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: java.io.FileNotFoundException: GET 
> https://[].dfs.core.windows.net/[]?resource=filesystem=5000=test-warehouse/test_file_modifications_d0471c2c=90=false
> StatusCode=404
> StatusDescription=The specified path does not exist.
> ErrorCode=PathNotFound
> ErrorMessage=The specified path does not exist.
> RequestId:[]
> Time:2019-11-04T22:38:53.7469083Z
> at 
> org.apache.hadoop.fs.azurebfs.AzureBlobFileSystem.checkException(AzureBlobFileSystem.java:957)
> at 
> org.apache.hadoop.fs.azurebfs.AzureBlobFileSystem.listStatus(AzureBlobFileSystem.java:351)
> at 
> org.apache.hadoop.fs.FileSystem.listStatusBatch(FileSystem.java:1790)
> at 
> org.apache.hadoop.fs.FileSystem$DirListingIterator.fetchMore(FileSystem.java:2058)
> at 
> org.apache.hadoop.fs.FileSystem$DirListingIterator.hasNext(FileSystem.java:2047)
> at 
> org.apache.impala.common.FileSystemUtil$RecursingIterator.hasNext(FileSystemUtil.java:722)
> at 
> org.apache.impala.common.FileSystemUtil$FilterIterator.hasNext(FileSystemUtil.java:679)
> at 
> org.apache.impala.catalog.FileMetadataLoader.load(FileMetadataLoader.java:166)
> at 
> org.apache.impala.catalog.ParallelFileMetadataLoader.lambda$load$0(ParallelFileMetadataLoader.java:93)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:293)
> at 
> com.google.common.util.concurrent.AbstractListeningExecutorService.submit(AbstractListeningExecutorService.java:61)
> at 
> 

[jira] [Commented] (IMPALA-9110) Add table loading time break-down metrics for HdfsTable

2019-11-05 Thread Jiawei Wang (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-9110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16967750#comment-16967750
 ] 

Jiawei Wang commented on IMPALA-9110:
-

The table loading process is kind of confusing to anyone reading it for the 
first time. Here are some good explanation from Quanlong:

" There are two kinds of table loading requests: 1) Async: load requests from 
invalidate metadata when --load_table_in_background=true. PrioritizedLoad 
requests from coordinators when missing table meta in analysis. 2) Sync: load 
requests from DDLs and some RPCs like getPartitionStats and 
getPartialCatalogObject.

All async requests are put in tableLoadingDeque_ and deduplicated by 
tableLoadingBarrier_. tableLoadingDeque_ is consumed by threads launched in 
startTableLoadingThreads(). These threads call 
CatalogServiceCatalog.getOrLoadTable() finally to load the table. Note that 
getOrLoadTable() is sync. It only returns when table already loaded or loading 
finish. All sync requests will also go to 
CatalogServiceCatalog.getOrLoadTable(). All loading threads are running in 
tblLoadingPool_.

So threads in startTableLoadingThreads() are just submitters for background 
load requests. All table loading threads are in tblLoadingPool "

> Add table loading time break-down metrics for HdfsTable
> ---
>
> Key: IMPALA-9110
> URL: https://issues.apache.org/jira/browse/IMPALA-9110
> Project: IMPALA
>  Issue Type: New Feature
>  Components: Catalog, Frontend
>Reporter: Jiawei Wang
>Assignee: Jiawei Wang
>Priority: Major
>
> We are only able to get total table loading time right now, which makes it 
> really hard for us to debug why sometimes table loading is slow. Therefore, 
> it would be good to have a break-down metrics on how much time each function 
> cost when loading tables.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-9124) Transparently retry queries that fail due to cluster membership changes

2019-11-05 Thread Sahil Takiar (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-9124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16967694#comment-16967694
 ] 

Sahil Takiar commented on IMPALA-9124:
--

Attached design doc. Comments welcome.

> Transparently retry queries that fail due to cluster membership changes
> ---
>
> Key: IMPALA-9124
> URL: https://issues.apache.org/jira/browse/IMPALA-9124
> Project: IMPALA
>  Issue Type: New Feature
>  Components: Backend, Clients
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
> Attachments: Impala Transparent Query Retries.pdf
>
>
> Currently, if the Impala Coordinator or any Executors run into errors during 
> query execution, Impala will fail the entire query. It would improve user 
> experience to transparently retry the query for some transient, recoverable 
> errors.
> This JIRA focuses on retrying queries that would otherwise fail due to 
> cluster membership changes. Specifically, node failures that cause changes in 
> the cluster membership (currently the Coordinator cancels all queries running 
> on a node if it detects that the node is no longer part of the cluster) and 
> node blacklisting (the Coordinator blacklists a node because it detects a 
> problem with that node - can’t execute RPCs against the node). It is not 
> focused on retrying general errors (e.g. any frontend errors, 
> MemLimitExceeded exceptions, etc.).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Updated] (IMPALA-9124) Transparently retry queries that fail due to cluster membership changes

2019-11-05 Thread Sahil Takiar (Jira)


 [ 
https://issues.apache.org/jira/browse/IMPALA-9124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated IMPALA-9124:
-
Attachment: Impala Transparent Query Retries.pdf

> Transparently retry queries that fail due to cluster membership changes
> ---
>
> Key: IMPALA-9124
> URL: https://issues.apache.org/jira/browse/IMPALA-9124
> Project: IMPALA
>  Issue Type: New Feature
>  Components: Backend, Clients
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
> Attachments: Impala Transparent Query Retries.pdf
>
>
> Currently, if the Impala Coordinator or any Executors run into errors during 
> query execution, Impala will fail the entire query. It would improve user 
> experience to transparently retry the query for some transient, recoverable 
> errors.
> This JIRA focuses on retrying queries that would otherwise fail due to 
> cluster membership changes. Specifically, node failures that cause changes in 
> the cluster membership (currently the Coordinator cancels all queries running 
> on a node if it detects that the node is no longer part of the cluster) and 
> node blacklisting (the Coordinator blacklists a node because it detects a 
> problem with that node - can’t execute RPCs against the node). It is not 
> focused on retrying general errors (e.g. any frontend errors, 
> MemLimitExceeded exceptions, etc.).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-9124) Transparently retry queries that fail due to cluster membership changes

2019-11-05 Thread Sahil Takiar (Jira)
Sahil Takiar created IMPALA-9124:


 Summary: Transparently retry queries that fail due to cluster 
membership changes
 Key: IMPALA-9124
 URL: https://issues.apache.org/jira/browse/IMPALA-9124
 Project: IMPALA
  Issue Type: New Feature
  Components: Backend, Clients
Reporter: Sahil Takiar
Assignee: Sahil Takiar


Currently, if the Impala Coordinator or any Executors run into errors during 
query execution, Impala will fail the entire query. It would improve user 
experience to transparently retry the query for some transient, recoverable 
errors.

This JIRA focuses on retrying queries that would otherwise fail due to cluster 
membership changes. Specifically, node failures that cause changes in the 
cluster membership (currently the Coordinator cancels all queries running on a 
node if it detects that the node is no longer part of the cluster) and node 
blacklisting (the Coordinator blacklists a node because it detects a problem 
with that node - can’t execute RPCs against the node). It is not focused on 
retrying general errors (e.g. any frontend errors, MemLimitExceeded exceptions, 
etc.).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IMPALA-9124) Transparently retry queries that fail due to cluster membership changes

2019-11-05 Thread Sahil Takiar (Jira)
Sahil Takiar created IMPALA-9124:


 Summary: Transparently retry queries that fail due to cluster 
membership changes
 Key: IMPALA-9124
 URL: https://issues.apache.org/jira/browse/IMPALA-9124
 Project: IMPALA
  Issue Type: New Feature
  Components: Backend, Clients
Reporter: Sahil Takiar
Assignee: Sahil Takiar


Currently, if the Impala Coordinator or any Executors run into errors during 
query execution, Impala will fail the entire query. It would improve user 
experience to transparently retry the query for some transient, recoverable 
errors.

This JIRA focuses on retrying queries that would otherwise fail due to cluster 
membership changes. Specifically, node failures that cause changes in the 
cluster membership (currently the Coordinator cancels all queries running on a 
node if it detects that the node is no longer part of the cluster) and node 
blacklisting (the Coordinator blacklists a node because it detects a problem 
with that node - can’t execute RPCs against the node). It is not focused on 
retrying general errors (e.g. any frontend errors, MemLimitExceeded exceptions, 
etc.).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-3933) Time zone definitions of Hive/Spark and Impala differ for historical dates

2019-11-05 Thread Attila Jeges (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-3933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16967675#comment-16967675
 ] 

Attila Jeges commented on IMPALA-3933:
--

[~mylogi...@gmail.com]  The Java TZ database and the IANA TZ database (used by 
the OS) have different binary formats, making Impala use the Java TZ database 
is not a trivial task. 

We could package an IANA TZ database that is compatible with the current 
version of the Java TZ database and make it publicly available for Impala 
users. The problem with this approach is that timezone rules change frequently 
and Java's TZ db gets updated from time to time (when the admin runs system 
update) and then we will be out of sync again.
 
I haven't tested it yet but there's a tool to convert the IANA TZ database to 
Java's TZ database: https://github.com/akashche/tzdbgen . Perhaps we should 
point users to this (or a similar) tool if they want to keep the 2 databases in 
sync. 




> Time zone definitions of Hive/Spark and Impala differ for historical dates
> --
>
> Key: IMPALA-3933
> URL: https://issues.apache.org/jira/browse/IMPALA-3933
> Project: IMPALA
>  Issue Type: New Feature
>  Components: Backend
>Affects Versions: impala 2.3
>Reporter: Adriano Simone
>Priority: Minor
>
> How the TIMESTAMP skew with convert_legacy_hive_parquet_utc_timestamps=true
> Enabling --convert_legacy_hive_parquet_utc_timestamps=true seems to cause 
> data skew (improper converting) upon the reading for dates earlier than 1900 
> (not sure about the exact date).
> The following example was run on a server which is in CEST timezone, thus the 
> time difference is GMT+1 for dates before 1900 (I'm not sure, I haven't 
> checked the exact starting date of DST computation), and GMT+2 when summer 
> daylight saving time was applied.
> create table itst (col1 int, myts timestamp) stored as parquet;
> From impala:
> {code:java}
> insert into itst values (1,'2016-04-15 12:34:45');
> insert into itst values (2,'1949-04-15 12:34:45');
> insert into itst values (3,'1753-04-15 12:34:45');
> insert into itst values (4,'1752-04-15 12:34:45');
> {code}
> from hive
> {code:java}
> insert into itst values (5,'2016-04-15 12:34:45');
> insert into itst values (6,'1949-04-15 12:34:45');
> insert into itst values (7,'1753-04-15 12:34:45');
> insert into itst values (8,'1752-04-15 12:34:45');
> {code}
> From impala
> {code:java}
> select * from itst order by col1;
> {code}
> Result:
> {code:java}
> Query: select * from itst
> +--+-+
> | col1 | myts|
> +--+-+
> | 1| 2016-04-15 12:34:45 |
> | 2| 1949-04-15 12:34:45 |
> | 3| 1753-04-15 12:34:45 |
> | 4| 1752-04-15 12:34:45 |
> | 5| 2016-04-15 10:34:45 |
> | 6| 1949-04-15 10:34:45 |
> | 7| 1753-04-15 11:34:45 |
> | 8| 1752-04-15 11:34:45 |
> +--+-+
> {code}
> The timestamps are looking good, the DST differences can be seen (hive 
> inserted it in local time, but impala shows it in UTC)
> From impala after setting the command line argument 
> "--convert_legacy_hive_parquet_utc_timestamps=true"
> {code:java}
> select * from itst order by col1;
> {code}
> The result in this case:
> {code:java}
> Query: select * from itst order by col1
> +--+-+
> | col1 | myts|
> +--+-+
> | 1| 2016-04-15 12:34:45 |
> | 2| 1949-04-15 12:34:45 |
> | 3| 1753-04-15 12:34:45 |
> | 4| 1752-04-15 12:34:45 |
> | 5| 2016-04-15 12:34:45 |
> | 6| 1949-04-15 12:34:45 |
> | 7| 1753-04-15 12:51:05 |
> | 8| 1752-04-15 12:51:05 |
> +--+-+
> {code}
> It seems that instead of 11:34:45 it is showing 12:51:05.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-8587) Show inherited privileges in show grant w/ Ranger

2019-11-05 Thread Karthik Manamcheri (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-8587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16967670#comment-16967670
 ] 

Karthik Manamcheri commented on IMPALA-8587:


[~fangyurao] [~dgarg] What's the status on this? This has been In Progress for 
almost a quarter now! 

> Show inherited privileges in show grant w/ Ranger
> -
>
> Key: IMPALA-8587
> URL: https://issues.apache.org/jira/browse/IMPALA-8587
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Frontend
>Reporter: Austin Nobis
>Assignee: Fang-Yu Rao
>Priority: Critical
>
> If an admin has privileges from:
> *grant all on server to user admin;*
>  
> Currently the command below will show no results:
> *show grant user admin on database functional;*
>  
> After the change, the user should see server level privileges from:
> *show grant user admin on database functional;*
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Comment Edited] (IMPALA-3933) Time zone definitions of Hive/Spark and Impala differ for historical dates

2019-11-05 Thread Manish Maheshwari (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-3933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16967598#comment-16967598
 ] 

Manish Maheshwari edited comment on IMPALA-3933 at 11/5/19 3:10 PM:


Thanks [~attilaj], replies as below

1 - This is quite well known. 

2 - Ok. So there isn't a solution possible this at all till the customer 
upgrades to CDP DC when the dates will match.

3 - Ok, understood. Is there a plan for Impala to also use the Java TZ 
database? I am guessing this is difficult to implement in the C++ BE. The 
reason for this ask is to avoid debugging TZ issues due to Java and OS 
mismatch. Alternatively can be package the Java TZ database and make it 
publicly available / part of the Impala binary to make it easy to use


was (Author: mylogi...@gmail.com):
[~attilaj]

1 - This is quite well known. 

2 - Ok. So there isn't a solution possible this at all till the customer 
upgrades to CDP DC when the dates will match.

3 - Ok, understood. Is there a plan for Impala to also use the Java TZ 
database? I am guessing this is difficult to implement in the C++ BE. The 
reason for this ask is to avoid debugging TZ issues due to Java and OS 
mismatch. Alternatively can be package the Java TZ database and make it 
publicly available / part of the Impala binary to make it easy to use

> Time zone definitions of Hive/Spark and Impala differ for historical dates
> --
>
> Key: IMPALA-3933
> URL: https://issues.apache.org/jira/browse/IMPALA-3933
> Project: IMPALA
>  Issue Type: New Feature
>  Components: Backend
>Affects Versions: impala 2.3
>Reporter: Adriano Simone
>Priority: Minor
>
> How the TIMESTAMP skew with convert_legacy_hive_parquet_utc_timestamps=true
> Enabling --convert_legacy_hive_parquet_utc_timestamps=true seems to cause 
> data skew (improper converting) upon the reading for dates earlier than 1900 
> (not sure about the exact date).
> The following example was run on a server which is in CEST timezone, thus the 
> time difference is GMT+1 for dates before 1900 (I'm not sure, I haven't 
> checked the exact starting date of DST computation), and GMT+2 when summer 
> daylight saving time was applied.
> create table itst (col1 int, myts timestamp) stored as parquet;
> From impala:
> {code:java}
> insert into itst values (1,'2016-04-15 12:34:45');
> insert into itst values (2,'1949-04-15 12:34:45');
> insert into itst values (3,'1753-04-15 12:34:45');
> insert into itst values (4,'1752-04-15 12:34:45');
> {code}
> from hive
> {code:java}
> insert into itst values (5,'2016-04-15 12:34:45');
> insert into itst values (6,'1949-04-15 12:34:45');
> insert into itst values (7,'1753-04-15 12:34:45');
> insert into itst values (8,'1752-04-15 12:34:45');
> {code}
> From impala
> {code:java}
> select * from itst order by col1;
> {code}
> Result:
> {code:java}
> Query: select * from itst
> +--+-+
> | col1 | myts|
> +--+-+
> | 1| 2016-04-15 12:34:45 |
> | 2| 1949-04-15 12:34:45 |
> | 3| 1753-04-15 12:34:45 |
> | 4| 1752-04-15 12:34:45 |
> | 5| 2016-04-15 10:34:45 |
> | 6| 1949-04-15 10:34:45 |
> | 7| 1753-04-15 11:34:45 |
> | 8| 1752-04-15 11:34:45 |
> +--+-+
> {code}
> The timestamps are looking good, the DST differences can be seen (hive 
> inserted it in local time, but impala shows it in UTC)
> From impala after setting the command line argument 
> "--convert_legacy_hive_parquet_utc_timestamps=true"
> {code:java}
> select * from itst order by col1;
> {code}
> The result in this case:
> {code:java}
> Query: select * from itst order by col1
> +--+-+
> | col1 | myts|
> +--+-+
> | 1| 2016-04-15 12:34:45 |
> | 2| 1949-04-15 12:34:45 |
> | 3| 1753-04-15 12:34:45 |
> | 4| 1752-04-15 12:34:45 |
> | 5| 2016-04-15 12:34:45 |
> | 6| 1949-04-15 12:34:45 |
> | 7| 1753-04-15 12:51:05 |
> | 8| 1752-04-15 12:51:05 |
> +--+-+
> {code}
> It seems that instead of 11:34:45 it is showing 12:51:05.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-9123) Detect DDL hangs in tests

2019-11-05 Thread Quanlong Huang (Jira)
Quanlong Huang created IMPALA-9123:
--

 Summary: Detect DDL hangs in tests
 Key: IMPALA-9123
 URL: https://issues.apache.org/jira/browse/IMPALA-9123
 Project: IMPALA
  Issue Type: Test
Reporter: Quanlong Huang


Currently, we detect query hangs in tests by using execute_async() and 
wait_for_finished_timeout() of BeeswaxConnection together.

E.g. 
[https://github.com/apache/impala/blob/3.3.0/tests/authorization/test_grant_revoke.py#L334-L335]
{code:python}
handle = self.client.execute_async("invalidate metadata")
assert self.client.wait_for_finished_timeout(handle, timeout=60)
{code}

However, execute_async() won't return if the DDL is in CREATED state which is 
usually when DDL hangs. See the implementation of query() interface for Beeswax 
protocol: 
https://github.com/apache/impala/blob/3.3.0/be/src/service/impala-beeswax-server.cc#L52
So wait_for_finished_timeout() don't run and the test is stuck in 
execute_async().

We need to find a elegant way to detect DDL hangs and cancel the DDLs in 
CREATED state.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (IMPALA-9123) Detect DDL hangs in tests

2019-11-05 Thread Quanlong Huang (Jira)
Quanlong Huang created IMPALA-9123:
--

 Summary: Detect DDL hangs in tests
 Key: IMPALA-9123
 URL: https://issues.apache.org/jira/browse/IMPALA-9123
 Project: IMPALA
  Issue Type: Test
Reporter: Quanlong Huang


Currently, we detect query hangs in tests by using execute_async() and 
wait_for_finished_timeout() of BeeswaxConnection together.

E.g. 
[https://github.com/apache/impala/blob/3.3.0/tests/authorization/test_grant_revoke.py#L334-L335]
{code:python}
handle = self.client.execute_async("invalidate metadata")
assert self.client.wait_for_finished_timeout(handle, timeout=60)
{code}

However, execute_async() won't return if the DDL is in CREATED state which is 
usually when DDL hangs. See the implementation of query() interface for Beeswax 
protocol: 
https://github.com/apache/impala/blob/3.3.0/be/src/service/impala-beeswax-server.cc#L52
So wait_for_finished_timeout() don't run and the test is stuck in 
execute_async().

We need to find a elegant way to detect DDL hangs and cancel the DDLs in 
CREATED state.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org